From ANN to Transformers: Understanding Modern Deep Learning Architectures