EECE 571F - Department of Electrical and Computer Engineering

Electrical Engineering Seminar and Special Problems – Advanced Topics in Deep Learning

Advanced deep learning techniques have revolutionized the field, enabling remarkable progress across various applications. This course will provide a comprehensive understanding of the latest models and methods that are shaping the future of deep learning, with a particular focus on probabilistic and geometric deep learning, and deep reinforcement learning.

We will cover advanced topics, including:

Probabilistic Deep Learning:

Large Language Models (LLMs): Delve into the architecture and training of large language models such as GPT, BERT, and their derivatives. Explore their probabilistic foundations, applications, and the challenges of scaling and fine-tuning these models.

Latent Variable Models: Explore models such as variational auto-encoders (VAEs) and Bayesian neural networks. Understand inference techniques, including amortization and variational inference, as well as learning strategies like REINFORCE and relaxation.

Diffusion Models and Flow Models: Delve into the principles and applications of diffusion models, a class of generative models that iteratively refine random noise into coherent data samples. Explore their advantages in terms of stability and diversity of generated samples, and their applications in image and audio generation. Additionally, study flow models, which transform simple probability distributions into complex ones using a series of invertible transformations, and understand their applications in density estimation and generative tasks.

Other Generative Models: Learn about Generative Adversarial Networks (GANs) and their probabilistic foundations. Discuss advanced GAN architectures, training stability issues, and their applications in image generation, data augmentation, and unsupervised learning.

Geometric Deep Learning:

Graph Neural Networks (GNNs): Delve into GNNs and their extensions, studying the processing of non-Euclidean data such as graphs and manifolds, and their applications in fields like social network analysis, molecular modeling, and 3D computer vision.

Transformers: Examine the architecture and applications of Transformers and attention mechanisms, with an emphasis on their probabilistic interpretations and applications in natural language processing and beyond.

Group Equivariant Networks: Learn about group equivariant networks (G-CNNs) which leverage symmetries in data to improve learning efficiency and robustness. Study their applications in areas such as computer vision and 3D shape analysis, where capturing rotational and translational invariances is crucial.

Deep Reinforcement Learning (DRL):

Study policy gradient methods like PPO, Actor-Critic methods, deep Q-networks (DQNs), and model-based RL.

Prerequisites: A solid understanding of basic machine learning, probabilities, statistics, and linear algebra is essential. Proficiency in Python and familiarity with modern deep learning software packages, such as PyTorch or TensorFlow, are required to successfully complete the course project. Additionally, prior exposure to fundamental deep learning concepts is highly recommended.

3 credits