Weekly
Deep Learning
Neural networks, architectures, and the math that powers modern AI.
Transformers, diffusion models, state space models, CNNs, GNNs — the architectures that actually power the models everyone talks about. This newsletter goes deeper than "GPT is good" into how and why these systems work, what's changing, and where the field is heading.
Latest Issue
deep-learning News: NVIDIA validates NVFP4 pretraining on a 12B Mamba-Transformer at 10T t — May 26, 2026
NVFP4 makes the leap from theory to 10T tokens, and Mamba-3 keeps eating into the attention budget. The week's center of gravity sits in numerical precision: NVIDIA published a 4-bit pretraining recipe that holds at the multi-trillion-token horizon, which is the regime where every prior low-precisi...
Free. Includes the main AI Weekly briefing. Manage all subscriptions.
What we cover
- Neural network architectures
- Training techniques & optimization
- Transformers & attention mechanisms
- Diffusion models & generative architectures
- Hardware & compute for deep learning
Who it's for
ML engineers, researchers, and students who want to understand AI at the architecture level.