Weekly

Deep Learning

Neural networks, architectures, and the math that powers modern AI.

Transformers, diffusion models, state space models, CNNs, GNNs — the architectures that actually power the models everyone talks about. This newsletter goes deeper than "GPT is good" into how and why these systems work, what's changing, and where the field is heading.

Latest Issue

deep-learning News: NVIDIA validates NVFP4 pretraining on a 12B Mamba-Transformer at 10T t — May 26, 2026

NVFP4 makes the leap from theory to 10T tokens, and Mamba-3 keeps eating into the attention budget. The week's center of gravity sits in numerical precision: NVIDIA published a 4-bit pretraining recipe that holds at the multi-trillion-token horizon, which is the regime where every prior low-precisi...

Read This Issue May 26, 2026

What we cover

  • Neural network architectures
  • Training techniques & optimization
  • Transformers & attention mechanisms
  • Diffusion models & generative architectures
  • Hardware & compute for deep learning

Who it's for

ML engineers, researchers, and students who want to understand AI at the architecture level.