Latest Issue Archives Learning AI Newsletters AI Index AI News Today Advertise Login
AI News Weekly

No spam, ever. We'll never share your email address and you can opt out at any time. Already a subscriber? Log in

Stream Directory
C Who's Who of AI

Clement Delangue

684 trust @clementdelangue · 413,575 followers
What they're sharing

Articles & links

RT @sundeep: https://t.co/6TzHB4ujWb

nvidia/GLM-5.2-NVFP4 · Hugging Face huggingface.co
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 3d ago

Kog open-sourced on @huggingface the 2B model that they used to show a model running at 3,000+ tokens per second. Very cool work! https://t.co/fjCnAwQoWe https://t.co/k8hD7xW0F7

Kog Laneformer 2B: The Latency-First Model Behind Kog Inference Engine huggingface.co
AI Weekly's analysis →
  • Kog released Laneformer 2B, a 2.3B-parameter instruction-tuned coding model built around decoding speed rather than benchmark score.
  • The team reports 3,000 output tokens/s on 8× AMD MI300X and 2,100 on 8× NVIDIA H200 at FP16, batch size 1.
  • Laneformer 2B scores 45.1% on HumanEval+ and 51.6% on MBPP+ in greedy decoding, with sliding-window attention on 10 of 15 layers.
Read full analysis →
View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 5d ago

©2015-2026 AI News Weekly | Latest Issue | Archives | Learning AI

Log in | Unsubscribe

We use essential cookies to keep the site working (login, form security). With your permission, we also use analytics cookies to understand how you use the site. Privacy policy