-
Drifting VQ-VAE -- How "drifting models" fixe failure modes of VQ-VAE
-
Loss landscape visualization 1 -- Seeing sticky plateau
-
Research agent 1 -- Reproducing 2026-01-01 blog (physics of feature learning)
-
Research agents should target knowledge graphs, not papers
-
181-parameter transformer-like models for 10-digit addition