Ziming Liu
  • about
  • blog
  • publications
  • media coverage
  • talk
  • Sparse attention 3 -- inefficiency of extracting similar content

    5 min read   ·   January 12, 2026

    2026   ·   Physics-of-AI   Sparse-attention     ·   AI  

  • Emergence of Induction Head Depends on Learning Rate Schedule

    8 min read   ·   January 11, 2026

    2026   ·   Physics-of-AI   Induction-head   Learning-rate-schedule     ·   AI  

  • Sparse attention 2 -- Unattention head, branching dynamics

    6 min read   ·   January 10, 2026

    2026   ·   Physics-of-AI   Sparse-attention     ·   AI  

  • Sparse attention 1 -- sticky plateau and rank collapse

    6 min read   ·   January 09, 2026

    2026   ·   Physics-of-AI   Sparse-attention     ·   AI  

  • Unigram toy model is surprisingly rich -- representation collapse, scaling laws, learning rate schedule

    10 min read   ·   January 08, 2026

    2026   ·   Physics-of-AI   Toy-language     ·   AI  

  • Newer
  • 5
  • 6
  • 7
  • 8
  • 9
  • Older
© Copyright 2026 Ziming Liu. Powered by Jekyll with al-folio theme. Hosted by GitHub Pages. Photos from Unsplash.