Ziming Liu
  • about
  • blog (current)
  • publications
  • media coverage
  • talk
  • Sparse attention 4 -- previous token head

    6 min read   ·   January 13, 2026

    2026   ·   Physics-of-AI     ·   AI  

  • Sparse attention 3 -- inefficiency of extracting similar content

    5 min read   ·   January 12, 2026

    2026   ·   Physics-of-AI     ·   AI  

  • Emergence of Induction Head Depends on Learning Rate Schedule

    7 min read   ·   January 11, 2026

    2026   ·   Physics-of-AI     ·   AI  

  • Sparse attention 2 -- Unattention head, branching dynamics

    6 min read   ·   January 10, 2026

    2026   ·   Physics-of-AI     ·   AI  

  • Sparse attention 1 -- sticky plateau and rank collapse

    6 min read   ·   January 09, 2026

    2026   ·   Physics-of-AI     ·   AI  

  • Newer
  • 1
  • 2
  • 3
  • 4
  • Older
© Copyright 2026 Ziming Liu. Powered by Jekyll with al-folio theme. Hosted by GitHub Pages. Photos from Unsplash.