blog | Hailey Schoelkopf

Prefix Linear Attention Can Outspeed Causal Linear Attention

Notes on Prefix Language Modeling--and a surprising observation that PrefixLM can be *faster* than Causal LM under some architectural conditions.

12 min read · August 11, 2024

2024 · ML, Architectures, Linear-Attention, PrefixLM
Linear Attention Fundamentals

The basics of linear attention in sub-quadratic language model architectures.

11 min read · July 09, 2024

2024 · ML, Architectures, Linear-Attention