-
Prefix Linear Attention Can Outspeed Causal Linear Attention
Notes on Prefix Language Modeling--and a surprising observation that PrefixLM can be *faster* than Causal LM under some architectural conditions.
-
Linear Attention Fundamentals
The basics of linear attention in sub-quadratic language model architectures.