On Subquadratic Architectures: From Applications to Principles
On Subquadratic Architectures: From Applications to Principles
要約
Transformers dominate modern sequence modeling, but their quadratic attention incurs substantial computational cost. Subquadratic architectures offer a scalable alternative. However, it remains unclear which designs yield the most effective sequence models. We compare three leading approaches: xLSTM…