Startup Subquadratic Claims Sparse-Attention LLM Cuts Costs and Boosts Speed | TekBrief
TekBrief
All Stories AI News & Media Security StartUps Tech Video
AI

Startup Subquadratic Claims Sparse-Attention LLM Cuts Costs and Boosts Speed

Executive Briefing

  • Announces SubQ, a sparse-attention LLM purportedly 56 times faster than FlashAttention-based models in speed benchmarks
  • Scores 89.7% on LiveCodeBench, placing it alongside top coding models from OpenAI, Google DeepMind, and Anthropic
  • Claims dramatic cost reduction, citing $8 versus $2,600 to run the same large-dataset retrieval test against Anthropic's Opus 4.6
  • Offers a 12-million-token context window, roughly 12 times larger than most current frontier models
  • Third-party evaluator Appen validates core architectural claims, though SubQ remains unavailable for broad public testing
  • Founders assert transformers could become obsolete within years if sparse-attention approaches gain wider adoption