FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning
Abstract
FROST is an attention-aware method that improves reasoning efficiency by pruning uncritical paths and removing reasoning outliers, leading to reduced token usage and improved accuracy.
We propose FROST, an attention-aware method for efficient reasoning. Unlike traditional approaches, FROST leverages attention weights to prune uncritical reasoning paths, yielding shorter and more reliable reasoning trajectories. Methodologically, we introduce the concept of reasoning outliers and design an attention-based mechanism to remove them. Theoretically, FROST preserves and enhances the model's reasoning capacity while eliminating outliers at the sentence level. Empirically, we validate FROST on four benchmarks using two strong reasoning models (Phi-4-Reasoning and GPT-OSS-20B), outperforming state-of-the-art methods such as TALE and ThinkLess. Notably, FROST achieves an average 69.68% reduction in token usage and a 26.70% improvement in accuracy over the base model. Furthermore, in evaluations of attention outlier metrics, FROST reduces the maximum infinity norm by 15.97% and the average kurtosis by 91.09% compared to the base model. Code is available at https://github.com/robinzixuan/FROST
Community
ICLR2026
arXivLens breakdown of this paper ๐ https://arxivlens.com/PaperView/Details/frost-filtering-reasoning-outliers-with-attention-for-efficient-reasoning-1164-dc23ec0d
- Executive Summary
- Detailed Breakdown
- Practical Applications
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Correct, Concise and Complete: Multi-stage Training For Adaptive Reasoning (2026)
- Long-Chain Reasoning Distillation via Adaptive Prefix Alignment (2026)
- Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers (2026)
- How Does Prefix Matter in Reasoning Model Tuning? (2026)
- Anti-Length Shift: Dynamic Outlier Truncation for Training Efficient Reasoning Models (2026)
- CtrlCoT: Dual-Granularity Chain-of-Thought Compression for Controllable Reasoning (2026)
- ENTRA: Entropy-Based Redundancy Avoidance in Large Language Model Reasoning (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper