Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published 15 days ago • 29
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs Paper • 2510.05069 • Published Oct 6 • 12
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published Oct 1 • 58
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks Paper • 2510.02286 • Published Oct 2 • 28