ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models Paper • 2510.06014 • Published Oct 7 • 10
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published about 1 month ago • 208
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27 • 96
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models Paper • 2503.09567 • Published Mar 12
Efficient Process Reward Model Training via Active Learning Paper • 2504.10559 • Published Apr 14 • 13
End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions Paper • 2311.09008 • Published Nov 15, 2023
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation Paper • 2505.23885 • Published May 29
AI4Research: A Survey of Artificial Intelligence for Scientific Research Paper • 2507.01903 • Published Jul 2 • 4
Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages Paper • 2310.14799 • Published Oct 23, 2023
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models Paper • 2505.19108 • Published May 25
Aware First, Think Less: Dynamic Boundary Self-Awareness Drives Extreme Reasoning Efficiency in Large Language Models Paper • 2508.11582 • Published Aug 15 • 1
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures Paper • 2510.14616 • Published Oct 16 • 11
COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes Paper • 2510.14763 • Published Oct 16 • 13
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 Paper • 2510.19600 • Published Oct 22 • 68
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176