-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 142 -
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 138 -
Learning to Reason under Off-Policy Guidance
Paper • 2504.14945 • Published • 88
Collections
Discover the best community collections!
Collections including paper arxiv:2410.05779
-
Financial Fraud Detection: A Comparative Study of Quantum Machine Learning Models
Paper • 2308.05237 • Published • 1 -
LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper • 2410.05779 • Published • 23 -
Realistic Synthetic Financial Transactions for Anti-Money Laundering Models
Paper • 2306.16424 • Published -
Combating Financial Crimes with Unsupervised Learning Techniques: Clustering and Dimensionality Reduction for Anti-Money Laundering
Paper • 2403.00777 • Published
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 35 -
LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper • 2410.05779 • Published • 23 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Paper • 2501.03936 • Published • 23
-
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 173 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53
-
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking
Paper • 2505.02322 • Published • 1 -
Affordable AI Assistants with Knowledge Graph of Thoughts
Paper • 2504.02670 • Published • 2 -
PolyG: Effective and Efficient GraphRAG with Adaptive Graph Traversal
Paper • 2504.02112 • Published • 2 -
Retrieval-Augmented Generation with Hierarchical Knowledge
Paper • 2503.10150 • Published • 2
-
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 39 -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Paper • 2512.03383 • Published • 3 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 96 -
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Paper • 2511.18890 • Published • 29
-
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Paper • 2511.18373 • Published • 5 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 17 -
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Paper • 2511.19418 • Published • 26 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 107
-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 142 -
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 138 -
Learning to Reason under Off-Policy Guidance
Paper • 2504.14945 • Published • 88
-
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking
Paper • 2505.02322 • Published • 1 -
Affordable AI Assistants with Knowledge Graph of Thoughts
Paper • 2504.02670 • Published • 2 -
PolyG: Effective and Efficient GraphRAG with Adaptive Graph Traversal
Paper • 2504.02112 • Published • 2 -
Retrieval-Augmented Generation with Hierarchical Knowledge
Paper • 2503.10150 • Published • 2
-
Financial Fraud Detection: A Comparative Study of Quantum Machine Learning Models
Paper • 2308.05237 • Published • 1 -
LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper • 2410.05779 • Published • 23 -
Realistic Synthetic Financial Transactions for Anti-Money Laundering Models
Paper • 2306.16424 • Published -
Combating Financial Crimes with Unsupervised Learning Techniques: Clustering and Dimensionality Reduction for Anti-Money Laundering
Paper • 2403.00777 • Published
-
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 39 -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Paper • 2512.03383 • Published • 3 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 96 -
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Paper • 2511.18890 • Published • 29
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 35 -
LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper • 2410.05779 • Published • 23 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Paper • 2501.03936 • Published • 23
-
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Paper • 2511.18373 • Published • 5 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 17 -
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Paper • 2511.19418 • Published • 26 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 107
-
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 158 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 173 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53