-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 143 -
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 139 -
Learning to Reason under Off-Policy Guidance
Paper • 2504.14945 • Published • 88
Collections
Discover the best community collections!
Collections including paper arxiv:2410.05779
-
Tabular Transformers for Modeling Multivariate Time Series
Paper • 2011.01843 • Published • 2 -
AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly Detection using Data Degradation Scheme
Paper • 2305.04468 • Published -
Realistic Synthetic Financial Transactions for Anti-Money Laundering Models
Paper • 2306.16424 • Published -
Challenges and Complexities in Machine Learning based Credit Card Fraud Detection
Paper • 2208.10943 • Published
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 35 -
LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper • 2410.05779 • Published • 24 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Paper • 2501.03936 • Published • 23
-
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking
Paper • 2505.02322 • Published • 1 -
Affordable AI Assistants with Knowledge Graph of Thoughts
Paper • 2504.02670 • Published • 2 -
PolyG: Effective and Efficient GraphRAG with Adaptive Graph Traversal
Paper • 2504.02112 • Published • 2 -
Retrieval-Augmented Generation with Hierarchical Knowledge
Paper • 2503.10150 • Published • 2
-
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 109 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 93 -
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Paper • 2512.04677 • Published • 163 -
LongCat-Image Technical Report
Paper • 2512.07584 • Published • 15
-
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 44 -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Paper • 2512.03383 • Published • 3 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 100 -
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Paper • 2511.18890 • Published • 29
-
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Paper • 2511.18373 • Published • 5 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 17 -
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Paper • 2511.19418 • Published • 27 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 109
-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 143 -
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Paper • 2504.13837 • Published • 139 -
Learning to Reason under Off-Policy Guidance
Paper • 2504.14945 • Published • 88
-
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking
Paper • 2505.02322 • Published • 1 -
Affordable AI Assistants with Knowledge Graph of Thoughts
Paper • 2504.02670 • Published • 2 -
PolyG: Effective and Efficient GraphRAG with Adaptive Graph Traversal
Paper • 2504.02112 • Published • 2 -
Retrieval-Augmented Generation with Hierarchical Knowledge
Paper • 2503.10150 • Published • 2
-
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 109 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 93 -
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Paper • 2512.04677 • Published • 163 -
LongCat-Image Technical Report
Paper • 2512.07584 • Published • 15
-
Tabular Transformers for Modeling Multivariate Time Series
Paper • 2011.01843 • Published • 2 -
AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly Detection using Data Degradation Scheme
Paper • 2305.04468 • Published -
Realistic Synthetic Financial Transactions for Anti-Money Laundering Models
Paper • 2306.16424 • Published -
Challenges and Complexities in Machine Learning based Credit Card Fraud Detection
Paper • 2208.10943 • Published
-
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 44 -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
Paper • 2512.03383 • Published • 3 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 100 -
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
Paper • 2511.18890 • Published • 29
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 35 -
LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper • 2410.05779 • Published • 24 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Paper • 2501.03936 • Published • 23
-
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Paper • 2511.18373 • Published • 5 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 17 -
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Paper • 2511.19418 • Published • 27 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 109