-
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 45 -
DINGO: Constrained Inference for Diffusion LLMs
Paper • 2505.23061 • Published • 31 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44
Collections
Discover the best community collections!
Collections including paper arxiv:2602.22661
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
Paper • 2603.15594 • Published • 149 -
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
Paper • 2603.13398 • Published • 154 -
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Paper • 2603.06569 • Published • 119
-
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models
Paper • 2601.23143 • Published • 39 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 227 -
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201
-
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
Paper • 2603.17187 • Published • 139 -
Attention Residuals
Paper • 2603.15031 • Published • 183 -
MOSS-TTS Technical Report
Paper • 2603.18090 • Published • 12 -
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens
Paper • 2603.23516 • Published • 49
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
Fara-7B: An Efficient Agentic Model for Computer Use
Paper • 2511.19663 • Published • 17 -
LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference
Paper • 2510.09665 • Published • 5 -
PersonaLive! Expressive Portrait Image Animation for Live Streaming
Paper • 2512.11253 • Published • 40
-
IndustryShapes: An RGB-D Benchmark dataset for 6D object pose estimation of industrial assembly components and tools
Paper • 2602.05555 • Published -
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Paper • 2410.13790 • Published -
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale
Paper • 2602.23361 • Published • 15
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 106 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 99 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66
-
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper • 2505.22618 • Published • 45 -
DINGO: Constrained Inference for Diffusion LLMs
Paper • 2505.23061 • Published • 31 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
Paper • 2506.14429 • Published • 44
-
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
Paper • 2603.17187 • Published • 139 -
Attention Residuals
Paper • 2603.15031 • Published • 183 -
MOSS-TTS Technical Report
Paper • 2603.18090 • Published • 12 -
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens
Paper • 2603.23516 • Published • 49
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
Paper • 2603.15594 • Published • 149 -
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence
Paper • 2603.13398 • Published • 154 -
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Paper • 2603.06569 • Published • 119
-
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
Fara-7B: An Efficient Agentic Model for Computer Use
Paper • 2511.19663 • Published • 17 -
LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference
Paper • 2510.09665 • Published • 5 -
PersonaLive! Expressive Portrait Image Animation for Live Streaming
Paper • 2512.11253 • Published • 40
-
IndustryShapes: An RGB-D Benchmark dataset for 6D object pose estimation of industrial assembly components and tools
Paper • 2602.05555 • Published -
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Paper • 2410.13790 • Published -
dLLM: Simple Diffusion Language Modeling
Paper • 2602.22661 • Published • 153 -
VGG-T^3: Offline Feed-Forward 3D Reconstruction at Scale
Paper • 2602.23361 • Published • 15
-
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models
Paper • 2601.23143 • Published • 39 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 227 -
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 106 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 99 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66