Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion Paper • 2512.04926 • Published 6 days ago • 40
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published Oct 2 • 17
ReasonMap Collection A fine-grained visual reasoning benchmark (We show more question types in the extension dataset.) • 3 items • Updated Oct 1 • 8
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published Oct 2 • 17
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published Oct 2 • 17 • 2
A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding Paper • 2508.01197 • Published Aug 2 • 5
A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding Paper • 2508.01197 • Published Aug 2 • 5
A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding Paper • 2508.01197 • Published Aug 2 • 5 • 2
Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport Paper • 2308.01779 • Published Aug 3, 2023 • 1
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning Paper • 2503.00513 • Published Mar 1 • 1
Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport Paper • 2308.01779 • Published Aug 3, 2023 • 1
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning Paper • 2503.00513 • Published Mar 1 • 1
Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction Paper • 2503.23109 • Published Mar 29
PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance Paper • 2406.09326 • Published Jun 13, 2024 • 1