Grid2Matrix: Revealing Digital Agnosia in Vision-Language Models
Paper • 2604.09687 • Published • 7
None defined yet.
Shared Nature, Unique Nurture: PRISM for Pluralistic Reasoning via In-context Structure Modeling
PALMS+: Modular Image-Based Floor Plan Localization Leveraging Depth Foundation Model