Multi-Vector Index Compression in Any Modality Collection Models and Paper for Multi-Vector Index Compression in Any Modality β’ 15 items β’ Updated Mar 9 β’ 3
MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning Paper β’ 2509.21113 β’ Published Sep 25, 2025 β’ 6
EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models Paper β’ 2511.10201 β’ Published Nov 13, 2025
Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models Paper β’ 2603.17541 β’ Published Mar 18 β’ 20
UME-R1 Collection UME-R1 is a framework designed to endow multimodal embedding models with the flexibility to switch between discriminative and generative embeddings β’ 5 items β’ Updated Mar 23 β’ 9
AnglEπ-based Embeddings Collection This collection consists of popular embeddings trained with AnglE: https://github.com/SeanLee97/AnglE β’ 9 items β’ Updated Aug 1, 2024 β’ 4
MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model Paper β’ 2406.11193 β’ Published Jun 17, 2024
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality Paper β’ 2410.04780 β’ Published Oct 7, 2024 β’ 1
UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web Paper β’ 2310.18340 β’ Published Oct 22, 2023