LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning Paper • 2509.24786 • Published Sep 29 • 5
Temporal Memory Attention for Video Semantic Segmentation Paper • 2102.08643 • Published Feb 17, 2021
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion Paper • 2407.07844 • Published Jul 10, 2024 • 1
FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models Paper • 2508.20586 • Published Aug 28 • 3
WebNovelBench: Placing LLM Novelists on the Web Novel Distribution Paper • 2505.14818 • Published May 20 • 4
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation Paper • 2501.11325 • Published Jan 20 • 5
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding Paper • 2406.08877 • Published Jun 13, 2024
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models Paper • 2407.15886 • Published Jul 21, 2024 • 3