FaithLens: Detecting and Explaining Faithfulness Hallucination Paper • 2512.20182 • Published Dec 23, 2025 • 9
VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos Paper • 2510.19488 • Published Oct 22, 2025 • 20
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions Paper • 2510.05318 • Published Oct 6, 2025 • 22
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions Paper • 2510.05318 • Published Oct 6, 2025 • 22
BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions Paper • 2510.05318 • Published Oct 6, 2025 • 22