-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 32 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 133 -
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
Paper • 2603.22458 • Published • 137 -
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
Paper • 2603.21986 • Published • 125
Cai
Mona8834
AI & ML interests
None yet
Organizations
None yet
LLM Papers
-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
Paper • 2603.25319 • Published • 32 -
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Paper • 2603.25040 • Published • 133 -
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
Paper • 2603.22458 • Published • 137 -
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model
Paper • 2603.21986 • Published • 125
models 0
None public yet
datasets 0
None public yet