nielsr HF Staff commited on
Commit
fe8d8fa
·
verified ·
1 Parent(s): b145030

Add metadata and link to paper

Browse files

Hi! I'm Niels from the Hugging Face community team. I'm opening this PR to enhance your model card with standard metadata:
- Added `pipeline_tag: image-to-image` to ensure the model appears in the correct category on the Hub.
- Added `library_name: diffusers` as the configuration indicates compatibility with the diffusers ecosystem.
- Linked the model to its [Hugging Face paper page](https://huggingface.co/papers/2603.13089).

This metadata helps researchers find and use your work more easily!

Files changed (1) hide show
  1. README.md +21 -6
README.md CHANGED
@@ -1,24 +1,29 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
4
  <p align="center">
5
- 📄 <a href="https://arxiv.org/pdf/2603.13089" target="_blank">Paper</a> &nbsp; | &nbsp;
6
  🖥️ <a href="https://github.com/Zhengsh123/V-Bridge" target="_blank">Code</a> &nbsp; &nbsp;
7
  🌐 <a href="https://zhengsh123.github.io/V-Bridge/" target="_blank">Website</a> &nbsp; &nbsp;
8
  </p>
9
 
10
- This repo contains the model for the paper V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration.
11
 
12
  # Overview
13
- Large-scale video generative models are trained on vast and diverse visual data, enabling them to internalize rich structural, semantic, and dynamic priors of the visual world. While these models have demonstrated impressive generative capability, their potential as general-purpose visual learners remains largely untapped. In this work, we introduce V-Bridge, a framework that bridges this latent capacity to versatile few-shot image restoration tasks. We reinterpret image restoration not as a static regression problem, but as a progressive generative process, and leverage video models to simulate the gradual refinement from degraded inputs to high-fidelity outputs. Surprisingly, with only 1,000 multi-task training samples (less than 2% of existing restoration methods), pretrained video models can be induced to perform competitive image restoration, achieving multiple tasks with a single model, rivaling specialized architectures designed explicitly for this purpose. Our findings reveal that video generative models implicitly learn powerful and transferable restoration priors that can be activated with only extremely limited data, challenging the traditional boundary between generative modeling and low-level vision, and opening a new design paradigm for foundation models in visual tasks.
 
 
14
 
15
  # Details
16
 
17
  Our model uses a full fine-tuning approach, with the base model being [Wan2.2-TI2V-5B](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B).
18
 
19
- The following are some of the detailed parameters for inference.
20
 
21
- ```
22
  cfg_skip_ratio = 0.15
23
 
24
  sampler_name = "Flow_Unipc"
@@ -58,4 +63,14 @@ num_inference_steps = 50
58
  More details and usage instructions can be found on [GitHub](https://github.com/Zhengsh123/V-Bridge).
59
 
60
  # Acknowledgements
61
- We would like to thank the contributors to [Wan-AI](https://huggingface.co/Wan-AI), [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun) and HuggingFace repositories, for their open research.
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: diffusers
4
+ pipeline_tag: image-to-image
5
  ---
6
+
7
  <p align="center">
8
+ 📄 <a href="https://huggingface.co/papers/2603.13089" target="_blank">Paper</a> &nbsp; | &nbsp;
9
  🖥️ <a href="https://github.com/Zhengsh123/V-Bridge" target="_blank">Code</a> &nbsp; &nbsp;
10
  🌐 <a href="https://zhengsh123.github.io/V-Bridge/" target="_blank">Website</a> &nbsp; &nbsp;
11
  </p>
12
 
13
+ This repository contains the model for the paper [V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration](https://huggingface.co/papers/2603.13089).
14
 
15
  # Overview
16
+ Large-scale video generative models are trained on vast and diverse visual data, enabling them to internalize rich structural, semantic, and dynamic priors of the visual world. V-Bridge is a framework that bridges this latent capacity to versatile few-shot image restoration tasks. By reinterpreting image restoration as a progressive generative process, V-Bridge leverages video models to simulate the gradual refinement from degraded inputs to high-fidelity outputs.
17
+
18
+ Surprisingly, with only 1,000 multi-task training samples (less than 2% of existing restoration methods), pretrained video models can be induced to perform competitive image restoration, achieving multiple tasks with a single model and rivaling specialized architectures designed explicitly for this purpose.
19
 
20
  # Details
21
 
22
  Our model uses a full fine-tuning approach, with the base model being [Wan2.2-TI2V-5B](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B).
23
 
24
+ The following are some of the detailed parameters for inference:
25
 
26
+ ```python
27
  cfg_skip_ratio = 0.15
28
 
29
  sampler_name = "Flow_Unipc"
 
63
  More details and usage instructions can be found on [GitHub](https://github.com/Zhengsh123/V-Bridge).
64
 
65
  # Acknowledgements
66
+ We would like to thank the contributors to [Wan-AI](https://huggingface.co/Wan-AI), [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun) and HuggingFace repositories, for their open research.
67
+
68
+ # Citation
69
+ ```bibtex
70
+ @article{zheng2026V-Bridge,
71
+ title={V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration},
72
+ author={Zheng, Shenghe and Jiang, Junpeng and Li, Wenbo},
73
+ journal={arXiv preprint arXiv:2603.13089},
74
+ year={2026}
75
+ }
76
+ ```