update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model:
|
| 4 |
+
- Qwen/Qwen3-VL-2B-Instruct
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
tags:
|
| 8 |
+
- text
|
| 9 |
+
- image
|
| 10 |
+
- video
|
| 11 |
+
- multimodal-embedding
|
| 12 |
+
- vidore
|
| 13 |
+
- colpali
|
| 14 |
+
- colqwen3
|
| 15 |
+
- multilingual-embedding
|
| 16 |
+
new_version: goodman2001/colqwen3-vlembed-base
|
| 17 |
+
pipeline_tag: visual-document-retrieval
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
# ColQwen3-vlembed-base: Visual Retriever built by merging Qwen3-VL-2B-Instruct with Qwen3-VL-Embedding-2B through ColBERT strategy
|
| 21 |
+
|
| 22 |
+
## 🙏🙏🙏 Why always 2B ??
|
| 23 |
+
|
| 24 |
+
Due to my limited computing resources, I can currently only conduct some interesting experiments with 2B/4B models.😝😝😝
|
| 25 |
+
|
| 26 |
+
<p align="center"><img width=800 src="https://github.com/illuin-tech/colpali/blob/main/assets/colpali_architecture.webp?raw=true"/></p>
|
| 27 |
+
|
| 28 |
+
## Usage
|
| 29 |
+
|
| 30 |
+
> [!WARNING]
|
| 31 |
+
> This version should not be used: it is solely the base version useful for deterministic LoRA initialization.
|
| 32 |
+
|
| 33 |
+
This model is built by merging **[Qwen/Qwen3-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct)** with **[Qwen3-VL-Embedding-2B](https://huggingface.co/Qwen/Qwen3-VL-Embedding-2B)**
|
| 34 |
+
|
| 35 |
+
<p align="center">
|
| 36 |
+
<img src="https://model-demo.oss-cn-hangzhou.aliyuncs.com/Qwen3-VL-Embedding.png" width="400"/>
|
| 37 |
+
</p>
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
## Contact
|
| 41 |
+
|
| 42 |
+
- Mungeryang: mungerygm@gmail.com/yangguimiao@iie.ac.cn
|
| 43 |
+
|
| 44 |
+
|
| 45 |
+
## Acknowledgments
|
| 46 |
+
|
| 47 |
+
❤️❤️❤️
|
| 48 |
+
|
| 49 |
+
> [!WARNING]
|
| 50 |
+
> Thanks to the **Colpali team** and **Qwen team** for their excellent open-source works!
|
| 51 |
+
> I accomplished this work by **standing on the shoulders of giants~**
|
| 52 |
+
|
| 53 |
+
<p align="center">
|
| 54 |
+
<img src="https://cdn.mos.cms.futurecdn.net/pqHroHNqYyQoJvEPrYkbcj-1200-80.jpg" width="80%"/>
|
| 55 |
+
<p>
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
## Citation
|
| 59 |
+
|
| 60 |
+
If you use any datasets or models from this organization in your research, please cite the original dataset as follows:
|
| 61 |
+
|
| 62 |
+
```bibtex
|
| 63 |
+
@misc{faysse2024colpaliefficientdocumentretrieval,
|
| 64 |
+
title={ColPali: Efficient Document Retrieval with Vision Language Models},
|
| 65 |
+
author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and Céline Hudelot and Pierre Colombo},
|
| 66 |
+
year={2024},
|
| 67 |
+
eprint={2407.01449},
|
| 68 |
+
archivePrefix={arXiv},
|
| 69 |
+
primaryClass={cs.IR},
|
| 70 |
+
url={https://arxiv.org/abs/2407.01449},
|
| 71 |
+
}
|