--- license: apache-2.0 base_model: - Qwen/Qwen3-VL-2B-Instruct language: - en tags: - text - image - video - multimodal-embedding - vidore - colpali - colqwen3 - multilingual-embedding new_version: goodman2001/colqwen3-vlembed-base pipeline_tag: visual-document-retrieval --- # ColQwen3-vlembed-base: Visual Retriever built by merging Qwen3-VL-2B-Instruct with Qwen3-VL-Embedding-2B through ColBERT strategy ## πππ Why always 2B ?? Due to my limited computing resources, I can currently only conduct some interesting experiments with 2B/4B models.πππ

## Citation If you use any datasets or models from this organization in your research, please cite the original dataset as follows: ```bibtex @misc{faysse2024colpaliefficientdocumentretrieval, title={ColPali: Efficient Document Retrieval with Vision Language Models}, author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and CΓ©line Hudelot and Pierre Colombo}, year={2024}, eprint={2407.01449}, archivePrefix={arXiv}, primaryClass={cs.IR}, url={https://arxiv.org/abs/2407.01449}, }