Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ tags:
|
|
| 9 |
- sentence-embedding
|
| 10 |
- mteb
|
| 11 |
model-index:
|
| 12 |
-
- name: bilingual-embedding
|
| 13 |
results:
|
| 14 |
- task:
|
| 15 |
type: Clustering
|
|
@@ -1527,9 +1527,9 @@ metrics:
|
|
| 1527 |
- spearmanr
|
| 1528 |
---
|
| 1529 |
|
| 1530 |
-
# [bilingual-embedding
|
| 1531 |
|
| 1532 |
-
bilingual-embedding is the Embedding Model for bilingual language: french and english. This model is a specialized sentence-embedding trained specifically for the bilingual language, leveraging the robust capabilities of [BGE M3](https://huggingface.co/BAAI/bge-m3), a pre-trained language model larged on the [BGE M3](https://huggingface.co/BAAI/bge-m3) architecture. The model utilizes xlm-roberta to encode english-french sentences into a 1024-dimensional vector space, facilitating a wide range of applications from semantic search to text clustering. The embeddings capture the nuanced meanings of english-french sentences, reflecting both the lexical and contextual layers of the language.
|
| 1533 |
|
| 1534 |
|
| 1535 |
## Full Model Architecture
|
|
@@ -1568,7 +1568,7 @@ from sentence_transformers import SentenceTransformer
|
|
| 1568 |
|
| 1569 |
sentences = ["Paris est une capitale de la France", "Paris is a capital of France"]
|
| 1570 |
|
| 1571 |
-
model = SentenceTransformer('Lajavaness/bilingual-embedding
|
| 1572 |
print(embeddings)
|
| 1573 |
|
| 1574 |
```
|
|
|
|
| 9 |
- sentence-embedding
|
| 10 |
- mteb
|
| 11 |
model-index:
|
| 12 |
+
- name: bilingual-document-embedding
|
| 13 |
results:
|
| 14 |
- task:
|
| 15 |
type: Clustering
|
|
|
|
| 1527 |
- spearmanr
|
| 1528 |
---
|
| 1529 |
|
| 1530 |
+
# [bilingual-document-embedding](https://huggingface.co/Lajavaness/bilingual-document-embedding)
|
| 1531 |
|
| 1532 |
+
bilingual-document-embedding is the Embedding Model for document in bilingual language: french and english with context length up to 8096 tokens . This model is a specialized sentence-embedding trained specifically for the bilingual language, leveraging the robust capabilities of [BGE M3](https://huggingface.co/BAAI/bge-m3), a pre-trained language model larged on the [BGE M3](https://huggingface.co/BAAI/bge-m3) architecture. The model utilizes xlm-roberta to encode english-french sentences into a 1024-dimensional vector space, facilitating a wide range of applications from semantic search to text clustering. The embeddings capture the nuanced meanings of english-french sentences, reflecting both the lexical and contextual layers of the language.
|
| 1533 |
|
| 1534 |
|
| 1535 |
## Full Model Architecture
|
|
|
|
| 1568 |
|
| 1569 |
sentences = ["Paris est une capitale de la France", "Paris is a capital of France"]
|
| 1570 |
|
| 1571 |
+
model = SentenceTransformer('Lajavaness/bilingual-document-embedding', trust_remote_code=True)
|
| 1572 |
print(embeddings)
|
| 1573 |
|
| 1574 |
```
|