Feature Extraction
Transformers
PyTorch
Safetensors
Hebrew
bert
custom_code
text-embeddings-inference
Instructions to use dicta-il/dictabert-joint with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dicta-il/dictabert-joint with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="dicta-il/dictabert-joint", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("dicta-il/dictabert-joint", trust_remote_code=True) model = AutoModel.from_pretrained("dicta-il/dictabert-joint", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
Update BertForMorphTagging.py
Browse files- BertForMorphTagging.py +2 -0
BertForMorphTagging.py
CHANGED
|
@@ -177,6 +177,8 @@ def parse_logits(input_ids: List[List[int]], sentences: List[str], tokenizer: Be
|
|
| 177 |
# { pos: str, feats: dict, prefixes: List[str], suffix: str | bool, suffix_feats: dict | None}
|
| 178 |
special_toks = tokenizer.all_special_tokens
|
| 179 |
special_toks.remove(tokenizer.unk_token)
|
|
|
|
|
|
|
| 180 |
ret = []
|
| 181 |
for sent_idx,sentence in enumerate(sentences):
|
| 182 |
input_id_strs = tokenizer.convert_ids_to_tokens(input_ids[sent_idx])
|
|
|
|
| 177 |
# { pos: str, feats: dict, prefixes: List[str], suffix: str | bool, suffix_feats: dict | None}
|
| 178 |
special_toks = tokenizer.all_special_tokens
|
| 179 |
special_toks.remove(tokenizer.unk_token)
|
| 180 |
+
special_toks.remove(tokenizer.mask_token)
|
| 181 |
+
|
| 182 |
ret = []
|
| 183 |
for sent_idx,sentence in enumerate(sentences):
|
| 184 |
input_id_strs = tokenizer.convert_ids_to_tokens(input_ids[sent_idx])
|