Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use thomaskim1130/stella_en_400M_v5-FinanceRAG with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("thomaskim1130/stella_en_400M_v5-FinanceRAG", trust_remote_code=True)
sentences = [
"Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Title: \nText: by what percentage did the aeco natural gas sales index decline from 2011 to 2013?",
"Title: \nText: | | 2017 | 2016 |\n| Projected benefit obligation | $74,953 | $76,586 |\n| Accumulated benefit obligation | 71,975 | 74,081 |\n| Fair value of plan assets | $58,353 | $56,530 |\nAssumptions The following assumptions, which are the weighted average for all plans, are used to calculate the benefit obligation at December 31 of each year and the net periodic benefit cost for the subsequent year.",
"Title: \nText: discount to Brent was narrower in 2013 than in 2012 and 2011.\nAs a result of the significant increase in U. S. production of light sweet crude oil, the historical relationship between WTI, Brent and LLS pricing may not be indicative of future periods.\nComposition – The proportion of our liquid hydrocarbon sales volumes that are NGLs continues to increase due to our development of United States unconventional liquids-rich plays.\nNGLs were 15 percent of our North America E&P liquid hydrocarbon sales volumes in 2013 compared to 10 percent in 2012 and 7 percent in 2011.\nNatural gas – A significant portion of our natural gas production in the U. S. is sold at bid-week prices, or first-of-month indices relative to our specific producing areas.\nAverage Henry Hub settlement prices for natural gas were 31 percent higher for 2013 than for 2012. International E&P Liquid hydrocarbons – Our International E&P crude oil production is relatively sweet and has historically sold in relation to the Brent crude benchmark, which on average was 3 percent lower for 2013 than 2012.\nNatural gas – Our major International E&P natural gas-producing regions are Europe and E. G. Natural gas prices in Europe have been considerably higher than the U. S. in recent years.\nIn the case of E. G. , our natural gas sales are subject to term contracts, making realized prices in these areas less volatile.\nThe natural gas sales from E. G. are at fixed prices; therefore, our reported average International E&P natural gas realized prices may not fully track market price movements.",
"Title: \nText: american tower corporation and subsidiaries notes to consolidated financial statements towerco ghana for an agreed purchase price of up to approximately $ 430 million , of which the company will pay up to approximately $ 220 million for its 51% ( 51 % ) stake in the holding company .\nmtn ghana will be the anchor tenant , on commercial terms , on each of the towers being purchased .\nthe company also expects that towerco ghana will build at least an additional 400 sites for both mtn ghana and other wireless operators in ghana over the next five years .\nthe company expects to close on an initial tranche of towers in the first half of 2011 , subject to customary closing conditions .\n6 .\nlong-term obligations outstanding amounts under the company 2019s long-term financing arrangements consist of the following as of december 31 , ( in thousands ) : .\n\n | 2010 | 2009 \n----------------------------------------------------------- | ---------------- | ----------------\ncommercial mortgage pass-through certificates series 2007-1 | $ 1750000 | $ 1750000 \nrevolving credit facility | 300000 | 550000 \nterm loan | 325000 | 325000 \nxcel credit facility | 2014 | 73367 \ncolombian short-term credit facility | 72889 | 2014 \n4.50% ( 4.50 % ) senior notes | 999216 | 2014 \n5.05% ( 5.05 % ) senior notes | 699186 | 2014 \n4.625% ( 4.625 % ) senior notes | 599346 | 599210 \n7.00% ( 7.00 % ) senior notes | 500000 | 500000 \n7.25% ( 7.25 % ) senior notes | 295420 | 295038 \n5.0% ( 5.0 % ) convertible notes | 2014 | 59683 \n7.25% ( 7.25 % ) senior subordinated notes | 2014 | 288 \nnotes payable and capital leases | 46331 | 58995 \ntotal | 5587388 | 4211581 \nless current portion of long term obligations | -74896 ( 74896 ) | -70521 ( 70521 )\nlong-term obligations | $ 5512492 | $ 4141060 \n\ncommercial mortgage pass-through certificates , series 2007-1 2014during the year ended december 31 , 2007 , the company completed a securitization transaction ( the 201csecuritization 201d ) involving assets related to 5295 broadcast and wireless communications towers ( the 201csecured towers 201d ) owned by two special purpose subsidiaries of the company , through a private offering of $ 1.75 billion of commercial mortgage pass-through certificates , series 2007-1 ( the 201ccertificates 201d ) .\nthe certificates were issued by american tower trust i ( the trust ) , a trust established by american tower depositor sub , llc ( the 201cdepositor 201d ) , an indirect wholly owned special purpose subsidiary of the company .\nthe assets of the trust consist of a recourse loan ( the 201cloan 201d ) initially made by the depositor to american tower asset sub , llc and american tower asset sub ii , llc ( the 201cborrowers 201d ) , pursuant to a loan and security agreement among the foregoing parties dated as of may 4 , 2007 ( the 201cloan agreement 201d ) .\nthe borrowers are special purpose entities formed solely for the purpose of holding the secured towers subject to the securitization .\nthe certificates were issued in seven separate classes , comprised of class a-fx , class a-fl , class b , class c , class d , class e and class f .\neach of the certificates in classes b , c , d , e and f are subordinated in right of payment to any other class of certificates which has an earlier alphabetical designation .\nthe certificates were issued with terms identical to the loan except for the class a-fl certificates , which bear interest at a floating "
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from dunzhang/stella_en_400M_v5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 1024, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Title: \nText: what was the average for "other" loans held in 2012 and 2011?',
'Title: \nText: LOANS HELD FOR SALE Table 15: Loans Held For Sale\n| In millions | December 312012 | December 312011 |\n| Commercial mortgages at fair value | $772 | $843 |\n| Commercial mortgages at lower of cost or market | 620 | 451 |\n| Total commercial mortgages | 1,392 | 1,294 |\n| Residential mortgages at fair value | 2,096 | 1,415 |\n| Residential mortgages at lower of cost or market | 124 | 107 |\n| Total residential mortgages | 2,220 | 1,522 |\n| Other | 81 | 120 |\n| Total | $3,693 | $2,936 |\nWe stopped originating commercial mortgage loans held for sale designated at fair value in 2008 and continue pursuing opportunities to reduce these positions at appropriate prices.\nAt December 31, 2012, the balance relating to these loans was $772 million, compared to $843 million at December 31, 2011.\nWe sold $32 million in unpaid principal balances of these commercial mortgage loans held for sale carried at fair value in 2012 and sold $25 million in 2011.',
'Title: \nText: Investments and Derivative Instruments (continued) Security Unrealized Loss Aging The following tables present the Company’s unrealized loss aging for AFS securities by type and length of time the security was in a continuous unrealized loss position.\n| | December 31, 2011 |\n| | Less Than 12 Months | 12 Months or More | Total |\n| | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized |\n| | Cost | Value | Losses | Cost | Value | Losses | Cost | Value | Losses |\n| ABS | $629 | $594 | $-35 | $1,169 | $872 | $-297 | $1,798 | $1,466 | $-332 |\n| CDOs | 81 | 59 | -22 | 2,709 | 2,383 | -326 | 2,790 | 2,442 | -348 |\n| CMBS | 1,297 | 1,194 | -103 | 2,144 | 1,735 | -409 | 3,441 | 2,929 | -512 |\n| Corporate [1] | 4,388 | 4,219 | -169 | 3,268 | 2,627 | -570 | 7,656 | 6,846 | -739 |\n| Foreign govt./govt. agencies | 218 | 212 | -6 | 51 | 47 | -4 | 269 | 259 | -10 |\n| Municipal | 299 | 294 | -5 | 627 | 560 | -67 | 926 | 854 | -72 |\n| RMBS | 415 | 330 | -85 | 1,206 | 835 | -371 | 1,621 | 1,165 | -456 |\n| U.S. Treasuries | 343 | 341 | -2 | — | — | — | 343 | 341 | -2 |\n| Total fixed maturities | 7,670 | 7,243 | -427 | 11,174 | 9,059 | -2,044 | 18,844 | 16,302 | -2,471 |\n| Equity securities | 167 | 138 | -29 | 439 | 265 | -174 | 606 | 403 | -203 |\n| Total securities in an unrealized loss | $7,837 | $7,381 | $-456 | $11,613 | $9,324 | $-2,218 | $19,450 | $16,705 | $-2,674 |\nDecember 31, 2010\n| | December 31, 2010 |\n| | Less Than 12 Months | 12 Months or More | Total |\n| | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized | Amortized | Fair | Unrealized |\n| | Cost | Value | Losses | Cost | Value | Losses | Cost | Value | Losses |\n| ABS | $302 | $290 | $-12 | $1,410 | $1,026 | $-384 | $1,712 | $1,316 | $-396 |\n| CDOs | 321 | 293 | -28 | 2,724 | 2,274 | -450 | 3,045 | 2,567 | -478 |\n| CMBS | 556 | 530 | -26 | 3,962 | 3,373 | -589 | 4,518 | 3,903 | -615 |\n| Corporate | 5,533 | 5,329 | -199 | 4,017 | 3,435 | -548 | 9,550 | 8,764 | -747 |\n| Foreign govt./govt. agencies | 356 | 349 | -7 | 78 | 68 | -10 | 434 | 417 | -17 |\n| Municipal | 7,485 | 7,173 | -312 | 1,046 | 863 | -183 | 8,531 | 8,036 | -495 |\n| RMBS | 1,744 | 1,702 | -42 | 1,567 | 1,147 | -420 | 3,311 | 2,849 | -462 |\n| U.S. Treasuries | 2,436 | 2,321 | -115 | 158 | 119 | -39 | 2,594 | 2,440 | -154 |\n| Total fixed maturities | 18,733 | 17,987 | -741 | 14,962 | 12,305 | -2,623 | 33,695 | 30,292 | -3,364 |\n| Equity securities | 53 | 52 | -1 | 637 | 506 | -131 | 690 | 558 | -132 |\n| Total securities in an unrealized loss | $18,786 | $18,039 | $-742 | $15,599 | $12,811 | $-2,754 | $34,385 | $30,850 | $-3,496 |\n[1] Unrealized losses exclude the change in fair value of bifurcated embedded derivative features of certain securities.\nSubsequent changes in fair value are recorded in net realized capital gains (losses).\nAs of December 31, 2011, AFS securities in an unrealized loss position, comprised of 2,549 securities, primarily related to corporate securities within the financial services sector, CMBS, and RMBS which have experienced significant price deterioration.\nAs of December 31, 2011, 75% of these securities were depressed less than 20% of cost or amortized cost.\nThe decline in unrealized losses during 2011 was primarily attributable to a decline in interest rates, partially offset by credit spread widening.\nMost of the securities depressed for twelve months or more relate to structured securities with exposure to commercial and residential real estate, as well as certain floating rate corporate securities or those securities with greater than 10 years to maturity, concentrated in the financial services sector.\nCurrent market spreads continue to be significantly wider for structured securities with exposure to commercial and residential real estate, as compared to spreads at the security’s respective purchase date, largely due to the economic and market uncertainties regarding future performance of commercial and residential real estate.\nIn addition, the majority of securities have a floating-rate coupon referenced to a market index where rates have declined substantially.\nThe Company neither has an intention to sell nor does it expect to be required to sell the securities outlined above.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
EvaluateInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.3617 |
| cosine_accuracy@3 | 0.5194 |
| cosine_accuracy@5 | 0.6092 |
| cosine_accuracy@10 | 0.7015 |
| cosine_precision@1 | 0.3617 |
| cosine_precision@3 | 0.1788 |
| cosine_precision@5 | 0.1267 |
| cosine_precision@10 | 0.0752 |
| cosine_recall@1 | 0.331 |
| cosine_recall@3 | 0.4768 |
| cosine_recall@5 | 0.5614 |
| cosine_recall@10 | 0.6548 |
| cosine_ndcg@10 | 0.496 |
| cosine_mrr@10 | 0.4668 |
| cosine_map@100 | 0.4482 |
| dot_accuracy@1 | 0.3325 |
| dot_accuracy@3 | 0.5243 |
| dot_accuracy@5 | 0.5922 |
| dot_accuracy@10 | 0.6748 |
| dot_precision@1 | 0.3325 |
| dot_precision@3 | 0.1796 |
| dot_precision@5 | 0.1248 |
| dot_precision@10 | 0.0726 |
| dot_recall@1 | 0.3059 |
| dot_recall@3 | 0.4762 |
| dot_recall@5 | 0.5446 |
| dot_recall@10 | 0.6273 |
| dot_ndcg@10 | 0.4723 |
| dot_mrr@10 | 0.4422 |
| dot_map@100 | 0.4264 |
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
Instruct: Given a web search query, retrieve relevant passages that answer the query. |
Title: |
Instruct: Given a web search query, retrieve relevant passages that answer the query. |
Title: |
Instruct: Given a web search query, retrieve relevant passages that answer the query. |
Title: |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 2fp16: Truebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falsebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: round_robin| Epoch | Step | Evaluate_cosine_map@100 |
|---|---|---|
| 0 | 0 | 0.2566 |
| 1.0 | 141 | 0.3931 |
| 2.0 | 282 | 0.4482 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}