| | --- |
| | datasets: |
| | - cuad |
| | - theatticusproject/cuad |
| | language: |
| | - en |
| | pipeline_tag: question-answering |
| | --- |
| | |
| |
|
| | # BERT-large fine-tuned on CUAD |
| |
|
| | This is a **BERT-large** model ([`bert-large-uncased-whole-word-masking`][2]) fine-tuned on the [**CUAD**][3] dataset |
| | from [*CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review* (Hendrycks et al., 2021)][1], with the **BertforQuestionAnswering** model architecture. |
| |
|
| | The questions ask for information often found in contracts; |
| | the model would return the relevant text string and its starting index in the given document if the information exists. |
| | The CUAD dataset is in SQuAD 2.0 format. |
| |
|
| | For details of the dataset and usage of the relevant training/testing scripts, check out the paper and their [Github repo][4]. |
| |
|
| | [1]: https://arxiv.org/abs/2103.06268 |
| | [2]: https://huggingface.co/bert-large-uncased-whole-word-masking |
| | [3]: https://www.atticusprojectai.org/cuad |
| | [4]: https://github.com/TheAtticusProject/cuad |