mBART-07-TextSimp-LT-BatchSize8-lr1e-4

This model is a fine-tuned version of facebook/mbart-large-50 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0856
  • Rouge1: 0.7721
  • Rouge2: 0.6212
  • Rougel: 0.7638
  • Sacrebleu: 53.4548
  • Gen Len: 36.8624

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Sacrebleu Gen Len
14.4684 0.06 10 13.1518 0.2872 0.186 0.272 6.3114 512.0
12.7706 0.12 20 11.9531 0.2443 0.1569 0.2281 8.8122 512.0
11.7916 0.18 30 11.2548 0.2758 0.1818 0.2591 3.1927 512.0
10.9334 0.24 40 10.5034 0.4334 0.2805 0.4034 7.3173 512.0
10.2486 0.3 50 9.6013 0.2897 0.1829 0.2717 7.6318 512.0
9.3552 0.36 60 8.6828 0.1181 0.0715 0.1106 3.5634 512.0
8.511 0.42 70 7.7545 0.0752 0.0479 0.0716 2.2899 512.0
7.6434 0.48 80 6.7397 0.2452 0.1558 0.2328 9.5121 505.5873
6.5921 0.54 90 5.6501 0.5661 0.3905 0.5478 32.2949 62.8677
5.4669 0.6 100 4.4328 0.6102 0.4252 0.5925 34.814 39.0847
4.2038 0.66 110 3.1788 0.6166 0.4326 0.5996 35.9449 38.3915
2.9406 0.72 120 1.8820 0.6071 0.4296 0.5934 36.3309 37.7884
1.6293 0.78 130 0.8202 0.6135 0.4347 0.6006 36.6343 37.1164
0.6931 0.84 140 0.3292 0.6273 0.4488 0.6152 37.7992 36.8677
0.3063 0.9 150 0.2047 0.6322 0.461 0.6219 39.5968 36.8624
0.2044 0.96 160 0.1721 0.6349 0.4521 0.6207 38.1827 36.8624
0.1642 1.02 170 0.1527 0.6414 0.472 0.6312 40.84 36.8624
0.1379 1.08 180 0.1307 0.6506 0.4813 0.6424 41.2202 36.8624
0.1401 1.14 190 0.1184 0.6212 0.4465 0.6119 38.9759 36.8201
0.1234 1.2 200 0.0963 0.6719 0.4862 0.6592 42.0111 36.8677
0.1042 1.26 210 0.0820 0.6604 0.477 0.6506 41.6463 36.8836
0.0933 1.32 220 0.0821 0.6645 0.49 0.6542 42.6241 36.8624
0.095 1.38 230 0.0967 0.6581 0.4837 0.6492 42.1158 36.8624
0.1022 1.44 240 0.1332 0.6133 0.4489 0.6034 40.4641 36.873
0.7413 1.5 250 1.4191 0.0 0.0 0.0 0.0 40.0
1.3629 1.56 260 1.0858 0.0 0.0 0.0 0.0 37.8624
1.0025 1.62 270 0.8858 0.0196 0.0003 0.0196 0.0151 37.8624
0.7626 1.68 280 0.7693 0.0 0.0 0.0 0.0213 37.8624
0.7308 1.74 290 0.6842 0.0008 0.0 0.0008 0.0214 37.8624
0.6324 1.8 300 0.5928 0.0074 0.0 0.0075 0.0257 36.836
0.5934 1.86 310 0.5517 0.0 0.0 0.0 0.022 36.8783
0.5535 1.92 320 0.5225 0.0319 0.0026 0.0312 0.0565 36.8624
0.5843 1.98 330 0.5135 0.0438 0.0012 0.0435 0.0694 36.8624
0.5422 2.04 340 0.5055 0.028 0.0 0.0281 0.0237 36.8519
0.4415 2.1 350 0.2156 0.4248 0.2316 0.4101 14.9745 36.8624
0.263 2.16 360 0.2392 0.3194 0.1424 0.3038 11.6071 36.8677
0.2213 2.22 370 0.1329 0.554 0.3556 0.5421 27.6303 36.873
0.121 2.28 380 0.1066 0.5935 0.4046 0.5802 35.6023 36.8624
0.1027 2.34 390 0.0934 0.6287 0.4401 0.6155 39.0308 36.8624
0.0858 2.4 400 0.0902 0.6394 0.4678 0.6278 40.2148 36.8624
0.0937 2.46 410 0.0852 0.6721 0.4854 0.6577 41.3938 36.8624
0.0765 2.51 420 0.0798 0.7085 0.53 0.6967 43.7951 36.8624
0.0708 2.57 430 0.0770 0.703 0.5241 0.6921 42.5005 36.8624
0.0717 2.63 440 0.0775 0.7108 0.5289 0.6997 43.3111 36.8624
0.0742 2.69 450 0.0754 0.7182 0.5422 0.7088 44.229 36.8624
0.0646 2.75 460 0.0789 0.7023 0.525 0.692 43.6914 36.8624
0.0995 2.81 470 0.0839 0.6559 0.4866 0.6486 42.6205 36.8624
0.0705 2.87 480 0.0751 0.7197 0.5478 0.7083 44.2463 36.8624
0.076 2.93 490 0.0719 0.7296 0.5522 0.7181 44.3324 36.8624
0.0669 2.99 500 0.0706 0.7263 0.5594 0.7172 45.7606 36.8624
0.0519 3.05 510 0.0712 0.7328 0.5579 0.7233 47.2644 36.8624
0.0478 3.11 520 0.0711 0.7298 0.5574 0.7189 46.5134 36.8624
0.0487 3.17 530 0.0737 0.7258 0.558 0.7171 47.3831 36.8624
0.0508 3.23 540 0.0719 0.7269 0.5577 0.7171 46.1187 36.8624
0.0495 3.29 550 0.0719 0.7229 0.547 0.7113 46.7407 36.8624
0.0437 3.35 560 0.0703 0.7346 0.5625 0.7238 47.6379 36.8624
0.046 3.41 570 0.0688 0.7349 0.5632 0.7252 47.7757 36.8624
0.0449 3.47 580 0.0688 0.7357 0.568 0.726 47.9775 36.8624
0.0496 3.53 590 0.0690 0.7392 0.5694 0.7301 48.3888 36.8624
0.051 3.59 600 0.0675 0.7434 0.5745 0.7349 48.565 36.8624
0.0522 3.65 610 0.0677 0.7465 0.5811 0.735 49.6128 36.8624
0.0529 3.71 620 0.0671 0.7463 0.5825 0.7374 49.4657 36.8624
0.0497 3.77 630 0.0657 0.7448 0.5814 0.7348 48.9751 36.8624
0.0432 3.83 640 0.0670 0.7351 0.5648 0.7253 48.0065 36.8624
0.0468 3.89 650 0.0664 0.7451 0.5827 0.7365 49.2015 36.8624
0.0453 3.95 660 0.0656 0.7391 0.5708 0.729 48.0288 36.8624
0.0417 4.01 670 0.0662 0.7408 0.5711 0.7328 49.2061 36.8624
0.02 4.07 680 0.0719 0.743 0.582 0.7343 49.2463 36.8624
0.0302 4.13 690 0.0699 0.752 0.584 0.7423 48.8187 36.8624
0.0278 4.19 700 0.0701 0.7452 0.5796 0.7374 50.1071 36.8624
0.024 4.25 710 0.0698 0.7567 0.5966 0.7481 50.1833 36.8624
0.024 4.31 720 0.0707 0.7547 0.5932 0.7453 50.0661 36.8624
0.0302 4.37 730 0.0681 0.7572 0.5918 0.7489 50.3441 36.8624
0.0253 4.43 740 0.0683 0.7593 0.5998 0.7504 50.9046 36.8624
0.0246 4.49 750 0.0683 0.7567 0.5909 0.7467 49.9375 36.8624
0.0253 4.55 760 0.0682 0.755 0.584 0.7458 49.9308 36.8624
0.0245 4.61 770 0.0712 0.7492 0.5807 0.7403 50.1708 36.8624
0.0274 4.67 780 0.0689 0.7516 0.5846 0.7431 49.6346 36.8624
0.0261 4.73 790 0.0687 0.7477 0.5824 0.7403 50.6044 36.8624
0.0304 4.79 800 0.0666 0.7534 0.5926 0.7443 49.9827 36.8624
0.0279 4.85 810 0.0675 0.7559 0.5913 0.7452 49.9368 36.8624
0.0277 4.91 820 0.0685 0.754 0.5895 0.7432 50.8755 36.8624
0.0278 4.97 830 0.0676 0.753 0.592 0.744 50.6179 36.8624
0.0215 5.03 840 0.0690 0.7626 0.6059 0.7535 51.7703 36.8624
0.0154 5.09 850 0.0732 0.7626 0.6036 0.7543 51.3579 36.8624
0.0141 5.15 860 0.0767 0.7582 0.5963 0.75 50.8613 36.8624
0.0144 5.21 870 0.0754 0.7564 0.5948 0.7475 50.5094 36.8624
0.0174 5.27 880 0.0736 0.7613 0.6032 0.7527 51.4235 36.8624
0.0141 5.33 890 0.0746 0.7661 0.6085 0.7572 52.4218 36.8624
0.0139 5.39 900 0.0753 0.7661 0.6079 0.7576 51.9856 36.8624
0.0165 5.45 910 0.0746 0.7636 0.6063 0.7556 51.6622 36.8624
0.0129 5.51 920 0.0732 0.7661 0.6106 0.7579 51.1851 36.8624
0.0166 5.57 930 0.0721 0.7665 0.6108 0.7588 51.9259 36.8624
0.0137 5.63 940 0.0702 0.7607 0.6007 0.7521 51.835 36.8624
0.0153 5.69 950 0.0731 0.765 0.605 0.7563 52.1179 36.8624
0.0165 5.75 960 0.0737 0.7641 0.6045 0.7559 52.2019 36.8624
0.0117 5.81 970 0.0742 0.7648 0.6057 0.7559 51.9825 36.8624
0.015 5.87 980 0.0735 0.7635 0.6061 0.7537 52.2846 36.8624
0.0171 5.93 990 0.0719 0.764 0.6031 0.7546 51.4322 36.8624
0.0143 5.99 1000 0.0733 0.7661 0.611 0.7572 51.9965 36.8624
0.0079 6.05 1010 0.0773 0.7635 0.6114 0.7554 52.298 36.8624
0.0077 6.11 1020 0.0801 0.7608 0.6046 0.7532 52.0944 36.8624
0.0094 6.17 1030 0.0806 0.7593 0.6008 0.7508 52.0165 36.8624
0.0078 6.23 1040 0.0798 0.7634 0.6064 0.7549 52.6951 36.8624
0.0078 6.29 1050 0.0790 0.7604 0.6039 0.7514 51.9559 36.8624
0.0088 6.35 1060 0.0783 0.7604 0.5987 0.7522 50.9347 36.8624
0.0095 6.41 1070 0.0787 0.7686 0.6095 0.7599 52.0715 36.8624
0.0093 6.47 1080 0.0794 0.7647 0.6098 0.757 52.3564 36.8624
0.009 6.53 1090 0.0786 0.7591 0.6012 0.75 52.058 36.8624
0.0099 6.59 1100 0.0770 0.7537 0.5954 0.7454 52.2023 36.8624
0.0091 6.65 1110 0.0770 0.7583 0.6003 0.7507 52.3379 36.8624
0.0085 6.71 1120 0.0777 0.7583 0.5992 0.7503 52.3369 36.8624
0.0085 6.77 1130 0.0787 0.76 0.6014 0.7523 52.5729 36.8624
0.0089 6.83 1140 0.0779 0.7629 0.6028 0.7553 52.0656 36.8624
0.0096 6.89 1150 0.0777 0.7606 0.5998 0.7527 51.8477 36.8624
0.0087 6.95 1160 0.0783 0.768 0.6076 0.7593 52.1456 36.8624
0.0081 7.01 1170 0.0791 0.7705 0.6152 0.7624 52.803 36.8624
0.0054 7.07 1180 0.0798 0.7667 0.6084 0.7584 52.7433 36.8624
0.0045 7.13 1190 0.0815 0.7646 0.6055 0.7565 53.0281 36.8624
0.0051 7.19 1200 0.0828 0.7609 0.5999 0.7526 52.6461 36.8624
0.0055 7.25 1210 0.0841 0.7608 0.5994 0.7529 52.6382 36.8624
0.0047 7.31 1220 0.0847 0.7637 0.6035 0.7554 52.5827 36.8624
0.0049 7.37 1230 0.0851 0.7673 0.6088 0.7589 52.9688 36.8624
0.0045 7.43 1240 0.0853 0.7661 0.6083 0.7575 52.9357 36.8624
0.0045 7.49 1250 0.0856 0.7667 0.6107 0.7583 53.1254 36.8624
0.0054 7.54 1260 0.0857 0.7691 0.615 0.7604 53.3457 36.8624
0.0052 7.6 1270 0.0857 0.7703 0.6158 0.7617 53.3061 36.8624
0.0055 7.66 1280 0.0858 0.7729 0.6205 0.7644 53.4336 36.8624
0.0053 7.72 1290 0.0858 0.7718 0.6163 0.7636 53.3378 36.8624
0.0052 7.78 1300 0.0859 0.7724 0.6206 0.7643 53.248 36.8624
0.0043 7.84 1310 0.0858 0.7715 0.6194 0.7633 53.441 36.8624
0.0048 7.9 1320 0.0857 0.7715 0.6194 0.7632 53.3966 36.8624
0.0053 7.96 1330 0.0856 0.7721 0.6212 0.7638 53.4548 36.8624

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for eglkan1/mBART-07-TextSimp-LT-BatchSize8-lr1e-4

Finetuned
(289)
this model

Evaluation results