WinstonWmj0512 commited on
Commit
4739542
·
verified ·
1 Parent(s): 795c9b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -76,7 +76,9 @@ We trained four models using RLinf:
76
  ### Benchmark Results
77
 
78
  Sft models for LIBERO-90 and LIBERO-130 are trained by ourself following training reciepe from [OpenVLA-OFT](https://github.com/moojink/openvla-oft/blob/main/vla-scripts/finetune.py). And other sft models are from [SimpleVLA-RL](https://huggingface.co/collections/Haozhan72/simplevla-rl-6833311430cd9df52aeb1f86).
79
- > We conduct two evaluation runs using libero_seed = 0. Each run contains 500 episodes for the Object, Spatial, Goal, and Long suites, 4,500 episodes for LIBERO-90, and 6,500 episodes for LIBERO-130. We set do_sample = True during evaluation. The final results are reported as the average across the two runs.
 
 
80
 
81
  | Model | Object | Spatial | Goal | Long | 90 | Average |
82
  | ------------------ | ------ | ------- | ----- | ----- | ------- |------- |
 
76
  ### Benchmark Results
77
 
78
  Sft models for LIBERO-90 and LIBERO-130 are trained by ourself following training reciepe from [OpenVLA-OFT](https://github.com/moojink/openvla-oft/blob/main/vla-scripts/finetune.py). And other sft models are from [SimpleVLA-RL](https://huggingface.co/collections/Haozhan72/simplevla-rl-6833311430cd9df52aeb1f86).
79
+ > We evaluate each model according to its training configuration. Using libero_seed = 0 and evaluating 500 episodes for the Object, Spatial, Goal, and Long suites, 4,500 episodes for LIBERO-90, and 6,500 episodes for LIBERO-130.
80
+ > For the SFT-trained (LoRA-base) models, we set do_sample = False.
81
+ > For the RL-trained models, we set do_sample = True, temperature = 1.6, and enable rollout_epoch=2, and the final results are reported as the average across the two runs.
82
 
83
  | Model | Object | Spatial | Goal | Long | 90 | Average |
84
  | ------------------ | ------ | ------- | ----- | ----- | ------- |------- |