RLinf
/

RLinf-OpenVLAOFT-LIBERO-130

Reinforcement Learning

Eval Results (legacy)

Model card Files Files and versions

WinstonWmj0512 commited on Dec 21, 2025

Commit

4739542

·

verified ·

1 Parent(s): 795c9b8

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -76,7 +76,9 @@ We trained four models using RLinf:
 ### Benchmark Results
 Sft models for LIBERO-90 and LIBERO-130 are trained by ourself following training reciepe from [OpenVLA-OFT](https://github.com/moojink/openvla-oft/blob/main/vla-scripts/finetune.py). And other sft models are from [SimpleVLA-RL](https://huggingface.co/collections/Haozhan72/simplevla-rl-6833311430cd9df52aeb1f86).
-  > We conduct two evaluation runs using libero_seed = 0. Each run contains 500 episodes for the Object, Spatial, Goal, and Long suites, 4,500 episodes for LIBERO-90, and 6,500 episodes for LIBERO-130. We set do_sample = True during evaluation. The final results are reported as the average across the two runs.
 | Model              | Object | Spatial | Goal  | Long  |   90    | Average |
 | ------------------ | ------ | ------- | ----- | ----- | ------- |-------  |

 ### Benchmark Results
 Sft models for LIBERO-90 and LIBERO-130 are trained by ourself following training reciepe from [OpenVLA-OFT](https://github.com/moojink/openvla-oft/blob/main/vla-scripts/finetune.py). And other sft models are from [SimpleVLA-RL](https://huggingface.co/collections/Haozhan72/simplevla-rl-6833311430cd9df52aeb1f86).
+  > We evaluate each model according to its training configuration. Using libero_seed = 0 and evaluating 500 episodes for the Object, Spatial, Goal, and Long suites, 4,500 episodes for LIBERO-90, and 6,500 episodes for LIBERO-130.
+  > For the SFT-trained (LoRA-base) models, we set do_sample = False.
+  > For the RL-trained models, we set do_sample = True, temperature = 1.6, and enable rollout_epoch=2, and the final results are reported as the average across the two runs.
 | Model              | Object | Spatial | Goal  | Long  |   90    | Average |
 | ------------------ | ------ | ------- | ----- | ----- | ------- |-------  |