Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
RoadMa's picture
4 2 3

RoadMa

RoadQAQ
John6666's profile picture
·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters
liked a model 9 days ago
stepfun-ai/Step-3.5-Flash
liked a dataset 9 days ago
stepfun-ai/CF-Div2-Stepfun
View all activity

Organizations

OpenDCAI's profile picture

RoadQAQ 's collections 1

ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18, 2025 • 1 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12, 2025 • 61
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27, 2025 • 70
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23, 2025 • 45.8k • 210 • 9
ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18, 2025 • 1 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12, 2025 • 61
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27, 2025 • 70
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23, 2025 • 45.8k • 210 • 9
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs