Datasets, and model checkpoints of our Group Relative Reward Model (GRRM) framework
Sen Yang
double7
AI & ML interests
None yet
Recent Activity
updated a collection 23 days ago
EnAnchored-X2X updated a collection 23 days ago
EnAnchored-X2X updated a collection 23 days ago
EnAnchored-X2XOrganizations
None yet
models 13
double7/Qwen2.5-7B-GRRM
Text Generation • 8B • Updated • 3
double7/Qwen2.5-7B-MT-GRRM-Optimized-CLA
Text Generation • 8B • Updated • 4
double7/Qwen2.5-7B-MT-GRRM-Optimized
Text Generation • 8B • Updated • 8
double7/mt.sft.v2
Text Generation • 333k • Updated • 2
double7/Qwen2.5-7B-SQM-GenRM
8B • Updated
double7/Qwen2.5-7b-EAX
Text Generation • 8B • Updated • 4
double7/Tower-7b-EAX
Text Generation • 7B • Updated • 7
double7/Llama-2-7b-EAX
Text Generation • 7B • Updated • 7
double7/Llama-2-7b-MT-SFT
Text Generation • 7B • Updated • 11
double7/Tower-7b-MT-SFT
Text Generation • 7B • Updated • 10