Cross-lingual Transfer of Reward Models This is the collection of synthetic preference data and trained reward models in "Cross-lingual Transfer of Reward Models in Multilingual Alignment". Cross-lingual Transfer of Reward Models in Multilingual Alignment Paper • 2410.18027 • Published Oct 23, 2024 iqwiki-kor/Llama3.2-3B-MP-RM 3B • Updated Nov 19, 2024 • 10 iqwiki-kor/Qwen2.5-3B-MP-RM Text Classification • 3B • Updated Nov 19, 2024 • 16 • 1 iqwiki-kor/MP-86k Viewer • Updated Oct 23, 2024 • 86.4k • 21 • 3
Cross-lingual Transfer of Reward Models in Multilingual Alignment Paper • 2410.18027 • Published Oct 23, 2024
Cross-lingual Transfer of Reward Models This is the collection of synthetic preference data and trained reward models in "Cross-lingual Transfer of Reward Models in Multilingual Alignment". Cross-lingual Transfer of Reward Models in Multilingual Alignment Paper • 2410.18027 • Published Oct 23, 2024 iqwiki-kor/Llama3.2-3B-MP-RM 3B • Updated Nov 19, 2024 • 10 iqwiki-kor/Qwen2.5-3B-MP-RM Text Classification • 3B • Updated Nov 19, 2024 • 16 • 1 iqwiki-kor/MP-86k Viewer • Updated Oct 23, 2024 • 86.4k • 21 • 3
Cross-lingual Transfer of Reward Models in Multilingual Alignment Paper • 2410.18027 • Published Oct 23, 2024