mradermacher/Qwen3-14B-ARPO-DeepSearch-GGUF Reinforcement Learning • 15B • Updated Aug 12, 2025 • 339 • 5
ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8 Reinforcement Learning • 8B • Updated Mar 28, 2025 • 1.45k • 195
Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-32B Reinforcement Learning • 32B • Updated Apr 7, 2025 • 10 • 7
ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4 Reinforcement Learning • 15B • Updated Feb 13, 2025 • 2.26k • 824
ValueFX9507/Tifa-Deepsex-14b-CoT-Q8 Reinforcement Learning • 15B • Updated Feb 13, 2025 • 7.84k • 186