view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 Jun 3 • 96
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7 • 255
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 270