5 21 6

Zichen

lkevinzc

https://lkevinzc.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

Diffusion Language Models are Super Data Learners

upvoted a paper 2 months ago

Defeating the Training-Inference Mismatch via FP16

upvoted a paper 3 months ago

Imperceptible Jailbreaking against Large Language Models

View all activity

Organizations

upvoted a paper about 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

upvoted a paper 2 months ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 29

upvoted 5 papers 3 months ago

upvoted a paper 6 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 50

upvoted 4 papers 7 months ago

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2, 2025 • 52

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28, 2025 • 29

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 26

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26, 2025 • 23

upvoted a paper 8 months ago

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published May 19, 2025 • 36

upvoted 2 papers 9 months ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published Apr 14, 2025 • 13

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59

upvoted a collection 9 months ago

🌾Oat-Zero: Understanding R1-Zero-Like Training

Collection

5 items • Updated Apr 10, 2025 • 7

upvoted a paper 10 months ago

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Paper • 2502.12982 • Published Feb 18, 2025 • 19

upvoted a paper about 1 year ago

Sample-Efficient Alignment for LLMs

Paper • 2411.01493 • Published Nov 3, 2024 • 12

upvoted a collection over 1 year ago

💡 DICE

Collection

Self-alignment with DPO Implicit Rewards • 5 items • Updated Jul 28, 2024 • 10

upvoted a paper over 1 year ago

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1, 2024 • 40

Zichen

AI & ML interests

Recent Activity

Organizations

lkevinzc's activity