SEGAgentRL

non-profit

AI & ML interests

We target improved agent reinforcement learning in terms of stability (S), efficiency (E), and generalization (G).

Recent Activity

dwenlong updated a collection about 17 hours ago

dwenlong updated a model about 18 hours ago

SEGAgentRL/LLDS-A-GRPO-Llama3.2-3B-Base-MA

dwenlong published a model about 18 hours ago

SEGAgentRL/LLDS-A-GRPO-Llama3.2-3B-Base-MA

View all activity

SEGAgentRL 's models 10

SEGAgentRL/LLDS-A-GRPO-Llama3.2-3B-Base-MA

Reinforcement Learning • 4B • Updated about 18 hours ago • 4

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated 1 day ago • 27

SEGAgentRL/LLDS-R-GRPO-Qwen2.5-3B-Base

Reinforcement Learning • 3B • Updated 1 day ago • 22 • 1

SEGAgentRL/LLDS-R-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated 1 day ago • 27

SEGAgentRL/LLDS-A-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated 1 day ago • 32

SEGAgentRL/LLDS-R-GRPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated 1 day ago • 28

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-3B-Base

Reinforcement Learning • 3B • Updated 1 day ago • 17

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-3B-Base-MA

Reinforcement Learning • 3B • Updated 1 day ago • 34

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-7B-Base

Reinforcement Learning • 8B • Updated 1 day ago • 57 • 2

SEGAgentRL/LLDS-A-GRPO-Qwen2.5-7B-Ins

Reinforcement Learning • 8B • Updated 1 day ago • 81 • 2