小明

xiaoming

xiaominghero

AI & ML interests

nlp

Recent Activity

upvoted a paper 2 days ago

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

upvoted an article 9 days ago

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

upvoted a paper 21 days ago

Step-DeepResearch Technical Report

View all activity

Organizations

None yet

upvoted a paper 2 days ago

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published 6 days ago • 71

upvoted an article 9 days ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3, 2025

•

upvoted a paper 21 days ago

Step-DeepResearch Technical Report

Paper • 2512.20491 • Published 23 days ago • 82

upvoted a paper 28 days ago

Step-GUI Technical Report

Paper • 2512.15431 • Published 29 days ago • 129

upvoted an article about 1 month ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

574

liked a Space 3 months ago

The Smol Training Playbook

📚

2.85k

The secrets to building world-class LLMs

liked a dataset 3 months ago

allenai/CoSyn-400K

Viewer • Updated Feb 28, 2025 • 408k • 1.69k • 44

upvoted a collection 4 months ago

MobileLLM-R1

Collection

MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated Nov 21, 2025 • 27

liked 3 datasets 4 months ago

liked 2 models 5 months ago

stepfun-ai/Step-Audio-2-mini

Any-to-Any • 8B • Updated Sep 5, 2025 • 686 • 243

ByteDance-Seed/Seed-OSS-36B-Base

Text Generation • 36B • Updated Aug 26, 2025 • 2.06k • 57

upvoted 2 papers 5 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 211

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 292

liked a dataset 5 months ago

nvidia/Nemotron-Pretraining-Dataset-sample

Viewer • Updated 24 days ago • 27.7k • 1.17k • 36

upvoted a collection 5 months ago

Nemotron-Pre-Training-Datasets

Collection

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 1 day ago • 91

liked a model 5 months ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated Aug 26, 2025 • 13.3k • 1.01k

liked a dataset 5 months ago

stemdataset/STEM

Viewer • Updated Apr 30, 2024 • 1.07M • 423 • 5

upvoted a paper 5 months ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14, 2025 • 145

小明

AI & ML interests

Recent Activity

Organizations

xiaoming's activity

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

We Got Claude to Fine-Tune an Open Source LLM

The Smol Training Playbook