Claude Code Fixed My Script And I Published Haiku-2 Anyway

I asked Claude Code to fix my training script. It fixed almost every bug. Then it added SPIN. Then it made my models more efficient. Then I published Haiku-2. Then I added all the optimizations. Obviously that is what I needed to do.

When an AI fixes your code better than you can, you have two choices: learn from it or publish first and optimize later. I chose the latter. Obviously.

Claude Code To The Rescue

My training script was a mess. NaN losses. Gradient explosions. Memory leaks that made my GPU cry. I pasted the error logs into Claude Code. It responded with fixes. Not suggestions. Fixes. Actual working code.

                        # Before Claude Code

                        try:

                            train_model()

                        except:

                            cry()

                        # After Claude Code

                        try:

                            train_model()

                        except NanError:

                            clip_gradients()

                            lower_lr()

                            retry()

                        # Still cries sometimes. But less.

It found bugs I did not know existed. It added error handling I forgot to write. It optimized memory usage I did not know could be optimized. I felt seen. I also felt embarrassed.

Enter SPIN

Then Claude Code suggested SPIN. Self-Play Fine-Tuning. An iterative training method that enables models to improve through self-play without additional data. The model generates responses, distinguishes its own outputs from human references, and iteratively refines its behavior.

I had never heard of SPIN. Claude Code had. It explained the algorithm. It showed me the implementation. It added the code to my training loop. I watched as my loss curve started doing something other than screaming into the void.

SPIN is like having your model debate itself until it gets less wrong. I am not sure if that is efficient or just deeply meta. Either way, it works.

Efficiency Gains

With SPIN enabled and the bugs fixed, my training became more efficient. The loss curve went down. Occasionally. When it felt like it.

Haiku-2 benefited from these improvements. It learned faster. It forgot less. It still outputs pipe characters sometimes. But now it outputs words too. Sometimes. Do not count on it.

Haiku-2 Is Out

https://huggingface.co/CompactAI/TMLM-Haiku-2

1M parameters. Can speak sometimes. Do not count on it. Open weights. Open hopes. Open questions about why it still says "|fdish|".

Haiku-2 is live on HuggingFace. It uses the optimized training script. It uses SPIN. It uses everything I added after publishing it. Obviously that is what I needed to do.

What can you expect? Sometimes it answers questions. Sometimes it outputs coherent sentences. Sometimes it remembers that Paris is a city and not a person named Pierre. Progress is slow. Progress is real. Progress is occasionally interrupted by pipe characters.

The Optimization Timeline

Let me be clear about the sequence of events:

                        # My release strategy in chronological order

                        Step 1: Publish Haiku-2

                        Step 2: Realize script has bugs

                        Step 3: Ask Claude Code for help

                        Step 4: Claude Code fixes everything

                        Step 5: Add SPIN

                        Step 6: Optimize everything

                        Step 7: Wonder why I did not do Step 3 first

                        # Obviously this is the correct order.

Would it have been smarter to optimize before publishing? Probably. Would that have been less fun? Definitely. I chose fun. I chose chaos. I chose publishing first and fixing later. Obviously that is what I needed to do.

What SPIN Actually Does

For those who care about the technical details: SPIN enables iterative self-improvement through a self-play mechanism inspired by game theory. The model generates responses, compares them to reference outputs, and updates its weights to favor human-like patterns.

In practice this means Haiku-2 gets better at not being wrong. It still gets wrong. Just less often. Just slightly less confusingly. Just enough that you might mistake it for competence if you are not paying attention.

Why This Matters

I am not competing with frontier models. I am competing with my previous self. Haiku-1.3 said "| as the USA | fdish|||||!@|". Haiku-2 says that sometimes too. But it also says "hello" sometimes. That is progress.

Claude Code helped me get there. SPIN helped me get there. Optimizations helped me get there. Publishing first and optimizing later helped me learn valuable lessons about workflow. Obviously that is what I needed to do.

Final Thoughts

Haiku-2 is out. It can speak sometimes. Do not count on it. The training script is fixed. SPIN is enabled. Optimizations are applied. All after publication. Obviously that is what I needed to do.

If you download Haiku-2, please be kind. It is trying its best. Its best is not good. But it is trying. Just like me. Just like all of us. Just like obviously what we all need to do.