Spaces:
Runtime error
Runtime error
| # YourMT3+ Local Setup Guide | |
| ## π Quick Start (Local Installation) | |
| ### 1. Install Dependencies | |
| ```bash | |
| pip install torch torchaudio transformers gradio pytorch-lightning einops numpy librosa | |
| ``` | |
| ### 2. Setup Model Weights | |
| - Download YourMT3 model weights | |
| - Place them in: `amt/logs/2024/` | |
| - Default expected: `mc13_256_g4_all_v7_mt3f_sqr_rms_moe_wf4_n8k2_silu_rope_rp_b36_nops@last.ckpt` | |
| ### 3. Run Setup Check | |
| ```bash | |
| cd /path/to/YourMT3 | |
| python setup_local.py | |
| ``` | |
| ### 4. Quick Test | |
| ```bash | |
| python test_local.py | |
| ``` | |
| ### 5. Launch Web Interface | |
| ```bash | |
| python app.py | |
| ``` | |
| Then open: http://127.0.0.1:7860 | |
| ## π― New Features | |
| ### Instrument Conditioning | |
| - **Problem**: YourMT3+ switches instruments mid-track (vocals β violin β guitar) | |
| - **Solution**: Select target instrument from dropdown | |
| - **Options**: Auto, Vocals, Guitar, Piano, Violin, Drums, Bass, Saxophone, Flute | |
| ### How It Works | |
| 1. **Upload audio** or paste YouTube URL | |
| 2. **Select instrument** from dropdown menu | |
| 3. **Click Transcribe** | |
| 4. **Get focused transcription** without instrument confusion | |
| ## π§ Troubleshooting | |
| ### "Unknown event type: transcribe_singing" | |
| **This is expected!** The error indicates your model doesn't have special task tokens, which is normal. The system will: | |
| 1. Try task tokens (may fail - that's OK) | |
| 2. Fall back to post-processing filtering | |
| 3. Still give you better results | |
| ### Debug Output | |
| Look for these messages in console: | |
| ``` | |
| === TRANSCRIBE FUNCTION CALLED === | |
| Audio file: /path/to/audio.wav | |
| Instrument hint: vocals | |
| === INSTRUMENT CONDITIONING ACTIVATED === | |
| Model Task Configuration Debug: | |
| β Model has task_manager | |
| Task name: mc13_full_plus_256 | |
| Available subtask prefixes: ['default'] | |
| === APPLYING INSTRUMENT FILTER === | |
| Found instruments in transcription: {0: 45, 100: 123, 40: 12} | |
| Primary instrument: 100 (73% of notes) | |
| Target program for vocals: 100 | |
| Converted 57 notes to primary instrument 100 | |
| ``` | |
| ### Common Issues | |
| **1. Import Errors** | |
| ```bash | |
| pip install torch torchaudio transformers gradio pytorch-lightning | |
| ``` | |
| **2. Model Not Found** | |
| - Download model weights to `amt/logs/2024/` | |
| - Check filename matches exactly | |
| **3. No Audio Examples** | |
| - Place test audio files in `examples/` folder | |
| - Supported formats: .wav, .mp3 | |
| **4. Port Already in Use** | |
| - Web interface runs on port 7860 | |
| - If busy, it will try 7861, 7862, etc. | |
| ## π Expected Results | |
| ### Before (Original YourMT3+) | |
| - Vocals file β outputs: vocals + violin + guitar tracks | |
| - Saxophone solo β incomplete transcription | |
| - Flute solo β single note only | |
| ### After (With Instrument Conditioning) | |
| - Select "Vocals/Singing" β clean vocal transcription only | |
| - Select "Saxophone" β complete saxophone solo | |
| - Select "Flute" β full flute transcription | |
| ## π οΈ Advanced Usage | |
| ### Command Line | |
| ```bash | |
| python transcribe_cli.py audio.wav --instrument vocals --verbose | |
| ``` | |
| ### Python API | |
| ```python | |
| from model_helper import transcribe, load_model_checkpoint | |
| # Load model | |
| model = load_model_checkpoint(args=model_args, device="cuda") | |
| # Transcribe with instrument conditioning | |
| midifile = transcribe(model, audio_info, instrument_hint="vocals") | |
| ``` | |
| ### Confidence Tuning | |
| - High confidence (0.8): Strict instrument filtering | |
| - Low confidence (0.4): Allows more mixed content | |
| - Auto-adjusts based on task token availability | |
| ## π Files Modified | |
| - `app.py` - Added instrument dropdown to web interface | |
| - `model_helper.py` - Enhanced transcription with conditioning | |
| - `transcribe_cli.py` - New command-line interface | |
| - `setup_local.py` - Local setup checker | |
| - `test_local.py` - Quick functionality test | |
| ## π΅ Enjoy Better Transcriptions! | |
| No more instrument confusion - you now have full control over what gets transcribed! π | |