Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- DO_THIS_NOW.md +139 -0
- GET_ERROR_LOGS.md +131 -0
- HOW_TO_DEBUG.md +284 -0
- app.py +440 -391
- debug_hf_space.py +182 -0
- get_logs.py +85 -0
- simple_check.py +119 -0
- test_local.py +127 -0
DO_THIS_NOW.md
ADDED
|
@@ -0,0 +1,139 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# β‘ DO THIS NOW - Immediate Debugging Steps
|
| 2 |
+
|
| 3 |
+
## Your Problem
|
| 4 |
+
- Gave prompt on HF Space
|
| 5 |
+
- Took long time
|
| 6 |
+
- Finally showed error
|
| 7 |
+
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
## π― Run These 2 Commands (Takes 1 minute)
|
| 11 |
+
|
| 12 |
+
### Command 1: Check HF Space Status
|
| 13 |
+
```powershell
|
| 14 |
+
python debug_hf_space.py
|
| 15 |
+
```
|
| 16 |
+
|
| 17 |
+
**This tells you:**
|
| 18 |
+
- Is your Space running?
|
| 19 |
+
- Using CPU or GPU?
|
| 20 |
+
- Are models uploaded?
|
| 21 |
+
|
| 22 |
+
### Command 2: Check What Error Actually Happened
|
| 23 |
+
|
| 24 |
+
**Manual step (30 seconds):**
|
| 25 |
+
1. Go to: https://huggingface.co/spaces/nocapdev/my-gradio-momask/logs
|
| 26 |
+
2. Scroll to the **BOTTOM**
|
| 27 |
+
3. Look for lines with **ERROR** or **Exception**
|
| 28 |
+
4. Copy the last 30 lines
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
## π Share These Results
|
| 33 |
+
|
| 34 |
+
After running the commands above, you'll see one of these scenarios:
|
| 35 |
+
|
| 36 |
+
### Scenario A: Models Missing β
|
| 37 |
+
```
|
| 38 |
+
β οΈ NO checkpoint files found!
|
| 39 |
+
ERROR: Model checkpoints not found!
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
**Fix:**
|
| 43 |
+
Upload your `checkpoints/` folder to HF Space (I'll show you how)
|
| 44 |
+
|
| 45 |
+
### Scenario B: Using CPU (Slow but OK) β³
|
| 46 |
+
```
|
| 47 |
+
β οΈ Using CPU (FREE tier)
|
| 48 |
+
β’ Generation time: 10-30 minutes per prompt
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
**Fix:**
|
| 52 |
+
This is NORMAL! Just wait 20-30 minutes for first prompt.
|
| 53 |
+
Or upgrade to GPU for 30x speed.
|
| 54 |
+
|
| 55 |
+
### Scenario C: Out of Memory π₯
|
| 56 |
+
```
|
| 57 |
+
Killed
|
| 58 |
+
SIGKILL
|
| 59 |
+
OutOfMemoryError
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
**Fix:**
|
| 63 |
+
Upgrade Space RAM or optimize model.
|
| 64 |
+
|
| 65 |
+
### Scenario D: Other Error β
|
| 66 |
+
```
|
| 67 |
+
Some other error message...
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
**Fix:**
|
| 71 |
+
Copy the full error and share it with me!
|
| 72 |
+
|
| 73 |
+
---
|
| 74 |
+
|
| 75 |
+
## π Quick Actions Based on Results
|
| 76 |
+
|
| 77 |
+
### If models are missing:
|
| 78 |
+
```powershell
|
| 79 |
+
# I'll help you upload them - just let me know
|
| 80 |
+
# We'll use Git LFS or HF web interface
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
### If using CPU (slow):
|
| 84 |
+
**Option 1:** Wait it out (FREE)
|
| 85 |
+
- Generation takes 15-30 minutes on CPU
|
| 86 |
+
- This is NORMAL behavior
|
| 87 |
+
|
| 88 |
+
**Option 2:** Upgrade to GPU
|
| 89 |
+
- Go to Space Settings β Hardware
|
| 90 |
+
- Select "T4 small GPU"
|
| 91 |
+
- Costs ~$0.60/hour
|
| 92 |
+
- 30x faster (30 seconds vs 30 minutes)
|
| 93 |
+
|
| 94 |
+
### If out of memory:
|
| 95 |
+
```powershell
|
| 96 |
+
# Upgrade hardware in Space Settings
|
| 97 |
+
# Or we can optimize the code
|
| 98 |
+
```
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
+
|
| 102 |
+
## π What to Share
|
| 103 |
+
|
| 104 |
+
Run this and copy the output:
|
| 105 |
+
```powershell
|
| 106 |
+
python debug_hf_space.py
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
Then go to HF Logs and copy:
|
| 110 |
+
- Last 30-50 lines
|
| 111 |
+
- Any lines with "ERROR"
|
| 112 |
+
- Any lines with "Exception" or "Traceback"
|
| 113 |
+
|
| 114 |
+
Share both and I'll give you the exact fix!
|
| 115 |
+
|
| 116 |
+
---
|
| 117 |
+
|
| 118 |
+
## β±οΈ This Takes 2 Minutes
|
| 119 |
+
|
| 120 |
+
1. **Run:** `python debug_hf_space.py` (30 seconds)
|
| 121 |
+
2. **Visit:** Logs tab on HF Space (30 seconds)
|
| 122 |
+
3. **Copy:** Error messages (30 seconds)
|
| 123 |
+
4. **Share:** Results here (30 seconds)
|
| 124 |
+
|
| 125 |
+
Then I can tell you EXACTLY what to fix!
|
| 126 |
+
|
| 127 |
+
---
|
| 128 |
+
|
| 129 |
+
## π― Most Likely: One of These Two
|
| 130 |
+
|
| 131 |
+
**90% chance it's one of:**
|
| 132 |
+
|
| 133 |
+
1. **Models not uploaded to HF Space**
|
| 134 |
+
- Fix: Upload checkpoints folder
|
| 135 |
+
|
| 136 |
+
2. **Using CPU so it's very slow**
|
| 137 |
+
- Fix: Wait longer OR upgrade to GPU
|
| 138 |
+
|
| 139 |
+
The debug script will tell you which one!
|
GET_ERROR_LOGS.md
ADDED
|
@@ -0,0 +1,131 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# π How to Get Your Error Logs
|
| 2 |
+
|
| 3 |
+
Since you saw an error, here's how to get the details:
|
| 4 |
+
|
| 5 |
+
## Step 1: Visit Your Space Logs
|
| 6 |
+
|
| 7 |
+
**Click this link:**
|
| 8 |
+
https://huggingface.co/spaces/nocapdev/my-gradio-momask/logs
|
| 9 |
+
|
| 10 |
+
## Step 2: What to Look For
|
| 11 |
+
|
| 12 |
+
Scroll through the logs and find:
|
| 13 |
+
|
| 14 |
+
### A. Startup Messages
|
| 15 |
+
Look for:
|
| 16 |
+
```
|
| 17 |
+
Using device: cpu
|
| 18 |
+
Loading models...
|
| 19 |
+
Models loaded successfully!
|
| 20 |
+
```
|
| 21 |
+
|
| 22 |
+
**OR error messages like:**
|
| 23 |
+
```
|
| 24 |
+
ERROR: Model checkpoints not found!
|
| 25 |
+
FileNotFoundError: ...
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
### B. When You Submitted the Prompt
|
| 29 |
+
Look for:
|
| 30 |
+
```
|
| 31 |
+
Generating motion for: 'your prompt here'
|
| 32 |
+
[1/4] Generating motion tokens...
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
**What happened after this?**
|
| 36 |
+
- Did it get stuck?
|
| 37 |
+
- Did it show an error?
|
| 38 |
+
- Did it say "Killed"?
|
| 39 |
+
|
| 40 |
+
### C. Any ERROR Lines
|
| 41 |
+
Search for (Ctrl+F):
|
| 42 |
+
- `ERROR`
|
| 43 |
+
- `Exception`
|
| 44 |
+
- `Traceback`
|
| 45 |
+
- `Killed`
|
| 46 |
+
- `SIGKILL`
|
| 47 |
+
|
| 48 |
+
## Step 3: Copy These Sections
|
| 49 |
+
|
| 50 |
+
Copy and share:
|
| 51 |
+
|
| 52 |
+
1. **Lines showing device and model loading** (first 20 lines)
|
| 53 |
+
2. **Lines when you submitted your prompt** (around your text)
|
| 54 |
+
3. **Any ERROR or Exception messages** (full traceback)
|
| 55 |
+
4. **The last 20 lines** of the log
|
| 56 |
+
|
| 57 |
+
## Quick Copy Template
|
| 58 |
+
|
| 59 |
+
Share this format:
|
| 60 |
+
|
| 61 |
+
```
|
| 62 |
+
=== STARTUP ===
|
| 63 |
+
[First 20 lines from logs showing "Using device" and "Loading models"]
|
| 64 |
+
|
| 65 |
+
=== WHEN I SUBMITTED PROMPT ===
|
| 66 |
+
[Lines showing "Generating motion for: 'your prompt'"]
|
| 67 |
+
[What happened next]
|
| 68 |
+
|
| 69 |
+
=== ERROR (if any) ===
|
| 70 |
+
[Any ERROR or Exception messages]
|
| 71 |
+
|
| 72 |
+
=== LAST 20 LINES ===
|
| 73 |
+
[Last 20 lines from the bottom]
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## Common Patterns to Look For
|
| 79 |
+
|
| 80 |
+
### Pattern 1: Models Missing
|
| 81 |
+
```
|
| 82 |
+
ERROR: Model checkpoints not found!
|
| 83 |
+
Looking for: ./checkpoints
|
| 84 |
+
FileNotFoundError: [Errno 2] No such file or directory
|
| 85 |
+
```
|
| 86 |
+
**β Models weren't uploaded to HF Space**
|
| 87 |
+
|
| 88 |
+
### Pattern 2: Using CPU (Slow but Normal)
|
| 89 |
+
```
|
| 90 |
+
Using device: cpu
|
| 91 |
+
[1/4] Generating motion tokens...
|
| 92 |
+
[stuck here for 20 minutes]
|
| 93 |
+
```
|
| 94 |
+
**β CPU is slow, wait 30 mins OR upgrade to GPU**
|
| 95 |
+
|
| 96 |
+
### Pattern 3: Out of Memory
|
| 97 |
+
```
|
| 98 |
+
Killed
|
| 99 |
+
Process finished with exit code 137
|
| 100 |
+
```
|
| 101 |
+
**β Ran out of RAM**
|
| 102 |
+
|
| 103 |
+
### Pattern 4: Import Error
|
| 104 |
+
```
|
| 105 |
+
ModuleNotFoundError: No module named 'xxx'
|
| 106 |
+
ImportError: cannot import name 'xxx'
|
| 107 |
+
```
|
| 108 |
+
**β Missing dependency**
|
| 109 |
+
|
| 110 |
+
---
|
| 111 |
+
|
| 112 |
+
## Fast Way (if token is set)
|
| 113 |
+
|
| 114 |
+
If you have HUGGINGFACE_TOKEN set:
|
| 115 |
+
```powershell
|
| 116 |
+
python simple_check.py
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
This shows:
|
| 120 |
+
- Space status (Running/Stopped)
|
| 121 |
+
- Hardware (CPU/GPU)
|
| 122 |
+
- If checkpoints exist
|
| 123 |
+
|
| 124 |
+
---
|
| 125 |
+
|
| 126 |
+
## What to Share
|
| 127 |
+
|
| 128 |
+
Just copy the logs and share them here. I'll identify the exact issue and give you the fix!
|
| 129 |
+
|
| 130 |
+
**Direct link to logs:**
|
| 131 |
+
https://huggingface.co/spaces/nocapdev/my-gradio-momask/logs
|
HOW_TO_DEBUG.md
ADDED
|
@@ -0,0 +1,284 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# π How to Debug Your HF Space
|
| 2 |
+
|
| 3 |
+
## Your Situation
|
| 4 |
+
β
Deployed successfully
|
| 5 |
+
β³ Took long time to respond
|
| 6 |
+
β Finally showed error
|
| 7 |
+
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
## π― Step-by-Step Debugging
|
| 11 |
+
|
| 12 |
+
### Step 1: Run Local Diagnosis (30 seconds)
|
| 13 |
+
|
| 14 |
+
```powershell
|
| 15 |
+
# Check your HF Space status
|
| 16 |
+
python debug_hf_space.py
|
| 17 |
+
```
|
| 18 |
+
|
| 19 |
+
This will tell you:
|
| 20 |
+
- β
If Space is running
|
| 21 |
+
- β
What hardware it's using (CPU vs GPU)
|
| 22 |
+
- β
If model files are uploaded
|
| 23 |
+
- β
Common issues
|
| 24 |
+
|
| 25 |
+
### Step 2: Get the Actual Error (MOST IMPORTANT)
|
| 26 |
+
|
| 27 |
+
Go to your Space and copy the error:
|
| 28 |
+
|
| 29 |
+
1. **Visit:** https://huggingface.co/spaces/nocapdev/my-gradio-momask
|
| 30 |
+
2. **Click:** "Logs" tab (top right)
|
| 31 |
+
3. **Scroll** to the bottom
|
| 32 |
+
4. **Copy** the last 30-50 lines
|
| 33 |
+
|
| 34 |
+
**What to look for:**
|
| 35 |
+
- Lines with `ERROR` or `Exception`
|
| 36 |
+
- Lines with `Traceback`
|
| 37 |
+
- The very last error message
|
| 38 |
+
|
| 39 |
+
### Step 3: Common Error Patterns
|
| 40 |
+
|
| 41 |
+
#### Error A: "Model checkpoints not found"
|
| 42 |
+
```
|
| 43 |
+
ERROR: Model checkpoints not found!
|
| 44 |
+
Looking for: ./checkpoints
|
| 45 |
+
FileNotFoundError: [Errno 2] No such file or directory
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
**Cause:** Model files weren't uploaded to HF Space
|
| 49 |
+
**Solution:** Upload the checkpoints (see below)
|
| 50 |
+
|
| 51 |
+
#### Error B: "CUDA out of memory"
|
| 52 |
+
```
|
| 53 |
+
RuntimeError: CUDA out of memory
|
| 54 |
+
torch.cuda.OutOfMemoryError
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
**Cause:** Model too large for GPU RAM
|
| 58 |
+
**Solution:** Use larger GPU or optimize model
|
| 59 |
+
|
| 60 |
+
#### Error C: "Killed" or "SIGKILL"
|
| 61 |
+
```
|
| 62 |
+
Killed
|
| 63 |
+
Process finished with exit code 137
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
**Cause:** Out of RAM (CPU memory)
|
| 67 |
+
**Solution:** Upgrade Space RAM or optimize code
|
| 68 |
+
|
| 69 |
+
#### Error D: Stuck at "Generating motion tokens..."
|
| 70 |
+
```
|
| 71 |
+
[1/4] Generating motion tokens...
|
| 72 |
+
[No more output for 20+ minutes]
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
**Cause:** Using CPU (very slow, not an error!)
|
| 76 |
+
**Solution:** Wait 20-30 minutes OR upgrade to GPU
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
## π§ Solutions for Common Issues
|
| 81 |
+
|
| 82 |
+
### Solution 1: Upload Model Checkpoints
|
| 83 |
+
|
| 84 |
+
**If error shows:** `Model checkpoints not found`
|
| 85 |
+
|
| 86 |
+
#### Option A: Upload via Git (for files <10GB)
|
| 87 |
+
|
| 88 |
+
```bash
|
| 89 |
+
# Clone your Space
|
| 90 |
+
git clone https://huggingface.co/spaces/nocapdev/my-gradio-momask
|
| 91 |
+
cd my-gradio-momask
|
| 92 |
+
|
| 93 |
+
# Install Git LFS (one time)
|
| 94 |
+
git lfs install
|
| 95 |
+
|
| 96 |
+
# Track large files
|
| 97 |
+
git lfs track "checkpoints/**/*.tar"
|
| 98 |
+
git lfs track "checkpoints/**/*.pth"
|
| 99 |
+
git lfs track "checkpoints/**/*.npy"
|
| 100 |
+
|
| 101 |
+
# Copy your checkpoints
|
| 102 |
+
# FROM: C:\Users\purva\OneDrive\Desktop\momaskhg\checkpoints
|
| 103 |
+
# TO: current directory
|
| 104 |
+
cp -r /path/to/checkpoints ./
|
| 105 |
+
|
| 106 |
+
# Commit and push
|
| 107 |
+
git add .gitattributes
|
| 108 |
+
git add checkpoints/
|
| 109 |
+
git commit -m "Add model checkpoints"
|
| 110 |
+
git push
|
| 111 |
+
```
|
| 112 |
+
|
| 113 |
+
#### Option B: Upload via HF Web UI
|
| 114 |
+
|
| 115 |
+
1. Go to: https://huggingface.co/spaces/nocapdev/my-gradio-momask/tree/main
|
| 116 |
+
2. Click "Add file" β "Upload files"
|
| 117 |
+
3. Drag your `checkpoints/` folder
|
| 118 |
+
4. Click "Commit"
|
| 119 |
+
|
| 120 |
+
**Note:** This works for files <50MB. For larger files, use Git LFS.
|
| 121 |
+
|
| 122 |
+
#### Option C: Host Models Separately
|
| 123 |
+
|
| 124 |
+
Upload models to HF Model Hub, then download in app.py:
|
| 125 |
+
|
| 126 |
+
```python
|
| 127 |
+
from huggingface_hub import snapshot_download
|
| 128 |
+
|
| 129 |
+
# Add to app.py before initializing generator
|
| 130 |
+
if not os.path.exists('./checkpoints'):
|
| 131 |
+
print("Downloading models from HF Hub...")
|
| 132 |
+
snapshot_download(
|
| 133 |
+
repo_id="YOUR_USERNAME/momask-models",
|
| 134 |
+
local_dir="./checkpoints"
|
| 135 |
+
)
|
| 136 |
+
```
|
| 137 |
+
|
| 138 |
+
---
|
| 139 |
+
|
| 140 |
+
### Solution 2: Upgrade Hardware (for speed)
|
| 141 |
+
|
| 142 |
+
If using CPU and it's too slow:
|
| 143 |
+
|
| 144 |
+
1. Go to: https://huggingface.co/spaces/nocapdev/my-gradio-momask/settings
|
| 145 |
+
2. Scroll to "Hardware"
|
| 146 |
+
3. Select:
|
| 147 |
+
- **T4 small** (~$0.60/hour) - Good for this app
|
| 148 |
+
- **A10G small** (~$3/hour) - Faster
|
| 149 |
+
4. Click "Save"
|
| 150 |
+
5. Wait for rebuild (~2 minutes)
|
| 151 |
+
|
| 152 |
+
---
|
| 153 |
+
|
| 154 |
+
### Solution 3: Test Locally First
|
| 155 |
+
|
| 156 |
+
Before debugging on HF, test locally:
|
| 157 |
+
|
| 158 |
+
```powershell
|
| 159 |
+
# 1. Test your setup
|
| 160 |
+
python test_local.py
|
| 161 |
+
|
| 162 |
+
# 2. Run app locally
|
| 163 |
+
python app.py
|
| 164 |
+
|
| 165 |
+
# 3. Visit http://localhost:7860
|
| 166 |
+
# 4. Try a prompt
|
| 167 |
+
# 5. Check terminal for errors
|
| 168 |
+
```
|
| 169 |
+
|
| 170 |
+
**If it works locally but fails on HF:**
|
| 171 |
+
- Models probably not uploaded to HF Space
|
| 172 |
+
- Or HF Space using different Python/package versions
|
| 173 |
+
|
| 174 |
+
---
|
| 175 |
+
|
| 176 |
+
## π Debugging Checklist
|
| 177 |
+
|
| 178 |
+
Run through this checklist:
|
| 179 |
+
|
| 180 |
+
### β
Pre-deployment
|
| 181 |
+
- [ ] `python test_local.py` passes
|
| 182 |
+
- [ ] App works locally at http://localhost:7860
|
| 183 |
+
- [ ] Models in `./checkpoints/` directory
|
| 184 |
+
- [ ] `python pre_deploy_check.py` shows 8/8 PASS
|
| 185 |
+
|
| 186 |
+
### β
Post-deployment
|
| 187 |
+
- [ ] Space shows "Running" status
|
| 188 |
+
- [ ] Logs show "Using device: cpu/cuda"
|
| 189 |
+
- [ ] Logs show "Models loaded successfully!"
|
| 190 |
+
- [ ] No error messages in logs
|
| 191 |
+
|
| 192 |
+
### β
During generation
|
| 193 |
+
- [ ] Logs show "[1/4] Generating motion tokens..."
|
| 194 |
+
- [ ] Logs show progress through [2/4], [3/4], [4/4]
|
| 195 |
+
- [ ] No "Killed" or "SIGKILL" messages
|
| 196 |
+
|
| 197 |
+
---
|
| 198 |
+
|
| 199 |
+
## π― Quick Diagnosis Commands
|
| 200 |
+
|
| 201 |
+
```powershell
|
| 202 |
+
# Check HF Space status
|
| 203 |
+
python debug_hf_space.py
|
| 204 |
+
|
| 205 |
+
# Test local setup
|
| 206 |
+
python test_local.py
|
| 207 |
+
|
| 208 |
+
# Validate before deploy
|
| 209 |
+
python pre_deploy_check.py
|
| 210 |
+
|
| 211 |
+
# Deploy with latest fixes
|
| 212 |
+
python deploy.py
|
| 213 |
+
```
|
| 214 |
+
|
| 215 |
+
---
|
| 216 |
+
|
| 217 |
+
## π Expected Logs (Healthy Run)
|
| 218 |
+
|
| 219 |
+
### Startup (should see this):
|
| 220 |
+
```
|
| 221 |
+
Using device: cpu (or cuda)
|
| 222 |
+
Loading models...
|
| 223 |
+
β VQ model loaded
|
| 224 |
+
β Transformer loaded
|
| 225 |
+
β Residual model loaded
|
| 226 |
+
β Length estimator loaded
|
| 227 |
+
Models loaded successfully!
|
| 228 |
+
Running on local URL: http://0.0.0.0:7860
|
| 229 |
+
```
|
| 230 |
+
|
| 231 |
+
### During generation (should see this):
|
| 232 |
+
```
|
| 233 |
+
======================================================================
|
| 234 |
+
Generating motion for: 'a person walks forward'
|
| 235 |
+
======================================================================
|
| 236 |
+
[1/4] Generating motion tokens...
|
| 237 |
+
β Generated 80 frames
|
| 238 |
+
[2/4] Converting to BVH format...
|
| 239 |
+
β BVH conversion complete
|
| 240 |
+
[3/4] Rendering video...
|
| 241 |
+
β Video saved to ./gradio_outputs/motion_12345.mp4
|
| 242 |
+
[4/4] Complete!
|
| 243 |
+
======================================================================
|
| 244 |
+
```
|
| 245 |
+
|
| 246 |
+
---
|
| 247 |
+
|
| 248 |
+
## π Still Stuck?
|
| 249 |
+
|
| 250 |
+
### Share these with me:
|
| 251 |
+
|
| 252 |
+
1. **Output from:**
|
| 253 |
+
```powershell
|
| 254 |
+
python debug_hf_space.py
|
| 255 |
+
```
|
| 256 |
+
|
| 257 |
+
2. **Last 50 lines from HF Space Logs**
|
| 258 |
+
- Go to Logs tab
|
| 259 |
+
- Copy from bottom
|
| 260 |
+
- Include any ERROR or Traceback
|
| 261 |
+
|
| 262 |
+
3. **What you see in the browser**
|
| 263 |
+
- Screenshot of the error
|
| 264 |
+
- Or copy the error message
|
| 265 |
+
|
| 266 |
+
Then I can give you the exact fix!
|
| 267 |
+
|
| 268 |
+
---
|
| 269 |
+
|
| 270 |
+
## π‘ Most Likely Issues (90% of cases)
|
| 271 |
+
|
| 272 |
+
1. **CPU is slow** (not an error!)
|
| 273 |
+
- Logs show: "Using device: cpu"
|
| 274 |
+
- Solution: Wait 20 mins OR upgrade to GPU
|
| 275 |
+
|
| 276 |
+
2. **Models not uploaded**
|
| 277 |
+
- Logs show: "Model checkpoints not found"
|
| 278 |
+
- Solution: Upload checkpoints to HF Space
|
| 279 |
+
|
| 280 |
+
3. **Out of memory**
|
| 281 |
+
- Logs show: "Killed" or "SIGKILL"
|
| 282 |
+
- Solution: Upgrade to more RAM
|
| 283 |
+
|
| 284 |
+
Run `python debug_hf_space.py` first - it will identify which one!
|
app.py
CHANGED
|
@@ -1,391 +1,440 @@
|
|
| 1 |
-
import os
|
| 2 |
-
from os.path import join as pjoin
|
| 3 |
-
import gradio as gr
|
| 4 |
-
import torch
|
| 5 |
-
import torch.nn.functional as F
|
| 6 |
-
import numpy as np
|
| 7 |
-
from torch.distributions.categorical import Categorical
|
| 8 |
-
|
| 9 |
-
from models.mask_transformer.transformer import MaskTransformer, ResidualTransformer
|
| 10 |
-
from models.vq.model import RVQVAE, LengthEstimator
|
| 11 |
-
from utils.get_opt import get_opt
|
| 12 |
-
from utils.fixseed import fixseed
|
| 13 |
-
from visualization.joints2bvh import Joint2BVHConvertor
|
| 14 |
-
from utils.motion_process import recover_from_ric
|
| 15 |
-
from utils.plot_script import plot_3d_motion
|
| 16 |
-
from utils.paramUtil import t2m_kinematic_chain
|
| 17 |
-
|
| 18 |
-
clip_version = 'ViT-B/32'
|
| 19 |
-
|
| 20 |
-
class MotionGenerator:
|
| 21 |
-
def __init__(self, checkpoints_dir, dataset_name, model_name, res_name, vq_name, device='cuda'):
|
| 22 |
-
self.device = torch.device(device if torch.cuda.is_available() else 'cpu')
|
| 23 |
-
self.dataset_name = dataset_name
|
| 24 |
-
self.dim_pose = 251 if dataset_name == 'kit' else 263
|
| 25 |
-
self.nb_joints = 21 if dataset_name == 'kit' else 22
|
| 26 |
-
|
| 27 |
-
# Load models
|
| 28 |
-
print("Loading models...")
|
| 29 |
-
self.vq_model, self.vq_opt = self._load_vq_model(checkpoints_dir, dataset_name, vq_name)
|
| 30 |
-
self.t2m_transformer = self._load_trans_model(checkpoints_dir, dataset_name, model_name)
|
| 31 |
-
self.res_model = self._load_res_model(checkpoints_dir, dataset_name, res_name, self.vq_opt)
|
| 32 |
-
self.length_estimator = self._load_len_estimator(checkpoints_dir, dataset_name)
|
| 33 |
-
|
| 34 |
-
# Set to eval mode
|
| 35 |
-
self.vq_model.eval()
|
| 36 |
-
self.t2m_transformer.eval()
|
| 37 |
-
self.res_model.eval()
|
| 38 |
-
self.length_estimator.eval()
|
| 39 |
-
|
| 40 |
-
# Load normalization stats
|
| 41 |
-
meta_dir = pjoin(checkpoints_dir, dataset_name, vq_name, 'meta')
|
| 42 |
-
self.mean = np.load(pjoin(meta_dir, 'mean.npy'))
|
| 43 |
-
self.std = np.load(pjoin(meta_dir, 'std.npy'))
|
| 44 |
-
|
| 45 |
-
self.kinematic_chain = t2m_kinematic_chain
|
| 46 |
-
self.converter = Joint2BVHConvertor()
|
| 47 |
-
|
| 48 |
-
print("Models loaded successfully!")
|
| 49 |
-
|
| 50 |
-
def _load_vq_model(self, checkpoints_dir, dataset_name, vq_name):
|
| 51 |
-
vq_opt_path = pjoin(checkpoints_dir, dataset_name, vq_name, 'opt.txt')
|
| 52 |
-
vq_opt = get_opt(vq_opt_path, device=self.device)
|
| 53 |
-
vq_opt.dim_pose = self.dim_pose
|
| 54 |
-
|
| 55 |
-
vq_model = RVQVAE(vq_opt,
|
| 56 |
-
vq_opt.dim_pose,
|
| 57 |
-
vq_opt.nb_code,
|
| 58 |
-
vq_opt.code_dim,
|
| 59 |
-
vq_opt.output_emb_width,
|
| 60 |
-
vq_opt.down_t,
|
| 61 |
-
vq_opt.stride_t,
|
| 62 |
-
vq_opt.width,
|
| 63 |
-
vq_opt.depth,
|
| 64 |
-
vq_opt.dilation_growth_rate,
|
| 65 |
-
vq_opt.vq_act,
|
| 66 |
-
vq_opt.vq_norm)
|
| 67 |
-
|
| 68 |
-
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, vq_name, 'model', 'net_best_fid.tar'),
|
| 69 |
-
map_location=self.device)
|
| 70 |
-
model_key = 'vq_model' if 'vq_model' in ckpt else 'net'
|
| 71 |
-
vq_model.load_state_dict(ckpt[model_key])
|
| 72 |
-
vq_model.to(self.device)
|
| 73 |
-
|
| 74 |
-
return vq_model, vq_opt
|
| 75 |
-
|
| 76 |
-
def _load_trans_model(self, checkpoints_dir, dataset_name, model_name):
|
| 77 |
-
model_opt_path = pjoin(checkpoints_dir, dataset_name, model_name, 'opt.txt')
|
| 78 |
-
model_opt = get_opt(model_opt_path, device=self.device)
|
| 79 |
-
|
| 80 |
-
model_opt.num_tokens = self.vq_opt.nb_code
|
| 81 |
-
model_opt.num_quantizers = self.vq_opt.num_quantizers
|
| 82 |
-
model_opt.code_dim = self.vq_opt.code_dim
|
| 83 |
-
|
| 84 |
-
# Set default values for missing attributes
|
| 85 |
-
if not hasattr(model_opt, 'latent_dim'):
|
| 86 |
-
model_opt.latent_dim = 384
|
| 87 |
-
if not hasattr(model_opt, 'ff_size'):
|
| 88 |
-
model_opt.ff_size = 1024
|
| 89 |
-
if not hasattr(model_opt, 'n_layers'):
|
| 90 |
-
model_opt.n_layers = 8
|
| 91 |
-
if not hasattr(model_opt, 'n_heads'):
|
| 92 |
-
model_opt.n_heads = 6
|
| 93 |
-
if not hasattr(model_opt, 'dropout'):
|
| 94 |
-
model_opt.dropout = 0.1
|
| 95 |
-
if not hasattr(model_opt, 'cond_drop_prob'):
|
| 96 |
-
model_opt.cond_drop_prob = 0.1
|
| 97 |
-
|
| 98 |
-
t2m_transformer = MaskTransformer(code_dim=model_opt.code_dim,
|
| 99 |
-
cond_mode='text',
|
| 100 |
-
latent_dim=model_opt.latent_dim,
|
| 101 |
-
ff_size=model_opt.ff_size,
|
| 102 |
-
num_layers=model_opt.n_layers,
|
| 103 |
-
num_heads=model_opt.n_heads,
|
| 104 |
-
dropout=model_opt.dropout,
|
| 105 |
-
clip_dim=512,
|
| 106 |
-
cond_drop_prob=model_opt.cond_drop_prob,
|
| 107 |
-
clip_version=clip_version,
|
| 108 |
-
opt=model_opt)
|
| 109 |
-
|
| 110 |
-
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, model_name, 'model', 'latest.tar'),
|
| 111 |
-
map_location=self.device)
|
| 112 |
-
model_key = 't2m_transformer' if 't2m_transformer' in ckpt else 'trans'
|
| 113 |
-
t2m_transformer.load_state_dict(ckpt[model_key], strict=False)
|
| 114 |
-
t2m_transformer.to(self.device)
|
| 115 |
-
|
| 116 |
-
return t2m_transformer
|
| 117 |
-
|
| 118 |
-
def _load_res_model(self, checkpoints_dir, dataset_name, res_name, vq_opt):
|
| 119 |
-
res_opt_path = pjoin(checkpoints_dir, dataset_name, res_name, 'opt.txt')
|
| 120 |
-
res_opt = get_opt(res_opt_path, device=self.device)
|
| 121 |
-
|
| 122 |
-
# The res_name appears to be the same as vq_name, so res_opt is actually vq_opt
|
| 123 |
-
# We need to use proper model architecture parameters
|
| 124 |
-
res_opt.num_quantizers = vq_opt.num_quantizers
|
| 125 |
-
res_opt.num_tokens = vq_opt.nb_code
|
| 126 |
-
|
| 127 |
-
# Set architecture parameters for ResidualTransformer
|
| 128 |
-
# These should match the main transformer architecture
|
| 129 |
-
res_opt.latent_dim = 384 # Match with main transformer
|
| 130 |
-
res_opt.ff_size = 1024
|
| 131 |
-
res_opt.n_layers = 9 # Typically slightly more layers for residual
|
| 132 |
-
res_opt.n_heads = 6
|
| 133 |
-
res_opt.dropout = 0.1
|
| 134 |
-
res_opt.cond_drop_prob = 0.1
|
| 135 |
-
res_opt.share_weight = False
|
| 136 |
-
|
| 137 |
-
print(f"ResidualTransformer config - latent_dim: {res_opt.latent_dim}, ff_size: {res_opt.ff_size}, nlayers: {res_opt.n_layers}, nheads: {res_opt.n_heads}, dropout: {res_opt.dropout}")
|
| 138 |
-
|
| 139 |
-
res_transformer = ResidualTransformer(code_dim=vq_opt.code_dim,
|
| 140 |
-
cond_mode='text',
|
| 141 |
-
latent_dim=res_opt.latent_dim,
|
| 142 |
-
ff_size=res_opt.ff_size,
|
| 143 |
-
num_layers=res_opt.n_layers,
|
| 144 |
-
num_heads=res_opt.n_heads,
|
| 145 |
-
dropout=res_opt.dropout,
|
| 146 |
-
clip_dim=512,
|
| 147 |
-
shared_codebook=vq_opt.shared_codebook,
|
| 148 |
-
cond_drop_prob=res_opt.cond_drop_prob,
|
| 149 |
-
share_weight=res_opt.share_weight,
|
| 150 |
-
clip_version=clip_version,
|
| 151 |
-
opt=res_opt)
|
| 152 |
-
|
| 153 |
-
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, res_name, 'model', 'net_best_fid.tar'),
|
| 154 |
-
map_location=self.device)
|
| 155 |
-
|
| 156 |
-
# Debug: check available keys
|
| 157 |
-
print(f"Available checkpoint keys: {ckpt.keys()}")
|
| 158 |
-
|
| 159 |
-
# Try different possible keys for the model state dict
|
| 160 |
-
model_key = None
|
| 161 |
-
for key in ['res_transformer', 'trans', 'net', 'model', 'state_dict']:
|
| 162 |
-
if key in ckpt:
|
| 163 |
-
model_key = key
|
| 164 |
-
break
|
| 165 |
-
|
| 166 |
-
if model_key:
|
| 167 |
-
print(f"Loading ResidualTransformer from key: {model_key}")
|
| 168 |
-
res_transformer.load_state_dict(ckpt[model_key], strict=False)
|
| 169 |
-
else:
|
| 170 |
-
print("Warning: Could not find model weights in checkpoint. Available keys:", list(ckpt.keys()))
|
| 171 |
-
# If this is actually a VQ model checkpoint, we might need to skip loading or handle differently
|
| 172 |
-
if 'vq_model' in ckpt or 'net' in ckpt:
|
| 173 |
-
print("This appears to be a VQ model checkpoint, not a ResidualTransformer checkpoint.")
|
| 174 |
-
print("Skipping weight loading - using randomly initialized ResidualTransformer.")
|
| 175 |
-
|
| 176 |
-
res_transformer.to(self.device)
|
| 177 |
-
|
| 178 |
-
return res_transformer
|
| 179 |
-
|
| 180 |
-
def _load_len_estimator(self, checkpoints_dir, dataset_name):
|
| 181 |
-
model = LengthEstimator(512, 50)
|
| 182 |
-
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, 'length_estimator', 'model', 'finest.tar'),
|
| 183 |
-
map_location=self.device)
|
| 184 |
-
model.load_state_dict(ckpt['estimator'])
|
| 185 |
-
model.to(self.device)
|
| 186 |
-
return model
|
| 187 |
-
|
| 188 |
-
def inv_transform(self, data):
|
| 189 |
-
return data * self.std + self.mean
|
| 190 |
-
|
| 191 |
-
@torch.no_grad()
|
| 192 |
-
def generate(self, text_prompt, motion_length=0, time_steps=18, cond_scale=4,
|
| 193 |
-
temperature=1, topkr=0.9, gumbel_sample=True, seed=42):
|
| 194 |
-
"""
|
| 195 |
-
Generate motion from text prompt
|
| 196 |
-
|
| 197 |
-
Args:
|
| 198 |
-
text_prompt: Text description of the motion
|
| 199 |
-
motion_length: Desired motion length (0 for auto-estimation)
|
| 200 |
-
time_steps: Number of denoising steps
|
| 201 |
-
cond_scale: Classifier-free guidance scale
|
| 202 |
-
temperature: Sampling temperature
|
| 203 |
-
topkr: Top-k filtering threshold
|
| 204 |
-
gumbel_sample: Whether to use Gumbel sampling
|
| 205 |
-
seed: Random seed
|
| 206 |
-
"""
|
| 207 |
-
fixseed(seed)
|
| 208 |
-
|
| 209 |
-
# Convert motion_length to int if needed
|
| 210 |
-
if isinstance(motion_length, float):
|
| 211 |
-
motion_length = int(motion_length)
|
| 212 |
-
|
| 213 |
-
# Estimate length if not provided
|
| 214 |
-
if motion_length == 0:
|
| 215 |
-
text_embedding = self.t2m_transformer.encode_text([text_prompt])
|
| 216 |
-
pred_dis = self.length_estimator(text_embedding)
|
| 217 |
-
probs = F.softmax(pred_dis, dim=-1)
|
| 218 |
-
token_lens = Categorical(probs).sample()
|
| 219 |
-
else:
|
| 220 |
-
token_lens = torch.LongTensor([motion_length // 4]).to(self.device)
|
| 221 |
-
|
| 222 |
-
m_length = token_lens * 4
|
| 223 |
-
|
| 224 |
-
# Generate motion tokens
|
| 225 |
-
mids = self.t2m_transformer.generate([text_prompt], token_lens,
|
| 226 |
-
timesteps=int(time_steps),
|
| 227 |
-
cond_scale=float(cond_scale),
|
| 228 |
-
temperature=float(temperature),
|
| 229 |
-
topk_filter_thres=float(topkr),
|
| 230 |
-
gsample=gumbel_sample)
|
| 231 |
-
|
| 232 |
-
# Refine with residual transformer
|
| 233 |
-
mids = self.res_model.generate(mids, [text_prompt], token_lens,
|
| 234 |
-
temperature=1, cond_scale=5)
|
| 235 |
-
|
| 236 |
-
# Decode to motion
|
| 237 |
-
pred_motions = self.vq_model.forward_decoder(mids)
|
| 238 |
-
pred_motions = pred_motions.detach().cpu().numpy()
|
| 239 |
-
|
| 240 |
-
# Denormalize
|
| 241 |
-
data = self.inv_transform(pred_motions)
|
| 242 |
-
joint_data = data[0, :m_length[0]]
|
| 243 |
-
|
| 244 |
-
# Recover 3D joints
|
| 245 |
-
joint = recover_from_ric(torch.from_numpy(joint_data).float(), self.nb_joints).numpy()
|
| 246 |
-
|
| 247 |
-
return joint, int(m_length[0].item())
|
| 248 |
-
|
| 249 |
-
|
| 250 |
-
def create_gradio_interface(generator, output_dir='./gradio_outputs'):
|
| 251 |
-
os.makedirs(output_dir, exist_ok=True)
|
| 252 |
-
|
| 253 |
-
def generate_motion(text_prompt):
|
| 254 |
-
try:
|
| 255 |
-
|
| 256 |
-
|
| 257 |
-
|
| 258 |
-
|
| 259 |
-
|
| 260 |
-
|
| 261 |
-
|
| 262 |
-
|
| 263 |
-
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
|
| 267 |
-
|
| 268 |
-
|
| 269 |
-
|
| 270 |
-
|
| 271 |
-
|
| 272 |
-
|
| 273 |
-
|
| 274 |
-
|
| 275 |
-
|
| 276 |
-
|
| 277 |
-
|
| 278 |
-
|
| 279 |
-
|
| 280 |
-
|
| 281 |
-
|
| 282 |
-
|
| 283 |
-
|
| 284 |
-
|
| 285 |
-
|
| 286 |
-
|
| 287 |
-
|
| 288 |
-
|
| 289 |
-
|
| 290 |
-
|
| 291 |
-
|
| 292 |
-
|
| 293 |
-
|
| 294 |
-
|
| 295 |
-
|
| 296 |
-
|
| 297 |
-
|
| 298 |
-
|
| 299 |
-
|
| 300 |
-
|
| 301 |
-
|
| 302 |
-
|
| 303 |
-
|
| 304 |
-
|
| 305 |
-
|
| 306 |
-
|
| 307 |
-
|
| 308 |
-
|
| 309 |
-
|
| 310 |
-
|
| 311 |
-
|
| 312 |
-
|
| 313 |
-
|
| 314 |
-
|
| 315 |
-
|
| 316 |
-
|
| 317 |
-
|
| 318 |
-
|
| 319 |
-
|
| 320 |
-
|
| 321 |
-
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
|
| 330 |
-
|
| 331 |
-
|
| 332 |
-
|
| 333 |
-
|
| 334 |
-
|
| 335 |
-
|
| 336 |
-
|
| 337 |
-
|
| 338 |
-
|
| 339 |
-
|
| 340 |
-
|
| 341 |
-
|
| 342 |
-
|
| 343 |
-
|
| 344 |
-
|
| 345 |
-
|
| 346 |
-
|
| 347 |
-
|
| 348 |
-
|
| 349 |
-
|
| 350 |
-
|
| 351 |
-
|
| 352 |
-
|
| 353 |
-
|
| 354 |
-
|
| 355 |
-
|
| 356 |
-
|
| 357 |
-
|
| 358 |
-
|
| 359 |
-
|
| 360 |
-
|
| 361 |
-
|
| 362 |
-
|
| 363 |
-
|
| 364 |
-
|
| 365 |
-
|
| 366 |
-
|
| 367 |
-
|
| 368 |
-
|
| 369 |
-
|
| 370 |
-
|
| 371 |
-
|
| 372 |
-
|
| 373 |
-
|
| 374 |
-
|
| 375 |
-
|
| 376 |
-
|
| 377 |
-
|
| 378 |
-
|
| 379 |
-
|
| 380 |
-
|
| 381 |
-
|
| 382 |
-
|
| 383 |
-
|
| 384 |
-
|
| 385 |
-
|
| 386 |
-
|
| 387 |
-
|
| 388 |
-
|
| 389 |
-
|
| 390 |
-
|
| 391 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
from os.path import join as pjoin
|
| 3 |
+
import gradio as gr
|
| 4 |
+
import torch
|
| 5 |
+
import torch.nn.functional as F
|
| 6 |
+
import numpy as np
|
| 7 |
+
from torch.distributions.categorical import Categorical
|
| 8 |
+
|
| 9 |
+
from models.mask_transformer.transformer import MaskTransformer, ResidualTransformer
|
| 10 |
+
from models.vq.model import RVQVAE, LengthEstimator
|
| 11 |
+
from utils.get_opt import get_opt
|
| 12 |
+
from utils.fixseed import fixseed
|
| 13 |
+
from visualization.joints2bvh import Joint2BVHConvertor
|
| 14 |
+
from utils.motion_process import recover_from_ric
|
| 15 |
+
from utils.plot_script import plot_3d_motion
|
| 16 |
+
from utils.paramUtil import t2m_kinematic_chain
|
| 17 |
+
|
| 18 |
+
clip_version = 'ViT-B/32'
|
| 19 |
+
|
| 20 |
+
class MotionGenerator:
|
| 21 |
+
def __init__(self, checkpoints_dir, dataset_name, model_name, res_name, vq_name, device='cuda'):
|
| 22 |
+
self.device = torch.device(device if torch.cuda.is_available() else 'cpu')
|
| 23 |
+
self.dataset_name = dataset_name
|
| 24 |
+
self.dim_pose = 251 if dataset_name == 'kit' else 263
|
| 25 |
+
self.nb_joints = 21 if dataset_name == 'kit' else 22
|
| 26 |
+
|
| 27 |
+
# Load models
|
| 28 |
+
print("Loading models...")
|
| 29 |
+
self.vq_model, self.vq_opt = self._load_vq_model(checkpoints_dir, dataset_name, vq_name)
|
| 30 |
+
self.t2m_transformer = self._load_trans_model(checkpoints_dir, dataset_name, model_name)
|
| 31 |
+
self.res_model = self._load_res_model(checkpoints_dir, dataset_name, res_name, self.vq_opt)
|
| 32 |
+
self.length_estimator = self._load_len_estimator(checkpoints_dir, dataset_name)
|
| 33 |
+
|
| 34 |
+
# Set to eval mode
|
| 35 |
+
self.vq_model.eval()
|
| 36 |
+
self.t2m_transformer.eval()
|
| 37 |
+
self.res_model.eval()
|
| 38 |
+
self.length_estimator.eval()
|
| 39 |
+
|
| 40 |
+
# Load normalization stats
|
| 41 |
+
meta_dir = pjoin(checkpoints_dir, dataset_name, vq_name, 'meta')
|
| 42 |
+
self.mean = np.load(pjoin(meta_dir, 'mean.npy'))
|
| 43 |
+
self.std = np.load(pjoin(meta_dir, 'std.npy'))
|
| 44 |
+
|
| 45 |
+
self.kinematic_chain = t2m_kinematic_chain
|
| 46 |
+
self.converter = Joint2BVHConvertor()
|
| 47 |
+
|
| 48 |
+
print("Models loaded successfully!")
|
| 49 |
+
|
| 50 |
+
def _load_vq_model(self, checkpoints_dir, dataset_name, vq_name):
|
| 51 |
+
vq_opt_path = pjoin(checkpoints_dir, dataset_name, vq_name, 'opt.txt')
|
| 52 |
+
vq_opt = get_opt(vq_opt_path, device=self.device)
|
| 53 |
+
vq_opt.dim_pose = self.dim_pose
|
| 54 |
+
|
| 55 |
+
vq_model = RVQVAE(vq_opt,
|
| 56 |
+
vq_opt.dim_pose,
|
| 57 |
+
vq_opt.nb_code,
|
| 58 |
+
vq_opt.code_dim,
|
| 59 |
+
vq_opt.output_emb_width,
|
| 60 |
+
vq_opt.down_t,
|
| 61 |
+
vq_opt.stride_t,
|
| 62 |
+
vq_opt.width,
|
| 63 |
+
vq_opt.depth,
|
| 64 |
+
vq_opt.dilation_growth_rate,
|
| 65 |
+
vq_opt.vq_act,
|
| 66 |
+
vq_opt.vq_norm)
|
| 67 |
+
|
| 68 |
+
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, vq_name, 'model', 'net_best_fid.tar'),
|
| 69 |
+
map_location=self.device)
|
| 70 |
+
model_key = 'vq_model' if 'vq_model' in ckpt else 'net'
|
| 71 |
+
vq_model.load_state_dict(ckpt[model_key])
|
| 72 |
+
vq_model.to(self.device)
|
| 73 |
+
|
| 74 |
+
return vq_model, vq_opt
|
| 75 |
+
|
| 76 |
+
def _load_trans_model(self, checkpoints_dir, dataset_name, model_name):
|
| 77 |
+
model_opt_path = pjoin(checkpoints_dir, dataset_name, model_name, 'opt.txt')
|
| 78 |
+
model_opt = get_opt(model_opt_path, device=self.device)
|
| 79 |
+
|
| 80 |
+
model_opt.num_tokens = self.vq_opt.nb_code
|
| 81 |
+
model_opt.num_quantizers = self.vq_opt.num_quantizers
|
| 82 |
+
model_opt.code_dim = self.vq_opt.code_dim
|
| 83 |
+
|
| 84 |
+
# Set default values for missing attributes
|
| 85 |
+
if not hasattr(model_opt, 'latent_dim'):
|
| 86 |
+
model_opt.latent_dim = 384
|
| 87 |
+
if not hasattr(model_opt, 'ff_size'):
|
| 88 |
+
model_opt.ff_size = 1024
|
| 89 |
+
if not hasattr(model_opt, 'n_layers'):
|
| 90 |
+
model_opt.n_layers = 8
|
| 91 |
+
if not hasattr(model_opt, 'n_heads'):
|
| 92 |
+
model_opt.n_heads = 6
|
| 93 |
+
if not hasattr(model_opt, 'dropout'):
|
| 94 |
+
model_opt.dropout = 0.1
|
| 95 |
+
if not hasattr(model_opt, 'cond_drop_prob'):
|
| 96 |
+
model_opt.cond_drop_prob = 0.1
|
| 97 |
+
|
| 98 |
+
t2m_transformer = MaskTransformer(code_dim=model_opt.code_dim,
|
| 99 |
+
cond_mode='text',
|
| 100 |
+
latent_dim=model_opt.latent_dim,
|
| 101 |
+
ff_size=model_opt.ff_size,
|
| 102 |
+
num_layers=model_opt.n_layers,
|
| 103 |
+
num_heads=model_opt.n_heads,
|
| 104 |
+
dropout=model_opt.dropout,
|
| 105 |
+
clip_dim=512,
|
| 106 |
+
cond_drop_prob=model_opt.cond_drop_prob,
|
| 107 |
+
clip_version=clip_version,
|
| 108 |
+
opt=model_opt)
|
| 109 |
+
|
| 110 |
+
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, model_name, 'model', 'latest.tar'),
|
| 111 |
+
map_location=self.device)
|
| 112 |
+
model_key = 't2m_transformer' if 't2m_transformer' in ckpt else 'trans'
|
| 113 |
+
t2m_transformer.load_state_dict(ckpt[model_key], strict=False)
|
| 114 |
+
t2m_transformer.to(self.device)
|
| 115 |
+
|
| 116 |
+
return t2m_transformer
|
| 117 |
+
|
| 118 |
+
def _load_res_model(self, checkpoints_dir, dataset_name, res_name, vq_opt):
|
| 119 |
+
res_opt_path = pjoin(checkpoints_dir, dataset_name, res_name, 'opt.txt')
|
| 120 |
+
res_opt = get_opt(res_opt_path, device=self.device)
|
| 121 |
+
|
| 122 |
+
# The res_name appears to be the same as vq_name, so res_opt is actually vq_opt
|
| 123 |
+
# We need to use proper model architecture parameters
|
| 124 |
+
res_opt.num_quantizers = vq_opt.num_quantizers
|
| 125 |
+
res_opt.num_tokens = vq_opt.nb_code
|
| 126 |
+
|
| 127 |
+
# Set architecture parameters for ResidualTransformer
|
| 128 |
+
# These should match the main transformer architecture
|
| 129 |
+
res_opt.latent_dim = 384 # Match with main transformer
|
| 130 |
+
res_opt.ff_size = 1024
|
| 131 |
+
res_opt.n_layers = 9 # Typically slightly more layers for residual
|
| 132 |
+
res_opt.n_heads = 6
|
| 133 |
+
res_opt.dropout = 0.1
|
| 134 |
+
res_opt.cond_drop_prob = 0.1
|
| 135 |
+
res_opt.share_weight = False
|
| 136 |
+
|
| 137 |
+
print(f"ResidualTransformer config - latent_dim: {res_opt.latent_dim}, ff_size: {res_opt.ff_size}, nlayers: {res_opt.n_layers}, nheads: {res_opt.n_heads}, dropout: {res_opt.dropout}")
|
| 138 |
+
|
| 139 |
+
res_transformer = ResidualTransformer(code_dim=vq_opt.code_dim,
|
| 140 |
+
cond_mode='text',
|
| 141 |
+
latent_dim=res_opt.latent_dim,
|
| 142 |
+
ff_size=res_opt.ff_size,
|
| 143 |
+
num_layers=res_opt.n_layers,
|
| 144 |
+
num_heads=res_opt.n_heads,
|
| 145 |
+
dropout=res_opt.dropout,
|
| 146 |
+
clip_dim=512,
|
| 147 |
+
shared_codebook=vq_opt.shared_codebook,
|
| 148 |
+
cond_drop_prob=res_opt.cond_drop_prob,
|
| 149 |
+
share_weight=res_opt.share_weight,
|
| 150 |
+
clip_version=clip_version,
|
| 151 |
+
opt=res_opt)
|
| 152 |
+
|
| 153 |
+
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, res_name, 'model', 'net_best_fid.tar'),
|
| 154 |
+
map_location=self.device)
|
| 155 |
+
|
| 156 |
+
# Debug: check available keys
|
| 157 |
+
print(f"Available checkpoint keys: {ckpt.keys()}")
|
| 158 |
+
|
| 159 |
+
# Try different possible keys for the model state dict
|
| 160 |
+
model_key = None
|
| 161 |
+
for key in ['res_transformer', 'trans', 'net', 'model', 'state_dict']:
|
| 162 |
+
if key in ckpt:
|
| 163 |
+
model_key = key
|
| 164 |
+
break
|
| 165 |
+
|
| 166 |
+
if model_key:
|
| 167 |
+
print(f"Loading ResidualTransformer from key: {model_key}")
|
| 168 |
+
res_transformer.load_state_dict(ckpt[model_key], strict=False)
|
| 169 |
+
else:
|
| 170 |
+
print("Warning: Could not find model weights in checkpoint. Available keys:", list(ckpt.keys()))
|
| 171 |
+
# If this is actually a VQ model checkpoint, we might need to skip loading or handle differently
|
| 172 |
+
if 'vq_model' in ckpt or 'net' in ckpt:
|
| 173 |
+
print("This appears to be a VQ model checkpoint, not a ResidualTransformer checkpoint.")
|
| 174 |
+
print("Skipping weight loading - using randomly initialized ResidualTransformer.")
|
| 175 |
+
|
| 176 |
+
res_transformer.to(self.device)
|
| 177 |
+
|
| 178 |
+
return res_transformer
|
| 179 |
+
|
| 180 |
+
def _load_len_estimator(self, checkpoints_dir, dataset_name):
|
| 181 |
+
model = LengthEstimator(512, 50)
|
| 182 |
+
ckpt = torch.load(pjoin(checkpoints_dir, dataset_name, 'length_estimator', 'model', 'finest.tar'),
|
| 183 |
+
map_location=self.device)
|
| 184 |
+
model.load_state_dict(ckpt['estimator'])
|
| 185 |
+
model.to(self.device)
|
| 186 |
+
return model
|
| 187 |
+
|
| 188 |
+
def inv_transform(self, data):
|
| 189 |
+
return data * self.std + self.mean
|
| 190 |
+
|
| 191 |
+
@torch.no_grad()
|
| 192 |
+
def generate(self, text_prompt, motion_length=0, time_steps=18, cond_scale=4,
|
| 193 |
+
temperature=1, topkr=0.9, gumbel_sample=True, seed=42):
|
| 194 |
+
"""
|
| 195 |
+
Generate motion from text prompt
|
| 196 |
+
|
| 197 |
+
Args:
|
| 198 |
+
text_prompt: Text description of the motion
|
| 199 |
+
motion_length: Desired motion length (0 for auto-estimation)
|
| 200 |
+
time_steps: Number of denoising steps
|
| 201 |
+
cond_scale: Classifier-free guidance scale
|
| 202 |
+
temperature: Sampling temperature
|
| 203 |
+
topkr: Top-k filtering threshold
|
| 204 |
+
gumbel_sample: Whether to use Gumbel sampling
|
| 205 |
+
seed: Random seed
|
| 206 |
+
"""
|
| 207 |
+
fixseed(seed)
|
| 208 |
+
|
| 209 |
+
# Convert motion_length to int if needed
|
| 210 |
+
if isinstance(motion_length, float):
|
| 211 |
+
motion_length = int(motion_length)
|
| 212 |
+
|
| 213 |
+
# Estimate length if not provided
|
| 214 |
+
if motion_length == 0:
|
| 215 |
+
text_embedding = self.t2m_transformer.encode_text([text_prompt])
|
| 216 |
+
pred_dis = self.length_estimator(text_embedding)
|
| 217 |
+
probs = F.softmax(pred_dis, dim=-1)
|
| 218 |
+
token_lens = Categorical(probs).sample()
|
| 219 |
+
else:
|
| 220 |
+
token_lens = torch.LongTensor([motion_length // 4]).to(self.device)
|
| 221 |
+
|
| 222 |
+
m_length = token_lens * 4
|
| 223 |
+
|
| 224 |
+
# Generate motion tokens
|
| 225 |
+
mids = self.t2m_transformer.generate([text_prompt], token_lens,
|
| 226 |
+
timesteps=int(time_steps),
|
| 227 |
+
cond_scale=float(cond_scale),
|
| 228 |
+
temperature=float(temperature),
|
| 229 |
+
topk_filter_thres=float(topkr),
|
| 230 |
+
gsample=gumbel_sample)
|
| 231 |
+
|
| 232 |
+
# Refine with residual transformer
|
| 233 |
+
mids = self.res_model.generate(mids, [text_prompt], token_lens,
|
| 234 |
+
temperature=1, cond_scale=5)
|
| 235 |
+
|
| 236 |
+
# Decode to motion
|
| 237 |
+
pred_motions = self.vq_model.forward_decoder(mids)
|
| 238 |
+
pred_motions = pred_motions.detach().cpu().numpy()
|
| 239 |
+
|
| 240 |
+
# Denormalize
|
| 241 |
+
data = self.inv_transform(pred_motions)
|
| 242 |
+
joint_data = data[0, :m_length[0]]
|
| 243 |
+
|
| 244 |
+
# Recover 3D joints
|
| 245 |
+
joint = recover_from_ric(torch.from_numpy(joint_data).float(), self.nb_joints).numpy()
|
| 246 |
+
|
| 247 |
+
return joint, int(m_length[0].item())
|
| 248 |
+
|
| 249 |
+
|
| 250 |
+
def create_gradio_interface(generator, output_dir='./gradio_outputs'):
|
| 251 |
+
os.makedirs(output_dir, exist_ok=True)
|
| 252 |
+
|
| 253 |
+
def generate_motion(text_prompt, progress=gr.Progress()):
|
| 254 |
+
try:
|
| 255 |
+
import time
|
| 256 |
+
start_time = time.time()
|
| 257 |
+
|
| 258 |
+
print(f"\n{'='*70}")
|
| 259 |
+
print(f"[START] Generating motion for: '{text_prompt}'")
|
| 260 |
+
print(f"Device: {generator.device}")
|
| 261 |
+
print(f"{'='*70}")
|
| 262 |
+
|
| 263 |
+
# Use default parameters for simplicity
|
| 264 |
+
motion_length = 0 # Auto-estimate
|
| 265 |
+
time_steps = 18
|
| 266 |
+
cond_scale = 4.0
|
| 267 |
+
temperature = 1.0
|
| 268 |
+
topkr = 0.9
|
| 269 |
+
use_gumbel = True
|
| 270 |
+
seed = 42
|
| 271 |
+
use_ik = True
|
| 272 |
+
|
| 273 |
+
# Show warning about CPU
|
| 274 |
+
if str(generator.device) == 'cpu':
|
| 275 |
+
print("\nβ οΈ WARNING: Using CPU - this will take 10-30 minutes!")
|
| 276 |
+
print("For faster inference, upgrade Space to GPU hardware.")
|
| 277 |
+
|
| 278 |
+
# Generate motion
|
| 279 |
+
progress(0.1, desc="[1/4] Generating motion tokens (10-20 mins on CPU)...")
|
| 280 |
+
print("[1/4] Generating motion tokens...")
|
| 281 |
+
|
| 282 |
+
joint, actual_length = generator.generate(
|
| 283 |
+
text_prompt,
|
| 284 |
+
motion_length,
|
| 285 |
+
time_steps,
|
| 286 |
+
cond_scale,
|
| 287 |
+
temperature,
|
| 288 |
+
topkr,
|
| 289 |
+
use_gumbel,
|
| 290 |
+
seed
|
| 291 |
+
)
|
| 292 |
+
|
| 293 |
+
elapsed = time.time() - start_time
|
| 294 |
+
print(f"β Generated {actual_length} frames in {elapsed:.1f}s")
|
| 295 |
+
|
| 296 |
+
# Save BVH and video
|
| 297 |
+
progress(0.6, desc="[2/4] Converting to BVH format...")
|
| 298 |
+
print("[2/4] Converting to BVH format...")
|
| 299 |
+
|
| 300 |
+
timestamp = str(np.random.randint(100000))
|
| 301 |
+
video_path = pjoin(output_dir, f'motion_{timestamp}.mp4')
|
| 302 |
+
|
| 303 |
+
# Convert to BVH with foot IK
|
| 304 |
+
_, joint_processed = generator.converter.convert(
|
| 305 |
+
joint, filename=None, iterations=100, foot_ik=True
|
| 306 |
+
)
|
| 307 |
+
|
| 308 |
+
print("β BVH conversion complete")
|
| 309 |
+
|
| 310 |
+
# Create video
|
| 311 |
+
progress(0.8, desc="[3/4] Rendering video...")
|
| 312 |
+
print("[3/4] Rendering video...")
|
| 313 |
+
|
| 314 |
+
plot_3d_motion(video_path, generator.kinematic_chain, joint_processed,
|
| 315 |
+
title=text_prompt, fps=20)
|
| 316 |
+
|
| 317 |
+
print(f"β Video saved: {video_path}")
|
| 318 |
+
|
| 319 |
+
progress(1.0, desc="[4/4] Complete!")
|
| 320 |
+
total_time = time.time() - start_time
|
| 321 |
+
print(f"[4/4] Complete! Total time: {total_time:.1f}s")
|
| 322 |
+
print(f"{'='*70}\n")
|
| 323 |
+
|
| 324 |
+
return video_path
|
| 325 |
+
|
| 326 |
+
except Exception as e:
|
| 327 |
+
import traceback
|
| 328 |
+
error_msg = f"Error: {str(e)}\n\nTraceback:\n{traceback.format_exc()}"
|
| 329 |
+
print("="*70)
|
| 330 |
+
print("ERROR during generation:")
|
| 331 |
+
print("="*70)
|
| 332 |
+
print(error_msg)
|
| 333 |
+
print("="*70)
|
| 334 |
+
return None
|
| 335 |
+
|
| 336 |
+
# Create Gradio interface with Blocks for custom layout
|
| 337 |
+
with gr.Blocks(theme=gr.themes.Base(
|
| 338 |
+
primary_hue="blue",
|
| 339 |
+
secondary_hue="gray",
|
| 340 |
+
).set(
|
| 341 |
+
body_background_fill="*neutral_950",
|
| 342 |
+
body_background_fill_dark="*neutral_950",
|
| 343 |
+
background_fill_primary="*neutral_900",
|
| 344 |
+
background_fill_primary_dark="*neutral_900",
|
| 345 |
+
background_fill_secondary="*neutral_800",
|
| 346 |
+
background_fill_secondary_dark="*neutral_800",
|
| 347 |
+
block_background_fill="*neutral_900",
|
| 348 |
+
block_background_fill_dark="*neutral_900",
|
| 349 |
+
input_background_fill="*neutral_800",
|
| 350 |
+
input_background_fill_dark="*neutral_800",
|
| 351 |
+
button_primary_background_fill="*primary_600",
|
| 352 |
+
button_primary_background_fill_dark="*primary_600",
|
| 353 |
+
button_primary_text_color="white",
|
| 354 |
+
button_primary_text_color_dark="white",
|
| 355 |
+
block_label_text_color="*neutral_200",
|
| 356 |
+
block_label_text_color_dark="*neutral_200",
|
| 357 |
+
body_text_color="*neutral_200",
|
| 358 |
+
body_text_color_dark="*neutral_200",
|
| 359 |
+
input_placeholder_color="*neutral_500",
|
| 360 |
+
input_placeholder_color_dark="*neutral_500",
|
| 361 |
+
),
|
| 362 |
+
css="""
|
| 363 |
+
footer {display: none !important;}
|
| 364 |
+
.video-fixed-height {
|
| 365 |
+
height: 600px !important;
|
| 366 |
+
}
|
| 367 |
+
.video-fixed-height video {
|
| 368 |
+
max-height: 600px !important;
|
| 369 |
+
object-fit: contain !important;
|
| 370 |
+
}
|
| 371 |
+
""") as demo:
|
| 372 |
+
|
| 373 |
+
gr.Markdown("# π Text-to-Motion Generator")
|
| 374 |
+
gr.Markdown("Generate 3D human motion animations from text descriptions")
|
| 375 |
+
|
| 376 |
+
# Show CPU warning if applicable
|
| 377 |
+
device_str = str(generator.device)
|
| 378 |
+
if 'cpu' in device_str:
|
| 379 |
+
gr.Markdown("""
|
| 380 |
+
### β οΈ Performance Notice
|
| 381 |
+
This Space is running on **CPU** (free tier). Generation takes **15-30 minutes** per prompt.
|
| 382 |
+
Please be patient! For faster results (~30 seconds), upgrade to GPU in Space Settings.
|
| 383 |
+
""")
|
| 384 |
+
else:
|
| 385 |
+
gr.Markdown(f"### β
Running on: {device_str.upper()}")
|
| 386 |
+
|
| 387 |
+
with gr.Row():
|
| 388 |
+
with gr.Column():
|
| 389 |
+
text_input = gr.Textbox(
|
| 390 |
+
label="Describe the motion you want to generate",
|
| 391 |
+
placeholder="e.g., 'a person walks forward and waves'",
|
| 392 |
+
lines=3
|
| 393 |
+
)
|
| 394 |
+
submit_btn = gr.Button("Generate Motion", variant="primary")
|
| 395 |
+
|
| 396 |
+
gr.Examples(
|
| 397 |
+
examples=[
|
| 398 |
+
["a person walks forward"],
|
| 399 |
+
["a person jumps in place"],
|
| 400 |
+
["someone performs a dance move"],
|
| 401 |
+
["a person sits down on a chair"],
|
| 402 |
+
["a person runs and then stops"],
|
| 403 |
+
],
|
| 404 |
+
inputs=text_input,
|
| 405 |
+
label="Try these examples"
|
| 406 |
+
)
|
| 407 |
+
|
| 408 |
+
with gr.Column():
|
| 409 |
+
video_output = gr.Video(label="Generated Motion", elem_classes="video-fixed-height")
|
| 410 |
+
|
| 411 |
+
submit_btn.click(
|
| 412 |
+
fn=generate_motion,
|
| 413 |
+
inputs=text_input,
|
| 414 |
+
outputs=video_output
|
| 415 |
+
)
|
| 416 |
+
|
| 417 |
+
return demo
|
| 418 |
+
|
| 419 |
+
|
| 420 |
+
if __name__ == '__main__':
|
| 421 |
+
# Configuration
|
| 422 |
+
CHECKPOINTS_DIR = './checkpoints'
|
| 423 |
+
DATASET_NAME = 't2m' # or 'kit'
|
| 424 |
+
MODEL_NAME = 't2m_nlayer8_nhead6_ld384_ff1024_cdp0.1_rvq6ns'
|
| 425 |
+
RES_NAME = 'rvq_nq6_dc512_nc512_noshare_qdp0.2'
|
| 426 |
+
VQ_NAME = 'rvq_nq6_dc512_nc512_noshare_qdp0.2'
|
| 427 |
+
|
| 428 |
+
# Initialize generator
|
| 429 |
+
generator = MotionGenerator(
|
| 430 |
+
checkpoints_dir=CHECKPOINTS_DIR,
|
| 431 |
+
dataset_name=DATASET_NAME,
|
| 432 |
+
model_name=MODEL_NAME,
|
| 433 |
+
res_name=RES_NAME,
|
| 434 |
+
vq_name=VQ_NAME,
|
| 435 |
+
device='cuda'
|
| 436 |
+
)
|
| 437 |
+
|
| 438 |
+
# Create and launch Gradio interface
|
| 439 |
+
demo = create_gradio_interface(generator)
|
| 440 |
+
demo.launch(server_name="0.0.0.0", server_port=7860)
|
debug_hf_space.py
ADDED
|
@@ -0,0 +1,182 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Debug Hugging Face Space - Check logs and diagnose issues
|
| 3 |
+
"""
|
| 4 |
+
import os
|
| 5 |
+
import sys
|
| 6 |
+
from huggingface_hub import HfApi
|
| 7 |
+
import time
|
| 8 |
+
|
| 9 |
+
# Configuration
|
| 10 |
+
YOUR_USERNAME = "nocapdev"
|
| 11 |
+
SPACE_NAME = "my-gradio-momask"
|
| 12 |
+
TOKEN = os.getenv("HUGGINGFACE_TOKEN")
|
| 13 |
+
|
| 14 |
+
def main():
|
| 15 |
+
print("=" * 80)
|
| 16 |
+
print(" " * 25 + "HF Space Debugger")
|
| 17 |
+
print("=" * 80)
|
| 18 |
+
|
| 19 |
+
if not TOKEN:
|
| 20 |
+
print("\nβ ERROR: HUGGINGFACE_TOKEN not set")
|
| 21 |
+
print("Set it with: $env:HUGGINGFACE_TOKEN = 'hf_your_token'")
|
| 22 |
+
print("\nAlternatively, check logs manually at:")
|
| 23 |
+
print(f"https://huggingface.co/spaces/{YOUR_USERNAME}/{SPACE_NAME}/logs")
|
| 24 |
+
return
|
| 25 |
+
|
| 26 |
+
api = HfApi(token=TOKEN)
|
| 27 |
+
repo_id = f"{YOUR_USERNAME}/{SPACE_NAME}"
|
| 28 |
+
|
| 29 |
+
print(f"\nπ Space: {repo_id}")
|
| 30 |
+
print(f"π URL: https://huggingface.co/spaces/{repo_id}")
|
| 31 |
+
print(f"π Logs: https://huggingface.co/spaces/{repo_id}/logs")
|
| 32 |
+
|
| 33 |
+
try:
|
| 34 |
+
# Get space runtime info
|
| 35 |
+
print("\n" + "β" * 80)
|
| 36 |
+
print("π§ RUNTIME INFORMATION")
|
| 37 |
+
print("β" * 80)
|
| 38 |
+
|
| 39 |
+
runtime = api.get_space_runtime(repo_id=repo_id)
|
| 40 |
+
|
| 41 |
+
print(f"Status: {runtime.stage}")
|
| 42 |
+
print(f"Hardware: {runtime.hardware or 'CPU basic (free)'}")
|
| 43 |
+
|
| 44 |
+
# Try to get SDK info if available
|
| 45 |
+
try:
|
| 46 |
+
print(f"SDK: {runtime.sdk}")
|
| 47 |
+
except AttributeError:
|
| 48 |
+
print(f"SDK: gradio (inferred)")
|
| 49 |
+
|
| 50 |
+
try:
|
| 51 |
+
print(f"SDK Version: {runtime.sdk_version or 'N/A'}")
|
| 52 |
+
except AttributeError:
|
| 53 |
+
print(f"SDK Version: N/A")
|
| 54 |
+
|
| 55 |
+
# Analyze status
|
| 56 |
+
if runtime.stage == "RUNNING":
|
| 57 |
+
print("\nβ
Space is RUNNING")
|
| 58 |
+
elif runtime.stage == "BUILDING":
|
| 59 |
+
print("\nβ³ Space is BUILDING... (wait a few minutes)")
|
| 60 |
+
elif runtime.stage == "STOPPED":
|
| 61 |
+
print("\nβ οΈ Space is STOPPED (may have crashed)")
|
| 62 |
+
elif runtime.stage == "SLEEPING":
|
| 63 |
+
print("\nπ΄ Space is SLEEPING (will wake on visit)")
|
| 64 |
+
else:
|
| 65 |
+
print(f"\nβ οΈ Unexpected stage: {runtime.stage}")
|
| 66 |
+
|
| 67 |
+
# Hardware analysis
|
| 68 |
+
print("\n" + "β" * 80)
|
| 69 |
+
print("π» HARDWARE ANALYSIS")
|
| 70 |
+
print("β" * 80)
|
| 71 |
+
|
| 72 |
+
hardware = str(runtime.hardware or 'cpu-basic').lower()
|
| 73 |
+
|
| 74 |
+
if 'cpu' in hardware or runtime.hardware is None:
|
| 75 |
+
print("β οΈ Using CPU (FREE tier)")
|
| 76 |
+
print(" β’ Generation time: 10-30 minutes per prompt")
|
| 77 |
+
print(" β’ This is NORMAL for free tier")
|
| 78 |
+
print(" β’ Recommendation: Upgrade to GPU or be patient")
|
| 79 |
+
elif 't4' in hardware:
|
| 80 |
+
print("β
Using T4 GPU")
|
| 81 |
+
print(" β’ Generation time: 20-60 seconds per prompt")
|
| 82 |
+
print(" β’ Good performance")
|
| 83 |
+
elif 'a10' in hardware or 'a100' in hardware:
|
| 84 |
+
print("β
Using High-end GPU")
|
| 85 |
+
print(" β’ Generation time: 10-30 seconds per prompt")
|
| 86 |
+
print(" β’ Excellent performance")
|
| 87 |
+
|
| 88 |
+
# Get space info
|
| 89 |
+
print("\n" + "β" * 80)
|
| 90 |
+
print("π¦ SPACE FILES")
|
| 91 |
+
print("β" * 80)
|
| 92 |
+
|
| 93 |
+
try:
|
| 94 |
+
files = api.list_repo_files(repo_id=repo_id, repo_type="space")
|
| 95 |
+
|
| 96 |
+
# Check critical files
|
| 97 |
+
critical_files = ['app.py', 'requirements.txt', 'README.md']
|
| 98 |
+
for file in critical_files:
|
| 99 |
+
if file in files:
|
| 100 |
+
print(f"β
{file}")
|
| 101 |
+
else:
|
| 102 |
+
print(f"β {file} - MISSING!")
|
| 103 |
+
|
| 104 |
+
# Check for checkpoints
|
| 105 |
+
checkpoint_files = [f for f in files if 'checkpoint' in f.lower() or f.endswith('.tar') or f.endswith('.pth')]
|
| 106 |
+
|
| 107 |
+
if checkpoint_files:
|
| 108 |
+
print(f"\nβ
Found {len(checkpoint_files)} checkpoint files")
|
| 109 |
+
print(" Sample files:")
|
| 110 |
+
for f in checkpoint_files[:5]:
|
| 111 |
+
print(f" β’ {f}")
|
| 112 |
+
if len(checkpoint_files) > 5:
|
| 113 |
+
print(f" ... and {len(checkpoint_files) - 5} more")
|
| 114 |
+
else:
|
| 115 |
+
print("\nβ οΈ NO checkpoint files found!")
|
| 116 |
+
print(" β’ Models may not be uploaded")
|
| 117 |
+
print(" β’ App will fail to initialize")
|
| 118 |
+
print(" β’ Action: Upload checkpoints/ directory")
|
| 119 |
+
|
| 120 |
+
except Exception as e:
|
| 121 |
+
print(f"β οΈ Could not list files: {e}")
|
| 122 |
+
|
| 123 |
+
# Provide debugging steps
|
| 124 |
+
print("\n" + "=" * 80)
|
| 125 |
+
print("π DEBUGGING STEPS")
|
| 126 |
+
print("=" * 80)
|
| 127 |
+
|
| 128 |
+
print("\n1. CHECK LOGS MANUALLY:")
|
| 129 |
+
print(f" Visit: https://huggingface.co/spaces/{repo_id}/logs")
|
| 130 |
+
print(" Look for:")
|
| 131 |
+
print(" β’ 'Using device: cpu' or 'Using device: cuda'")
|
| 132 |
+
print(" β’ Any ERROR messages")
|
| 133 |
+
print(" β’ 'Model checkpoints not found'")
|
| 134 |
+
print(" β’ Traceback or exception messages")
|
| 135 |
+
|
| 136 |
+
print("\n2. COMMON ERROR PATTERNS:")
|
| 137 |
+
print(" β’ 'FileNotFoundError' β Models not uploaded")
|
| 138 |
+
print(" β’ 'CUDA out of memory' β Need more GPU RAM")
|
| 139 |
+
print(" β’ 'Killed' or 'SIGKILL' β Out of RAM")
|
| 140 |
+
print(" β’ Hangs at '[1/4] Generating...' β CPU is slow (wait 20 mins)")
|
| 141 |
+
|
| 142 |
+
print("\n3. QUICK TESTS:")
|
| 143 |
+
print(" β’ Visit the Space URL")
|
| 144 |
+
print(" β’ Try prompt: 'a person walks forward'")
|
| 145 |
+
print(" β’ Monitor Logs tab while it runs")
|
| 146 |
+
|
| 147 |
+
if 'cpu' in hardware or runtime.hardware is None:
|
| 148 |
+
print("\nβ οΈ CPU PERFORMANCE WARNING:")
|
| 149 |
+
print(" Your Space is using CPU. Expected behavior:")
|
| 150 |
+
print(" β’ First load: 2-5 minutes (loading models)")
|
| 151 |
+
print(" β’ Each generation: 10-30 minutes")
|
| 152 |
+
print(" β’ This is NORMAL for CPU!")
|
| 153 |
+
print(" β’ Solutions:")
|
| 154 |
+
print(" - Wait patiently (free)")
|
| 155 |
+
print(" - Upgrade to T4 GPU (~$0.60/hour)")
|
| 156 |
+
|
| 157 |
+
print("\n4. IMMEDIATE ACTION:")
|
| 158 |
+
print(" Copy the ERROR message from Logs tab and share it")
|
| 159 |
+
|
| 160 |
+
print("\n" + "=" * 80)
|
| 161 |
+
|
| 162 |
+
except Exception as e:
|
| 163 |
+
print("\n" + "=" * 80)
|
| 164 |
+
print("β ERROR CHECKING SPACE")
|
| 165 |
+
print("=" * 80)
|
| 166 |
+
print(f"Error: {e}")
|
| 167 |
+
print("\nManual debugging required:")
|
| 168 |
+
print(f"1. Visit: https://huggingface.co/spaces/{repo_id}")
|
| 169 |
+
print(f"2. Click 'Logs' tab")
|
| 170 |
+
print(f"3. Copy the last 50 lines")
|
| 171 |
+
print(f"4. Share the error messages")
|
| 172 |
+
print("\n" + "=" * 80)
|
| 173 |
+
|
| 174 |
+
if __name__ == "__main__":
|
| 175 |
+
try:
|
| 176 |
+
main()
|
| 177 |
+
except KeyboardInterrupt:
|
| 178 |
+
print("\n\nCancelled by user")
|
| 179 |
+
except Exception as e:
|
| 180 |
+
print(f"\n\nUnexpected error: {e}")
|
| 181 |
+
import traceback
|
| 182 |
+
traceback.print_exc()
|
get_logs.py
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Simple script to fetch and display HF Space logs
|
| 3 |
+
"""
|
| 4 |
+
import os
|
| 5 |
+
from huggingface_hub import HfApi
|
| 6 |
+
|
| 7 |
+
YOUR_USERNAME = "nocapdev"
|
| 8 |
+
SPACE_NAME = "my-gradio-momask"
|
| 9 |
+
TOKEN = os.getenv("HUGGINGFACE_TOKEN")
|
| 10 |
+
|
| 11 |
+
print("=" * 80)
|
| 12 |
+
print(" " * 25 + "HF Space Log Viewer")
|
| 13 |
+
print("=" * 80)
|
| 14 |
+
|
| 15 |
+
if not TOKEN:
|
| 16 |
+
print("\nβ οΈ HUGGINGFACE_TOKEN not set")
|
| 17 |
+
print("Manual access:")
|
| 18 |
+
print(f"Visit: https://huggingface.co/spaces/{YOUR_USERNAME}/{SPACE_NAME}/logs")
|
| 19 |
+
else:
|
| 20 |
+
api = HfApi(token=TOKEN)
|
| 21 |
+
repo_id = f"{YOUR_USERNAME}/{SPACE_NAME}"
|
| 22 |
+
|
| 23 |
+
print(f"\nπ Space: {repo_id}")
|
| 24 |
+
print(f"π Logs: https://huggingface.co/spaces/{repo_id}/logs")
|
| 25 |
+
|
| 26 |
+
try:
|
| 27 |
+
print("\n" + "β" * 80)
|
| 28 |
+
print("π§ SPACE STATUS")
|
| 29 |
+
print("β" * 80)
|
| 30 |
+
|
| 31 |
+
runtime = api.get_space_runtime(repo_id=repo_id)
|
| 32 |
+
|
| 33 |
+
print(f"\nStatus: {runtime.stage}")
|
| 34 |
+
|
| 35 |
+
hardware = str(runtime.hardware) if runtime.hardware else "CPU basic (free tier)"
|
| 36 |
+
print(f"Hardware: {hardware}")
|
| 37 |
+
|
| 38 |
+
if runtime.stage == "RUNNING":
|
| 39 |
+
print("β
Space is RUNNING")
|
| 40 |
+
elif runtime.stage == "BUILDING":
|
| 41 |
+
print("β³ Space is BUILDING")
|
| 42 |
+
elif runtime.stage == "STOPPED":
|
| 43 |
+
print("β οΈ Space STOPPED (may have crashed)")
|
| 44 |
+
|
| 45 |
+
# Hardware warning
|
| 46 |
+
if not runtime.hardware or 'cpu' in str(runtime.hardware).lower():
|
| 47 |
+
print("\nβ οΈ PERFORMANCE WARNING:")
|
| 48 |
+
print(" Using CPU (free tier)")
|
| 49 |
+
print(" Expected generation time: 15-30 minutes per prompt")
|
| 50 |
+
print(" This is NORMAL for free tier!")
|
| 51 |
+
|
| 52 |
+
print("\n" + "β" * 80)
|
| 53 |
+
print("π WHAT TO CHECK IN LOGS")
|
| 54 |
+
print("β" * 80)
|
| 55 |
+
print("\n1. Visit the logs manually:")
|
| 56 |
+
print(f" https://huggingface.co/spaces/{repo_id}/logs")
|
| 57 |
+
print("\n2. Look for these key lines:")
|
| 58 |
+
print(" β’ 'Using device: cpu' or 'Using device: cuda'")
|
| 59 |
+
print(" β’ 'Loading models...'")
|
| 60 |
+
print(" β’ 'Models loaded successfully!'")
|
| 61 |
+
print(" β’ Any lines with 'ERROR' or 'Exception'")
|
| 62 |
+
print(" β’ 'Model checkpoints not found'")
|
| 63 |
+
print(" β’ '[1/4] Generating motion tokens...'")
|
| 64 |
+
print("\n3. Common issues to look for:")
|
| 65 |
+
print(" β’ FileNotFoundError β Models not uploaded")
|
| 66 |
+
print(" β’ Killed/SIGKILL β Out of memory")
|
| 67 |
+
print(" β’ Stuck at [1/4] β CPU is slow (wait 20 mins)")
|
| 68 |
+
|
| 69 |
+
print("\n" + "β" * 80)
|
| 70 |
+
print("π‘ NEXT STEPS")
|
| 71 |
+
print("β" * 80)
|
| 72 |
+
print("\n1. Click the link above to view logs")
|
| 73 |
+
print("2. Copy the LAST 50 LINES from the logs")
|
| 74 |
+
print("3. Share them here so I can diagnose the issue")
|
| 75 |
+
print("\nSpecifically look for:")
|
| 76 |
+
print(" β’ Any ERROR messages")
|
| 77 |
+
print(" β’ Any Exception or Traceback")
|
| 78 |
+
print(" β’ What happened after you submitted your prompt")
|
| 79 |
+
|
| 80 |
+
except Exception as e:
|
| 81 |
+
print(f"\nβ Error: {e}")
|
| 82 |
+
print(f"\nManual check required:")
|
| 83 |
+
print(f"Visit: https://huggingface.co/spaces/{repo_id}/logs")
|
| 84 |
+
|
| 85 |
+
print("\n" + "=" * 80)
|
simple_check.py
ADDED
|
@@ -0,0 +1,119 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Simple HF Space checker without Unicode issues
|
| 3 |
+
"""
|
| 4 |
+
import os
|
| 5 |
+
import sys
|
| 6 |
+
|
| 7 |
+
# Fix Windows encoding
|
| 8 |
+
if sys.platform == 'win32':
|
| 9 |
+
import io
|
| 10 |
+
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
|
| 11 |
+
|
| 12 |
+
from huggingface_hub import HfApi
|
| 13 |
+
|
| 14 |
+
YOUR_USERNAME = "nocapdev"
|
| 15 |
+
SPACE_NAME = "my-gradio-momask"
|
| 16 |
+
TOKEN = os.getenv("HUGGINGFACE_TOKEN")
|
| 17 |
+
|
| 18 |
+
print("=" * 80)
|
| 19 |
+
print("HF SPACE STATUS CHECK")
|
| 20 |
+
print("=" * 80)
|
| 21 |
+
|
| 22 |
+
repo_id = f"{YOUR_USERNAME}/{SPACE_NAME}"
|
| 23 |
+
|
| 24 |
+
if not TOKEN:
|
| 25 |
+
print("\nWARNING: HUGGINGFACE_TOKEN not set")
|
| 26 |
+
print(f"\nTo check your Space manually:")
|
| 27 |
+
print(f"1. Visit: https://huggingface.co/spaces/{repo_id}")
|
| 28 |
+
print(f"2. Click 'Logs' tab")
|
| 29 |
+
print(f"3. Copy the last 50 lines")
|
| 30 |
+
print(f"4. Look for ERROR messages")
|
| 31 |
+
sys.exit(0)
|
| 32 |
+
|
| 33 |
+
try:
|
| 34 |
+
api = HfApi(token=TOKEN)
|
| 35 |
+
|
| 36 |
+
print(f"\nSpace: {repo_id}")
|
| 37 |
+
print(f"URL: https://huggingface.co/spaces/{repo_id}")
|
| 38 |
+
print(f"Logs: https://huggingface.co/spaces/{repo_id}/logs")
|
| 39 |
+
|
| 40 |
+
print("\n" + "-" * 80)
|
| 41 |
+
print("RUNTIME INFO")
|
| 42 |
+
print("-" * 80)
|
| 43 |
+
|
| 44 |
+
runtime = api.get_space_runtime(repo_id=repo_id)
|
| 45 |
+
|
| 46 |
+
print(f"\nStatus: {runtime.stage}")
|
| 47 |
+
|
| 48 |
+
hardware = str(runtime.hardware) if runtime.hardware else "cpu-basic"
|
| 49 |
+
print(f"Hardware: {hardware}")
|
| 50 |
+
|
| 51 |
+
# Status analysis
|
| 52 |
+
print("\n" + "-" * 80)
|
| 53 |
+
print("ANALYSIS")
|
| 54 |
+
print("-" * 80)
|
| 55 |
+
|
| 56 |
+
if runtime.stage == "RUNNING":
|
| 57 |
+
print("\n[OK] Space is RUNNING")
|
| 58 |
+
elif runtime.stage == "BUILDING":
|
| 59 |
+
print("\n[WAIT] Space is BUILDING (wait 2-3 minutes)")
|
| 60 |
+
elif runtime.stage == "STOPPED":
|
| 61 |
+
print("\n[ERROR] Space STOPPED - may have crashed")
|
| 62 |
+
print("Check logs for errors!")
|
| 63 |
+
else:
|
| 64 |
+
print(f"\n[WARNING] Unexpected status: {runtime.stage}")
|
| 65 |
+
|
| 66 |
+
# Hardware check
|
| 67 |
+
if 'cpu' in hardware.lower() or hardware == "cpu-basic":
|
| 68 |
+
print("\n[SLOW] Using CPU (free tier)")
|
| 69 |
+
print(" - Generation time: 15-30 minutes per prompt")
|
| 70 |
+
print(" - This is NORMAL for free tier")
|
| 71 |
+
print(" - Solution: Wait OR upgrade to GPU")
|
| 72 |
+
else:
|
| 73 |
+
print(f"\n[FAST] Using GPU: {hardware}")
|
| 74 |
+
print(" - Generation time: 30-60 seconds per prompt")
|
| 75 |
+
|
| 76 |
+
# Check files
|
| 77 |
+
print("\n" + "-" * 80)
|
| 78 |
+
print("FILES CHECK")
|
| 79 |
+
print("-" * 80)
|
| 80 |
+
|
| 81 |
+
files = api.list_repo_files(repo_id=repo_id, repo_type="space")
|
| 82 |
+
|
| 83 |
+
# Critical files
|
| 84 |
+
for f in ['app.py', 'requirements.txt', 'README.md']:
|
| 85 |
+
if f in files:
|
| 86 |
+
print(f"[OK] {f}")
|
| 87 |
+
else:
|
| 88 |
+
print(f"[MISSING] {f}")
|
| 89 |
+
|
| 90 |
+
# Checkpoints
|
| 91 |
+
checkpoint_files = [f for f in files if 'checkpoint' in f.lower() or f.endswith('.tar')]
|
| 92 |
+
|
| 93 |
+
if checkpoint_files:
|
| 94 |
+
print(f"\n[OK] Found {len(checkpoint_files)} checkpoint files")
|
| 95 |
+
else:
|
| 96 |
+
print("\n[WARNING] No checkpoint files found!")
|
| 97 |
+
print(" Models may not be uploaded")
|
| 98 |
+
print(" App will fail to load")
|
| 99 |
+
|
| 100 |
+
print("\n" + "=" * 80)
|
| 101 |
+
print("NEXT STEPS")
|
| 102 |
+
print("=" * 80)
|
| 103 |
+
print(f"\n1. View logs at: https://huggingface.co/spaces/{repo_id}/logs")
|
| 104 |
+
print("\n2. Look for:")
|
| 105 |
+
print(" - 'Using device: cpu' or 'cuda'")
|
| 106 |
+
print(" - 'Loading models...'")
|
| 107 |
+
print(" - Any ERROR messages")
|
| 108 |
+
print(" - 'Model checkpoints not found'")
|
| 109 |
+
print("\n3. Copy the last 50 lines from logs")
|
| 110 |
+
print(" Especially any lines with ERROR or Exception")
|
| 111 |
+
print("\n4. Share those lines to get exact solution")
|
| 112 |
+
|
| 113 |
+
print("\n" + "=" * 80)
|
| 114 |
+
|
| 115 |
+
except Exception as e:
|
| 116 |
+
print(f"\nERROR: {e}")
|
| 117 |
+
print(f"\nManual check:")
|
| 118 |
+
print(f"Visit: https://huggingface.co/spaces/{repo_id}/logs")
|
| 119 |
+
print("Copy the last 50 lines and share them")
|
test_local.py
ADDED
|
@@ -0,0 +1,127 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Test your setup locally before deploying to HF
|
| 3 |
+
This helps identify issues without waiting for HF Space builds
|
| 4 |
+
"""
|
| 5 |
+
import os
|
| 6 |
+
import sys
|
| 7 |
+
import torch
|
| 8 |
+
|
| 9 |
+
print("=" * 80)
|
| 10 |
+
print(" " * 25 + "Local Setup Test")
|
| 11 |
+
print("=" * 80)
|
| 12 |
+
|
| 13 |
+
# Test 1: Python version
|
| 14 |
+
print("\n[1/7] Python Version")
|
| 15 |
+
print(f"β Python {sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}")
|
| 16 |
+
|
| 17 |
+
# Test 2: PyTorch
|
| 18 |
+
print("\n[2/7] PyTorch")
|
| 19 |
+
try:
|
| 20 |
+
print(f"β PyTorch version: {torch.__version__}")
|
| 21 |
+
print(f"β CUDA available: {torch.cuda.is_available()}")
|
| 22 |
+
if torch.cuda.is_available():
|
| 23 |
+
print(f"β CUDA version: {torch.version.cuda}")
|
| 24 |
+
print(f"β GPU: {torch.cuda.get_device_name(0)}")
|
| 25 |
+
else:
|
| 26 |
+
print("β οΈ No GPU detected (will use CPU)")
|
| 27 |
+
except Exception as e:
|
| 28 |
+
print(f"β Error: {e}")
|
| 29 |
+
|
| 30 |
+
# Test 3: Critical imports
|
| 31 |
+
print("\n[3/7] Critical Dependencies")
|
| 32 |
+
deps = {
|
| 33 |
+
'gradio': 'Gradio',
|
| 34 |
+
'numpy': 'NumPy',
|
| 35 |
+
'scipy': 'SciPy',
|
| 36 |
+
'matplotlib': 'Matplotlib',
|
| 37 |
+
'trimesh': 'Trimesh',
|
| 38 |
+
'einops': 'Einops',
|
| 39 |
+
'clip': 'OpenAI CLIP'
|
| 40 |
+
}
|
| 41 |
+
|
| 42 |
+
for module, name in deps.items():
|
| 43 |
+
try:
|
| 44 |
+
__import__(module)
|
| 45 |
+
print(f"β {name}")
|
| 46 |
+
except ImportError as e:
|
| 47 |
+
print(f"β {name} - NOT INSTALLED")
|
| 48 |
+
|
| 49 |
+
# Test 4: Model checkpoints
|
| 50 |
+
print("\n[4/7] Model Checkpoints")
|
| 51 |
+
checkpoints_dir = './checkpoints'
|
| 52 |
+
dataset_name = 't2m'
|
| 53 |
+
|
| 54 |
+
if os.path.exists(checkpoints_dir):
|
| 55 |
+
print(f"β Checkpoints directory exists: {checkpoints_dir}")
|
| 56 |
+
|
| 57 |
+
# Check for specific model directories
|
| 58 |
+
models_to_check = [
|
| 59 |
+
f'{checkpoints_dir}/{dataset_name}/t2m_nlayer8_nhead6_ld384_ff1024_cdp0.1_rvq6ns',
|
| 60 |
+
f'{checkpoints_dir}/{dataset_name}/rvq_nq6_dc512_nc512_noshare_qdp0.2',
|
| 61 |
+
f'{checkpoints_dir}/{dataset_name}/length_estimator',
|
| 62 |
+
]
|
| 63 |
+
|
| 64 |
+
for model_path in models_to_check:
|
| 65 |
+
if os.path.exists(model_path):
|
| 66 |
+
# Count files
|
| 67 |
+
files = []
|
| 68 |
+
for root, dirs, filenames in os.walk(model_path):
|
| 69 |
+
files.extend(filenames)
|
| 70 |
+
print(f"β {os.path.basename(model_path)} ({len(files)} files)")
|
| 71 |
+
else:
|
| 72 |
+
print(f"β {os.path.basename(model_path)} - NOT FOUND")
|
| 73 |
+
else:
|
| 74 |
+
print(f"β Checkpoints directory NOT FOUND: {checkpoints_dir}")
|
| 75 |
+
print(" Models must be present for the app to work!")
|
| 76 |
+
|
| 77 |
+
# Test 5: Try loading app.py
|
| 78 |
+
print("\n[5/7] App.py Syntax")
|
| 79 |
+
try:
|
| 80 |
+
with open('app.py', 'r', encoding='utf-8') as f:
|
| 81 |
+
compile(f.read(), 'app.py', 'exec')
|
| 82 |
+
print("β app.py syntax is valid")
|
| 83 |
+
except FileNotFoundError:
|
| 84 |
+
print("β app.py not found")
|
| 85 |
+
except SyntaxError as e:
|
| 86 |
+
print(f"β Syntax error: {e}")
|
| 87 |
+
|
| 88 |
+
# Test 6: Required files
|
| 89 |
+
print("\n[6/7] Required Files")
|
| 90 |
+
required = ['app.py', 'requirements.txt', 'README.md']
|
| 91 |
+
for file in required:
|
| 92 |
+
if os.path.exists(file):
|
| 93 |
+
size = os.path.getsize(file)
|
| 94 |
+
print(f"β {file} ({size} bytes)")
|
| 95 |
+
else:
|
| 96 |
+
print(f"β {file} - NOT FOUND")
|
| 97 |
+
|
| 98 |
+
# Test 7: Disk space for outputs
|
| 99 |
+
print("\n[7/7] Output Directory")
|
| 100 |
+
output_dir = './gradio_outputs'
|
| 101 |
+
try:
|
| 102 |
+
os.makedirs(output_dir, exist_ok=True)
|
| 103 |
+
print(f"β Output directory ready: {output_dir}")
|
| 104 |
+
except Exception as e:
|
| 105 |
+
print(f"β Error creating output directory: {e}")
|
| 106 |
+
|
| 107 |
+
# Summary
|
| 108 |
+
print("\n" + "=" * 80)
|
| 109 |
+
print("SUMMARY")
|
| 110 |
+
print("=" * 80)
|
| 111 |
+
|
| 112 |
+
# Check if ready
|
| 113 |
+
if os.path.exists(checkpoints_dir) and os.path.exists('app.py'):
|
| 114 |
+
print("\nβ
Basic setup looks good!")
|
| 115 |
+
print("\nNext steps:")
|
| 116 |
+
print("1. Test locally: python app.py")
|
| 117 |
+
print("2. Visit http://localhost:7860 in browser")
|
| 118 |
+
print("3. Try a prompt and check for errors")
|
| 119 |
+
print("4. If it works locally, redeploy to HF")
|
| 120 |
+
else:
|
| 121 |
+
print("\nβ οΈ Setup incomplete!")
|
| 122 |
+
if not os.path.exists(checkpoints_dir):
|
| 123 |
+
print("\nβ Missing: Model checkpoints")
|
| 124 |
+
print(" β’ Download models to ./checkpoints/")
|
| 125 |
+
print(" β’ Or configure model download in app.py")
|
| 126 |
+
|
| 127 |
+
print("\n" + "=" * 80)
|