Powered by Facebook's TimeSformer model, this application can identify and classify human actions in video clips with state-of-the-art accuracy.
This GitHub Pages site shows the project information. To actually upload videos and test the AI model, you need to run the application locally or deploy it to a cloud platform.
Choose one of the deployment options below to start using the video action recognition feature!
Download and run the application on your computer. This gives you full control and doesn't require any cloud credits.
Setup GuideRun the app in Google Colab with GPU acceleration. Perfect for quick testing without local installation.
Open ColabTry the live demo hosted on Hugging Face Spaces. Upload your video directly in the browser.
Live DemoUses Facebook's TimeSformer model fine-tuned on Kinetics-400 dataset with 400+ action classes for accurate predictions.
Efficiently processes videos using GPU acceleration when available, with fallback to CPU for universal compatibility.
Simple drag-and-drop interface supporting multiple video formats (MP4, MOV, AVI, MKV) up to 200MB.
Get top-k predictions with confidence scores and visual feedback for better understanding of model decisions.
Recognizes sports, daily activities, musical performances, exercise, work activities, and social interactions.
Complete source code available on GitHub with detailed documentation and setup instructions.
./run_fix.sh first
http://localhost:8501 in your
browser