DeepGuard AI vs Real Deepfake Model

Model Overview

This is a fine-tuned version of google/siglip2-base-patch16-224, specifically trained for binary image classification to detect AI-generated and deepfake images. It is the core inference engine powering the DeepGuard AI Media Forensics App.

The model distinguishes between Real photographs and Fake (AI-generated or deepfake) images. By leveraging the powerful SigLIP2 vision-language encoder and training it on a diverse, multi-source dataset of over 330,000 images, this model demonstrates robust performance in identifying synthetic media, including outputs from modern generators like Midjourney, Stable Diffusion, and DALL·E.

Metric Value
Architecture SigLIP2 (Vision Transformer)
Base Model google/siglip2-base-patch16-224
Input Resolution 224x224 pixels
Number of Classes 2 (Real, Fake)
Model Size ~372 MB
License Apache 2.0

Datasets

The model was trained on a carefully curated, balanced dataset of 40,000 images (20,000 real, 20,000 fake), sampled from five diverse, high-quality sources to ensure robustness and generalization across various forgery types.

Dataset Name Source Description
Deepfake and Real Images manjilkarki/deepfake-and-real-images A foundational dataset of 190k human faces, split evenly between real and manipulated images created by various deepfake techniques. Images are 256x256 pixels[reference:0].
HardFake vs Real Faces hamzaboulahia/hardfakevsrealfaces A challenging test-oriented dataset of 1,288 high-quality images (700 fake, 589 real) designed to push the limits of detection models. Fake faces are generated using StyleGAN2, and real faces feature diverse attributes[reference:1].
GRAVEX-200K muhammadbilal6305/200k-real-vs-ai-visuals-by-mbilal A comprehensive multisource dataset of 200,000 face images, curated from six major sources including FaceForensics++, DFDC, Celeb-DF, and Stable Diffusion outputs (SD 1.5, 2.1, XL)[reference:2].
DeepDetect-2025 ayushmandatta1/deepdetect-2025 A large-scale dataset of over 112,000 images spanning diverse categories (people, animals, nature, urban, artworks), generated by cutting-edge models like DALL·E 3, Midjourney, and Stable Diffusion 3.
Super GenAI (SUT-Project) hiddenplant/sut-project A dataset featuring high-fidelity images from the latest generative models, including Midjourney V6, Flux, and NanoBanana (SDXL), covering landscapes, portraits, and urban scenes.

Training Procedure

The model was fine-tuned using a progressive unfreezing strategy to adapt the pre-trained SigLIP2 encoder while preventing catastrophic forgetting. All training was performed on a Tesla T4 GPU in Google Colab.

Training Hyperparameters

Stage Epochs Learning Rate Trainable Parameters Description
Stage 1 2 1e-3 Classifier head only Warm-up phase to adapt the new binary classification head.
Stage 2 3 5e-5 Classifier + Top 6 Transformer Blocks Gradual unfreezing to allow the model to learn task-specific features.
Stage 3 2 1e-5 All layers Full model fine-tuning with a very low learning rate for final convergence.
  • Batch Size: 32
  • Optimizer: AdamW
  • Scheduler: Cosine Annealing
  • Loss Function: Cross-Entropy Loss
  • Data Augmentation: Random Horizontal Flip, Random Rotation (10°), Color Jitter

Performance Metrics

Evaluation on a held-out validation set results:

Metric Score
Accuracy 78.5%
AUC > 0.86
F1 Score ~0.78

Usage

You can load and use this model directly with the Hugging Face transformers library.

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "king1oo1/ai-vs-real-deepfake-model"  # Replace with your actual model ID
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForImageClassification.from_pretrained(model_name)
model.eval()

# Load and preprocess an image
image = Image.open("path/to/your/image.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

# Run inference
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    fake_prob = probs[0][1].item() * 100
    real_prob = probs[0][0].item() * 100

print(f"Fake probability: {fake_prob:.2f}%")
print(f"Real probability: {real_prob:.2f}%")
print(f"Verdict: {'FAKE' if fake_prob > 50 else 'REAL'}")
Downloads last month
78
Safetensors
Model size
92.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for king1oo1/deepfake-model

Finetuned
(119)
this model

Space using king1oo1/deepfake-model 1