DeepGuard AI vs Real Deepfake Model

Model Overview

This is a fine-tuned version of google/siglip2-base-patch16-224, specifically trained for binary image classification to detect AI-generated and deepfake images. It is the core inference engine powering the DeepGuard AI Media Forensics App.

The model distinguishes between Real photographs and Fake (AI-generated or deepfake) images. By leveraging the powerful SigLIP2 vision-language encoder and training it on a diverse, multi-source dataset of over 330,000 images, this model demonstrates robust performance in identifying synthetic media, including outputs from modern generators like Midjourney, Stable Diffusion, and DALL·E.

Metric	Value
Architecture	SigLIP2 (Vision Transformer)
Base Model	`google/siglip2-base-patch16-224`
Input Resolution	224x224 pixels
Number of Classes	2 (`Real`, `Fake`)
Model Size	~372 MB
License	Apache 2.0

Datasets

The model was trained on a carefully curated, balanced dataset of 40,000 images (20,000 real, 20,000 fake), sampled from five diverse, high-quality sources to ensure robustness and generalization across various forgery types.

Dataset Name	Source	Description
Deepfake and Real Images	`manjilkarki/deepfake-and-real-images`	A foundational dataset of 190k human faces, split evenly between real and manipulated images created by various deepfake techniques. Images are 256x256 pixels[reference:0].
HardFake vs Real Faces	`hamzaboulahia/hardfakevsrealfaces`	A challenging test-oriented dataset of 1,288 high-quality images (700 fake, 589 real) designed to push the limits of detection models. Fake faces are generated using StyleGAN2, and real faces feature diverse attributes[reference:1].
GRAVEX-200K	`muhammadbilal6305/200k-real-vs-ai-visuals-by-mbilal`	A comprehensive multisource dataset of 200,000 face images, curated from six major sources including FaceForensics++, DFDC, Celeb-DF, and Stable Diffusion outputs (SD 1.5, 2.1, XL)[reference:2].
DeepDetect-2025	`ayushmandatta1/deepdetect-2025`	A large-scale dataset of over 112,000 images spanning diverse categories (people, animals, nature, urban, artworks), generated by cutting-edge models like DALL·E 3, Midjourney, and Stable Diffusion 3.
Super GenAI (SUT-Project)	`hiddenplant/sut-project`	A dataset featuring high-fidelity images from the latest generative models, including Midjourney V6, Flux, and NanoBanana (SDXL), covering landscapes, portraits, and urban scenes.

Training Procedure

The model was fine-tuned using a progressive unfreezing strategy to adapt the pre-trained SigLIP2 encoder while preventing catastrophic forgetting. All training was performed on a Tesla T4 GPU in Google Colab.

Training Hyperparameters

Stage	Epochs	Learning Rate	Trainable Parameters	Description
Stage 1	2	1e-3	Classifier head only	Warm-up phase to adapt the new binary classification head.
Stage 2	3	5e-5	Classifier + Top 6 Transformer Blocks	Gradual unfreezing to allow the model to learn task-specific features.
Stage 3	2	1e-5	All layers	Full model fine-tuning with a very low learning rate for final convergence.

Batch Size: 32
Optimizer: AdamW
Scheduler: Cosine Annealing
Loss Function: Cross-Entropy Loss
Data Augmentation: Random Horizontal Flip, Random Rotation (10°), Color Jitter

Performance Metrics

Evaluation on a held-out validation set results:

Metric	Score
Accuracy	78.5%
AUC	> 0.86
F1 Score	~0.78

Usage

You can load and use this model directly with the Hugging Face transformers library.

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

# Load model and processor
model_name = "king1oo1/ai-vs-real-deepfake-model"  # Replace with your actual model ID
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForImageClassification.from_pretrained(model_name)
model.eval()

# Load and preprocess an image
image = Image.open("path/to/your/image.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

# Run inference
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    fake_prob = probs[0][1].item() * 100
    real_prob = probs[0][0].item() * 100

print(f"Fake probability: {fake_prob:.2f}%")
print(f"Real probability: {real_prob:.2f}%")
print(f"Verdict: {'FAKE' if fake_prob > 50 else 'REAL'}")

Downloads last month: 78

Safetensors

Model size

92.9M params

Tensor type

F32

Model tree for king1oo1/deepfake-model

Base model

google/siglip2-base-patch16-224

Finetuned

(119)

this model

king1oo1
/

deepfake-model