GuardrailsAI/prompt-saturation-attack-detector Text Classification • 4.39M • Updated Nov 14, 2024 • 12.4k • • 2
intelliway/deberta-v3-base-prompt-injection-v2-mapa Text Classification • 0.2B • Updated Jul 3, 2025 • 2
qualifire/prompt-injection-jailbreak-sentinel-v2 Text Classification • 0.6B • Updated Sep 28, 2025 • 3.07k • 28
llm-semantic-router/mmbert-jailbreak-detector-lora Text Classification • Updated about 1 month ago • 8
llm-semantic-router/mmbert-jailbreak-detector-merged Text Classification • 0.3B • Updated about 1 month ago • 105
abdulmunimjemal/Sentinel-Rail-A-Prompt-Attack-Guard Text Classification • Updated about 1 month ago • 1
llm-semantic-router/mmbert-safety-classifier-level1 Text Classification • Updated about 1 month ago • 3
llm-semantic-router/mlcommons-safety-classifier-level1-binary Text Classification • Updated 30 days ago • 10
llm-semantic-router/mmbert32k-jailbreak-detector-merged Text Classification • 0.3B • Updated 19 days ago • 593
satyamg1620/mmbert32k-jailbreak-detector-healthcare-merged Text Classification • 0.3B • Updated 6 days ago • 18