Introduction
🔍 What is Mithra?
Mithra is LLM vulnerability scanner that helps security professionals, developers, and researchers test the security posture of Large Language Model (LLM) applications via API-based scanning. It automatically sends crafted payloads to detect common and emerging risks such as prompt injection, sensitive data leakage, and unauthorized output manipulation — based on structured configurations.
⚠️ Why is LLM Vulnerability Scanning Needed?
As LLMs are rapidly integrated into web apps, chatbots, and SaaS products, new attack surfaces emerge. These models can:
Be tricked into bypassing filters (jailbreaking)
Leak sensitive information from training data
Echo or amplify malicious input
Misinterpret user roles or instructions
Traditional scanners don’t understand the semantic behavior of LLMs. This is where Mithra comes in — by simulating real-world prompts and analyzing the model's language-based responses to detect exploitable behavior.
🧩 What Problems Does Mithra Solve?
Detects Prompt Injection and response manipulation vulnerabilities
Identifies LLMs that leak secrets, instructions, or internal context
Finds misconfigurations in LLM-based APIs
Helps enforce secure prompt engineering practices
Aids in verifying LLM guardrails and filters
Provides a reproducible testing framework for LLM bug bounty or red team engagements
👥 Who Should Use Mithra?
This tool is designed for:
Application Security Engineers To integrate LLM scanning into your existing AppSec workflows
Red Teams & Penetration Testers To assess the robustness of deployed LLM systems during offensive engagements
LLM-Powered Product Teams To test your own API before exposing it to users or the public
Researchers and Builders Exploring how LLMs behave under malicious input or curious to test custom model behavior
Last updated