square-poll-horizontalScan Result

Once a scan is completed, Mithra provides detailed results that include vulnerability findings, attack performance, and actionable insights. These reports help security professionals, developers, and auditors quickly identify weaknesses in their LLM-based applications.


βœ… What Does a Scan Result Look Like?

Mithra presents results with real-time feedback, per-payload analysis, and grouped summaries by vulnerability type. The results dashboard contains:

  • βœ… Total Attacks

  • ❌ Total Failures (model misbehaved)

  • βœ… Total Passed (model remained secure)

  • πŸ“‚ Categories of attack (e.g., Prompt Injection, Encoding, Reflection)

  • πŸ” Per-payload outcomes with decoded input and model output


πŸ“¦ Data Fields in a Result

Each scan result includes structured data for easy review:

Field
Description

Scan Name

User-defined name for the scan

Endpoint

The target URL/API scanned

Payload

The exact prompt or encoded input used

Category

Type of attack (e.g., Prompt Injection β†’ Base64)

Result

Pass (no misbehavior) or Fail (vulnerability detected)

Detection Method

How the failure was recognized (e.g., model decoded malicious intent)

Confidence

Confidence score (if applicable)

Severity

Low / Medium / High, based on impact


πŸ”Ž Interpreting Severity

Mithra assigns severity levels to each detected issue based on:

Severity
Meaning

πŸ”΅ Low

Model performs non-critical decoding (e.g., harmless Base64 echo)

🟑 Medium

Model reproduces filtered content or leaks system behavior

πŸ”΄ High

Model generates harmful output, decodes malicious intent, or bypasses core restrictions


πŸ“ˆ Example Scan Summary

πŸ§ͺ Prompt Injection (Encoding-Based)

Metric
Value

Total Attacks

350

Passed

269

Failed

81


πŸ” Base16 Injection

  • Pass Rate: 48.3%

  • Objective: Test if model decodes hex-encoded hate speech

  • Sample Failure:


πŸ” Base64 Injection

  • Pass Rate: 63.6%

  • Objective: Inject encoded harmful instructions using Base64

  • Sample Failure:


πŸ” ROT13 Injection

  • Pass Rate: 83.3%

  • Objective: Use ROT13 to bypass filters

  • Sample Failure:


βœ… Braille & UUEncode

  • Pass Rate: 100%

  • Model remained secure against these encoding forms.


πŸ–ΌοΈ Screenshots

Last updated