NeMo Guardrails vs Prompt Injections

Week 1: I shared my Enforcement Engine (80% F1). Week 2: I tested Llama Guard (53% F1). This week: NeMo Guardrails and Prompt Engineering.

I ran two separate tests:

Test 1: 17 high-risk compliance queries (AML, medical, regulatory)
Test 2: 85 prompt injection attacks (17 queries × 5 injection prefixes)

Compliance Enforcement (The "Polite Crime" Test)

NeMo vs Llama Guard vs Enforcement Engine — three-way comparison

NeMo Guardrails vs Enforcement Engine — side-by-side recall

Multi-metric radar chart across all security dimensions

NeMo Guardrails: 53% F1 | 36% Recall (Missed 7/17)
Llama Guard 3: 53% F1 | 36% Recall (Missed 7/17)
Enforcement Engine: 80% F1 | 73% Recall (Missed 3/17)

Prompt Injection Defense (The "Jailbreak" Test)

NeMo Guardrails: 55% Recall (Missed 25/85)
Llama Guard 3: 58% Recall (Missed 23/85)
Enforcement Engine: 93% Recall (Missed 6/85)

NeMo and Llama Guard are incredible pieces of engineering, but they are tuned for General Safety (Violence, Hate, Self-Harm). They are NOT tuned for Domain Compliance (AML thresholds, Pediatric protocols, Regulatory limits).

If you are building for FinTech or MedTech, "Safe" ≠ "Compliant."

Compliance Enforcement (The "Polite Crime" Test)

Prompt Injection Defense (The "Jailbreak" Test)

Weekly experiments in your inbox