Compare · LLM Guard
LLM Guard alternative for image and audio prompt injection
LLM Guard (Protect AI) is the de-facto open-source guardrail library for LLM applications. Its PromptInjection scanner is one of the strongest free options in the category — and it is, by design, a text-only library. If your attack surface includes user-uploaded images or voice input, LLM Guard cannot see those bytes, and the right move is almost always to add a second detector at a different point in the pipeline rather than to replace anything.
TL;DR
Keep LLM Guard on the text path. Put Glyphward in front of any image or audio that lands in your model. The two compose cleanly because they look at different bytes — Glyphward never tries to compete with LLM Guard on text PI, and LLM Guard does not try to read pixels. Free tier on Glyphward (10 scans/day, no card) is enough to benchmark against your own samples before you commit.
Where LLM Guard sits in 2026
LLM Guard ships as a Python package under an MIT-style licence, with a maintained registry of input scanners (Anonymize, BanSubstrings, BanTopics, Code, Language, PromptInjection, Toxicity, Secrets, …) and output scanners (NoRefusal, Sensitive, Bias, …). The community ships new scanner classes, the Hugging Face transformer backing PromptInjection gets re-trained, and the integration cost for a Python LLM app is essentially zero — pip install, instantiate, call scan_prompt().
That is the design surface, and it is excellent. It is also explicitly text-in / text-out. There is no ImagePromptInjection scanner that takes bytes; BanSubstrings works on strings, not pixels. If you want to inspect an image for FigStep- or AgentTypo-style typographic injection before it reaches your VLM, LLM Guard does not have a hook for that — you would have to write one, train it, host the model, maintain a corpus of known payloads, and version it as new attack variants emerge. That is a real engineering programme, not a config change.
Architectural difference
| LLM Guard | Glyphward | |
|---|---|---|
| Distribution | OSS Python library, self-host | Managed API, drop-in REST + SDK |
| Primary modality | Text in / text out | Image bytes + audio bytes |
| PI coverage | Text PromptInjection (transformer-backed) | OCR + CLIP + text-in-image head + waveform anomaly + transcript filter |
| Image PI | Not in scope | FigStep / AgentTypo / typographic / indirect-via-image |
| Audio PI | Not in scope | WhisperInject-class + spoken-jailbreak + ultrasonic carriers |
| Pricing | Free (BYO compute) | Free 10/day · $29/mo Pro · $99/mo Team |
| Lock-in | None (OSS) | Standard REST, no SDK lock |
| Operational ownership | You host, monitor, update | We host the detector and corpus |
The story is not "LLM Guard is missing something" — it is that LLM Guard solves the text half of a problem whose multimodal half lives at a different layer of the stack. Typographic PI walks past character-level text detectors because the attack vector never becomes text inside your application; the model reads it directly off the rendered pixels. A Python guardrail library that runs on a string cannot, by construction, see what the VLM sees.
The run-both pattern
The deployment most multimodal teams converge on:
- Text path — system prompt + user text + retrieved context →
LLM Guard.scan_prompt()withPromptInjection,BanTopics, and any custom rule set you maintain. Block or sanitise as your policy requires. - Image path — every uploaded image (avatar, screenshot, document, attachment) →
POST glyphward.com/v1/scanbefore it is forwarded to the VLM. Block on score > threshold, log on score in the grey band, pass on clean. - Audio path — every uploaded clip or live ASR-bound stream → Glyphward audio scanner before STT, ideally co-inspected with the transcript LLM Guard already produces.
- Cross-modal join — keep a per-request trace ID so a flagged image + a clean transcript can be reviewed as a single composite incident, not three orphaned signals.
Two independent detectors at two pipeline stages is also the architectural posture we argue for in detail in our cornerstone post on why text scanners miss image PI — no single PI detector should be load-bearing for the whole stack, and adding the modality you do not yet cover gives you the largest marginal lift.
When to pick which
- Stay on LLM Guard alone if your application is text-only and you have no plans to accept user-uploaded images or audio. There is no marginal value to layering Glyphward on a pure-text RAG.
- Add Glyphward if you are about to ship — or already shipped — image upload or voice features, want a drop-in detector with a curated multimodal-PI corpus, and would rather not run and version a pixel-level model in-house. The free tier covers benchmarking.
- Run both as the default for any multimodal app. They are not in the same category once you look at what they read.
What a switch costs
There is no switch — that is the point. Adding Glyphward to an existing LLM Guard deployment is one new HTTP call on the upload path. No Python dependency change, no retraining, no migration of allow-lists or block-lists. If you decide to leave Glyphward later, you remove the call; LLM Guard keeps running. The lock-in surface is intentionally tiny.
Related questions
Can I just write an ImagePromptInjection scanner inside LLM Guard?
You can. The cost is real: a labelled corpus of known FigStep / AgentTypo / typographic-PI payloads, a model that runs on pixels rather than tokens, an OCR pre-stage, evaluation harness, GPU hosting, and ongoing maintenance as new variants ship. The reason Glyphward exists is that this work duplicates across every team that tries it. We did it once and rent it out at $29/mo.
Is Glyphward open-source like LLM Guard?
No. Glyphward is closed-source by deliberate choice — the corpus of known-malicious multimodal payloads compounds as more users run scans, and an open repo would let attackers iterate against the exact detector. Our public surface is the API and the hosted free scanner; the private surface is the corpus.
Does running both double my latency?
No, because they sit at different pipeline stages. Glyphward runs against the upload before the VLM call (typical p95 <200ms); LLM Guard runs against the text prompt at LLM-call time. They are sequential in the same request only if you fan an image's OCR text into your text PI scanner — and that is exactly the path Glyphward already covers, so doing it again in LLM Guard is optional.
What about Protect AI's commercial offering?
Protect AI has a broader paid platform around guard scanners and model-supply-chain security; this page is specifically about the OSS LLM Guard library, which is what most engineers find first when they search for "open-source prompt injection". The commercial platform is enterprise-shaped and out of scope for the self-serve buyer this page is written for.
Is the corpus ever shared back?
We publish aggregate threat-class statistics on the blog (e.g. share of FigStep variants vs AgentTypo across a quarter) but never raw payloads. Sharing the raw corpus would give the attacker a free training set against the exact detector defending the customer. The OSS community gets the analytics, not the ammunition.
Further reading
- Glyphward vs LLM Guard — head-to-head feature table and integration sketch.
- Multimodal PI scanner pricing comparison (2026) — full market table across all five credible options.
- Why text scanners miss image PI — the architectural argument for image bytes in, score and region out.
- FigStep detection · AgentTypo detector · WhisperInject detection — concrete attack classes a text-only library cannot see.