Comparison · 2026
Multimodal prompt-injection scanner pricing comparison (2026)
If you are shopping for a prompt-injection defender that covers images or audio — not only text — the field is smaller than it looks. Here is what the five credible options cost, what modality each one actually covers, and where the self-serve tier starts.
TL;DR
Four of the five best-known PI defenders are text-only by design or by gating. Only Lakera Guard and Glyphward ship multimodal coverage as a self-serve product in 2026. Lakera sits at $99+/mo and has pushed upmarket since the Check Point acquisition; Glyphward sits at $29/mo with image and audio included. LLM Guard (free, open source) and Promptfoo (free-tier eval) are excellent complements — just not real-time multimodal scanners.
The five options, side by side
| Product | Image PI | Audio PI | Self-serve tier | Where it sits |
|---|---|---|---|---|
| Lakera Guard | Yes (text-in-image; emphasis on text) | Limited | ~$99+/mo (enterprise gated beyond) | Text-first, multimodal roadmap; pushed upmarket post Check Point acquisition (per public reporting, Sept–Nov 2025) |
| LLM Guard (open source) | No | No | Free (self-hosted) | Text-only by design; excellent complement to a multimodal scanner, not a replacement |
| Azure Prompt Shields | Text only | No | Per-call on Azure Foundry (PPU) | Azure-gated; cannot be used outside Azure tenant boundaries |
| Promptfoo | Eval-time, not inference-time | Eval-time, not inference-time | Free OSS + Cloud Team $50/mo | Test harness for red-team suites; not a real-time scanner on live traffic |
| Glyphward | Yes (FigStep, AgentTypo, typographic PI, rendered glyphs) | Yes (WhisperInject-class, waveform anomalies, transcript filter) | Free 10/day · $29/mo Pro · $99/mo Team | Multimodal-first self-serve; starts below the enterprise tier |
Figures are drawn from public vendor websites as of April 2026. Private quotes and enterprise tiers frequently differ — if a specific line has moved since we wrote this, email hello@glyphward.com and we will update it.
How to read the table
“Covers a modality” in this market usually means one of three things. Real-time scanning — your live traffic is sent to the scanner and a verdict comes back in under a second. Eval-time — you test your model against a suite of attacks at CI time, not at request time. Per-call gating — the scanner is bundled inside a larger platform and billed per invocation. These three shapes are not substitutes: Promptfoo is eval-time and extremely good at it, but it does not intercept live user uploads.
The second gotcha is text-in-image vs pixel-level injection. A scanner that runs OCR on images and then passes the OCR output through a text PI pipeline covers a third of the threat surface — it catches unsophisticated FigStep variants but misses anti-OCR rendering, low-res composites, and paraphrased list structures. See FigStep detection for why.
When each option is the right choice
- Lakera Guard — you already have an enterprise contract or a procurement team, text PI is your dominant surface, and you can absorb $99+/mo plus the post-acquisition upmarket push. Excellent product; different buyer.
- LLM Guard — you want a free, open-source text-PI baseline to self-host next to your LLM. Pair it with Glyphward for multimodal, which it explicitly does not cover.
- Azure Prompt Shields — you are all-in on Azure Foundry and content-safety APIs, and you only need text. If you have a single upload form that takes images, you still need a separate image defender; Prompt Shields does not see pixels.
- Promptfoo — you want to red-team your model at CI time and graph attack-class coverage over releases. This is complementary to Glyphward, which runs at inference on live traffic. See Lakera alternative (multimodal) for the category distinction.
- Glyphward — your product takes image or audio uploads from end users, and you need a scanner that sees both modalities at inference time at a self-serve price. Pro at $29/mo buys 100k scans a month; the Free tier covers benchmark runs before you commit. Full rate card on the pricing page.
How Glyphward fits
Glyphward is purpose-built for the multimodal gap every text-first defender leaves open: one endpoint that scans images and audio at inference time, free for 10 scans a day, $29 a month for 100,000. No Azure tenancy required, no enterprise sales cycle. Drop-in REST plus Node and Python SDKs.
Related questions
Is Glyphward a replacement for Lakera Guard or LLM Guard?
Not a 1:1 replacement — a complement for most teams. If you already run a text PI scanner you are happy with, bolt Glyphward on to cover images and audio and nothing else changes. If you have no scanner and most of your attack surface is image/audio uploads, Glyphward alone is a sensible v1.
Why is Azure Prompt Shields listed as text-only when Microsoft markets content safety broadly?
Content Safety and Prompt Shields are separate products inside Azure AI Foundry. Content Safety covers image moderation (sexual, violent, hateful content) but not prompt injection. Prompt Shields inspects text. Neither scans image-rendered instructions or audio at the waveform.
You listed Promptfoo as "not a real-time scanner" — is that fair?
It is — and it is not a criticism. Promptfoo is excellent at what it does: eval-time red-team runs and CI-gated attack-class reports. Its own docs describe the product as an eval harness. Real-time inference scanning is a different surface, and the team has not marketed Promptfoo as that.
Further reading
- Lakera alternative for multimodal prompt injection — the specific comparison with the market leader.
- Glyphward pricing — the rate card in detail.
- FigStep detection and WhisperInject detection — the two attack classes that define the multimodal gap.