Platform guide · Gradio & Streamlit
Prompt injection scanner for Gradio and Streamlit AI apps
Gradio and Streamlit are the two most widely used Python frameworks for building and deploying AI applications with image upload. Gradio's gr.Image() component and Streamlit's st.file_uploader() are the default entry points for user-submitted images in the Python ML ecosystem — used in production apps, Hugging Face Spaces deployments, and internal tools alike. Both frameworks make it trivially easy to pass the uploaded image to a vision LLM: four lines of code connect gr.Image() to a claude.messages.create() call. Neither framework includes multimodal prompt injection detection. The image validation available by default — file type checking, size limits, image dimension checks — operates on the file metadata, not the pixel content. A JPEG that passes all metadata checks can contain typographic injection text that the vision LLM processes as an instruction. The gap between "the file uploaded successfully" and "the file is safe for the LLM to process" is the attack surface. Glyphward's scan gate belongs in that gap: after the Gradio or Streamlit upload handler receives the image bytes, before those bytes are passed to the vision LLM.
TL;DR
Add a Glyphward scan call inside your Gradio inference function or Streamlit processing block, between the upload handler and the messages.create() / generate_content() call. Reject images with score ≥ 65 and return an error message to the user instead of a model response. The scan adds under 200ms to most image processing workflows. Free tier — 10 scans/day, no card required.
Four attack surfaces in Gradio and Streamlit image apps
1. gr.Image() input in Gradio inference functions. A Gradio app that accepts image input calls a Python function with the image as a NumPy array, PIL Image, or file path. That function typically calls a vision model API. An adversarial image uploaded through the gr.Image() component passes through Gradio's file handling without content inspection. Gradio validates the MIME type and optionally enforces a file size limit; it does not inspect pixel content for injection payloads. Public Gradio apps on Hugging Face Spaces (which are indexed and publicly accessible) are particularly high-risk: any internet user can upload an image, and the app's system prompt and model configuration are often visible in the Space's app.py — allowing an attacker to craft a payload targeted specifically at the app's context.
2. Streamlit st.file_uploader() with vision model integration. Streamlit apps that call st.file_uploader(accept_multiple_files=False, type=["jpg","png"]) and pass the uploaded bytes to a vision model API have the same gap: the uploaded file is validated for type but not content. Streamlit's session state means that an uploaded image persists across reruns of the script — a user who uploads an adversarial image and triggers multiple interactions can cause repeated injection attempts against the same app state. For Streamlit apps deployed internally (behind SSO or IP allowlist), the threat model is insider attack or compromised internal account rather than public internet attacker.
3. Gradio Chatbot with image attachment (gr.MultimodalTextbox()). Newer Gradio interfaces use gr.MultimodalTextbox() to combine text and image input in a chat interface. When the user sends a message with an image attachment, the Gradio event handler receives a dict containing both the text and the image file path. Applications that pass the image from this dict directly to a multimodal LLM (appending it as an image_url content block to the conversation history) inherit the same injection risk. The conversational context compounds the attack: an adversarial image injected early in a multi-turn conversation can affect all subsequent turns until the conversation is reset.
4. Batch processing via Gradio or Streamlit with external data sources. Production deployments sometimes use Gradio or Streamlit as a processing UI for batch jobs — a data analyst uploads a folder of images to process, a researcher uploads a dataset of documents. The batch source is often an S3 bucket URL, a Google Drive folder link, or a local file upload. In batch mode, one adversarial image in a large upload affects the entire batch if the app processes images sequentially and does not scan each one independently. The adversarial image's injection score is diluted when the operator reviews results at the aggregate level — a single anomalous model output in a 500-image batch is easy to miss.
Integration: Gradio scan gate (Python)
import gradio as gr
import anthropic, base64, os, requests
from PIL import Image
import io
GLYPHWARD_KEY = os.environ["GLYPHWARD_API_KEY"]
INJECTION_THRESHOLD = 65
anthropic_client = anthropic.Anthropic()
def scan_pil_image(pil_image: Image.Image, source: str = "gradio_upload") -> dict:
"""Scan a PIL Image for prompt injection before sending to the LLM."""
buffer = io.BytesIO()
pil_image.save(buffer, format="JPEG", quality=85)
image_bytes = buffer.getvalue()
try:
resp = requests.post(
"https://glyphward.com/v1/scan",
json={
"image": base64.b64encode(image_bytes).decode(),
"source": source,
},
headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
timeout=8,
)
resp.raise_for_status()
return resp.json()
except Exception:
return {"score": 100, "scan_id": None} # Fail-closed
def analyse_image(pil_image: Image.Image, user_prompt: str) -> str:
"""Gradio inference function — scan then analyse."""
if pil_image is None:
return "Please upload an image."
# Scan gate: runs before the LLM sees the image
scan = scan_pil_image(pil_image)
if scan["score"] >= INJECTION_THRESHOLD:
return (
f"⚠️ This image could not be processed. "
f"It failed our security scan (risk score: {scan['score']}). "
"Please upload a different image."
)
# Safe to send to vision LLM
buffer = io.BytesIO()
pil_image.save(buffer, format="JPEG", quality=85)
b64 = base64.b64encode(buffer.getvalue()).decode()
message = anthropic_client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {"type": "base64", "media_type": "image/jpeg", "data": b64},
},
{"type": "text", "text": user_prompt or "Describe this image."},
],
}
],
)
return message.content[0].text
# Build Gradio interface
demo = gr.Interface(
fn=analyse_image,
inputs=[
gr.Image(type="pil", label="Upload image"),
gr.Textbox(label="Question about the image", placeholder="What do you want to know?"),
],
outputs=gr.Textbox(label="Analysis"),
title="Image Analyser",
description="Upload an image and ask a question. Images are scanned for security before processing.",
)
if __name__ == "__main__":
demo.launch()
# Streamlit equivalent — same scan gate, different framework
import streamlit as st
import anthropic, base64, os, requests
from PIL import Image
import io
GLYPHWARD_KEY = os.environ["GLYPHWARD_API_KEY"]
INJECTION_THRESHOLD = 65
client = anthropic.Anthropic()
st.title("Image Analyser")
uploaded_file = st.file_uploader("Upload an image", type=["jpg", "jpeg", "png", "webp"])
user_prompt = st.text_input("Question", "Describe this image.")
if uploaded_file and st.button("Analyse"):
image_bytes = uploaded_file.read()
# Scan gate
try:
scan_resp = requests.post(
"https://glyphward.com/v1/scan",
json={"image": base64.b64encode(image_bytes).decode(), "source": "streamlit_upload"},
headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
timeout=8,
)
scan_resp.raise_for_status()
scan = scan_resp.json()
except Exception:
scan = {"score": 100, "scan_id": None}
if scan["score"] >= INJECTION_THRESHOLD:
st.error(
f"This image failed the security scan (risk score: {scan['score']}). "
"Please upload a different image."
)
else:
b64 = base64.b64encode(image_bytes).decode()
# Determine MIME type from uploaded file
mime = "image/jpeg" if uploaded_file.type in ("image/jpg", "image/jpeg") else uploaded_file.type
with st.spinner("Analysing..."):
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{
"role": "user",
"content": [
{"type": "image", "source": {"type": "base64", "media_type": mime, "data": b64}},
{"type": "text", "text": user_prompt},
],
}],
)
st.write(message.content[0].text)
Coverage matrix
| Defence layer | gr.Image() upload | st.file_uploader() upload | gr.MultimodalTextbox() chat | Batch processing mode |
|---|---|---|---|---|
| Gradio file type validation | Validates MIME type — not pixel content | N/A | Validates MIME type — not pixel content | Validates MIME type — not pixel content |
| Streamlit file type filter | N/A | Filters by extension — not pixel content | N/A | N/A |
| Vision model system prompt hardening | Probabilistic — model may still follow injected instructions | Probabilistic | Lower effectiveness in chat context — instructions feel conversationally natural | Probabilistic, harder to audit at batch scale |
| Glyphward scan gate (pre-model) | Yes — scan PIL Image before messages.create() | Yes — scan bytes before LLM call | Yes — scan image attachment before appending to conversation | Yes — scan each image independently; reject and skip adversarial images |
Related questions
Does this apply to Hugging Face Spaces running Gradio apps?
Yes, and Hugging Face Spaces represent the highest-risk Gradio deployment context. Spaces are publicly accessible by default, indexed by search engines, and easy to discover. The Space's app.py source code is often public, revealing the app's system prompt, model choice, and prompt template — giving an attacker precise knowledge of what injection payload to craft. The Glyphward API key should be stored as a Hugging Face Space secret (not in the source code). Add the scan gate to your inference function before the model call, and add GLYPHWARD_API_KEY to your Space's repository secrets.
How do I handle the scan latency in a Gradio app where users expect fast responses?
For interactive Gradio demos, the scan (80–150ms) is typically masked by the vision model response latency (500ms–3s). Users do not notice the difference. If you are building a latency-critical production app using Gradio as a UI layer, consider running the scan asynchronously: start the scan and the model call concurrently; if the scan returns a high score before the model call completes, cancel the model call and surface the error. This prevents the model from generating a response based on adversarial content even if the scan takes slightly longer than the model's initial tokens. In most cases, synchronous scanning is simpler and sufficient.
What should the Gradio app display when an image is rejected?
Return a user-facing message that is informative but not exploitable. "This image failed our security scan" is appropriate. Avoid including the raw scan score in the public UI (it helps attackers calibrate their payload). Log the full scan result (score, scan_id, user session ID) internally for audit. If you are running a public-facing app, consider adding a brief cooldown on rejected uploads per session to rate-limit adversarial probing.
Does this apply to Gradio apps that use Google Gemini or OpenAI GPT-4o instead of Claude?
Yes. The scan gate is model-agnostic — it runs on the uploaded image before it is sent to any vision LLM. The Gradio code pattern above can be adapted to any vision API: replace anthropic_client.messages.create() with openai_client.chat.completions.create(), genai.GenerativeModel.generate_content(), or any other vision model call. The scan logic before the model call is identical regardless of the downstream LLM.
Further reading
- Prompt-injection scanner for Hugging Face Transformers — self-hosted vision model scan patterns for the HF ecosystem.
- Prompt-injection scanner for chatbots with image upload — general chatbot image upload attack surface.
- Real-time vs batch prompt injection scanning — architecture guide for batch mode scan patterns.
- Prompt injection prevention best practices — full six-layer defence stack.
- Multimodal LLM security API — Glyphward API overview and authentication.