ICP-by-platform · Dify
Prompt-injection scanner for Dify agents
Dify is one of the most widely deployed open-source LLM application frameworks — teams use it to build chatbots, RAG pipelines, and multi-step agents with a visual workflow editor. When a Dify workflow includes a vision node or a file-upload variable that passes image bytes to an LLM node, it creates a multimodal prompt-injection attack surface. Dify's built-in content moderation applies to the text returned by the model, not to the image bytes provided as input. A user who uploads a FigStep-class adversarial image can inject instructions that the vision model reads from the pixel stream while the moderation layer sees a clean text input. The fix is a Glyphward scan HTTP node placed between the file-input variable and the LLM node in the Dify workflow — scan the image, gate on the score, only pass to the model if clean.
TL;DR
In any Dify workflow that accepts an image or file upload and passes it to a vision-capable LLM node: add an HTTP Request node that POSTs the image bytes to Glyphward's /v1/scan, then add an IF/ELSE node that gates on the returned score. Score ≥ 70 → terminate the workflow with an error message before the LLM node fires. Score < 70 → pass to the LLM node as normal. One node, under 200 ms overhead, zero user-visible latency impact on clean inputs. Free tier — 10 scans/day, no card.
Where multimodal PI enters a Dify workflow
Vision-node image inputs. Dify supports a built-in Vision input type on the LLM node that passes an uploaded image as a vision content block to GPT-4o, Claude 3.x, Gemini 1.5, or any other vision-capable model. When the workflow's start node includes an image-type variable, users can upload any image. The image is passed directly to the LLM node without any content inspection. A FigStep-class payload in the uploaded image — a text overlay, a typographic instruction, or a high-frequency-steganography payload — is invisible to Dify's moderation layer (which operates on model output text) and passes to the vision model unfiltered.
File-upload tool with vision routing. Dify's file-upload feature accepts PDF, DOCX, PPTX, and image files. When a PDF is processed through a Dify knowledge-base node or document-extractor node, pages are rendered and passed to the LLM. Any image within those pages — embedded photos, scanned pages, charts — reaches the vision encoder. The indirect-PI attack pattern for Dify RAG pipelines follows the same path described in indirect prompt injection via image: the payload is in an image embedded in a knowledge-base document, not in the user's direct message.
Tool-use agents that fetch external URLs. Dify agents with web-browsing or URL-fetch tools retrieve web pages that may contain images. If the agent passes retrieved page content (including embedded images) to a vision model for analysis, the retrieved image is an untrusted external input. This is the indirect-PI via image pattern applied to agentic browsing.
Screenshot agents. Dify workflows that capture screenshots of web pages or application UIs and pass them to a vision model for analysis (a common pattern for UI testing agents and monitoring agents) have a screenshot PI attack surface. See prompt-injection scanner for screenshot agents.
Adding Glyphward to a Dify workflow (HTTP Request node)
Dify's workflow editor supports an HTTP Request node that makes arbitrary HTTP calls and injects the response into workflow variables. Add a scan gate in four steps:
Step 1: Add an HTTP Request node after your start node (or after any node that produces the image variable), before the LLM node.
Configure the HTTP Request node:
Method: POST
URL: https://glyphward.com/v1/scan
Headers:
Authorization: Bearer {{env.GLYPHWARD_API_KEY}}
Content-Type: application/json
Body (JSON):
{
"image": "{{sys.files[0].content_base64}}",
"source": "dify_workflow",
"metadata": { "workflow_id": "{{sys.workflow_id}}" }
}
Step 2: Map the response. The HTTP node returns the Glyphward response body as a JSON variable. Name the output variable scan_result. The scan_result.score field is the 0–100 PI risk score.
Step 3: Add an IF/ELSE node after the HTTP Request node.
Condition: scan_result.score >= 70
True branch → End node (output: "Image blocked: potential adversarial payload detected.")
False branch → LLM node (pass image as normal)
Step 4: Set the API key as a Dify environment variable (GLYPHWARD_API_KEY) in your workspace settings — do not hardcode it in the node configuration.
For PDF file uploads, use a Document Extractor node first to get the rendered page images, then loop over pages with an HTTP Request + IF/ELSE per page before ingesting into the knowledge base.
Python integration for Dify self-hosted custom tools
If you run Dify self-hosted and want a custom Python tool rather than an HTTP node, add a scan_image tool to your Dify tool set:
import base64, httpx
from collections.abc import Generator
from dify_plugin import Tool
from dify_plugin.entities.tool import ToolInvokeMessage
class ScanImageTool(Tool):
"""Pre-flight PI scan for image inputs."""
def _invoke(self, tool_parameters: dict) -> Generator[ToolInvokeMessage]:
image_b64 = tool_parameters.get("image_base64", "")
threshold = int(tool_parameters.get("threshold", 70))
api_key = self.runtime.credentials["glyphward_api_key"]
resp = httpx.post(
"https://glyphward.com/v1/scan",
headers={"Authorization": f"Bearer {api_key}"},
json={"image": image_b64, "source": "dify_tool"},
timeout=5.0,
)
resp.raise_for_status()
result = resp.json()
blocked = result["score"] >= threshold
yield self.create_text_message(
f"scan_id={result['scan_id']} score={result['score']} "
f"blocked={blocked}"
)
yield self.create_json_message({
"scan_id": result["scan_id"],
"score": result["score"],
"blocked": blocked,
"flagged_region": result.get("flagged_region"),
})
Use this tool in an Agent node before any vision-model call. The agent checks the scan result and terminates the workflow if blocked is true.
Coverage matrix
| Defence layer | Detects FigStep in user-uploaded image | Detects PI in file-upload PDF pages | Detects indirect PI via fetched URLs | Blocks before LLM node fires |
|---|---|---|---|---|
| Dify built-in content moderation | No (checks model output text) | No | No | No (output-side only) |
| LLM system-prompt instruction | Unreliable (bypassed by payload) | Unreliable | Unreliable | No hard block |
| Lakera Guard (text) | No (text only) | No | No | Text only |
| Glyphward HTTP node | Yes — pixel-level | Yes — page-render scan | Yes — image scan | Yes — hard IF/ELSE block |
Related questions
Which Dify LLM providers expose vision input?
As of mid-2026: OpenAI (GPT-4o, GPT-4V), Anthropic (Claude 3 Opus/Sonnet/Haiku, Claude 3.5 Sonnet), Google (Gemini 1.5 Pro/Flash), Mistral (Pixtral), and any Ollama/local model with vision support (LLaVA, BakLLaVA, InternVL). All of these accept image inputs via the Dify Vision node and are susceptible to FigStep-class payloads in uploaded images. The scan gate is model-agnostic — scan the image bytes before they reach any of these providers.
Does the HTTP node add noticeable latency to my Dify workflow?
The Glyphward scan returns in under 200 ms. For most Dify workflows where the LLM node itself takes 2–10 seconds, the scan adds less than 10% overhead on clean inputs. For workflows with very tight latency requirements (< 500 ms total), use Glyphward's async scan endpoint and return the scan result asynchronously while the workflow waits at the IF/ELSE node.
What about Dify's knowledge-base documents — not user uploads?
Dify knowledge-base documents are typically uploaded by developers or administrators, not by end users — they are a lower-risk input channel than live user uploads. However, if your knowledge base ingests documents from external sources (web crawl, API feeds, customer-submitted content), those documents are untrusted external inputs. For external-origin knowledge-base documents, run a per-page scan at ingestion time, not at query time. See prompt-injection scanner for RAG pipelines for the ingestion-time pattern.
Can I use Glyphward in a Dify self-hosted deployment behind a firewall?
Yes, with a network egress allowance for glyphward.com/v1/scan. The scan POST sends the image bytes as a base64-encoded field in the JSON body — it does not require inbound connectivity. If your Dify deployment is on an air-gapped network, contact Glyphward for on-premises deployment options. The free tier and Pro tier are cloud-hosted; the Team tier includes on-premises deployment under a custom agreement.
How do I handle a workflow where the user's image is flagged — what do I tell the user?
Return a generic error from the End node: "We were unable to process this image. Please try a different image or contact support." Do not reveal that a PI scan blocked the request, as this tells an attacker what threshold they need to evade. Log the scan_id, the workflow_id, and the flagged score to your Dify workflow logs or an external SIEM. A score above 85 warrants immediate security incident review; a score between 70–85 may be a false positive that a human can review against the uploaded image.
Further reading
- FigStep detection — the typographic attack class that vision models read from pixel streams.
- Indirect prompt injection via image — PI payloads in knowledge-base documents and retrieved URLs.
- Prompt-injection scanner for screenshot agents — Dify UI-automation agents.
- Prompt-injection scanner for RAG pipelines — knowledge-base ingestion-time scanning.
- Prompt-injection scanner for LangChain agents — Python-based agent framework comparison.
- Vision language model security — why the visual token stream is invisible to text-layer defences.
- Why text-only scanners miss image prompt injection — architectural background.