ICP-by-platform · Flowise
Prompt-injection scanner for Flowise agents
Flowise is the leading open-source drag-and-drop tool for building LLM chatflows and agent pipelines. Its visual node library includes ChatOpenAI, ChatAnthropic, and ChatGoogleGenerativeAI nodes that accept image inputs when Allow Image Uploads is enabled, as well as PDF File Loader, Image File Loader, and Document QA Chain nodes that process external files. None of these nodes include built-in multimodal prompt-injection scanning — they pass image and document content directly to the underlying model API. A user who uploads a FigStep-class adversarial image can inject instructions into the vision model's token stream that are invisible to every text-level content filter in the pipeline. Add a Glyphward scan step before the vision-capable LLM node using Flowise's custom API chain or a pre-processing middleware hook.
TL;DR
In any Flowise chatflow with Allow Image Uploads enabled or a document-loader node feeding a vision-capable model: intercept the image bytes via a custom tool or middleware hook, POST to Glyphward's /v1/scan, and reject inputs with score ≥ 70 before they reach the ChatOpenAI/ChatAnthropic node. Under 200 ms scan latency. Free tier — 10 scans/day, no card.
Where multimodal PI enters a Flowise chatflow
ChatOpenAI / ChatAnthropic with Allow Image Uploads. When you enable Allow Image Uploads on a vision-capable chat model node in Flowise, end users of your chatflow can upload images alongside their text messages. Flowise passes these images as image_url or image content blocks to the model API. There is no built-in scan of the image content. Any user can upload a FigStep-class image — a PNG with adversarial typography or a steganographically encoded instruction — that the vision model reads and acts on.
PDF File Loader in Document QA Chains. Flowise's PDF File loader (using pdf-parse or pdfjs-dist under the hood) extracts text from uploaded PDFs. However, PDF pages that are scanned images or that contain embedded image content are not text-extractable — they are passed as rendered images to a multimodal LLM node. This is the same attack surface described in PDF prompt-injection detection: a payload on an image-only page in the PDF reaches the vision encoder unfiltered.
Image File Loader + Vision Chain. Flowise supports image loading for multi-image analysis chains. A user who can supply images to this chain can supply adversarial images. For document-processing chatflows where users upload product photos, receipt images, or screenshot captures, each user-supplied image is an untrusted input.
Agent tools that fetch images. Flowise agents configured with web-search or URL-fetch tools may retrieve images from external URLs and pass them to a vision model for analysis. External-origin images are untrusted. See indirect prompt injection via image for the retrieved-image attack pattern.
Option 1: Express middleware hook (self-hosted Flowise)
For self-hosted Flowise, the cleanest approach is to add an Express middleware that intercepts multipart uploads before they reach the Flowise route handler:
// flowise-pi-scan-middleware.js
const axios = require("axios");
const GLYPHWARD_API_KEY = process.env.GLYPHWARD_API_KEY;
const GLYPHWARD_SCAN_URL = "https://glyphward.com/v1/scan";
const SCAN_THRESHOLD = 70;
async function scanImageBuffer(buffer, mimeType, filename) {
const base64 = buffer.toString("base64");
const response = await axios.post(
GLYPHWARD_SCAN_URL,
{ image: base64, source: "flowise_upload", metadata: { filename } },
{
headers: { Authorization: `Bearer ${GLYPHWARD_API_KEY}` },
timeout: 5000,
}
);
return response.data; // { score, flagged_region, scan_id, modality }
}
// Attach to Flowise Express app before flowise routes
module.exports = function piScanMiddleware(req, res, next) {
// Only intercept routes that handle file uploads
if (!req.path.includes("/api/v1/prediction") || !req.files) {
return next();
}
const images = Object.values(req.files).flat().filter(
(f) => f.mimetype.startsWith("image/") || f.mimetype === "application/pdf"
);
if (images.length === 0) return next();
Promise.all(
images.map(async (file) => {
const result = await scanImageBuffer(
file.data,
file.mimetype,
file.name
);
if (result.score >= SCAN_THRESHOLD) {
throw Object.assign(
new Error(`Image blocked: PI score ${result.score}`),
{ scan_id: result.scan_id, status: 400 }
);
}
return result;
})
)
.then(() => next())
.catch((err) => {
res.status(err.status || 500).json({
error: err.message,
scan_id: err.scan_id,
});
});
};
Mount this middleware in Flowise's main index.js or in your custom Docker entrypoint before the router.use(flowiseRouter) line:
const piScan = require("./flowise-pi-scan-middleware");
app.use(piScan);
app.use("/api/v1", flowiseRouter);
Option 2: Custom tool node (cloud-hosted or self-hosted)
If you use Flowise Cloud or cannot modify the server code, create a Custom JS Function node in your chatflow that calls Glyphward before the LLM node:
// Flowise Custom JS Function node
// Input: $image (base64 string from upload variable)
// Output: { passed: true } or throws error
const response = await fetch("https://glyphward.com/v1/scan", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${$vars.GLYPHWARD_API_KEY}`,
},
body: JSON.stringify({
image: $image,
source: "flowise_custom_fn",
}),
});
const result = await response.json();
if (result.score >= 70) {
throw new Error(
`Image blocked by PI scanner (score ${result.score}). ` +
`Please upload a different image. scan_id=${result.scan_id}`
);
}
return { passed: true, scan_id: result.scan_id, score: result.score };
Wire this node between the Image Upload variable and the ChatOpenAI node. If the function throws, Flowise terminates the chain and returns the error message to the chat interface.
Coverage matrix
| Defence layer | Detects FigStep in image uploads | Detects PI in PDF loader image pages | Works with Flowise Cloud (no server access) | Blocks before LLM API call |
|---|---|---|---|---|
| Flowise built-in moderation node | No (text prompt moderation only) | No | Yes (but text only) | Text only |
| OpenAI content filter | No (content categories, not PI) | No | Yes (but post-call) | No (output-side) |
| Lakera Guard (text API) | No (text only) | No | Requires custom wiring | Text only |
| Glyphward middleware hook | Yes — pixel-level | Yes — image scan | Yes (custom function node) | Yes — hard block before API |
Related questions
Which Flowise LLM nodes support image inputs?
As of Flowise v2.x: ChatOpenAI (GPT-4o, GPT-4V), ChatAnthropic (Claude 3 family, Claude 3.5 Sonnet), ChatGoogleGenerativeAI (Gemini 1.5 Pro, Gemini 1.5 Flash), ChatOllama (LLaVA and other vision-capable Ollama models), and ChatAzureOpenAI (GPT-4V deployments). All of these require the Allow Image Uploads checkbox to be enabled in the node configuration before they accept image content from the chat interface. Scan applies to all of them — the Glyphward API is model-provider agnostic.
Does the scan work with Flowise Cloud (hosted version)?
Yes. Use Option 2 (Custom JS Function node) — it makes an outbound HTTPS call to Glyphward from within the Flowise Cloud execution environment. You do not need server access. Store your Glyphward API key as a Flowise environment variable and reference it as $vars.GLYPHWARD_API_KEY in the function code. The function node executes server-side, so the API key is not exposed to the end user.
How does this interact with Flowise's memory nodes?
Flowise memory nodes (Buffer Memory, Redis Memory) store conversation history as text. If an adversarial image payload causes the LLM to generate a payload-containing text response, that response could be stored in memory and re-injected into future conversation turns. The scan gate prevents the adversarial image from reaching the LLM in the first place, eliminating the risk of payload persistence in memory. Scan at upload time, not at memory retrieval time.
What about the Flowise Document Store — are stored documents scanned?
Flowise's Document Store allows administrators to upload reference documents that are indexed for retrieval. These are uploaded by administrators, not end users — lower risk than live user uploads. However, if your Document Store ingests content from external URLs (via the Web Scraper loader) or from third-party APIs, those are untrusted external sources. For external-origin document-store content, scan at ingestion time using the middleware hook or a separate pre-ingestion pipeline. See prompt-injection scanner for RAG pipelines.
Can I run the scan asynchronously to avoid slowing the chat response?
For user-facing chat latency, the 200 ms scan overhead is typically imperceptible. If you want to avoid adding any synchronous latency, you can run the scan in parallel with an initial LLM request and cancel the LLM request if the scan returns a high score. However, this approach requires careful error handling to avoid leaking partial LLM output to the user before the scan completes. For most Flowise deployments, the synchronous gate (scan, then LLM) is simpler and safer.
Further reading
- FigStep detection — the typographic attack class in image uploads.
- Indirect prompt injection via image — PI in document-loader and URL-fetcher paths.
- PDF prompt-injection detection — scanning PDF pages rendered as images.
- Prompt-injection scanner for Dify agents — analogous HTTP-node approach for Dify workflows.
- Prompt-injection scanner for LangChain agents — Flowise uses LangChain under the hood.
- Prompt-injection scanner for RAG pipelines — document ingestion PI patterns.
- Vision language model security — why the visual token stream bypasses text defences.