Platform guide · Google Gemini Flash

Prompt injection scanner for Google Gemini Flash

Gemini Flash (gemini-1.5-flash, gemini-2.0-flash) is the cost-optimized tier of the Gemini family and is multimodal by default — text, images, audio, video, and PDF all pass through the same model endpoint with no extra configuration. That makes it the default pick for high-throughput document processing pipelines: receipt extraction, form digitisation, invoice classification, medical-report parsing, product-image cataloguing. Unlike text-only models where the multimodal surface is an opt-in feature, every Flash API call is capable of receiving image data. The attack surface is proportional to throughput: a processing pipeline that calls Flash 10,000 times per day with user-supplied images has 10,000 injection opportunities per day. Flash's 1M token context window compounds this — a single adversarial image uploaded to the Files API and referenced across hundreds of model calls affects every one of those calls until the file is deleted. Glyphward's scan gate must be applied at the point where images enter the pipeline, not at the model call level, to contain the blast radius.

TL;DR

For Gemini Flash pipelines, scan images before they are uploaded to the Files API or passed as inline base64 data. Use POST https://glyphward.com/v1/scan; reject images with score ≥ 65. Files API uploads that pass the scan should be tagged with the returned scan_id in file metadata so downstream calls can verify the image was pre-screened. Free tier — 10 scans/day, no card required.

Four attack surfaces specific to Gemini Flash

1. Files API reuse across model calls. The Gemini Files API lets you upload a file once and reference it by URI in multiple generateContent() calls for up to 48 hours. In a batch document-processing pipeline, a user uploads a PDF; your pipeline converts it to images and uploads each page to the Files API; your Flash calls reference those file URIs. If one page contains a typographic prompt injection — text rendered in the page's font that instructs the model to change its output — every subsequent model call that includes that file URI is affected. The Files API has no built-in content inspection; it stores and serves bytes. A pre-upload scan gate is the only way to prevent a malicious page from propagating through the entire batch.

2. Long-context multi-image batches. Gemini Flash's 1M token context window enables passing hundreds of document pages in a single API call — a 300-page PDF renders as ~300 image frames, all ingestible in one request. An attacker who controls even one page in a large document batch can inject instructions that affect the model's interpretation of all subsequent pages in the same call. This is particularly relevant for contract review, financial statement analysis, and legal discovery workflows where documents come from counterparties or external sources. Scanning each frame individually before constructing the batch request prevents a single adversarial page from poisoning the entire context.

3. Inline base64 images in high-throughput pipelines. Pipelines that skip the Files API and pass images as inline base64 data in each request are vulnerable at every call site. At high throughput, the scanning overhead must be minimised. Gemini Flash's latency target (designed for real-time and near-real-time use) creates pressure to skip pre-call validation. The correct architecture is to scan asynchronously during image ingestion (when the user uploads or the pipeline fetches the source image) and cache the scan result with the image hash. The inline base64 call then uses the cached result rather than blocking on a synchronous scan.

4. Multimodal streaming responses. generateContentStream() returns partial responses as the model generates output. In agentic pipelines, downstream code processes each streamed chunk as it arrives — extracting structured fields, updating a UI, or triggering side effects. An injection instruction embedded in an image can cause the model to output a structured payload mid-stream (for example, a JSON fragment that overrides a previously parsed field, or a tool-call instruction that triggers an unintended action before the stream closes). Standard text-injection defences that scan the final assembled response are blind to mid-stream injection artefacts. The scan gate on the input image — before the stream starts — is the correct prevention layer.

Integration: Files API scan gate (Python)

import base64, hashlib, os, time
import google.generativeai as genai
import requests

GLYPHWARD_KEY = os.environ["GLYPHWARD_API_KEY"]
INJECTION_THRESHOLD = 65
genai.configure(api_key=os.environ["GEMINI_API_KEY"])

_scan_cache: dict[str, dict] = {}  # sha256 → {score, scan_id, ts}

def scan_image_bytes(image_bytes: bytes, source: str) -> dict:
    """Scan with result caching by content hash."""
    h = hashlib.sha256(image_bytes).hexdigest()
    cached = _scan_cache.get(h)
    if cached and time.time() - cached["ts"] < 3600:
        return cached

    try:
        resp = requests.post(
            "https://glyphward.com/v1/scan",
            json={"image": base64.b64encode(image_bytes).decode(), "source": source},
            headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
            timeout=8,
        )
        resp.raise_for_status()
        result = resp.json()
    except Exception:
        # Fail-closed: scanner unreachable → treat as high-risk
        return {"score": 100, "scan_id": None}

    entry = {"score": result["score"], "scan_id": result["scan_id"], "ts": time.time()}
    _scan_cache[h] = entry
    return entry


def safe_upload_to_files_api(image_bytes: bytes, display_name: str, mime_type: str) -> str | None:
    """
    Upload an image to the Gemini Files API after scanning.
    Returns the file URI on success, None if the image was rejected.
    """
    scan = scan_image_bytes(image_bytes, f"files_api_upload:{display_name}")

    if scan["score"] >= INJECTION_THRESHOLD:
        print(
            f"Files API upload rejected: display_name={display_name}, "
            f"score={scan['score']}, scan_id={scan['scan_id']}"
        )
        return None

    # Upload the scanned-safe image
    import io
    uploaded = genai.upload_file(
        io.BytesIO(image_bytes),
        mime_type=mime_type,
        display_name=f"{display_name} [gwscan:{scan['scan_id']}]",
    )
    return uploaded.uri


def process_document_batch(page_images: list[tuple[bytes, str]], prompt: str) -> str | None:
    """
    Process a batch of page images with Gemini Flash.
    Scans each page; if any page fails, aborts the entire batch.
    """
    model = genai.GenerativeModel("gemini-2.0-flash")
    content_parts = [prompt]

    for image_bytes, display_name in page_images:
        uri = safe_upload_to_files_api(image_bytes, display_name, "image/png")
        if uri is None:
            return None  # Abort batch — adversarial image detected
        content_parts.append({"file_data": {"file_uri": uri}})

    response = model.generate_content(content_parts)
    return response.text

Get early access

Coverage matrix

Defence layer	Files API reuse	Multi-image batch	Inline base64 calls	Streaming responses
Google AI Studio built-in safety filters	Harm-category flagging (violence, explicit) — not adversarial PI	Harm-category only	Harm-category only	Harm-category only
Files API access controls	Controls who can reference a file URI — not the file's content	N/A	N/A	N/A
Google Cloud DLP	Scans text for PII patterns — not adversarial pixel-layer content	No	No	No
Glyphward pre-upload scan gate	Yes — scan before Files API upload; tag URI with scan_id	Yes — scan each frame before batch construction	Yes — scan+cache at ingestion; reuse result per content hash	Yes — input-side scan prevents injection appearing in stream

Related questions

Does this apply to Gemini 2.0 Flash and Gemini 2.5 Flash as well?

Yes. All Gemini Flash variants share the same Files API and inline image input pattern. The multimodal attack surface is identical across Flash versions — the model version affects capabilities and pricing, not the injection risk profile. The scan gate described above applies to any gemini-*-flash model ID.

How does the Files API reuse risk differ from the Vertex AI Gemini API?

The Files API (available via generativelanguage.googleapis.com, the AI Studio endpoint) and the Vertex AI Multimodal API are separate services. On Vertex AI, images are typically passed as inline base64 or as GCS URIs rather than through the Files API. The Vertex AI attack surface is covered on the Vertex AI Agent Builder page. For the AI Studio / Files API pattern described here, the pre-upload scan gate is the primary control.

What is the latency impact of scanning before every Files API upload?

A Glyphward scan on a typical document page image (<1MB, 1-3MP) takes 80–150ms. For batch document processing where upload latency is already in the hundreds of milliseconds (network + Files API write), the scan adds <20% overhead. For synchronous pipelines where latency matters, use the content-hash cache: scan once per unique image, reuse the result for identical re-uploads. In practice, document processing pipelines contain many duplicate or near-duplicate pages (blank pages, standard headers) that collapse to a single scan.

Should we scan audio and video inputs as well?

Yes, if your Gemini Flash pipeline processes audio or video. Glyphward's current scanner covers images and OCR-extracted text from PDF page images. Audio injection (WhisperInject-style attacks on spoken-word content) and video frame injection are on the roadmap. For audio pipelines today, apply Glyphward to any image frames extracted from video, and use a transcript-level text scanner on audio content. See the voice agent scanning page for the audio pattern.

TL;DR

Four attack surfaces specific to Gemini Flash

Integration: Files API scan gate (Python)

Coverage matrix

Related questions

Further reading