ICP-by-platform · Google Cloud Vertex AI

Prompt-injection scanner for Vertex AI Agent Builder

Vertex AI Agent Builder (successor to Dialogflow CX + Vertex AI Search for Commerce and Enterprise) lets you build production-ready RAG agents by connecting a Data Store of PDFs, web pages, or multimodal documents to a Gemini-powered reasoning engine via the Agent Builder console or Agent APIs. Unlike calling the Gemini API directly, Agent Builder's grounding layer automatically retrieves relevant chunks from the Data Store — including image-rich pages from PDFs — and passes them as context to the vision model. This creates an indirect injection path: an adversarial image embedded in any Data Store document propagates into the agent's context window without inspection every time a user query triggers a retrieval that includes that chunk. Inserting Glyphward at the document-ingestion layer or at query-time on retrieved chunks closes this attack surface before the Gemini model sees the content.

TL;DR

At document ingestion time (before adding to the Data Store), scan each image page with POST https://glyphward.com/v1/scan. At query time, if your Grounding API response returns a snippet that includes image bytes or image-derived text, scan before passing to Gemini. If score ≥ 70, quarantine the document from the Data Store or block the retrieved chunk. Free tier — 10 scans/day, no card required.

Attack surface: where Vertex AI Agent Builder receives unscanned images

Data Store document ingestion. Agent Builder Data Stores accept structured data, unstructured documents (PDFs, HTML), and website URLs via the Discovery Engine API (POST /v1/projects/{project}/locations/{location}/dataStores/{dataStoreId}/branches/default_branch/documents:import). PDFs containing images — architecture diagrams, scanned invoices, signed contracts, compliance certificates — are parsed page-by-page and stored as multimodal chunks. Any image in the ingested PDFs that contains adversarial pixel-level instructions is stored verbatim in the Data Store and will be retrieved and forwarded to the Gemini model whenever a query matches that chunk's relevance score.

Agent Playbooks with multimodal tool outputs. Agent Builder's Playbook feature (the declarative task-planning engine) supports tool actions that can return structured data, including image references. If a Playbook tool fetches an image from an external URL and includes it as part of the tool response, the image is passed directly to the Gemini grounding model as context without any intermediate inspection. External URLs returned by tools are under the control of the data sources the tool queries — which may include user-controlled external systems.

Search and Summarize with image-rich results. Agent Builder's Search and Summarize feature extracts snippets from retrieved documents and asks Gemini to synthesise an answer. For image-heavy documents (technical manuals, scientific papers, compliance reports with embedded charts), the image is included in the Gemini multimodal context as a rendered page image. Any adversarial payload in the image portion of a retrieved chunk reaches Gemini as part of the summarisation context.

Conversation history with image uploads. Agent Builder conversations support multi-turn interactions where users can upload images as part of their query. Unlike grounding-driven injection (which comes from documents in the Data Store), this is a direct injection vector: the user sends an image that is passed verbatim to the Gemini model as the most recent conversational turn. Agents deployed in external-facing products (customer support bots, document Q&A interfaces, e-commerce advisors) accept these images from the full public user base.

Integration: Python with google-cloud-discoveryengine and vertexai

The patterns below show how to insert a Glyphward scan at both the document-ingestion layer (Pattern A) and the conversation-turn layer (Pattern B). Both use the same /v1/scan endpoint and the same threshold logic.

import base64
import requests
from google.cloud import discoveryengine_v1 as discoveryengine
from vertexai.generative_models import GenerativeModel, Image

PROJECT_ID = "your-gcp-project"
LOCATION = "global"
DATA_STORE_ID = "your-data-store"
GLYPHWARD_KEY = "<your-glyphward-api-key>"
GLYPHWARD_THRESHOLD = 70

def scan_image_bytes(image_bytes: bytes, source: str = "vertex_agent_builder") -> dict:
    encoded = base64.b64encode(image_bytes).decode()
    resp = requests.post(
        "https://glyphward.com/v1/scan",
        json={"image": encoded, "source": source},
        headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
        timeout=8,
    )
    resp.raise_for_status()
    return resp.json()

# Pattern A: scan at document ingestion time
def safe_ingest_pdf_pages(pdf_page_images: list[bytes], doc_id: str) -> list[bytes]:
    """Return only pages that pass the scan threshold."""
    clean_pages = []
    for page_idx, page_bytes in enumerate(pdf_page_images):
        try:
            result = scan_image_bytes(page_bytes, source="vertex_data_store_ingest")
        except Exception:
            # Fail-closed: scanner unreachable → quarantine the page
            print(f"Scanner unavailable for page {page_idx} of {doc_id} — quarantined.")
            continue
        if result["score"] >= GLYPHWARD_THRESHOLD:
            print(
                f"Page {page_idx} of {doc_id} blocked: score {result['score']}/100 "
                f"(ref {result['scan_id']})"
            )
        else:
            clean_pages.append(page_bytes)
    return clean_pages

# Pattern B: scan user-supplied image at conversation turn
def safe_agent_conversation_with_image(user_text: str, image_bytes: bytes) -> str:
    try:
        scan = scan_image_bytes(image_bytes, source="vertex_agent_conversation")
    except Exception:
        raise RuntimeError("Image security check unavailable. Please retry.")

    if scan["score"] >= GLYPHWARD_THRESHOLD:
        raise ValueError(
            f"Image blocked (score {scan['score']}/100, ref {scan['scan_id']})"
        )

    model = GenerativeModel("gemini-2.0-flash")
    response = model.generate_content([
        user_text,
        Image.from_bytes(image_bytes),
    ])
    return response.text

Pattern A (ingestion-time scan) prevents adversarial images from entering the Data Store permanently — removing the risk for all future queries that would have retrieved the contaminated chunk. Pattern B (query-time scan) gates user-uploaded images in conversation turns. Both patterns use the same Glyphward endpoint; combine them for defence-in-depth.

Get early access

Coverage matrix

Defence layer	Data Store PDF ingestion	Playbook tool image output	User conversation image upload
Cloud DLP API (content inspection)	Partial — scans for PII patterns, not adversarial pixel-level text	No	No
Google Cloud CCAI content safety	No — text moderation only, not pixel-level image inspection	No	No
VPC Service Controls	No — network perimeter, not content inspection	No	No
Glyphward pre-model scan	Yes — ingestion-time quarantine of adversarial pages	Yes	Yes

Related questions

How does this differ from calling the Gemini API on Vertex AI directly?

The prompt-injection-scanner-for-google-vertex-ai-gemini page covers the raw Gemini API via vertexai.generative_models.GenerativeModel — a direct single-turn or chat call where your application controls every input. Agent Builder adds an abstraction layer: it retrieves grounding chunks from a Data Store autonomously, decides which chunks to include as context, and can invoke Playbook tools that produce their own outputs. You may not see every image that reaches the Gemini model because Agent Builder's retrieval step decides what to include at query time. The injection surface is larger because it includes documents that were ingested weeks ago and are now silently forwarded as context. Scanning at ingestion time (Pattern A) addresses this class of injection that a direct-API scan cannot catch retrospectively.

Does this work with Vertex AI Search (Data Store with website URLs)?

Vertex AI Agent Builder supports Data Stores backed by website crawls in addition to document uploads. Pages crawled from external URLs may contain images with adversarial content. For web-crawled Data Stores, the most practical protection is to scan retrieved multimodal chunks at query time rather than at ingestion time (Pattern B applied to grounding context rather than conversation images). Contact Glyphward if you need a bulk-scan API for web-crawled Data Store contents.

What about Vertex AI Workbench notebook integrations?

Vertex AI Workbench notebooks that call Agent Builder APIs directly (via discoveryengine client or raw REST) follow the same patterns. Insert the scan_image_bytes() call between the image source and any Gemini multimodal content block. Workbench-based pipelines often process batch documents for ML experiments — use Pattern A (ingestion-time scan on each page image) before adding documents to the Data Store.

How does this interact with VPC Service Controls?

VPC Service Controls creates a security perimeter around Google Cloud APIs — it restricts which projects and networks can call the Discovery Engine and Vertex AI APIs. VPC SC is a network-layer control: it prevents unauthorised callers but does not inspect the content of API requests. An adversarial image submitted by an authorised caller inside the VPC SC perimeter is not flagged by the perimeter. Glyphward operates at the content layer — call it from inside your VPC (the Glyphward endpoint is reachable over HTTPS from any GCP region) before forwarding image bytes to Agent Builder or Vertex AI.

TL;DR

Attack surface: where Vertex AI Agent Builder receives unscanned images

Integration: Python with google-cloud-discoveryengine and vertexai

Coverage matrix

Related questions

Further reading