Platform guide · AWS Bedrock Agents
Prompt injection scanner for AWS Bedrock Agents
AWS Bedrock Agents is not the same as calling the Bedrock API directly. Agents add two autonomous attack surfaces that the basic Bedrock API does not have: Knowledge Bases (which retrieve document chunks from S3, Confluence, SharePoint, and Salesforce data sources — including rendered PDF page images) and ActionGroups (Lambda or OpenAPI-backed functions that the agent invokes autonomously and whose return values feed back into the agent's reasoning). Bedrock Guardrails — AWS's built-in content filter — inspects the text layer only. It does not decode or scan image bytes embedded in Knowledge Base chunks or ActionGroup responses. An adversarial image planted in an S3-stored PDF, a Confluence attachment, or a Lambda function's return payload can instruct the agent to call unintended actions, exfiltrate data via the next tool call, or produce a false final answer — all without triggering any Guardrail. Glyphward's scan gate closes this gap at the point where image bytes enter the agent's context, before the next reasoning step.
TL;DR
Add a Lambda layer or wrapper function that scans image bytes from Knowledge Base retrieval results and ActionGroup responses before returning them to the Bedrock Agent runtime. Call POST https://glyphward.com/v1/scan with the base64-encoded image; if score ≥ 65, replace the image with a redaction notice and log the event. Free tier — 10 scans/day, no card required.
The four multimodal attack surfaces in Bedrock Agents
1. Knowledge Base document retrieval. When you create a Bedrock Knowledge Base, you connect it to an S3 bucket (or Confluence, SharePoint, Salesforce). The Knowledge Base ingestion pipeline chunks documents and stores embeddings in OpenSearch Serverless or Amazon Aurora PostgreSQL. For PDFs, the ingestion pipeline renders each page as an image for chunking purposes. When the agent retrieves chunks at runtime (via RetrieveAndGenerate or the agent's built-in knowledgeBase action), those rendered page images may be included in the chunk payload alongside the extracted text. The agent then passes both the text and the image to the foundation model (Claude, Titan, Llama) for reasoning. If the PDF was contributed by an external user (a support ticket attachment, a public S3 prefix, a Confluence space with external write access), the adversarial image was planted before your Guardrail had any opportunity to inspect it.
2. ActionGroup Lambda return values with image data. ActionGroups invoke Lambda functions (or OpenAPI-endpoint URLs) to perform actions — querying databases, calling external APIs, reading files from S3. A Lambda function that reads from S3, calls a third-party REST API, or scrapes a URL can return a base64-encoded image in its JSON response. The Bedrock Agents runtime passes this image to the foundation model as part of the tool result. Guardrails does not intercept Lambda return values for image content. The scan must happen inside the Lambda function or in a wrapper invoked before the result is returned to the agent runtime.
3. Multi-agent orchestration with sub-agent results. Bedrock Agents supports multi-agent patterns where a supervisor agent invokes sub-agents and aggregates their responses. If a sub-agent's response includes an image (from its own Knowledge Base retrieval or ActionGroup call), that image enters the supervisor agent's context without the supervisor having any native inspection hook. The image can carry instructions that override the supervisor's system prompt, redirecting the orchestration chain.
4. Inline agents with session-level image context. The Bedrock Agents inline API (InvokeInlineAgent) allows passing inline tool definitions and session-state variables at invocation time. An attacker who can influence session state can inject an image into the agent's context before the first reasoning step. Session-state variables are not filtered by Guardrails.
Integration: Lambda wrapper scanning Knowledge Base results
import base64, json, boto3, requests, os
GLYPHWARD_KEY = os.environ["GLYPHWARD_API_KEY"]
SCAN_THRESHOLD = 65
bedrock_agent_runtime = boto3.client("bedrock-agent-runtime", region_name="us-east-1")
def scan_image_bytes(image_bytes: bytes) -> dict:
resp = requests.post(
"https://glyphward.com/v1/scan",
json={"image": base64.b64encode(image_bytes).decode(), "source": "bedrock_agents_kb"},
headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
timeout=8,
)
resp.raise_for_status()
return resp.json()
def safe_retrieve_and_generate(session_id: str, input_text: str, kb_id: str, model_arn: str) -> dict:
"""
Retrieve from Knowledge Base, scan any images in retrieved chunks,
then pass clean chunks to the foundation model.
"""
# Step 1: Retrieve chunks from Knowledge Base (without generate)
retrieve_resp = bedrock_agent_runtime.retrieve(
knowledgeBaseId=kb_id,
retrievalQuery={"text": input_text},
retrievalConfiguration={"vectorSearchConfiguration": {"numberOfResults": 5}},
)
clean_results = []
for result in retrieve_resp.get("retrievalResults", []):
content = result.get("content", {})
# Check for image content in the retrieval result
if content.get("type") == "IMAGE" or "imageContent" in content:
image_b64 = content.get("imageContent", {}).get("data", "")
if image_b64:
try:
scan = scan_image_bytes(base64.b64decode(image_b64))
if scan["score"] >= SCAN_THRESHOLD:
# Replace adversarial image with a redaction notice
result["content"] = {"type": "TEXT", "text": "[Image redacted: adversarial content detected]"}
print(f"KB image redacted: score={scan['score']}, scan_id={scan['scan_id']}, loc={result.get('location')}")
clean_results.append(result)
continue
except Exception as e:
# Fail-closed: scanner unreachable → redact
result["content"] = {"type": "TEXT", "text": "[Image redacted: scan unavailable]"}
clean_results.append(result)
continue
clean_results.append(result)
# Step 2: Generate with clean retrieved context
generate_resp = bedrock_agent_runtime.retrieve_and_generate(
sessionId=session_id,
input={"text": input_text},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {
"knowledgeBaseId": kb_id,
"modelArn": model_arn,
},
},
)
return generate_resp
# ActionGroup Lambda handler pattern
def action_group_handler(event, context):
"""
ActionGroup Lambda: scan any image bytes before returning to agent runtime.
"""
# ... your business logic ...
result_payload = your_business_logic(event)
# Scan image bytes in result before returning to Bedrock Agents
if "imageData" in result_payload:
image_bytes = base64.b64decode(result_payload["imageData"])
try:
scan = scan_image_bytes(image_bytes)
if scan["score"] >= SCAN_THRESHOLD:
result_payload["imageData"] = None
result_payload["warning"] = "Image redacted by security scan"
except Exception:
result_payload["imageData"] = None # Fail-closed
return {
"messageVersion": "1.0",
"response": {"actionGroup": event["actionGroup"], "function": event["function"], "functionResponse": {"responseBody": {"TEXT": {"body": json.dumps(result_payload)}}}},
}
The Knowledge Base retrieval pattern uses the two-step retrieve() then retrieve_and_generate() approach to interpose the scan between the vector search result and the foundation model call. The ActionGroup Lambda pattern wraps the return value before the Bedrock Agents runtime receives it. Both patterns work with any foundation model available in Bedrock (Claude, Titan, Llama, Mistral).
Coverage matrix
| Defence layer | KB PDF page image | ActionGroup Lambda image return | Multi-agent sub-agent image result | Inline session-state image |
|---|---|---|---|---|
| Bedrock Guardrails (text filters) | No — image bytes not inspected | No | No | No |
| Bedrock Guardrails (grounding check) | Partial — checks factual grounding of text output, not image input | No | No | No |
| S3 bucket ACL / IAM policy | Prevents unauthorised access, not content inspection | No | No | No |
| Amazon Macie (S3 data classification) | Classifies sensitive data types (PII, credentials) — not adversarial image payloads | No | No | No |
| Glyphward scan at KB retrieval + ActionGroup return | Yes — scan before FM reasoning step | Yes — scan in Lambda handler | Yes — scan sub-agent result before aggregation | Yes — scan session-state images at entry |
Related questions
How is this page different from the existing AWS Bedrock page?
The Bedrock API page covers the direct converse() and invokeModel() calls where your application controls every image that enters the request. Bedrock Agents is fundamentally different: the agent runtime autonomously decides what to retrieve and what tools to call. Your application code does not see the retrieved images before they enter the foundation model's context. This page covers the hooks (Knowledge Base retrieval wrapper, ActionGroup Lambda pattern) that are specific to the Agents runtime and not needed for the basic Bedrock API.
Does Bedrock Guardrails cover image inputs at all?
As of 2026, Bedrock Guardrails supports content filtering for text, and can be configured to block specific topics or deny lists of phrases in model outputs. Guardrails does not decode, resize, or run classifiers on image bytes. Amazon has published guidance on using Guardrails with Claude's vision input, but Guardrails' image-handling is limited to content policy (e.g. blocking sexual content) rather than adversarial pixel-level injection detection. Glyphward is complementary to Guardrails — Guardrails handles policy compliance on text, Glyphward handles adversarial image injection at the input layer.
Which Bedrock foundation models support multimodal Knowledge Base retrieval?
Bedrock Knowledge Bases with multimodal retrieval is supported for Claude 3 models (Haiku, Sonnet, Opus) and Amazon Nova (Pro, Lite). The multimodal retrieval capability was introduced in late 2024 and is enabled per-Knowledge-Base during configuration. If your Knowledge Base is configured for multimodal retrieval, the agent runtime will pass rendered PDF page images to the vision model. If your Knowledge Base is text-only, the image attack surface is limited to ActionGroup return values and session-state injection.
What about the Bedrock Agent's session state and memory features?
Bedrock Agents supports session memory (via Amazon MemoryStore) and inline session state variables. If your agent stores images in session state between turns (e.g. a screensharing assistant that saves the last screenshot to memory), those stored images re-enter the agent's context on every subsequent turn. An adversarial image that was scanned and passed on turn 1 may have been crafted to activate only when seen in context with a specific follow-up instruction on turn N. We recommend scanning images at both the initial ingestion step and on every retrieval from session memory to prevent delayed-activation injection.
Does the scan add significant latency to Bedrock Agent invocations?
Glyphward's scan API returns in under 200ms for images up to 4 MB. Bedrock Agent invocations typically involve 2–4 seconds of latency for the Knowledge Base retrieval and FM reasoning step, so the scan adds less than 10% to total agent latency. For ActionGroup Lambdas with tight timeouts, set the Glyphward request timeout to 5 seconds and configure fail-closed behavior (redact the image if the scan times out) rather than fail-open. The Pro tier ($29/mo) includes a 99.9% uptime SLA on the scan API.
Further reading
- Prompt-injection scanner for AWS Bedrock API — direct
converse()andinvokeModel()calls. - Multimodal prompt injection in agentic RAG pipelines — LangGraph, LlamaIndex, AutoGen patterns.
- Prompt-injection scanner for RAG pipelines — simple one-hop retrieval.
- Real-time vs batch scanning — architecture guide for latency-sensitive Bedrock deployments.
- Multimodal LLM security API — Glyphward API overview.