Platform security · Azure Functions

Prompt injection scanner for Azure Functions AI workloads

Azure Functions is the primary serverless compute platform for AI image processing workloads in the Microsoft Azure ecosystem: Blob Storage–triggered document analysis, HTTP-triggered functions with direct image upload, Azure OpenAI Service multi-modal GPT-4o invocations, and Document Intelligence (formerly Form Recognizer) OCR pipelines. Azure provides enterprise-grade security at every layer: Entra ID (formerly Azure Active Directory) manages function authentication and managed identity; Azure Front Door WAF applies managed rule sets at the edge; Microsoft Defender for Cloud monitors function execution and storage access; Azure Key Vault stores secrets with hardware-backed encryption; Azure Private Endpoints isolate function outbound traffic within a VNet. None of these controls inspect image pixel content for embedded prompt injection payloads. Entra ID validates caller identity, not image content. Azure WAF managed rule sets detect OWASP web application attacks in HTTP request bodies — SQL injection, XSS, RFI — not image pixel content. Microsoft Defender for Cloud monitors API call patterns, file access events, and anomalous network traffic, not image pixel entropy or injection payload structure. An adversarially crafted image submitted to a Blob Storage container, uploaded to an HTTP-triggered function, or passed as a base64 input to an Azure OpenAI GPT-4o call reaches the vision model with its injection payload intact. Glyphward provides the pre-VLM scan gate that closes this platform gap within Azure Functions' own execution context.

TL;DR

Azure Functions AI pipelines have no platform-layer multimodal PI scanning — Entra ID, WAF, Defender for Cloud, and Document Intelligence pre-processing all operate at layers above image pixel content. Add POST https://glyphward.com/v1/scan as the first call in every function that passes images to Azure OpenAI or Document Intelligence. Store the Glyphward key in Azure Key Vault and retrieve it via managed identity. Reject images with score >= 65. Fail-closed: if the scan API is unavailable, return HTTP 503 rather than proceeding to the VLM call. Free tier — 10 scans/day, no card required.

The four multimodal attack surfaces in Azure Functions AI pipelines

1. Blob Storage–triggered Azure Functions — adversarial images in automated document processing pipelines. The canonical Azure pattern for AI document processing is Blob Storage event binding: a document uploaded to an Azure Blob container triggers a Blob-triggered function, which reads the blob, extracts page images, and passes them to Azure OpenAI or Document Intelligence for analysis. Microsoft Defender for Storage can scan uploaded files for malware signatures and sensitive data patterns; Azure Blob lifecycle management policies control retention; SAS tokens and Entra ID object-level RBAC control upload access — none of these controls inspect image pixel content for embedded prompt injection payloads. An adversarially crafted image uploaded by an external user (a customer submitting an invoice, a supplier uploading a certificate, a partner delivering a contract) to a monitored Blob container triggers the Azure Function, which reads the blob bytes and passes them to the VLM. Defender for Storage will classify the image as a standard JPEG with no malware signature — because the adversarial payload is not malware code but structured pixel content that affects VLM interpretation. The function's Entra managed identity controls which Azure resources the function can access, not which image content it will process.

2. HTTP-triggered Azure Functions with Azure Front Door WAF — adversarial images bypassing WAF managed rule sets. Azure Functions deployed behind Azure Front Door use WAF policies with Microsoft-managed rule sets to inspect incoming HTTP traffic. The Microsoft managed rule set (DRS 2.1) covers OWASP CRS rules for web application attacks: SQL injection, XSS, protocol attacks, and path traversal — all text-layer attack patterns in HTTP request headers and bodies. Image content submitted as a multipart form upload or base64-encoded JSON body within the WAF-inspected request passes through WAF inspection without pixel-level analysis: WAF validates that the content type is image/jpeg, checks byte limits on the request body, and applies text-pattern rules to the surrounding JSON structure. Adversarial pixel content within the image bytes does not match any WAF managed rule and passes through to the Azure Function handler. The function's HTTP trigger binding extracts the image from the request, and the image is passed to Azure OpenAI with the WAF-approved request metadata that provides no indication of adversarial content.

3. Azure OpenAI Service multi-modal GPT-4o calls — adversarial images in managed Azure AI deployments. Azure Functions that invoke Azure OpenAI Service (AOAI) GPT-4o or GPT-4o-mini multi-modal endpoints pass images as base64-encoded content blocks within the chat completion request. Azure OpenAI Service provides content filtering on both inputs and outputs, configurable per deployment. Azure OpenAI content filters use text-based classifiers for hate, violence, sexual content, and self-harm categories applied to the text prompt and model-generated response — they do not inspect image pixel content for embedded prompt injection instructions. Azure AI Content Safety Prompt Shields (a separate service) provides text-based prompt injection detection for the text portions of multi-modal requests; it does not analyse image pixel content. An adversarially crafted image passed to a GPT-4o multi-modal deployment via an Azure Function reaches the model with image content filters checking for harmful visual content categories (violence, explicit content) but not for pixel-level prompt injection payloads designed to redirect the model's reasoning. Azure OpenAI's responsible AI layer is a different concern than the adversarial input layer — the former addresses output content categories; the latter addresses adversarial input manipulation of model behaviour.

4. Event Grid–orchestrated multi-function pipelines — adversarial images propagating through chained function invocations. Azure Event Grid routes events between Azure services and functions, enabling multi-stage AI pipelines where images move through a chain of Azure Functions: a pre-processing function resizes and validates the image, publishes an Event Grid event; an analysis function subscribes to the event, invokes Document Intelligence for OCR, publishes another event; a post-processing function subscribes, calls Azure OpenAI with the extracted text and original image, writes results to Cosmos DB. Each function in the chain is an independently deployed Azure Function with its own Entra managed identity and Key Vault references. The Event Grid subscription schema validates event structure (event type, subject, data schema) — not the image content within the event data payload. An adversarial image that passes the pre-processing function's file type validation enters the Event Grid pipeline and propagates through every subsequent function in the chain, reaching Document Intelligence and Azure OpenAI with the adversarial pixel payload intact. The Event Grid dead-letter destination handles events that fail delivery — not events whose content causes adversarial VLM behaviour within the receiving function.

Integration: Glyphward pre-scan gate in Azure Functions (Python)

import azure.functions as func
import base64
import json
import logging
import os
import requests
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
from azure.storage.blob import BlobServiceClient
from openai import AzureOpenAI

GLYPHWARD_THRESHOLD = 65
KEY_VAULT_URL = os.environ["KEY_VAULT_URL"]  # e.g. https://myvault.vault.azure.net/

def get_glyphward_key() -> str:
    """Retrieve Glyphward API key from Azure Key Vault via managed identity."""
    credential = DefaultAzureCredential()
    client = SecretClient(vault_url=KEY_VAULT_URL, credential=credential)
    return client.get_secret("glyphward-api-key").value

def scan_image(image_bytes: bytes, glyphward_key: str) -> dict:
    """Scan image bytes with Glyphward. Raises on scan API failure (fail-closed)."""
    encoded = base64.b64encode(image_bytes).decode()
    response = requests.post(
        "https://glyphward.com/v1/scan",
        headers={"Authorization": f"Bearer {glyphward_key}"},
        json={"image": encoded},
        timeout=5,
    )
    response.raise_for_status()
    return response.json()

app = func.FunctionApp(http_auth_level=func.AuthLevel.FUNCTION)

@app.blob_trigger(arg_name="blob", path="uploads/{name}", connection="AzureWebJobsStorage")
def process_document_blob(blob: func.InputStream):
    """Blob-triggered function with Glyphward pre-scan gate before Azure OpenAI call."""
    glyphward_key = get_glyphward_key()
    image_bytes = blob.read()

    # Glyphward pre-scan gate
    try:
        scan = scan_image(image_bytes, glyphward_key)
    except Exception as e:
        logging.error(f"Glyphward scan failed: {e}")
        # Fail-closed: raise so the blob trigger retries with dead-letter routing
        raise RuntimeError("Scan unavailable — aborting VLM call") from e

    if scan["score"] >= GLYPHWARD_THRESHOLD:
        logging.warning(f"Adversarial image rejected: score={scan['score']} scan_id={scan['scan_id']}")
        # Move blob to quarantine container
        blob_service = BlobServiceClient.from_connection_string(os.environ["AzureWebJobsStorage"])
        blob_service.get_blob_client("quarantine", blob.name).upload_blob(image_bytes, overwrite=True)
        return  # Do not invoke Azure OpenAI

    # Azure OpenAI GPT-4o multi-modal call — only reached by non-adversarial images
    aoai_client = AzureOpenAI(
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        api_version="2025-04-01-preview",
        azure_ad_token_provider=lambda: DefaultAzureCredential().get_token(
            "https://cognitiveservices.azure.com/.default"
        ).token,
    )

    encoded = base64.b64encode(image_bytes).decode()
    completion = aoai_client.chat.completions.create(
        model=os.environ["AZURE_OPENAI_DEPLOYMENT"],
        max_tokens=512,
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{encoded}"}},
                    {"type": "text", "text": "Extract the key information from this document."},
                ],
            }
        ],
    )

    logging.info(f"Document processed: scan_id={scan['scan_id']} result_length={len(completion.choices[0].message.content)}")

Retrieve the Glyphward key from Azure Key Vault using DefaultAzureCredential and a managed identity with Key Vault Secrets User role assignment on the vault — never store the key in application settings or environment variables in plaintext. For HTTP-triggered functions returning results to callers, return HTTP 400 with the scan rejection reason (do not raise an exception, which would cause a 500 response). For Blob-triggered and Event Grid–triggered functions, raise an exception on adversarial image detection so Azure Functions' built-in retry logic routes the event to the configured dead-letter storage or dead-letter queue. Add the quarantine blob copy before raising so the image is preserved for security analysis. For Event Grid–orchestrated pipelines, add the scan gate in the first function subscriber and publish a Glyphward.ImageRejected Event Grid event on rejection so downstream security monitoring functions can react. Get early access

Coverage matrix

Azure security control Blob-triggered function pipeline HTTP-triggered function (WAF) Azure OpenAI GPT-4o multi-modal Event Grid multi-function chain
Entra ID managed identity and RBAC Controls which identities can access Azure resources; does not inspect image content Controls function caller authentication; does not inspect image pixel content Controls AOAI deployment access; does not inspect image payload content Controls Event Grid publisher/subscriber identities; does not inspect event data image content
Azure Front Door WAF managed rules Not applicable to Blob trigger path Inspects HTTP text patterns (OWASP CRS); does not inspect image pixel content Not applicable to internal AOAI API calls from function VNet Not applicable to Event Grid internal routing
Azure OpenAI content filters Not applicable at Blob trigger layer Not applicable at WAF layer Filters text output content categories (hate, violence); does not inspect image pixel PI payloads Applies only at AOAI call; does not inspect images at pre-processing function stages
Microsoft Defender for Storage Scans for malware signatures and sensitive data; does not detect pixel-level PI payloads Not applicable to HTTP function trigger Not applicable to AOAI image payload Not applicable to Event Grid in-flight data
Glyphward pre-VLM scan gate (multimodal PI detection) Yes — scans blob bytes before Azure OpenAI call; routes adversarial images to quarantine Yes — scans image bytes at function entry; returns HTTP 400 on adversarial detection Yes — scans base64 image before AOAI multi-modal call; prevents adversarial payload reaching GPT-4o Yes — scan gate in first subscriber terminates adversarial event chains before pre-processing

Related questions

Does Azure AI Content Safety Prompt Shields cover multimodal prompt injection?

Azure AI Content Safety Prompt Shields detects prompt injection attacks in text content — it analyses the text portions of a request for embedded instructions that attempt to override system prompts or redirect model behaviour. Prompt Shields supports a user prompt analysis mode and a document analysis mode (for RAG-retrieved text documents). Neither mode analyses image pixel content for embedded adversarial payloads. In a multi-modal Azure OpenAI request where the content array contains both an image block and a text block, Prompt Shields analyses the text block for injection patterns; the image block is passed to the model without pixel-level injection analysis. Glyphward and Prompt Shields are complementary controls: Prompt Shields covers text-layer injection in Azure AI workloads; Glyphward covers image-layer injection in the same workloads. Both should be applied in multi-modal pipelines where the image source is not fully trusted. See the comparison of multimodal PI scanners for a full coverage breakdown.

How does the Glyphward scan gate integrate with Azure Durable Functions orchestration?

Azure Durable Functions orchestrate long-running workflows through orchestrator and activity functions. The Glyphward scan gate belongs in a dedicated activity function — ScanImageActivity — that is called from the orchestrator as the first activity in the workflow. The activity function makes the POST /v1/scan call and returns the scan result to the orchestrator. If the scan result indicates an adversarial image (score >= 65), the orchestrator terminates the workflow instance using context.terminate() and raises a custom status event for monitoring. This pattern keeps the scan gate within Durable Functions' at-least-once execution model: if the ScanImageActivity fails transiently, Durable Functions retries it automatically (with the configured retry policy) before the orchestrator advances to downstream activity functions. Store the Glyphward API key in Azure Key Vault and retrieve it in the activity function via managed identity — the orchestrator should not handle secrets directly. For fan-out patterns where multiple images are processed in parallel, apply the scan gate as a parallel activity for each image before the analysis fan-in step.

Can I use Azure Policy to enforce Glyphward scan gate implementation across my function apps?

Azure Policy enforces governance rules on Azure resource properties — SKU restrictions, tag requirements, network configurations, diagnostic settings — not on application code logic within function apps. Azure Policy cannot directly enforce that every Azure Function that calls Azure OpenAI also calls the Glyphward scan gate. However, you can use complementary approaches: (1) Azure Policy deny effects to require specific application settings (like GLYPHWARD_ENABLED=true) on function app resources, creating a soft enforcement signal; (2) a custom Azure Policy combined with Azure Monitor alerts to detect Azure OpenAI API calls from function apps that do not have a matching Glyphward scan call in the same invocation log window (detectable from Application Insights telemetry if the scan gate logs the scan ID alongside the AOAI call); (3) a shared base class or middleware pattern in your internal function app template that makes the Glyphward scan gate non-optional for any function that processes images. For organisations with a centralised platform engineering team, option 3 — embedding the scan gate in the internal image processing SDK — is the most reliable enforcement mechanism.

What is the latency impact of adding the Glyphward scan call to an Azure Function?

The Glyphward POST /v1/scan API call adds 1–2 seconds of wall-clock latency to an Azure Function invocation for standard image sizes (up to 4096×4096 pixels). For HTTP-triggered functions serving synchronous API requests, this additional latency should be evaluated against the downstream VLM call latency: Azure OpenAI GPT-4o multi-modal calls typically take 3–8 seconds for standard image analysis tasks, so the Glyphward scan adds 15–40% to the total response time. For latency-sensitive workloads where this is unacceptable, consider: (1) running the scan in parallel with lightweight pre-processing steps (image resize, format conversion) rather than sequentially, using asyncio.gather(); (2) using the Glyphward async scan endpoint if your function processes images asynchronously (Blob-triggered, Event Grid–triggered); (3) caching scan results for identical images by content hash — if the same image is submitted multiple times within a session, return the cached scan result rather than re-scanning. Blob-triggered and Event Grid–triggered functions are asynchronous and have no synchronous latency budget, so the 1–2 second scan overhead is generally not a concern for those patterns.

Further reading