Compliance · FedRAMP

FedRAMP AI security: prompt injection controls for multimodal federal systems

FedRAMP Moderate and High authorizations require cloud service providers to implement NIST SP 800-53 Rev 5 controls across every attack surface their system exposes — including AI inference inputs. When a federal AI system processes claimant-submitted ID documents, contractor field photos, or agency-internal scanned forms through a vision-language model, the image byte stream is an input channel that standard text-only prompt-injection scanners do not inspect. Adversarial text embedded in image pixels — rendered invisibly to a human reviewer but fully legible to a vision model — can redirect model behavior, suppress output, exfiltrate retrieved context, or cause the system to produce fraudulent determinations. Five NIST 800-53 Rev 5 controls directly implicate multimodal prompt-injection detection, and the absence of that control is a findable gap during a 3PAO assessment under SA-11.

TL;DR

Call POST https://api.glyphward.com/v1/scan with the base64-encoded image bytes before passing the image to your vision model. The response includes a 0–100 score, a decision of allow or block, and a scan_id that becomes your per-transaction audit evidence for SI-10 and AU-12. Use a threshold of 70 for FedRAMP Moderate systems and 60 for High impact. Log the scan_id and image SHA-256 to your SIEM for continuous monitoring evidence. The free tier covers 1,000 scans per month — get early access to start building before your next 3PAO engagement.

Relevant NIST 800-53 Rev 5 controls for AI image inputs

FedRAMP does not yet publish a dedicated AI security baseline overlay, but the Joint Authorization Board (JAB) has confirmed that existing controls apply to AI system components. The five controls below have direct, documentable relationships to multimodal prompt-injection risk in vision AI systems.

SI-10 — Information Input Validation

SI-10 requires the information system to check the validity of information inputs. The control explicitly covers "values, format, and accuracy" of all inputs — not merely text fields. For an AI system that routes uploaded images to a vision-language model, the image byte stream is an information input. A multimodal prompt-injection payload (adversarial text rendered in the image that instructs the model to alter its behavior) is an invalid input under SI-10: it is not a legitimate documentary image; it is an instruction masquerading as one. The control enhancement SI-10(3) extends validation to the component level, which for a microservices AI pipeline means the inference gateway, not only the web application firewall. A pre-inference scan that returns a block decision before the image reaches the model is the SI-10 control implementation for multimodal AI inputs.

Assessors look for documented validation logic and evidence that it fires on every image input. The scan_id per-image log entry is that evidence.

SA-11 — Developer Testing and Evaluation

SA-11 requires developers to perform security testing and evaluation of the system, including penetration testing sufficient to identify vulnerabilities. Enhancement SA-11(5) specifically requires penetration testing of developer-defined elements — AI inference endpoints are developer-defined elements. The 3PAO security assessment package for a FedRAMP authorization with AI features must demonstrate that the CSP's developers tested those features for known AI attack classes. OWASP LLM01 (prompt injection) and its multimodal variants are documented vulnerability classes; a 3PAO assessor who is current on the AI threat landscape will look for evidence of multimodal PI testing. Providing the assessor with a set of adversarial test images and a log of scan decisions demonstrating that the scanner blocked them — with scan_id artifacts traceable to specific test runs — satisfies the SA-11 penetration testing evidence requirement for AI image inputs.

RA-5 — Vulnerability Monitoring and Scanning

RA-5 requires the organization to monitor and scan for vulnerabilities in the information system on an ongoing basis. FedRAMP Moderate and High both require continuous monitoring (ConMon) plans that enumerate the vulnerability classes the organization is scanning for. OWASP LLM01 multimodal prompt injection is a documented vulnerability class with a growing body of public exploit demonstrations, including attacks against document-processing AI published in 2024–2025. Once this vulnerability class appears in authoritative references (OWASP Top 10 for LLMs, NIST NVD AI-specific entries, CISA advisories), it must be included in the RA-5 vulnerability monitoring scope for AI systems. A ConMon evidence package that demonstrates per-image scanning in production — with weekly summary logs from the Glyphward dashboard — provides the artifact that satisfies the RA-5 monitoring requirement for this vulnerability class.

SI-3 — Malicious Code Protection

SI-3 requires malicious code protection at information system entry and exit points. Traditional implementations address executable malware in file uploads. For AI systems, an adversarial image payload is a form of malicious code: it is a crafted artifact designed to alter the behavior of an executing process (the model inference) in a manner the legitimate user and the system owner did not authorize. The analogy to shellcode is precise — the payload is not executable in isolation, but when the "interpreter" (the vision model) processes it, it executes the embedded instruction. SI-3's requirement to scan at entry points maps directly to pre-inference multimodal PI scanning for every uploaded image before it reaches the model. SI-3(1) and SI-3(2) enhancements, which require centrally managed and automatically updated malicious code protection tools, are satisfied by an API-based scanner that Glyphward updates as new attack patterns emerge — the CSP does not need to maintain adversarial pattern libraries in-house.

AU-2 / AU-12 — Audit Events and Audit Record Generation

AU-2 requires the organization to determine which events the system must audit. AU-12 requires the system to generate audit records for those events. For FedRAMP Moderate and High systems, every significant data-processing event must produce an auditable record. A vision AI system processing a claimant's uploaded document image is performing a significant data-processing event that affects the claimant's benefit determination. Each such event must appear in the audit log. The Glyphward scan call produces four audit-relevant fields per image: scan_id (unique per scan), image_sha256 (content-addressable identifier of the specific image processed), decision (allow/block), and timestamp. Logging these four fields to your SIEM alongside the application-layer transaction identifier provides the per-transaction audit trail AU-12 requires for AI inference events — a trail that shows which image was processed, when, by which system component, and what security decision was made about it.

AI attack surfaces in federal deployments

The federal government's expansion of AI into citizen-facing and operational systems has created several distinct image-processing attack surfaces. Each has characteristics that affect the threat model and the priority level for multimodal PI controls.

Benefits processing: SSA, VA, HHS

AI systems at the Social Security Administration, Department of Veterans Affairs, and Department of Health and Human Services increasingly process claimant-submitted document images — identity documents, medical records, supporting evidence for disability claims, provider attestation forms. These images originate from members of the public who may or may not be the claimant of record. An adversarial claimant or third party can craft an ID scan or supporting document image containing hidden text that, when processed by the agency's vision AI, instructs the model to approve a claim, suppress a fraud flag, or return a determination the model's developers did not intend. The impact of a successful injection is an erroneous benefit determination affecting a federal entitlement program — a high-severity outcome that elevates the priority of this control relative to most commercial deployments.

Immigration and border: USCIS, CBP

U.S. Citizenship and Immigration Services processes millions of document images annually — petitions, supporting evidence, identity documents from applicants in jurisdictions where document fraud is a documented threat vector. An adversarial image payload submitted in a visa petition or naturalization application could, against a vulnerable vision AI, alter the AI's characterization of the document, suppress anomaly flags, or redirect model output in ways that affect an adjudication. CBP's image classification systems at ports of entry face a similar threat from cargo and traveler-submitted images. The FedRAMP authorization boundary for USCIS and CBP AI systems must account for this external-input attack surface explicitly.

Defense contractor pipelines: DoD NIPR

DoD AI vision systems on Non-classified Internet Protocol Router (NIPR) networks process imagery submitted by contractors — field photographs, technical diagrams, equipment condition reports, inspection images. Contractors are external parties to the DoD authorization boundary; their submitted images are untrusted inputs. A contractor whose subcontractor has been compromised, or a contractor themselves acting adversarially, can embed instructions in a submitted image that affect downstream DoD AI processing. On NIPR, which handles controlled unclassified information (CUI) and some For Official Use Only (FOUO) data, a successful injection attack against a DoD AI vision system constitutes a CUI handling incident. The FedRAMP Moderate equivalent for DoD (Impact Level 2) applies to many NIPR AI deployments.

Grant management: NSF, NIH

The National Science Foundation and National Institutes of Health use AI systems to assist in processing grant proposals, including research diagrams, figures, and supplementary materials submitted as image attachments. A research submitter with adversarial intent — whether an applicant attempting to manipulate AI scoring or a nation-state actor targeting US research programs — can embed prompt-injection payloads in submitted figures. The vision AI that processes those figures to assist reviewers may return altered characterizations of the research if it processes an adversarial image. NSF and NIH grant systems often operate at the FedRAMP Moderate baseline.

Tax and revenue: IRS document AI

IRS AI document processing systems handle uploaded W-2 images, scanned tax forms, supporting schedules, and identity verification documents. Taxpayers who submit these images are external actors with direct financial motivation to alter the AI system's output. Adversarial image payloads in tax document uploads represent a revenue-integrity risk. The IRS's authorization boundary for its document-processing AI must include controls for the image-input channel — SI-10 input validation and SI-3 malicious artifact scanning at the inference gateway are the applicable controls.

Implementation pattern

import hashlib
import time
import uuid
from dataclasses import dataclass, field
from typing import Literal

import httpx

GLYPHWARD_API_KEY = "gw_..."          # stored in Secrets Manager, never in source
GLYPHWARD_SCAN_URL = "https://api.glyphward.com/v1/scan"

# Threshold by FedRAMP impact level.
# High impact uses a lower (more sensitive) threshold because the cost of a
# missed injection in a High-impact system exceeds the cost of a false positive.
FEDRAMP_THRESHOLDS: dict[str, int] = {
    "Moderate": 70,
    "High":     60,
}


@dataclass
class FederalScanRecord:
    """
    Per-image audit record.  All four fields must be written to the SIEM
    to satisfy AU-12 audit record generation for AI inference events.

    classification_context notes the FedRAMP impact level or system name.
    It MUST NOT contain actual CUI — use system identifiers only.
    """
    scan_id:                str
    image_sha256:           str
    score:                  int
    decision:               Literal["allow", "block"]
    timestamp:              str
    classification_context: str   # e.g. "FedRAMP-Moderate-BenefitsProcessing"


def log_to_audit_trail(record: FederalScanRecord) -> None:
    """
    Write the scan record to the system's authoritative audit log.

    In practice: publish to the CloudWatch Logs group / Splunk index /
    Azure Sentinel workspace that feeds the ConMon dashboard.  The
    scan_id ties back to the Glyphward dashboard for weekly anomaly review
    (RA-5 continuous monitoring evidence).
    """
    import json, logging
    audit_logger = logging.getLogger("audit.ai.image")
    audit_logger.info(json.dumps({
        "event":                    "multimodal_pi_scan",
        "scan_id":                  record.scan_id,
        "image_sha256":             record.image_sha256,
        "score":                    record.score,
        "decision":                 record.decision,
        "timestamp":                record.timestamp,
        "classification_context":   record.classification_context,
    }))


def scan_federal_image(
    image_bytes: bytes,
    impact_level: Literal["Moderate", "High"],
    system_identifier: str,
) -> FederalScanRecord:
    """
    Scan an image for multimodal prompt-injection before passing it to a
    vision-language model in a FedRAMP-authorized system.

    Implements SI-10 (input validation), SI-3 (malicious artifact detection),
    and AU-12 (audit record generation) for AI image inputs.

    Fail-closed: any API error or timeout results in a block decision.
    The image is never forwarded to the vision model unless this function
    returns a record with decision == "allow".

    Args:
        image_bytes:       Raw bytes of the image to scan.
        impact_level:      "Moderate" or "High" — sets the block threshold.
        system_identifier: Non-CUI system name for the audit log context
                           (e.g. "BenefitsProcessing-Prod").

    Returns:
        FederalScanRecord with scan_id, image_sha256, score, decision,
        timestamp, and classification_context.

    Raises:
        ValueError if image_bytes is empty.
    """
    if not image_bytes:
        raise ValueError("image_bytes must not be empty")

    image_sha256 = hashlib.sha256(image_bytes).hexdigest()
    classification_context = f"FedRAMP-{impact_level}-{system_identifier}"
    threshold = FEDRAMP_THRESHOLDS[impact_level]

    try:
        response = httpx.post(
            GLYPHWARD_SCAN_URL,
            headers={
                "Authorization": f"Bearer {GLYPHWARD_API_KEY}",
                "Content-Type":  "application/json",
            },
            json={
                "image":  __import__("base64").b64encode(image_bytes).decode(),
                "source": classification_context,
            },
            timeout=5.0,   # fail-closed on timeout — do not stall inference pipeline
        )
        response.raise_for_status()
        payload = response.json()
        score    = int(payload["score"])
        scan_id  = payload["scan_id"]
        decision: Literal["allow", "block"] = (
            "block" if score >= threshold else "allow"
        )

    except Exception:
        # Fail-closed: network error, API error, or timeout -> block.
        # Log the failure as a block so the SIEM alert fires, not silently passes.
        score    = 100
        scan_id  = f"ERR-{uuid.uuid4()}"
        decision = "block"

    record = FederalScanRecord(
        scan_id=scan_id,
        image_sha256=image_sha256,
        score=score,
        decision=decision,
        timestamp=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
        classification_context=classification_context,
    )
    log_to_audit_trail(record)
    return record


# ---------------------------------------------------------------------------
# Inference gateway — call this instead of calling the vision model directly
# ---------------------------------------------------------------------------

def process_document_image(image_bytes: bytes, impact_level: str, system_id: str):
    """
    SI-10 compliant entry point for document image processing.
    The vision model is only called when the scan decision is 'allow'.
    """
    record = scan_federal_image(image_bytes, impact_level, system_id)

    if record.decision == "block":
        # Return a structured rejection.  Do not surface score to the submitter
        # (avoids oracle attacks that let an adversary tune the payload).
        return {
            "status":  "rejected",
            "reason":  "Image failed security validation.",
            "scan_id": record.scan_id,   # for audit cross-reference only
        }

    # Safe to forward to vision model.
    return call_vision_model(image_bytes)


def call_vision_model(image_bytes: bytes) -> dict:
    """Placeholder for the actual vision model inference call."""
    raise NotImplementedError("Replace with your GPT-4V / Claude / Gemini call")

The critical invariant is that call_vision_model is never reached when record.decision == "block". The scan sits between the image ingestion handler and the model inference call — not as an async side-channel, not as a post-processing filter, but as a synchronous gate. The scan_id returned in the rejection response lets the operations team locate the full scan record in the audit log without exposing the score, which would let an attacker iterate toward a payload that falls below the threshold. For the same reason, the threshold value should not appear in any user-facing API response or error message.

The FedRAMP continuous monitoring package for this control should include: the scan log export (grouped by week), a count of block decisions per system, and any anomalous spikes in score distribution — available from the Glyphward dashboard and exportable as CSV for inclusion in the monthly ConMon report.

Get early access

Coverage matrix

Control Claimant-submitted ID document Agency-internal diagram/chart Contractor-submitted field image User-uploaded form scan
SI-10 Input Validation Pre-inference scan required; external origin = untrusted input Lower risk (internal origin) but still required under SI-10(3) Pre-inference scan required; contractor = external party Pre-inference scan required; public-facing upload = untrusted
SI-3 Malicious Code Protection Adversarial payload in claimant ID image is a malicious artifact at entry point Reduced risk from internal origin; scan still satisfies SI-3 completeness requirement Contractor-origin adversarial image is a malicious artifact; SI-3 entry-point scanning applies Public upload is highest-risk SI-3 entry point; scanning mandatory
RA-5 Vulnerability Scanning OWASP LLM01 multimodal PI is the documented vulnerability class; scan log is ConMon evidence Internal images still within RA-5 continuous monitoring scope External-origin images must be included in RA-5 monitoring scope Highest-exposure RA-5 surface; anomaly spike detection required
SA-11 Penetration Testing 3PAO must test with adversarial claimant ID images; scan_id is test artifact Internal diagram surface should be included in SA-11 scope for completeness 3PAO test scope must include contractor-submitted image channel Primary SA-11 test surface for public-facing AI document systems
AU-12 Audit Record Generation scan_id + image_sha256 + decision per document = per-transaction AU-12 record Same audit fields required even for internal origin images AU-12 record required; contractor identity logged at application layer, not in scan record AU-12 record required; session token ties scan_id to authenticated user session

Related questions

Does Glyphward have a FedRAMP authorization?

No. Glyphward does not currently hold a FedRAMP Provisional Authorization to Operate (P-ATO) or an agency ATO. Glyphward processes image bytes outside your authorization boundary. Before integrating Glyphward into a FedRAMP-authorized system, you must confirm with your Authorizing Official whether routing images through a non-FedRAMP API constitutes a boundary expansion that requires a significant change notification or a new ATO. As a practical guideline: Glyphward is appropriate for scanning Low-impact images where the image content itself is not CUI and the system's FedRAMP impact level permits external processing of that image category. CUI-containing images — such as claimant medical records, taxpayer financial documents, or law enforcement submissions — should remain within the authorization boundary and should not be routed through any external API, including Glyphward, without explicit AO approval and a boundary assessment. Contact us if you are evaluating on-premise deployment options that keep all image data within your authorization boundary.

How does this map to the FedRAMP continuous monitoring (ConMon) requirement?

FedRAMP ConMon requires monthly reporting to the authorizing agency covering the current security posture of the system, including evidence of ongoing control effectiveness. For the SI-10 multimodal input validation control, the ConMon evidence artifact is the per-image scan log: each log entry includes scan_id, image_sha256, decision, and timestamp, satisfying the AU-12 audit record requirement for AI inference events. The Glyphward dashboard exports a weekly anomaly report showing block rate, score distribution, and flagged image counts by system identifier — this report is the ConMon evidence for RA-5 vulnerability monitoring coverage of the OWASP LLM01 multimodal PI vulnerability class. Include the weekly export in your monthly ConMon package as an artifact under the SI-10 and RA-5 control implementation statement. The control implementation statement in your System Security Plan (SSP) should reference the Glyphward integration as the mechanism by which SI-10 input validation is implemented for AI image inputs, with the scan log as the automated evidence.

Is this relevant for DoD IL2, IL4, or IL5 environments?

Impact Level 2 (IL2) maps roughly to FedRAMP Moderate and governs DoD systems handling public and non-CUI information on DoD networks. The controls and threat model described on this page apply directly to IL2. The Glyphward API can be used for IL2 image scanning subject to the same AO confirmation process described in the FedRAMP authorization question above — specifically, that the images being scanned are not CUI. IL4 and IL5 are a different matter. IL4 governs CUI, and IL5 governs CUI and National Security Systems (NSS) data; both impact levels require all processing to occur within DoD-controlled networks and with DoD-authorized services. Glyphward is not currently authorized at IL4 or IL5. For IL4 and IL5 deployments, on-premise deployment is the only viable integration pattern that keeps image data within the DoD authorization boundary. Contact us to discuss on-premise or air-gapped deployment options for IL4/IL5 environments.

How does the 3PAO test multimodal prompt-injection controls during a FedRAMP security assessment?

The 3PAO security assessment includes penetration testing under SA-11, and for a FedRAMP system with AI image-processing features, that penetration testing must include the AI image-input channel. In practice, the assessor will request evidence that the CSP's developers have defined and tested controls for OWASP LLM01 (prompt injection) at the multimodal input layer. To satisfy this: first, provide the assessor with a set of adversarial test images — images containing embedded text instructions designed to redirect model behavior. Second, demonstrate in the test environment that submitting these images to the inference gateway results in a block decision from the scanner, that the vision model is not called, and that the block decision is recorded in the audit log with a scan_id. The scan_id values from the test run are the SA-11 penetration testing evidence artifacts — they appear in the audit log with timestamps that match the test window, proving that the control was active and effective during testing. The assessor includes these scan_id artifacts in the SAR (Security Assessment Report) as evidence for the SI-10 and SA-11 control findings. If the scanner is not in place, the assessor's finding will be an open finding (OF) or a risk acceptance that must be tracked in the Plan of Action and Milestones (POA&M) until the control is implemented.

Further reading