Vertical guide · Legal AI & document review

Prompt injection in AI-powered document review platforms

AI-powered document review platforms — Kira Systems (acquired by Litera), Luminance, Harvey AI, Ironclad AI, Evisort (acquired by Workday), and the growing range of GPT-4o-based legal AI tools — process contracts, scanned court exhibits, due diligence document packs, and regulatory submissions at scale. These platforms use vision-language models or multimodal LLMs to extract clauses, flag risks, compare contract versions, summarise document content, and draft review memos. The documents they process originate in large part from external counterparties — the opposing counsel's drafted contract, the target company's data room in an M&A review, the supplier's scanned master service agreement, the court exhibit submitted by the plaintiff. These are untrusted external documents, and they contain images: company logos, scanned signatures, exhibit stamps, embedded charts, and in some cases fully scanned page images rather than machine-readable text. An adversarial counterparty who knows the target organisation uses an AI review platform can embed adversarially crafted pixel-level instructions in a document image — in a scanned page, a chart, a logo, or a signature block — to steer the AI's clause extraction, risk flagging, or summarisation outputs. The legal stakes of this attack are high: AI-influenced contract review errors can affect deal terms, litigation strategy, and regulatory filings. Glyphward provides the missing pre-VLM scan gate for organisations that build on these platforms' APIs or deploy custom document review pipelines.

TL;DR

Any AI-powered document review pipeline that extracts images from uploaded documents (PDF page renders, scanned contract images, embedded charts) and passes them to a VLM for analysis should scan those images with Glyphward before the model call. Reject documents containing images with score >= 65 and route to human review. External counterparty documents are untrusted external artifacts — the same hostile-input assumption that applies to user-submitted web forms applies to opposing counsel's contract attachments. Free tier — 10 scans/day, no card required.

The four multimodal attack surfaces in legal AI document review

1. Scanned contract pages — rasterised PDF documents from external counterparties. Many contracts in legal practice exist as scanned PDFs rather than machine-readable text documents: signed physical originals, older agreements from archives, documents from jurisdictions where original wet signatures are required, and contracts deliberately sent as scanned images rather than native PDFs by counterparties who do not want the text extracted. When an AI review platform processes a scanned contract PDF, it renders each page as an image and passes that image to a VLM or OCR-then-LLM pipeline for text extraction and clause analysis. A counterparty who embeds adversarial pixel content in a scanned page — placing a typographic injection payload in a page image that appears visually as a normal contract page — can inject instructions into the AI review layer without the instruction being present in any text layer of the document. The adversarial instruction is not in the document's text stream, is not detectable by text-based document analysis tools, and passes undetected through standard PDF validation (the image is a valid JPEG or PNG embedded in a valid PDF). Legal AI platforms that trust scanned page images as inputs to VLMs are exposed to this attack on every external document they process.

2. Due diligence data rooms — high-volume document processing with minimal per-document review. M&A due diligence involves processing hundreds to thousands of documents uploaded by the target company to a virtual data room (Datasite, Intralinks, iDeals, or a custom SharePoint/Google Drive repository). AI review platforms are used precisely to make this volume tractable — processing documents faster than a human team could review individually. The high-volume, low-per-document-scrutiny nature of AI due diligence creates an adversarial opportunity: a single adversarially crafted document embedded in a 500-document data room may be processed without ever triggering manual review of its images. If the data room upload controls validate file format and size but do not inspect image content (which is typical), the adversarially crafted document enters the review corpus with no friction. The AI platform processes it alongside legitimate documents, and the injected output — a risk flag suppressed, a clause misdescribed, a liability incorrectly classified — is presented in the review memo alongside findings from legitimate documents. Because due diligence AI outputs are reviewed under time pressure and high document volume, anomalous findings in a single document may not receive the scrutiny that would catch the injection.

3. Court exhibit processing — adversarial image injection in litigation AI workflows. Legal teams processing court exhibits, discovery productions, and deposition exhibits via AI review tools face the same pixel-level injection risk with heightened adversarial incentive. In litigation, the opposing party — who may have strong motivation to distort the AI analysis — controls the content of exhibits they submit. A PDF exhibit containing an embedded image (a photograph, a scanned document, a chart) can carry adversarially crafted pixel content that instructs the AI to mischaracterise the exhibit's content. If a legal team uses AI summarisation to generate exhibit summaries or deposition question guides from exhibits reviewed at volume, an injected exhibit summary could misdirect litigation strategy. Unlike commercial contract review (where errors may be caught at negotiation), litigation AI errors may persist into legal filings or trial preparation without an equivalent review checkpoint.

4. Harvey AI API and GPT-4o-based custom legal pipelines — teams building on top of foundation models. The rise of Harvey AI and similar GPT-4o-based legal AI tools has led many law firms and legal operations teams to build custom document processing pipelines using foundation model APIs directly. These custom pipelines — often built in Python using the OpenAI or Anthropic SDK with PDF parsing libraries (PyMuPDF, pdfplumber, python-docx) — extract images from documents and pass them to GPT-4o or Claude with a legal review prompt. These pipelines have no platform-layer content filtering because they are custom application code, and their developers typically focus on legal accuracy and prompt engineering rather than adversarial input security. A law firm deploying a custom GPT-4o review pipeline for contract analysis or discovery document processing should treat the image extraction + VLM call path as requiring the same pre-scan gate that any other multimodal production application requires.

Integration: PDF document review pipeline with Glyphward pre-scan gate

import base64
import io
import fitz  # PyMuPDF
import requests
from openai import OpenAI

GLYPHWARD_KEY = "<your-glyphward-api-key>"
GLYPHWARD_THRESHOLD = 65

client = OpenAI()


def scan_image_bytes(image_bytes: bytes, source: str = "document_review") -> dict:
    """Scan image for adversarial PI before legal AI review call."""
    encoded = base64.b64encode(image_bytes).decode()
    resp = requests.post(
        "https://glyphward.com/v1/scan",
        json={"image": encoded, "source": source},
        headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
        timeout=8,
    )
    resp.raise_for_status()
    return resp.json()


def extract_page_images_from_pdf(pdf_bytes: bytes, dpi: int = 150) -> list[bytes]:
    """Render each PDF page as an image at specified DPI."""
    doc = fitz.open(stream=pdf_bytes, filetype="pdf")
    images = []
    for page in doc:
        mat = fitz.Matrix(dpi / 72, dpi / 72)
        pix = page.get_pixmap(matrix=mat, colorspace=fitz.csRGB)
        images.append(pix.tobytes("png"))
    return images


def review_contract_safe(
    pdf_bytes: bytes,
    review_prompt: str,
    document_name: str = "contract",
) -> dict:
    """
    Legal AI document review pattern: scan all page images BEFORE VLM review call.
    Returns dict with 'status' of 'ok' (all pages clean) or 'blocked' (adversarial content).
    """
    page_images = extract_page_images_from_pdf(pdf_bytes)
    flagged_pages = []

    for page_num, page_bytes in enumerate(page_images, start=1):
        try:
            scan = scan_image_bytes(page_bytes, source=f"legal_review_{document_name}")
        except Exception as exc:
            return {
                "status": "blocked",
                "reason": f"scanner_unavailable_on_page_{page_num}: {exc}",
                "action": "route_to_manual_review",
            }
        if scan["score"] >= GLYPHWARD_THRESHOLD:
            flagged_pages.append({
                "page": page_num,
                "score": scan["score"],
                "scan_id": scan["scan_id"],
            })

    if flagged_pages:
        return {
            "status": "blocked",
            "reason": "adversarial_image_detected_in_document",
            "flagged_pages": flagged_pages,
            "total_pages": len(page_images),
            "action": "route_to_manual_review",
            "document": document_name,
        }

    # All pages passed — proceed with AI review
    # Build multimodal content for GPT-4o
    content = [{"type": "text", "text": review_prompt}]
    for page_bytes in page_images:
        encoded = base64.b64encode(page_bytes).decode()
        content.append({
            "type": "image_url",
            "image_url": {"url": f"data:image/png;base64,{encoded}", "detail": "high"},
        })

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": "You are a legal document review assistant. Analyse the provided document pages accurately and flag any unusual provisions.",
            },
            {"role": "user", "content": content},
        ],
        max_tokens=2048,
    )

    return {
        "status": "ok",
        "review": response.choices[0].message.content,
        "pages_scanned": len(page_images),
        "document": document_name,
    }

The pattern renders each PDF page as an image using PyMuPDF (fitz) and scans each page before passing any page to GPT-4o. Flagging any single page causes the entire document to be routed to manual review — the correct behaviour for legal documents where a single adversarial page can corrupt the entire review. For high-volume due diligence pipelines, batch the scan calls using the Glyphward batch endpoint to reduce latency on large document sets. Log all flagged documents with their scan IDs and route them to a separate manual review queue tracked in your document management system. For Harvey AI integrations, add the scan gate in the Python code that calls the Harvey API before submitting document content — Harvey AI's platform controls do not include pixel-level PI scanning. Get early access

Coverage matrix

Defence layer Scanned contract pages (external counterparty) Due diligence data room (M&A volume processing) Court exhibit processing (litigation AI) Custom GPT-4o legal pipeline (API-direct)
Platform-level document upload controls (format validation, AV scanning) Partial — validates MIME type and malware signatures; does not detect adversarial pixel content in valid image files Partial — data room platforms validate format and scan for malware; no PI detection No — exhibit upload systems validate format; no PI detection No — application code handles upload; developer must add validation
Legal AI platform text-layer analysis (Kira, Luminance clause extraction) No — text-layer analysis operates on OCR output; adversarial payload in pixel layer bypasses OCR-then-LLM pipeline No No N/A
GPT-4o / Claude content moderation (OpenAI / Anthropic safety filters) Partial — detects harmful content in text output; not designed for adversarial business-document image injection Partial Partial Partial
Human lawyer review of AI-generated output Partial — catches grossly wrong AI output; may miss subtle injection (suppressed flag, misdescribed clause) Low — high-volume AI review reduces per-document human scrutiny by design Partial — attorney review of AI summaries; subtle injections in exhibit characterisation may persist Variable — depends on review workflow
Glyphward pre-VLM scan on extracted page/document images Yes — scan rendered PDF page images before AI clause extraction call Yes — scan all document images in bulk before batch AI review Yes — scan exhibit page images before AI summarisation Yes — add scan gate in PDF parsing + GPT-4o call chain

Related questions

Is this attack realistic? Would opposing counsel actually attempt prompt injection?

Prompt injection via adversarial images does not require the attacker to believe that the target uses a specific AI review platform — it only requires embedding content in a document that would be harmful if processed by any VLM-based review system. The barrier to creating adversarially crafted images has decreased significantly with the publication of FigStep (arXiv:2311.05608) and similar techniques — these attacks can be automated. In commercial settings, counterparties in high-stakes negotiations, M&A transactions, or litigation have clear financial or strategic incentives to influence AI review outcomes. The attack does not need to completely override AI review — a single suppressed risk flag or a misdescribed clause term can have material legal consequences. The question is not whether this attack is realistic in the abstract but whether any organisation processing external documents with AI review wants to find out by experiencing it. The controls are straightforward to add now.

Does Kira Systems or Luminance provide PI protection for multimodal document processing?

As of the knowledge cutoff of this page (June 2026), neither Kira Systems (now Litera's AI document review) nor Luminance publishes documentation of multimodal prompt injection scanning as a named feature of their platform. Both platforms' security documentation addresses data privacy (SOC 2, ISO 27001 certifications), access controls, and model trust frameworks. Their text-extraction and clause-matching pipelines are primarily designed around OCR + NLP, where the adversarial text-layer PI risk is addressed differently than pixel-level image injection. For teams using these platforms via their APIs or SDKs, adding Glyphward scan gates in the integration code before documents are submitted to the platform API provides defence-in-depth regardless of what the platform's internal processing does. For teams using Kira or Luminance as a no-code interface, the attack surface is within the platform's own processing — consult the vendor's security team about their multimodal PI posture.

How should law firms handle GDPR and legal professional privilege when using PI scanning?

Glyphward's scan API processes only the image submitted — no document content beyond the image bytes is stored beyond the scan transaction. The scan result (a score and flagged region bounding boxes) does not contain any text extracted from the document or any information about the document's legal content. For law firms subject to GDPR, the scan constitutes processing of personal data only if the document images contain personal data (photographs of individuals, scanned ID documents) — in which case the same data processing agreements and retention controls that apply to the AI review platform apply to the scan API. Legal professional privilege attaches to the document content communicated between lawyer and client; submitting an image hash or pixel sample for security scanning does not constitute disclosure of privileged content under most common law frameworks. Consult your data protection counsel for jurisdiction-specific guidance before deploying in regulated contexts. The legal AI prompt injection page covers additional GDPR and privilege considerations.

Further reading