Legal technology AI · eDiscovery AI · Contract analysis AI · Compliance review AI
Adversarial image injection attacks on legal technology AI platforms
Legal technology AI operates at the intersection of the highest-stakes document workflows in the enterprise: court exhibits with Brady obligations attached, eDiscovery review queues where a misclassified document can constitute spoliation, M&A due-diligence runs where a missed liability clause exposes a client to uncapped indemnification, and regulatory compliance packages where a suppressed GDPR or FCPA flag can draw enforcement action. Platforms including iManage AI (RAVN), NetDocuments AI, Nuix Workstation AI, and Conduent Legal AI ingest scanned court exhibit photographs and evidence images through AI classification pipelines. eDiscovery platforms — Reveal AI (Brainspace), Relativity aiR for Review, Everlaw AI, and IPRO AI Eclipse — run technology-assisted review across millions of document images produced during litigation discovery. Contract analysis AI — Harvey AI, Kira Systems, Luminance AI, and ThoughtRiver — extracts representations and warranties, liability caps, termination provisions, and change-of-control clauses from scanned contract pages submitted by counterparties in M&A, lease portfolio review, and regulatory compliance. Compliance document review platforms — Consilio AI, Epiq AI, and Lighthouse AI — classify GDPR Data Protection Impact Assessment documents, HIPAA breach notification filings, SEC 10-K exhibit images, and FCPA audit materials for regulatory risk. Every one of these workflows accepts document image inputs from parties with an incentive to influence the AI’s classification outcome. An adversarially crafted document image — where imperceptible pixel perturbations cause the AI relevance classifier to score a known-relevant document as non-relevant, cause the TAR predictive coding model to misclassify privileged material, cause the contract AI to misread a liability cap figure, or cause the compliance AI to suppress a regulatory red-flag indicator — passes through optical character recognition and every text-based prompt injection scanner without detection. The payload reaches the legal AI model’s vision channel intact and corrupts the classification or extraction output that attorneys, litigation teams, and compliance officers will rely on. Glyphward scans document image bytes at the inference boundary, before the image reaches the legal AI model, returning a risk score and audit-trail scan_id in under 200 ms per page.
TL;DR
Before passing any scanned court exhibit, eDiscovery document image, contract page scan, or regulatory compliance filing image to your legal technology AI model, POST the raw image bytes to Glyphward’s /v1/scan endpoint. For legal AI workflows, set THRESHOLD_LEGAL_AI = 55 — lower than a general commercial threshold because the consequence of a missed adversarial payload is litigation risk, privilege waiver, or regulatory enforcement, not merely a corrupted extraction. A score ≥ 55 means a payload is present: reject the document from the AI pipeline and route it for human attorney review. Log the returned scan_id, document hash, matter identifier hash, and decision to your matter audit trail. Under 200 ms per page at scale. Free tier — 10 scans/day, no card required.
Why legal technology AI is the highest-consequence adversarial image target
Most enterprise AI workflows process documents from counterparties with commercial interests opposed to the AI’s operator — a supplier wants the AP AI to approve its invoice, a job applicant wants the screening AI to advance their résumé. Legal technology AI is different in degree of adversarial incentive and consequence of manipulation. In litigation, opposing counsel and their clients have a direct interest in how the opposing party’s eDiscovery AI classifies documents: a non-relevant classification removes a document from attorney review; a non-privileged classification causes inadvertent disclosure. In M&A due diligence, the target company’s management has a financial interest in controlling what the acquirer’s contract AI extracts from data-room materials. In regulatory enforcement, an entity under investigation has an obvious interest in suppressing the AI’s identification of FCPA or GDPR red-flag indicators in audit documents.
The attack surface is document images: scanned physical exhibits, photographed contract pages, PDF renderings of filings — all processed through vision models that are exposed to the same indirect prompt injection via image attack classes documented against general vision LLMs, including FigStep-class typographic adversarial payloads that survive OCR extraction. Text-only prompt injection scanners — tools that examine the text content of documents for injection strings — do not inspect the image layer. A pixel-domain adversarial perturbation applied to a scanned document page carries no text that a text scanner can find. It reaches the legal AI model’s vision channel unchanged and produces the outcome the attacker intended.
The professional responsibility and evidentiary consequences compound the commercial risk. A Brady violation risk attaches when exculpatory evidence is suppressed from criminal defense review by AI misclassification. A Federal Rule of Evidence 502 inadvertent disclosure attaches when a privilege misclassification causes privileged material to reach opposing counsel before the error is caught. An FRCP Rule 26 sanctions exposure attaches when discovery misconduct — including AI-assisted discovery review that produces demonstrably wrong classifications — comes before a court. These are not hypothetical consequences: they are the documented legal frameworks that apply to the workflows these platforms operate.
Four injection surfaces in legal technology AI
1. Court exhibit adversarial image injection (Nuix Workstation AI, iManage RAVN, NetDocuments AI, Conduent Legal AI)
Law firms and eDiscovery platforms process scanned court exhibit photographs, physical evidence images, and documentary exhibits submitted through evidence management systems every day at scale. Courts across U.S. federal and state jurisdictions increasingly accept digital exhibits through electronic evidence management systems; eDiscovery AI platforms run AI classification on ingested exhibit images to identify document relevance, privilege status, and responsiveness to specific discovery requests before human attorney review. The document image — not the extracted text — is the operative input to the vision model that performs this classification.
An adversarially crafted exhibit photograph or document image operates through pixel-domain perturbations: modifications to the image’s pixel values that are imperceptible to a human reviewer examining the exhibit at screen or print resolution but that alter the vision model’s internal representations of the document in ways that produce a specific classification outcome. In the court exhibit context, the adversarial perturbation is designed to cause the AI relevance classifier to score a highly relevant document as non-relevant — pushing it below the review threshold and removing it from the attorney review queue entirely. Alternatively, the perturbation is designed to cause the AI privilege classifier to misclassify an attorney-client privileged document as non-privileged, or to misclassify a non-privileged document as privileged, depending on which direction serves the adversary’s litigation strategy.
iManage RAVN AI processes document collections from law firms, including the top 100 Am Law firms, for contract intelligence, matter management, and document classification. RAVN’s AI pipeline ingests document images as part of its multimodal document understanding stack. NetDocuments AI operates as a cloud-native document management and AI analysis platform used by thousands of law firms and legal departments globally; its AI classification pipeline processes uploaded document images to tag, classify, and route documents within matter workspaces. Nuix Workstation AI is deployed by law enforcement, government agencies, and litigation support teams to process forensic evidence image collections for classification, relevance determination, and privilege review. Conduent Legal AI processes large-scale document collections for government and corporate legal departments under managed legal services arrangements.
The legal consequence of a successful adversarial court exhibit injection bifurcates by direction of the manipulation. If the adversarial perturbation causes the AI to classify a relevant, potentially exculpatory exhibit as non-relevant in a criminal matter, the document may not be reviewed by defense counsel before trial. Under Brady v. Maryland, 373 U.S. 83 (1963), the prosecution’s obligation to disclose material exculpatory evidence to the defense is constitutional. If AI-assisted review on the government’s side suppresses a Brady-material exhibit because an adversarially crafted version of that exhibit scored below the relevance threshold, the failure is a Brady violation risk that survives the AI workflow as its cause. The adversarial manipulation is invisible to the reviewing attorneys; they see a review queue from which the document has been removed by the AI, not a document that was misclassified by a manipulated payload.
If the adversarial perturbation causes the AI privilege classifier to misclassify an attorney-client privileged document as non-privileged, and the document is produced to opposing counsel in discovery, Federal Rule of Evidence 502 governs the waiver consequences. Under FRE 502(b), inadvertent disclosure of privileged information does not waive attorney-client privilege or work product protection if the holder took reasonable steps to prevent disclosure and promptly asserted privilege after discovering the disclosure. An AI classification that produced a non-privileged label on a document that a reasonable privilege review would have flagged as privileged is a failure of the “reasonable steps to prevent disclosure” standard — particularly if the AI pipeline did not include adversarial input validation at the image layer. Courts evaluating FRE 502 clawback requests will scrutinize the quality of the privilege review process; an AI privilege classifier operating on unvalidated document image inputs that could have been adversarially manipulated may not satisfy the reasonableness standard.
Glyphward’s inference-boundary scan intercepts the document image before it reaches the classification model. For court exhibit workflows, threshold THRESHOLD_LEGAL_AI = 55 balances detection sensitivity against false-positive rate at the lower end appropriate for privilege and relevance classification consequences. A score at or above 55 routes the document image for human attorney review; the scan_id and document hash are logged to the matter audit trail as evidence that the input validation control operated. This per-document evidence chain is the “reasonable steps” documentation that an FRE 502 clawback motion requires.
2. eDiscovery AI review queue adversarial injection (Reveal AI Brainspace, Relativity aiR for Review, Everlaw AI, IPRO AI Eclipse)
eDiscovery AI performs technology-assisted review — the discipline known in the litigation support industry as TAR — on document image collections produced during civil litigation discovery. TAR AI uses an initial seed set of documents reviewed and classified by attorneys as relevant or non-relevant to train a predictive coding model, which then classifies millions of additional documents in the production collection as relevant or non-relevant. The predictive coding model’s classifications determine which documents enter the attorney review queue for human attention and which are set aside as non-relevant, potentially without any human review at all. This is the architecture that makes TAR economically viable for large-scale litigation: attorney review of a small seed set, followed by AI classification of the remaining collection, with human review reserved for the AI-flagged relevant population.
The adversarial attack against eDiscovery TAR operates at the document image level. An adversarially crafted document image — where pixel perturbations applied to a known-relevant document cause the TAR predictive coding model to classify it as non-relevant — removes that document from the attorney review queue. This is not a probabilistic drift: a targeted adversarial perturbation is designed to produce a specific classification outcome for a specific document image. In large-scale litigation where eDiscovery collections routinely exceed one million documents — antitrust class actions, securities class actions, multi-district litigation, and regulatory investigations by the DOJ, FTC, and SEC regularly involve collections of this scale — adversarial seeding of the review queue with adversarially crafted document images can cause entire categories of relevant documents to fall below the TAR review threshold.
The mechanism of category-level suppression is the TAR model’s generalization from seed to collection. If the seed set contains adversarially crafted versions of documents that belong to a particular category — say, internal communications discussing a specific pricing practice relevant to an antitrust inquiry — the TAR model learns that the image characteristics of that document category are associated with non-relevance. It then applies that learned association to the full collection, classifying the entire category of similar documents as non-relevant. The adversarial perturbation is applied to the seed-set documents; the downstream effect is category-level suppression across the collection. This is a realistic threat model for a sophisticated adversary who controls or has access to the documents being produced in discovery.
Reveal AI (Brainspace) deploys AI-assisted review technology to litigation support teams handling high-stakes commercial litigation, including antitrust, intellectual property, and securities matters. Relativity’s aiR for Review AI runs at data centres processing petabytes of litigation discovery for global law firms, investment banks, and pharmaceutical companies — Relativity reports that its platform processes more than half of all U.S. litigation discovery by volume. Everlaw AI deploys AI-assisted review to government agencies, investment banks, and pharmaceutical companies handling regulatory investigation document collections where document volumes routinely exceed ten million pages. IPRO AI Eclipse processes discovery document collections for corporations, law firms, and litigation support vendors on both civil and regulatory matters.
Privilege misclassification is the second TAR attack surface. A predictive coding model trained to identify attorney-client privileged documents and exclude them from production can be attacked by adversarially crafted privileged document images that score as non-privileged — causing privileged material to enter the produced set. Alternatively, non-privileged documents crafted to score as privileged can be used to inflate the privilege log and suppress responsive but non-privileged documents from production. Both directions of TAR privilege manipulation have consequences under FRCP Rule 26: the producing party has obligations of accuracy in both directions.
Because eDiscovery TAR review operates at high document volumes, the Glyphward integration for this surface must support batch processing with asyncio-based parallelism. Scanning one million document images sequentially is not operationally viable; the integration architecture described in the code section below shows the async batch pattern designed for TAR review queue pre-scan. The threshold for eDiscovery TAR remains THRESHOLD_LEGAL_AI = 55, matching the litigation privilege and attorney-client confidentiality stakes of the surface.
3. Contract analysis AI injection (Harvey AI, Kira Systems, Luminance AI, ThoughtRiver)
M&A due diligence, lease portfolio review, and regulatory compliance contract workflows deploy contract analysis AI to extract key clauses from scanned contract documents at scale. The clauses that matter in M&A due diligence — representations and warranties, limitation of liability caps, termination provisions, change-of-control provisions, material adverse change definitions, intellectual property ownership representations, and non-compete restrictions — are commercially and legally operative terms whose accurate extraction directly determines the risk profile of a transaction. Contract analysis AI platforms process scanned counterparty contracts from data rooms, lease abstractions from commercial real estate portfolios, and supply agreement reviews from procurement teams. In each context, the counterparty who submitted the contract has an interest in controlling what the AI extracts from it.
An adversarially crafted contract page image operates by applying pixel perturbations to the scan in ways that cause the contract AI vision model to misread a specific clause or figure. The attack is not typographic — the contract text itself is not modified; the physical contract is unchanged. The adversarial perturbation is in the image domain: the pixel-level representation of the page that the vision model receives. The model’s internal processing produces an extraction that differs from what an unmodified scan of the same contract page would produce. The physical contract, the signed original, and the counterparty’s production version are identical — only the digital scan that the AI processed is adversarially crafted.
Harvey AI processes legal documents for law firms including Allen & Overy (A&O Shearman), Paul Weiss, and Linklaters, with a document extraction and analysis capability that spans contract review, M&A diligence, and regulatory analysis. Kira Systems is deployed at KPMG, Ernst & Young, and Deloitte for contract due diligence automation — the platform extracts structured data from thousands of contract pages in parallel during M&A transactions. Luminance AI is deployed at Clifford Chance, Freshfields Bruckhaus Deringer, and other Magic Circle and Silver Circle firms for contract review and due diligence. ThoughtRiver processes contract pages for structured clause extraction and risk scoring in commercial contracts, supply agreements, and technology licensing, with a focus on pre-signature contract risk assessment.
A missed liability cap clause in an M&A transaction document could expose the acquiring entity to uncapped indemnification in a post-closing claim. If the target company’s contract AI-processed data-room materials contain an adversarially crafted version of a key acquisition agreement page — one where pixel perturbations cause Kira or Luminance to extract a liability cap figure that is higher than the actual cap, or to fail to extract the cap entirely — the acquirer’s due diligence report will reflect the manipulated extraction. The deal team, relying on the AI-extracted clause summary, may not review the underlying contract page independently. The closing occurs on the basis of terms that differ from what the AI reported.
A missed change-of-control provision in a target company’s material contract could trigger an unreported covenant breach under the acquisition agreement’s representations and warranties. Change-of-control clauses in material contracts — bank facility agreements, key customer contracts, software license agreements — typically require counterparty consent for an acquisition. If the contract AI fails to identify a change-of-control provision in a material contract because the scanned contract page was adversarially crafted to cause the AI to misclassify or miss the clause, the acquirer’s representation that all required third-party consents were obtained will be false. Post-closing, the failure to obtain consent may constitute a breach of the material contract and trigger consequences under the acquisition agreement’s indemnification provisions.
The lease portfolio review context presents a different adversarial dynamic. In large commercial real estate transactions where a buyer acquires a portfolio of leased properties, the seller controls the lease document images submitted for due diligence AI review. An adversarially crafted lease page image that causes ThoughtRiver or Kira to misread a rent escalation clause, a landlord co-tenancy right, or a termination-for-convenience option has direct financial consequence on the portfolio valuation model. Lease abstraction AI is used specifically to process volumes of leases that are too large for attorney-by-attorney review; the AI extraction is the input to the financial model. Manipulating the AI extraction is manipulating the model.
The threshold for contract analysis AI remains THRESHOLD_LEGAL_AI = 55. For M&A transactions above a material transaction size threshold, legal operations teams should consider lowering to 50 — the same logic as lowering the SOX financial threshold for high-value transactions applies here. Document the threshold and rationale in the engagement’s AI tool risk management log.
4. Legal compliance document review AI injection (Consilio AI, Epiq AI, Lighthouse AI)
Compliance document review AI processes regulatory document collections that carry the highest regulatory consequence of any legal AI surface: GDPR Data Protection Impact Assessment (DPIA) documents under Regulation (EU) 2016/679 Article 35, HIPAA breach notification filings, SEC 10-K and 10-Q exhibit images, FCPA audit trail documents, and DOJ Civil Investigative Demand response packages. These document collections are submitted to compliance AI platforms for classification of regulatory risk, flagging of non-compliance indicators, identification of material adverse changes, and routing of documents to attorney or regulatory specialist review. The AI classification determines what human reviewers see; documents that score below the routing threshold may receive no human review at all.
Adversarially crafted regulatory compliance document images operate in this context by suppressing the AI compliance flag for specific risk categories. A pixel-perturbation applied to a GDPR Article 35 processing activity description — one that causes the compliance AI to score the described processing activity as low-risk rather than triggering Article 35’s DPIA requirement — removes the DPIA obligation flag from the compliance team’s review queue. The DPIA is not conducted. The processing activity proceeds without the required prior consultation with the supervisory authority. Under GDPR Article 83(4), failure to conduct a required DPIA is subject to administrative fines of up to €10 million or 2% of global annual turnover. The adversarial image manipulation produced a regulatory exposure that is invisible to the compliance team because the AI never surfaced the flag that would have triggered their review.
Consilio AI processes regulatory compliance document collections for Fortune 500 legal departments facing DOJ, FTC, and SEC investigations — the platform handles managed review, privilege log preparation, and regulatory response workflow for some of the largest regulatory proceedings in U.S. corporate law. Epiq AI processes class action settlement document collections and regulatory response packages for corporate defendants and their counsel, with AI classification handling initial responsiveness and privilege determination at volumes that human review could not match economically. Lighthouse AI serves as a managed review and compliance AI platform for corporate legal departments and law firms engaged in regulatory response and internal investigation document review.
The FCPA red-flag scenario presents a particularly consequential adversarial target. FCPA audit documents — travel and entertainment expense records, agent payment authorizations, government official interaction logs — are processed by compliance AI to identify indicators of potential bribery or improper payment. An adversarially crafted audit document image that causes the compliance AI to score a payment authorization as routine rather than flagging it for FCPA red-flag review creates a gap in the anti-corruption compliance program. If that document surfaces in a subsequent DOJ investigation, the failure of the AI to flag it — when the document image was adversarially crafted to suppress the flag — is both an FCPA substantive risk and an evidence problem for the company’s cooperation credit claim. Companies cooperating with DOJ FCPA investigations receive cooperation credit for proactive disclosure; an AI compliance review that missed red flags due to adversarial manipulation is not proactive.
The SEC 10-K exhibit image attack surface is the material adverse change variant. SEC Regulation S-K Item 303 requires disclosure of known trends, events, or uncertainties that are reasonably likely to have a material effect on financial condition or results of operations. Compliance AI that reviews 10-K exhibit images — property photographs, equipment images, regulatory correspondence — for material adverse change indicators can be manipulated by adversarially crafted exhibit images that suppress the MAC flag. A scanned exhibit image adversarially crafted to score as routine causes the compliance AI to skip MAC review for that exhibit. If the exhibit reflects a material adverse development that the issuer was required to disclose, the AI-suppressed flag contributes to a disclosure failure under Section 10(b) of the Securities Exchange Act of 1934 and Rule 10b-5.
The threshold for compliance document review AI is THRESHOLD_LEGAL_AI = 55, consistent with the regulatory privilege, GDPR, FCPA, and SEC compliance stakes of this surface. Consilio AI, Epiq AI, and Lighthouse AI clients operating under regulatory privilege — where the compliance review is conducted under the direction of outside counsel for purposes of providing legal advice — have an additional interest in per-document scan evidence: the scan_id log demonstrates that the AI input validation control was operating, which is relevant to the integrity of the privilege assertion over the compliance review work product.
Integration: legal technology AI document ingestion with Glyphward pre-scan
"""
Glyphward adversarial document image scanning integration
for legal technology AI platforms.
Pre-scan every document image before passing to:
- Court exhibit / document management AI (iManage RAVN, NetDocuments AI, Nuix, Conduent)
- eDiscovery TAR AI (Reveal Brainspace, Relativity aiR, Everlaw AI, IPRO Eclipse)
- Contract analysis AI (Harvey AI, Kira Systems, Luminance AI, ThoughtRiver)
- Compliance document review AI (Consilio AI, Epiq AI, Lighthouse AI)
Threshold: 55 for all legal AI surfaces (legal privilege, Brady, FRE 502 stakes).
"""
import asyncio
import base64
import hashlib
import logging
from dataclasses import dataclass
from enum import Enum
from typing import Optional
import httpx
# -----------------------------------------------------------------------
# Configuration
# -----------------------------------------------------------------------
GLYPHWARD_API_KEY: str = "gw-..." # store in secrets manager, not source
GLYPHWARD_SCAN_URL: str = "https://api.glyphward.com/v1/scan"
# Legal AI threshold — lower than general commercial (70) because the
# consequence of a missed adversarial payload includes Brady violation risk,
# FRE 502 privilege waiver, FRCP Rule 26 sanctions, or FCPA/GDPR enforcement.
THRESHOLD_LEGAL_AI: int = 55
# -----------------------------------------------------------------------
# Legal AI context enum
# -----------------------------------------------------------------------
class LegalAIContext(Enum):
COURT_EXHIBIT = "court_exhibit" # Nuix, iManage RAVN, NetDocuments, Conduent
EDISCOVERY_TAR = "ediscovery_tar" # Reveal Brainspace, Relativity aiR, Everlaw, IPRO
CONTRACT_ANALYSIS = "contract_analysis" # Harvey AI, Kira Systems, Luminance, ThoughtRiver
COMPLIANCE_REVIEW = "compliance_review" # Consilio AI, Epiq AI, Lighthouse AI
# -----------------------------------------------------------------------
# Exception class
# -----------------------------------------------------------------------
class AdversarialLegalDocumentError(Exception):
"""
Raised when a legal technology AI document image is blocked by
Glyphward because the adversarial injection score exceeds
THRESHOLD_LEGAL_AI.
Attributes
----------
scan_id Glyphward scan identifier — log to matter audit trail.
score Raw adversarial injection risk score (0–100).
context LegalAIContext enum value for the blocked document.
matter_id_hash SHA-256 of the matter/case identifier (never the raw ID).
document_id_hash SHA-256 of the document identifier.
page Page number within the document (0-indexed).
"""
def __init__(
self,
scan_id: str,
score: int,
context: LegalAIContext,
matter_id_hash: str,
document_id_hash: str,
page: int = 0,
) -> None:
self.scan_id = scan_id
self.score = score
self.context = context
self.matter_id_hash = matter_id_hash
self.document_id_hash = document_id_hash
self.page = page
super().__init__(
f"[Glyphward] Adversarial legal document blocked: "
f"context={context.value}, page={page}, score={score}, "
f"scan_id={scan_id}, matter={matter_id_hash[:12]}..., "
f"doc={document_id_hash[:12]}..."
)
# -----------------------------------------------------------------------
# Audit logger
# -----------------------------------------------------------------------
_audit_log = logging.getLogger("glyphward.legal_ai.audit")
def _log_scan(
*,
scan_id: str,
score: int,
decision: str,
context: LegalAIContext,
matter_id_hash: str,
document_id_hash: str,
image_sha256: str,
page: int,
) -> None:
"""Append an immutable scan record to the matter audit trail."""
import json, time
_audit_log.info(json.dumps({
"event": "legal_document_image_scan",
"scan_id": scan_id,
"score": score,
"decision": decision,
"context": context.value,
"matter_id_hash": matter_id_hash,
"document_id_hash": document_id_hash,
"image_sha256": image_sha256,
"page": page,
"threshold": THRESHOLD_LEGAL_AI,
"timestamp": time.time(),
}))
# -----------------------------------------------------------------------
# Single-page async scan
# -----------------------------------------------------------------------
async def scan_legal_document_image(
client: httpx.AsyncClient,
image_bytes: bytes,
context: LegalAIContext,
matter_id: str,
document_id: str,
page: int = 0,
threshold: int = THRESHOLD_LEGAL_AI,
) -> dict:
"""
Scan a single document image page for adversarial injection before
passing it to a legal technology AI model.
Parameters
----------
client Shared httpx.AsyncClient (caller manages lifecycle).
image_bytes Raw PNG/JPEG bytes of the rendered document page.
context LegalAIContext enum — identifies the legal AI platform.
matter_id Raw matter/case identifier (hashed before transmission).
document_id Raw document identifier (hashed before transmission).
page Page number within the document (0-indexed).
threshold Adversarial score threshold (default THRESHOLD_LEGAL_AI=55).
Returns
-------
dict with keys: scan_id, score, decision ("allow" | "block"), page.
Raises
------
AdversarialLegalDocumentError if score >= threshold (document blocked).
httpx.HTTPError on Glyphward API failure (fail closed —
caller must route to human review).
"""
matter_id_hash = hashlib.sha256(matter_id.encode()).hexdigest()
document_id_hash = hashlib.sha256(document_id.encode()).hexdigest()
image_sha256 = hashlib.sha256(image_bytes).hexdigest()
b64 = base64.b64encode(image_bytes).decode()
resp = await client.post(
GLYPHWARD_SCAN_URL,
headers={"Authorization": f"Bearer {GLYPHWARD_API_KEY}"},
json={
"image": b64,
"source": context.value,
"metadata": {
"matter_id_hash": matter_id_hash, # SHA-256; never raw matter ID
"document_id_hash": document_id_hash, # SHA-256; never raw document ID
"page": page,
"threshold": threshold,
},
},
timeout=10.0,
)
resp.raise_for_status()
payload = resp.json()
scan_id = payload["scan_id"]
score = payload["score"]
decision = "block" if score >= threshold else "allow"
_log_scan(
scan_id=scan_id,
score=score,
decision=decision,
context=context,
matter_id_hash=matter_id_hash,
document_id_hash=document_id_hash,
image_sha256=image_sha256,
page=page,
)
if score >= threshold:
raise AdversarialLegalDocumentError(
scan_id=scan_id,
score=score,
context=context,
matter_id_hash=matter_id_hash,
document_id_hash=document_id_hash,
page=page,
)
return {"scan_id": scan_id, "score": score, "decision": decision, "page": page}
# -----------------------------------------------------------------------
# Batch scan for eDiscovery TAR review queue
# -----------------------------------------------------------------------
async def batch_scan_ediscovery_tar_queue(
document_pages: list[tuple[bytes, str, int]],
matter_id: str,
concurrency: int = 20,
threshold: int = THRESHOLD_LEGAL_AI,
) -> dict:
"""
Batch-scan a list of eDiscovery document image pages for the TAR
review queue pre-scan pattern.
Parameters
----------
document_pages List of (image_bytes, document_id, page_number) tuples.
For a 1-million-document eDiscovery collection, this
function is called in batches; concurrency controls the
number of simultaneous Glyphward API calls.
matter_id Raw matter/case identifier (hashed before transmission).
concurrency Max simultaneous scan calls (default 20).
threshold Adversarial score threshold (default 55).
Returns
-------
dict with:
total_scanned int — number of pages scanned.
blocked_pages list — AdversarialLegalDocumentError instances for
blocked pages (caller must quarantine).
allowed_pages list — scan result dicts for pages that passed.
passed bool — True if zero pages were blocked.
Notes
-----
- Fail closed: API errors are NOT treated as allow decisions. Any page
that cannot be scanned due to an API error is added to blocked_pages.
- Do not pass blocked pages to the TAR predictive coding model under
any circumstances. Route to human attorney privilege review.
- Log all scan results — including allowed pages — to the matter audit
trail. Per-document scan evidence is the “reasonable steps”
record for FRCP Rule 26 and FRE 502(b) clawback purposes.
"""
semaphore = asyncio.Semaphore(concurrency)
blocked: list[AdversarialLegalDocumentError] = []
allowed: list[dict] = []
async def _scan_one(
client: httpx.AsyncClient,
image_bytes: bytes,
document_id: str,
page: int,
) -> None:
async with semaphore:
try:
result = await scan_legal_document_image(
client=client,
image_bytes=image_bytes,
context=LegalAIContext.EDISCOVERY_TAR,
matter_id=matter_id,
document_id=document_id,
page=page,
threshold=threshold,
)
allowed.append(result)
except AdversarialLegalDocumentError as exc:
blocked.append(exc)
except httpx.HTTPError as exc:
# Fail closed: API error -> block the page.
_audit_log.error(
f"Glyphward API error for doc={document_id} page={page}: {exc}. "
f"Routing to human review (fail-closed)."
)
blocked.append(
AdversarialLegalDocumentError(
scan_id="api_error",
score=-1,
context=LegalAIContext.EDISCOVERY_TAR,
matter_id_hash=hashlib.sha256(matter_id.encode()).hexdigest(),
document_id_hash=hashlib.sha256(document_id.encode()).hexdigest(),
page=page,
)
)
async with httpx.AsyncClient() as client:
await asyncio.gather(*[
_scan_one(client, img, doc_id, pg)
for img, doc_id, pg in document_pages
])
return {
"total_scanned": len(document_pages),
"blocked_pages": blocked,
"allowed_pages": allowed,
"passed": len(blocked) == 0,
}
The LegalAIContext enum serialises to the source field of the Glyphward scan API call, allowing platform-specific detection heuristics to be applied on the server side. The matter_id_hash and document_id_hash fields transmit SHA-256 hashes of the matter and document identifiers — never the raw identifiers — so that scan metadata in the Glyphward API log does not constitute a disclosure of confidential matter information. The batch function for eDiscovery TAR includes a semaphore-bounded concurrency limit, defaulting to 20 simultaneous scan calls, which allows a one-million-document collection to be pre-scanned within an operationally acceptable window when called in batches. The fail-closed API-error handling is not optional for legal AI surfaces: a transient API error that silently allows an unscanned document image to reach the TAR model negates the control and removes the per-document audit evidence. Fail-closed means fail to human review, not fail to AI processing.
Coverage matrix: adversarial image injection detection across legal AI surfaces
| Detector | Court exhibit AI (iManage RAVN, Nuix, NetDocuments) | eDiscovery TAR AI (Relativity aiR, Reveal, Everlaw, IPRO) | Contract analysis AI (Harvey AI, Kira, Luminance, ThoughtRiver) | Compliance review AI (Consilio AI, Epiq AI, Lighthouse) |
|---|---|---|---|---|
| Text-only PI scanners (Lakera Guard, Azure Prompt Shields, LLM Guard) | No — inspects extracted text only; pixel-domain adversarial perturbations are invisible to text analysis | No — TAR document images processed by vision model before OCR; text scanner receives no input at the adversarial injection point | No — contract page image perturbations carry no text-layer signal; scanner sees only OCR output of the unmodified text layer | No — compliance document image perturbations operate in the pixel domain; text-only scanners do not inspect image bytes |
| Legal professional conduct rules (Model Rules 1.1 / 1.6) | Partial — competence (Rule 1.1) and confidentiality (Rule 1.6) obligations motivate control, but rules do not detect adversarial images; they define the professional responsibility consequence of a failure to detect | Partial — Rule 1.1 requires technology competence including understanding AI risks; does not prevent or detect the attack, only imposes duty to have done so | Partial — Rule 1.1 technology competence requires attorneys to understand limitations of AI contract review tools; does not constitute a technical detection control | Partial — Rule 1.6 confidentiality obligations extend to regulatory compliance matter materials; professional rules impose duty but do not provide detection capability |
| Human attorney review (without pre-scan) | No — adversarial pixel perturbations in exhibit images are imperceptible to human visual review; the attorney reviewing the document cannot see the perturbation that manipulated the AI | No — TAR review is specifically designed to reduce human review volume; attorneys do not review the document images that the TAR model classified as non-relevant, which is exactly the population an adversarial injection targets | No — human review of contract pages sees the visible text; the adversarial perturbation is in the pixel domain and is imperceptible at screen or print resolution | No — compliance reviewers reviewing the AI-classified document set do not see the documents that the AI classified as low-risk; adversarial suppression of the AI flag removes the document from human review scope |
| Glyphward (inference-boundary image scan) | Yes — scans document image bytes before classification; returns score and scan_id in <200 ms; blocks adversarially crafted exhibit images at threshold 55 | Yes — async batch scan API supports TAR-scale document volumes; per-page scan_id logged to matter audit trail as FRCP / FRE 502 evidence | Yes — scans contract page images before Harvey AI / Kira / Luminance extraction; blocks adversarially crafted pages; logs scan_id to due-diligence audit trail | Yes — scans compliance document images before Consilio / Epiq / Lighthouse classification; suppresses adversarial flag-suppression attacks; logs scan_id for regulatory privilege record |
Frequently asked questions
How does adversarial eDiscovery document injection relate to Federal Rule of Civil Procedure Rule 26 sanctions for discovery misconduct?
FRCP Rule 26 governs the parties’ obligations of complete and accurate disclosure in civil discovery, including the duty to make reasonable inquiry in preparing discovery responses and the obligation to supplement discovery as new information becomes available. Rule 26(g) requires that the attorney signing a discovery response certify that the response is complete and correct to the best of the attorney’s knowledge after reasonable inquiry. When a party uses AI-assisted review — TAR predictive coding — to identify responsive documents, courts including the D.C. Circuit in Moore v. Publicis Groupe and the Southern District of New York in Rio Tinto v. Vale have addressed the use of TAR in discovery. Emerging case law generally holds that the use of TAR satisfies Rule 26’s reasonableness requirement when the AI methodology is properly validated and documented. The adversarial injection risk creates a direct challenge to the “reasonable inquiry” and “complete and correct” standards: if an adversarially crafted document image caused the TAR model to classify a responsive document as non-relevant, the producing party’s Rule 26 certification that the production is complete is false — not due to negligence in human review, but due to adversarial manipulation of the AI classifier that the party relied upon.
The sanctions exposure under Rule 26(g)(3) for an improper certification includes reasonable expenses and attorney’s fees. Courts have also imposed spoliation sanctions under Rule 37(e) when AI-assisted review produced demonstrably incomplete productions, particularly when the inadequacy was systematic rather than random. An adversarial injection that causes category-level suppression of a class of responsive documents — all documents in a specific topic category scoring as non-relevant due to adversarially crafted seed-set images — looks like systematic discovery incompleteness to a court examining the production. Demonstrating that the adversarial input validation control was operating — via per-document Glyphward scan_id records in the matter audit trail — is the producing party’s evidence that its TAR methodology was not deficient in design, even if it was attacked. Without that scan evidence, the party has no mechanism to distinguish between a genuine TAR methodology failure (which courts have sanctioned) and an adversarial attack on a properly-designed TAR process (which is a different legal and factual situation). The per-document scan record is the factual foundation for the distinction.
What professional responsibility obligations does an attorney have if the firm’s contract AI was manipulated during M&A due diligence?
The American Bar Association’s Model Rules of Professional Conduct provide the framework, and several rules are directly implicated. Model Rule 1.1 — Competence — requires lawyers to provide competent representation, which includes the legal knowledge, skill, thoroughness, and preparation reasonably necessary for the representation. Comment 8 to Rule 1.1, added in the 2012 ABA technology amendment, provides that to maintain the requisite knowledge and skill, a lawyer should keep abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology. State bar ethics opinions applying Comment 8 — including those from California, New York, Florida, and North Carolina — have consistently held that an attorney using AI tools for client work must understand the limitations of those tools sufficient to supervise their output. An attorney who delegates contract clause extraction to Harvey AI or Kira Systems and does not understand that those platforms process scanned document images through vision models exposed to adversarial pixel perturbations cannot be said to understand the tool’s limitations in the sense Rule 1.1 requires.
Model Rule 5.3 — Responsibilities Regarding Nonlawyer Assistance — applies to AI tools used in legal practice under the ABA’s 2023 guidance on generative AI. A supervising attorney who uses AI contract review platforms is responsible for ensuring that the AI’s output is adequately supervised. If manipulated contract AI output — a missed liability cap, an undetected change-of-control provision — reaches a client advisory or deal document without attorney review sufficient to catch the manipulation, Rule 5.3 is implicated. Model Rule 1.4 — Communication — may require the attorney to disclose to the client that AI contract review was used and that a manipulation was discovered, particularly if the manipulation affected advice previously given. In a transactional context where the due diligence report has already been delivered to the client and relied upon in negotiation or pricing, the discovery of adversarial manipulation of the AI-generated clause extraction creates disclosure obligations that depend on the state of the deal and the nature of the manipulated terms. Documenting the Glyphward pre-scan — including the matter_id_hash and scan_id in the engagement file — demonstrates that the firm took reasonable steps to protect the AI input layer, which is relevant to the Rule 1.1 competence analysis and to any subsequent professional responsibility proceeding.
What is the correct protocol for handling a Glyphward-blocked document image during a live eDiscovery TAR review run?
When batch_scan_ediscovery_tar_queue() returns a non-empty blocked_pages list, the protocol has five steps. First, quarantine: do not pass any blocked document image to the TAR predictive coding model under any circumstances. The blocked image must not enter the TAR training seed set or the classification queue. Adding an adversarially crafted document image to the TAR seed set is the mechanism by which an adversary achieves category-level suppression across the collection; quarantining the blocked image prevents this. The quarantine must be applied to the specific page identified by the document_id_hash and page fields in the AdversarialLegalDocumentError exception. Second, log: record the scan_id, document_id_hash, matter_id_hash, score, and quarantine decision to the matter audit trail. This record is immutable after write. Third, preserve: do not delete or modify the original document image. The document may be evidence in the litigation; destruction creates spoliation risk under Rule 37(e). The quarantined image is a potential forensic artifact demonstrating that an adversarial attack was attempted. Fourth, route: the blocked document must receive human attorney review. A supervising attorney or senior litigation support professional should review the original document and make an independent relevance and privilege determination without reliance on the AI classifier. The attorney review outcome and the scan_id reference are both logged to the matter audit trail. Fifth, escalate if the volume of blocked images exceeds a threshold consistent with random adversarial noise — if more than 0.5% of documents in a production set are blocked, or if blocked documents cluster in a specific document category or custodian, treat the pattern as evidence of a potential adversarial attack on the TAR process and escalate to the supervising partner and, if appropriate, to outside cybersecurity counsel. Document the escalation and the basis for it. The existence of this protocol — documented and followed — is the “reasonable steps” evidence for both FRE 502(b) inadvertent disclosure defense and FRCP Rule 26 certification defence.
Further reading
- Indirect prompt injection via image — foundational attack pattern: how adversarial pixel payloads reach vision model token streams through scanned document images.
- Fintech and payments AI prompt injection — financial services AI document integrity stakes with overlapping adversarial image attack patterns.
- SOC 2 AI security controls — SOC 2 CC6.6 and CC7.1 input validation evidence applicable to legal technology AI platforms holding SOC 2 Type II attestations.
- SOX compliance AI security — SOX Section 302 and 404 ICFR controls covering public company legal department AI document workflows.
- Free tier — 10 scans/day, no card required — start scanning legal document images in your pipeline today.