Construction AI · Engineering AI · BIM platforms

Prompt injection in construction and engineering AI — drone photo, permit scan, BIM screenshot, and RFI submittal injection

Construction technology has entered a phase of rapid AI adoption across the project lifecycle, and the attack surface that comes with it is both new and high-stakes. Platforms including Procore AI and Procore Copilot, Autodesk Construction Cloud (ACC) with BIM 360 and Autodesk Construction IQ, Oracle Aconex AI, Trimble Construction, and Bentley iTwin all integrate vision-capable AI into workflows that ingest images from untrusted external sources: drone-captured site survey photos, scanned building permit documents, exported BIM model screenshots, and subcontractor-submitted RFI and shop drawing packages. Construction-specific AI progress-monitoring platforms — OpenSpace AI (360° site photo documentation), Doxel AI (construction progress monitoring against BIM baselines), Buildots (helmet-camera and 360° walk-through AI), Versatile AI (crane productivity monitoring), and Structure 3D — process high volumes of site imagery through vision models that track completion percentages, detect schedule deviations, and flag safety conditions. Each of these image streams is a multimodal prompt injection surface. A FigStep-class adversarial payload embedded in a drone photo at 40×40 pixels is invisible in the 4K image a human reviewer sees on screen and invisible to any text-only scanner. The vision model reads it as a direct instruction. The financial and safety consequences are not hypothetical: construction delay penalties on major infrastructure and commercial projects routinely run to hundreds of thousands of dollars per day, structural inspection fraud enables unsafe building occupancy, and false BIM takeoffs corrupt multi-million-dollar procurement decisions. Text-only prompt-injection scanners — Lakera Guard, LLM Guard, Azure Prompt Shields — operate exclusively on the text channel and cannot see the pixel stream. Glyphward scans the image bytes before the vision model does.

TL;DR

Construction AI pipelines on Procore, Autodesk ACC, OpenSpace AI, Doxel, Buildots, and Bentley iTwin ingest drone photos, scanned permits, BIM screenshots, and RFI images from external parties with direct financial and competitive incentive to manipulate AI outputs. Call Glyphward’s /v1/scan before passing any construction image to a vision LLM. Score ≥ 60 for permit and safety-critical documents, ≥ 60 for standard construction documents — block and route to human review with full audit record including project_id, doc_type, scan_id, and image_sha256. Sub-200 ms scan latency. Free tier — 10 scans/day, no card.

Four multimodal injection surfaces in construction and engineering AI

1. Site survey drone photo injection corrupting AI progress monitoring. Construction AI platforms including OpenSpace AI, Doxel AI, and Buildots continuously ingest drone-captured site survey photos and 360° walk-through images to track construction progress against BIM schedule baselines. The AI compares current site photos with the expected completion state derived from the project BIM model, computes progress percentages for individual work packages, and surfaces delay signals for project-manager review. The drone photos are captured by contractor-operated drones or contractor-worn helmet cameras — the very parties who face financial penalties if the AI reports a schedule delay. An adversarial contractor can pre-process drone image files to embed a typographic prompt injection payload before uploading to the platform: a low-contrast instruction strip in the sky region of a rooftop photo, or a pixel-level adversarial perturbation across the image surface that is invisible at human inspection resolution. The vision model reads the instruction and returns a false progress percentage, suppresses a detected delay signal, or fabricates a milestone completion event. Construction delay penalties on major commercial and infrastructure projects run to hundreds of thousands of dollars per day; on mega-projects (data centres, hospitals, transit infrastructure), penalties reach seven figures per day. A contractor who avoids even a single reported delay week via AI progress manipulation recoups the cost of developing or purchasing the payload many times over. Procore’s integration with OpenSpace and Doxel means a falsified progress report propagates directly to the project owner’s pay-application approval workflow.

2. Building permit and inspection document image injection. AI-powered permit management and building inspection platforms — including Procore permit workflows, PermitFlow, OpenGov permit processing, and AI inspection tools used by municipal building departments — process scanned permit documents, inspection certificates, structural engineering sign-off sheets, and code compliance photographs. These scanned document images are passed to vision LLMs for field extraction: approval dates, inspector badge numbers, licence holder names, code section references, and compliance attestation text. A malicious party who controls the document being scanned — a contractor, a permit expediter, or a compromised inspection service — can embed an adversarial typographic injection payload in the scanned image that instructs the AI extraction model to return a false approval date, a fabricated inspector signature field, or a false code compliance certification. In Oracle Aconex AI document workflows and Trimble Construction document management, permit and inspection documents feed downstream approval gates. An AI-extracted false compliance record that passes the automated gate without human review of the original scan can enable building occupancy before a legitimate structural clearance has been issued. The safety risk is direct: a falsely AI-cleared structural inspection result can enable occupation of a building whose structural elements have not been certified safe. This is not a low-probability tail risk — construction fraud involving permit document falsification is a documented, recurring enforcement category for building authorities in the United States, United Kingdom, and Australia.

3. BIM model screenshot and plan sheet image injection. AI-powered quantity takeoff and clash detection tools — including Autodesk Takeoff AI, Togal.ai, PlanSwift AI, and Autodesk Construction Cloud’s Assemble Systems integration — accept uploaded BIM model screenshots and plan sheet image exports for automated quantity extraction and clash report generation. A project BIM coordinator exports a floor-plan image or structural model screenshot and uploads it to the AI takeoff tool; the vision model reads the image and returns material quantities, counts, and area measurements that feed procurement budgets and subcontractor bid packages. An adversarial actor with access to the BIM export step — a subcontractor submitting their own plan sheets, or a malicious insider in the BIM coordination team — can embed an injection payload in the exported image that causes the AI to return false material quantities (under-reporting steel tonnage, over-reporting concrete volume), suppress clash alerts between structural and MEP systems, or fabricate a structural clearance field that the AI takeoff report treats as confirmed. Bentley iTwin’s AI-powered digital twin analysis and Autodesk ACC’s clash-detection workflows both pass BIM-derived image data to vision models. A false quantity extraction that feeds a procurement decision can result in multi-million-dollar material shortfalls on site; a suppressed clash alert that reaches a construction IQ summary can cause a structural or MEP conflict to remain undetected until it is discovered during physical installation — at a remediation cost orders of magnitude higher than early detection.

4. Subcontractor-submitted RFI and submittal document image injection. Construction document management AI — including Procore Copilot, Autodesk Construction IQ, and PlanGrid (now part of Autodesk) — processes subcontractor-submitted request-for-information (RFI) documents and shop drawing submittals for automated review. RFI packages and shop drawing submittals routinely contain embedded images: manufacturer product data sheet photographs, fabrication detail drawings exported as images, and installation diagram screenshots. Construction IQ and Procore Copilot pass these documents — including their embedded image content — to vision LLMs that extract specification compliance information, flag non-conformances, and generate AI recommendation summaries for the project engineer’s review. A subcontractor with a financial interest in having a non-conforming product approved can embed an adversarial injection payload in a product data sheet image within the RFI package. The payload instructs the AI review model to suppress the specification non-conformance flag for a particular product attribute, return a false compliance attestation, or generate a favourable AI summary that recommends approval. A project engineer reviewing twenty RFIs in a morning session relies on the AI summary to identify which items require detailed attention; a falsely favourable AI summary on a structural or fire-rated assembly creates a direct path from adversarial injection to a non-conforming material being approved and installed in a building.

Integration: construction document intake with Glyphward pre-scan

Insert scan_construction_document() as a gate before any image is passed to a vision LLM in a Procore, Autodesk ACC, Oracle Aconex, or OpenSpace AI integration pipeline. The function uses a ConstructionDocType enum to set context-appropriate thresholds and audit metadata:

import base64, hashlib, os
from enum import Enum
from pathlib import Path
import httpx

GLYPHWARD_API_KEY = os.environ["GLYPHWARD_API_KEY"]
GLYPHWARD_SCAN_URL = "https://glyphward.com/v1/scan"

class ConstructionDocType(str, Enum):
    SITE_SURVEY_PHOTO  = "site_survey_photo"
    PERMIT_DOCUMENT    = "permit_document"
    BIM_SCREENSHOT     = "bim_screenshot"
    RFI_SUBMITTAL      = "rfi_submittal"

# Permit and safety-critical documents use a lower (more conservative) threshold
THRESHOLD_BY_DOCTYPE: dict[ConstructionDocType, int] = {
    ConstructionDocType.SITE_SURVEY_PHOTO: 60,
    ConstructionDocType.PERMIT_DOCUMENT:   55,  # safety-critical — fail closed earlier
    ConstructionDocType.BIM_SCREENSHOT:    60,
    ConstructionDocType.RFI_SUBMITTAL:     55,  # structural submittals treated as safety-critical
}

class ConstructionPIBlockedError(Exception):
    """Raised when a construction document image exceeds the PI risk threshold.
    Routes the document to human review rather than auto-rejection."""
    pass

def scan_construction_document(
    image_bytes: bytes,
    doc_type: ConstructionDocType,
    project_id: str,
    doc_ref: str,
) -> dict:
    """
    Scan a construction document image for multimodal prompt injection payloads.

    Args:
        image_bytes: Raw bytes of the image (PNG, JPEG, or PDF-page render).
        doc_type:    ConstructionDocType enum value — sets threshold and audit context.
        project_id:  Project identifier (Procore project ID, Autodesk ACC hub/project).
        doc_ref:     Document reference (permit number, RFI number, BIM export filename).

    Returns:
        Glyphward scan result dict: {score, scan_id, flagged_region, modality}.

    Raises:
        ConstructionPIBlockedError: if score >= threshold for this doc_type.
    """
    image_b64   = base64.b64encode(image_bytes).decode()
    image_sha256 = hashlib.sha256(image_bytes).hexdigest()
    threshold   = THRESHOLD_BY_DOCTYPE[doc_type]

    resp = httpx.post(
        GLYPHWARD_SCAN_URL,
        headers={"Authorization": f"Bearer {GLYPHWARD_API_KEY}"},
        json={
            "image": image_b64,
            "source": "construction_document_intake",
            "metadata": {
                "doc_type":   doc_type.value,
                "project_id": project_id,
                "doc_ref":    doc_ref,
            },
        },
        timeout=5.0,
    )
    resp.raise_for_status()
    result = resp.json()

    # Persist audit record — required for construction project audit trail
    _log_scan_audit(
        project_id=project_id,
        doc_type=doc_type.value,
        doc_ref=doc_ref,
        scan_id=result["scan_id"],
        image_sha256=image_sha256,
        score=result["score"],
        flagged=result["score"] >= threshold,
    )

    if result["score"] >= threshold:
        raise ConstructionPIBlockedError(
            f"[PI BLOCKED] project={project_id} doc_type={doc_type.value} "
            f"doc_ref={doc_ref} scan_id={result['scan_id']} "
            f"score={result['score']} threshold={threshold} "
            f"sha256={image_sha256}"
        )
    return result

def _log_scan_audit(
    project_id: str,
    doc_type: str,
    doc_ref: str,
    scan_id: str,
    image_sha256: str,
    score: int,
    flagged: bool,
) -> None:
    """Write immutable audit record to your project SIEM or append-only store."""
    import json, datetime
    record = {
        "event_type":    "construction_doc_pi_scan",
        "project_id":    project_id,
        "doc_type":      doc_type,
        "doc_ref":       doc_ref,
        "scan_id":       scan_id,
        "image_sha256":  image_sha256,
        "score":         score,
        "flagged":       flagged,
        "ts":            datetime.datetime.utcnow().isoformat() + "Z",
    }
    # Replace with write to your SIEM, append-only S3 object, or audit DB table
    print(json.dumps(record))


# --- Usage examples ---

# 1. OpenSpace / Doxel / Buildots site survey photo before progress AI
def safe_ingest_site_photo(photo_path: str, project_id: str, upload_ref: str):
    image_bytes = Path(photo_path).read_bytes()
    scan_construction_document(
        image_bytes, ConstructionDocType.SITE_SURVEY_PHOTO,
        project_id, upload_ref,
    )
    # Only reached if scan passes — then pass to vision progress AI
    return submit_to_progress_ai(image_bytes, upload_ref)

# 2. Permit document scan before AI field extraction
def safe_ingest_permit_scan(scan_bytes: bytes, project_id: str, permit_number: str):
    scan_construction_document(
        scan_bytes, ConstructionDocType.PERMIT_DOCUMENT,
        project_id, permit_number,
    )
    return extract_permit_fields_with_ai(scan_bytes, permit_number)

# 3. BIM screenshot before Autodesk Takeoff / Togal.ai quantity extraction
def safe_ingest_bim_screenshot(png_bytes: bytes, project_id: str, export_name: str):
    scan_construction_document(
        png_bytes, ConstructionDocType.BIM_SCREENSHOT,
        project_id, export_name,
    )
    return run_ai_quantity_takeoff(png_bytes, export_name)

# 4. RFI submittal image before Procore Copilot / Construction IQ review
def safe_ingest_rfi_image(img_bytes: bytes, project_id: str, rfi_number: str):
    scan_construction_document(
        img_bytes, ConstructionDocType.RFI_SUBMITTAL,
        project_id, rfi_number,
    )
    return submit_to_ai_rfi_review(img_bytes, rfi_number)

All four entry points share the same audit record schema — project_id, doc_type, scan_id, and image_sha256 — so every scanned image is uniquely identified and its scan outcome is persisted before the AI call. The ConstructionPIBlockedError should route the document to a human reviewer queue rather than silently dropping it; a high PI score indicates adversarial content in the image, but the underlying document may be legitimate and require manual processing. Fail-closed: the AI model is never reached when the scan threshold is exceeded. Get early access

Coverage matrix

Mitigation layer Site survey photo injection Permit document injection BIM screenshot injection RFI/submittal injection
Text-only scanner (Lakera Guard, LLM Guard, Azure Prompt Shields) No — image bytes not inspected No — image bytes not inspected No — image bytes not inspected No — image bytes not inspected
OCR text extraction (Tesseract, Azure Form Recognizer) No — adversarial payload not in OCR character set Partial — misses pixel-level overlays and typographic PI No — dimension annotations only, misses PI payloads Partial — extracts printed text, misses image-embedded payloads
Human reviewer No — payload imperceptible at normal review resolution No — designed to evade visual inspection No — pixel-level perturbations invisible at screen resolution No — product data sheet images not reviewed at pixel level
Content moderation classifier (Azure AI Content Safety, AWS Rekognition) No — classifies harmful content categories, not PI payloads No — not designed for adversarial text overlay detection No — detects explicit content, not injected instructions No — wrong threat model for PI
Glyphward (pixel-level + waveform scan) Yes — pixel-level scan of full drone photo Yes — scanned document page render Yes — BIM export image scan Yes — embedded image content in RFI packages

Related questions

Can construction AI progress monitoring (Doxel, OpenSpace) really be fooled by adversarial drone photos?

Yes — and the mechanism is the same one demonstrated in published research on vision-language model adversarial attacks. Platforms like Doxel AI and OpenSpace AI pass drone-captured site photos to vision models that compare the current site state against a BIM baseline. The model reads the full pixel stream of each image, including content that is imperceptible to a human reviewer looking at the image on a laptop screen. A FigStep-class adversarial payload — a low-contrast text instruction rendered at 30×40 pixels in a sky or concrete region of the drone photo — is designed to be readable by the vision encoder while remaining invisible at human inspection speeds and at the compressed thumbnail resolution used in most platform UIs. The vision model, operating at full encoder resolution, reads the instruction and follows it. The payload can be as simple as a white-on-light-grey text string positioned in the upper corner of a rooftop photo; it does not require sophisticated steganography. Published work on typographic attacks (Materzynska et al., 2022; Goh et al., 2021) and FigStep (Gong et al., 2023) demonstrates that off-the-shelf vision models follow rendered text instructions even when those instructions are visually subtle. The construction-specific attack vector adds financial motivation that does not exist in the research lab: a general contractor on a project with a seven-figure weekly delay penalty has direct economic incentive to experiment with payload delivery via the image upload channel.

What is the financial incentive for a contractor to adversarially inject AI progress reports?

Construction delay penalties — liquidated damages (LDs) — are a standard clause in major construction contracts. On commercial, infrastructure, and data-centre projects, LD rates typically range from $10,000 to $500,000 per day of delay beyond the substantial completion date. On mega-projects (airports, hospitals, transit systems), daily LD rates can exceed $1 million. The AI progress monitoring systems used by owners and project managers on these projects — Doxel, OpenSpace, Buildots integrated with Procore and Autodesk ACC — generate progress reports that feed directly into the milestone certification and pay-application approval workflows. If the AI reports that a work package is complete when it is not, the owner may not flag the delay before the LD clock starts. If the AI suppresses a detected delay signal, the project manager reviewing the daily progress dashboard does not trigger the schedule recovery conversation that would otherwise happen. The contractor who avoids a single week of LD exposure on a $100,000/day project saves $700,000 — an amount that would fund significant adversarial research, tooling, and operational security to avoid attribution. The asymmetry between the cost of the attack and the financial exposure it mitigates is what makes this a credible and high-priority threat for construction technology platforms.

How does adversarial building permit scan injection create safety risk?

Building permits and structural inspection certificates are gatekeeping documents: they must be issued and recorded before a building can be legally occupied. AI-powered permit management systems — including those used by municipal building departments adopting OpenGov or similar platforms, and contractor-side permit workflow tools in Procore — process scanned permit documents by passing the scan image to a vision LLM that extracts key fields: permit number, issue date, inspector name and licence number, approved work description, and compliance certifications. In the typical workflow, the AI-extracted fields are stored in the project management system and can feed automated approval gates. If an adversarial payload in the scanned permit image causes the AI to return a false approval date (earlier than the actual approval), a fabricated inspector signature, or a false structural clearance certification, the downstream approval gate may pass on the basis of the false AI extraction without a human reviewer cross-checking the original scan against the building authority’s records. In the construction fraud literature, permit document falsification most commonly targets structural and fire-resistance certifications — the categories where a false clearance most directly enables unsafe occupancy. An AI system that has been manipulated into extracting a false structural inspection pass is a direct enabler of building occupancy before the structural elements have been independently certified as safe. This is the same safety-criticality logic that drives lower (more conservative) thresholds for permit document scans in the Glyphward integration above.

Are adversarial attacks on BIM takeoff AI practical given the technical skill required?

The barrier is lower than it might appear, and it is falling. The adversarial attack techniques that enable BIM screenshot injection do not require a PhD in machine learning. The FigStep family of attacks requires only the ability to render low-contrast text onto an image using a standard image library (PIL, OpenCV) — a skill set that a technically proficient subcontractor BIM coordinator, a quantity surveyor with Python exposure, or a malicious insider at a construction software firm could acquire and deploy without specialised adversarial ML knowledge. More sophisticated pixel-level perturbation attacks (FGSM, PGD-class) require knowledge of the target model’s architecture, but the construction AI platforms naming specific underlying models in their documentation (e.g., Autodesk Takeoff AI’s vision pipeline) narrow the search space for a determined attacker. The procurement stakes make the investment worthwhile: a successful BIM quantity injection that under-reports steel tonnage on a major structural package by 15% can shift millions of dollars from the material supplier’s invoice line to the subcontractor’s margin. As construction AI takeoff tools become more widely adopted — Togal.ai, Autodesk Takeoff, and PlanSwift AI collectively serve thousands of construction firms — the adversarial tooling developed for one platform becomes reusable against others running similar vision encoder architectures. The practical threshold is not “can a nation-state actor do this” but “can a financially motivated subcontractor with access to a capable developer do this” — and the answer is yes.

Further reading