OWASP LLM Top 10 · LLM04:2025

OWASP LLM04:2025 Model Denial of Service — multimodal resource exhaustion

OWASP LLM04:2025 names a class of attacks where an adversary sends inputs that consume disproportionate compute, memory, or API quota — effectively making the LLM slow, expensive, or unavailable. The text-input variant is well-understood: unusually long prompts, deeply recursive context, or requests designed to maximize the number of output tokens. The multimodal variant is less studied but potentially more severe. A crafted image can inflate the model's attention computation by orders of magnitude relative to a benign image of the same file size. An attacker who can influence what images your application sends to the model — via image upload, URL parameter, retrieved document, or tool output — can cause a single request to cost 10–100× its normal compute budget. At scale, this is a direct denial-of-service attack on your inference budget. Glyphward detects anomalously complex images — including adversarial textures designed to maximize tokenization cost — and returns a complexity_score alongside the injection score, letting you block or throttle before the model call.

TL;DR

Before every multimodal model call, scan the input image with POST https://glyphward.com/v1/scan. Check both score (injection risk) and complexity_score (DoS risk). If complexity_score ≥ 80, downscale or reject the image before it reaches the model. Free tier — 10 scans/day, no card required.

How adversarial images cause multimodal model DoS

Vision tokenization is quadratically sensitive to image complexity. Multimodal models (GPT-4o, Claude 3, Gemini 1.5) tile images into patches before encoding. The number of visual tokens generated from an image is a function of both its resolution and its high-frequency content (detail, texture, noise). A 1024×1024 benign photograph generates roughly 1 000 visual tokens. A 1024×1024 image filled with adversarial noise — specifically crafted to maximize edge detection and texture variance — can generate 3–5× as many visual tokens, because the patch encoder assigns maximum information density to every tile. The attention computation for the transformer scales with O(n²) in the token count, so tripling the token count from a single image increases the attention FLOPS for that image by roughly 9×.

Three known adversarial-image DoS vectors.

1. Adversarial texture injection. Images filled with fine-grained, high-entropy textures — visually similar to white noise or microscopy images — maximise patch-encoder output token counts. The VRAM usage and compute time per image spike relative to benign images of identical file size. Attackers do not need to craft these images from scratch: adversarial noise generation is a well-documented technique from the adversarial ML literature (FGSM, PGD, C&W attacks all produce high-frequency perturbations that incidentally maximise tokenisation cost).

2. Oversized or extremely high-resolution images. Most multimodal APIs impose resolution limits (e.g. Anthropic Claude: max 1 568 × 1 568 px; OpenAI GPT-4o: max 2 048 × 2 048 px on the high-detail tier). Before these limits take effect, however, images must be validated and resized by the API gateway. A stream of near-limit images (e.g. 2 047 × 2 047 px) can saturate the gateway's preprocessing compute before any model call occurs. At the free or starter tier of most APIs, this is enough to exhaust the request quota and block legitimate users.

3. Repeated identical images in multimodal context. Some agentic frameworks pass the same image multiple times in a conversation context (once as a user upload, once as a retrieved document chunk, once as a tool result). Each appearance is tokenised independently; the model's context window fills with redundant visual tokens, and the per-token compute cost applies to each copy. An attacker who can cause a single image to appear N times in a multi-turn conversation multiplies the DoS impact by N.

Detection and mitigation — Python example

import base64, requests, os
from PIL import Image
import io

GLYPHWARD_KEY = os.environ["GLYPHWARD_API_KEY"]
INJECTION_THRESHOLD = 65
COMPLEXITY_THRESHOLD = 80
MAX_PIXELS = 1_500_000  # 1.5 MP hard cap before scan

def preprocess_image(image_bytes: bytes) -> bytes:
    """Downscale to MAX_PIXELS before even sending to the scan API."""
    img = Image.open(io.BytesIO(image_bytes))
    w, h = img.size
    if w * h > MAX_PIXELS:
        scale = (MAX_PIXELS / (w * h)) ** 0.5
        img = img.resize((int(w * scale), int(h * scale)), Image.LANCZOS)
        buf = io.BytesIO()
        img.save(buf, format="PNG")
        return buf.getvalue()
    return image_bytes

def scan_for_injection_and_dos(image_bytes: bytes, source: str = "api") -> dict:
    preprocessed = preprocess_image(image_bytes)
    resp = requests.post(
        "https://glyphward.com/v1/scan",
        json={"image": base64.b64encode(preprocessed).decode(), "source": source},
        headers={"Authorization": f"Bearer {GLYPHWARD_KEY}"},
        timeout=8,
    )
    resp.raise_for_status()
    return resp.json()

def safe_multimodal_call(image_bytes: bytes, user_text: str) -> str:
    """
    Gate: scan for both injection (LLM01) and DoS complexity (LLM04).
    Rejects images that score above either threshold.
    """
    try:
        result = scan_for_injection_and_dos(image_bytes, source="user_upload")
    except Exception:
        # Fail-closed on scanner failure
        raise ValueError("Image could not be verified — request rejected")

    if result["score"] >= INJECTION_THRESHOLD:
        raise ValueError(f"Image rejected: adversarial injection content (score={result['score']})")

    if result.get("complexity_score", 0) >= COMPLEXITY_THRESHOLD:
        # Downscale aggressively and retry rather than hard-reject
        # (complexity may be benign — e.g. a detailed diagram)
        image_bytes = preprocess_image(image_bytes)
        # Optionally convert to JPEG to reduce high-frequency content
        img = Image.open(io.BytesIO(image_bytes)).convert("RGB")
        buf = io.BytesIO()
        img.save(buf, format="JPEG", quality=70)
        image_bytes = buf.getvalue()

    # Safe to pass to the model
    return call_your_vision_model(image_bytes, user_text)

# Rate-limit by complexity score for bulk endpoints
def rate_limit_by_complexity(scan_result: dict, user_id: str) -> bool:
    """Returns True if the request should proceed; False if rate-limited."""
    complexity = scan_result.get("complexity_score", 0)
    if complexity >= 60:
        # High-complexity image: count as 3× tokens against the user's quota
        token_weight = 3
    elif complexity >= 40:
        token_weight = 1.5
    else:
        token_weight = 1
    return your_rate_limiter.consume(user_id, tokens=token_weight)

The two-threshold pattern separates injection risk (block hard) from DoS complexity (downscale or rate-limit). Many benign images — microscopy photos, satellite imagery, dense engineering diagrams — are legitimately complex. The correct response to a high complexity_score is downscaling and JPEG compression, not rejection. Only reject if the complexity is combined with adversarial injection indicators (score ≥ 65), which is the signature of a crafted DoS-plus-injection image.

Get early access

LLM04 in the OWASP LLM Top 10 context

OWASP risk	Text-only LLM	Multimodal LLM — additional surface	Glyphward coverage
LLM01: Prompt Injection	Malicious instructions in text input	Malicious instructions hidden in image pixels (FigStep, AgentTypo, typographic PI)	Yes — injection score
LLM04: Model DoS	Excessively long prompts, deeply recursive context	Adversarial texture images, extreme-resolution images, repeated image tokens in context	Yes — complexity score
LLM02: Insecure Output Handling	Model output rendered without sanitisation	Image-origin injection that produces unsafe model output (XSS via generated HTML)	Partial — injection scan limits the source of dangerous output
LLM03: Training Data Poisoning	Training set contaminated with adversarial text	Multimodal training set contaminated with adversarial images	Partial — scan at inference time; training-time coverage requires separate corpus audit

Related questions

What exactly is OWASP LLM04:2025 Model Denial of Service?

OWASP LLM04 describes attacks where an adversary sends inputs that cause the LLM to consume an abnormally large amount of resources — compute time, memory, API tokens, or financial budget — without producing proportionally more useful output. The attack goal may be financial (driving up inference costs until the victim exhausts their budget and the service degrades) or availability (consuming all available compute so that legitimate requests time out). Unlike traditional web DoS (which exhausts network bandwidth or TCP connections), LLM DoS exploits the economic structure of LLM APIs: most providers charge per-token, and per-token costs are much higher for complex or high-resolution inputs than for simple text.

How do I know if an image is genuinely complex vs adversarially crafted for DoS?

Glyphward's complexity_score is derived from several signals: Shannon entropy of the image, edge density (via Sobel operator), patch-encoder token estimate, and spectral analysis (power spectral density of high-frequency components). Adversarially crafted noise images score extremely high on entropy and edge density while scoring near-zero on semantic content indicators (no coherent objects, no text regions, no dominant colours). Benign complex images (microscopy, satellite) score high on entropy but also on semantic content. The complexity_score is a composite that discounts legitimate complexity. That said, some false positives are inevitable — downscaling rather than hard-rejecting high-complexity images is the recommended response for production deployments.

Are there known real-world examples of multimodal DoS attacks?

There are no widely publicised confirmed incidents of multimodal LLM DoS in production as of 2026. However, the underlying mechanism — that adversarial noise images inflate visual token counts and attention FLOPS — is a documented property of vision transformers studied in the adversarial ML literature. The risk is particularly acute for free-tier or startup deployments that operate close to their inference budget ceiling. A competitor, disgruntled user, or bot operator targeting a public-facing image-scanning endpoint can cause a cost spike without any obvious visible impact on the product, making the attack difficult to detect without cost anomaly monitoring.

How does this interact with OWASP LLM01 (Prompt Injection)?

LLM01 and LLM04 are distinct but can be combined in a single adversarial image. A DoS-plus-injection image is crafted to both exhaust compute (bypassing rate limits by making each request expensive) and inject instructions into the model's context. Glyphward's scan returns both an injection_score and a complexity_score, and both thresholds should be checked independently. If an image triggers the injection threshold, reject it hard (do not downscale and retry). If it triggers only the complexity threshold, downscale and assess whether the downscaled version also triggers injection.

TL;DR

How adversarial images cause multimodal model DoS

Detection and mitigation — Python example

LLM04 in the OWASP LLM Top 10 context

Related questions

Further reading