ICP-by-product · Microsoft Semantic Kernel

Prompt-injection scanner for Microsoft Semantic Kernel

Microsoft Semantic Kernel (SK) is an open-source AI SDK for .NET, Python, and Java. Its ChatHistory supports multimodal message content via ImageContent objects — a user can supply image bytes alongside a text prompt, and SK passes those bytes to the underlying vision model (Azure OpenAI GPT-4o, OpenAI GPT-4V, or another configured connector) without inspecting them for adversarial payloads. Azure Prompt Shields, if enabled on the Azure OpenAI endpoint, covers the text-content path. It does not inspect the ImageContent bytes for FigStep-class typographic jailbreak instructions or AgentTypo-class glyph distortions. Scan those bytes before they enter the chat history.

TL;DR

Before calling chatCompletionService.GetChatMessageContentsAsync() (C#) or chat_service.get_chat_message_contents() (Python), scan every ImageContent item in the user message. POST the image bytes to Glyphward's /v1/scan — if the score exceeds your threshold, reject the request before it reaches the model. One POST, under 200 ms, returns a 0–100 score and the flagged pixel region. Free tier: 10 scans/day, no card. Start on the free tier.

How Semantic Kernel handles multimodal inputs

Semantic Kernel's content model uses a ChatMessageContent class whose Items collection can hold any mix of TextContent and ImageContent objects. An ImageContent holds either raw bytes (Data as a ReadOnlyMemory<byte> in C# or a bytes object in Python) or a URI reference (Uri in C# / uri in Python). When GetChatMessageContentsAsync() is called, the SK connector serialises the ChatHistory into the model provider's request format — the image_url content blocks for OpenAI-compatible APIs, or the image source blocks for Anthropic.

The serialisation step passes the image bytes through to the model API without any analysis. SK's kernel middleware (filters in the IFunctionInvocationFilter / IPromptRenderFilter pipeline) operates on prompt templates and function calls, not on raw image bytes. There is no SK built-in equivalent of a vision-layer PI scan. The gap is at the same location as in all other framework wrappers: the bytes leave your application and reach the vision encoder without a PI scan unless you add one explicitly.

C# intercept — before GetChatMessageContentsAsync

In a .NET application, add a scan helper that walks the ChatHistory items before the inference call:

using System.Net.Http.Json;
using System.Text;
using System.Text.Json;
using Microsoft.SemanticKernel.ChatCompletion;

public static class GlyphwardScanner
{
    private static readonly HttpClient Http = new();
    private const string ApiKey = "YOUR_GLYPHWARD_API_KEY"; // use env var in production
    private const int ScoreThreshold = 70;

    public static async Task ScanChatHistoryImagesAsync(ChatHistory chatHistory)
    {
        foreach (var message in chatHistory)
        {
            if (message.Role != AuthorRole.User) continue;

            foreach (var item in message.Items)
            {
                if (item is not ImageContent imageContent) continue;

                byte[] imageBytes;
                if (imageContent.Data is { IsEmpty: false })
                {
                    imageBytes = imageContent.Data.Value.ToArray();
                }
                else if (imageContent.Uri is not null)
                {
                    imageBytes = await Http.GetByteArrayAsync(imageContent.Uri);
                }
                else continue;

                var payload = new
                {
                    data = Convert.ToBase64String(imageBytes),
                    modality = "image",
                    source_trust = "low"
                };

                using var request = new HttpRequestMessage(HttpMethod.Post,
                    "https://api.glyphward.com/v1/scan");
                request.Headers.Add("Authorization", $"Bearer {ApiKey}");
                request.Content = JsonContent.Create(payload);

                using var response = await Http.SendAsync(request);
                var result = await response.Content.ReadFromJsonAsync<JsonElement>();
                int score = result.GetProperty("score").GetInt32();

                if (score > ScoreThreshold)
                {
                    string region = result.TryGetProperty("region", out var r)
                        ? r.GetString() ?? "" : "";
                    throw new InvalidOperationException(
                        $"Image blocked: multimodal PI score {score} (region: {region})");
                }
            }
        }
    }
}

// Usage:
// await GlyphwardScanner.ScanChatHistoryImagesAsync(chatHistory);
// var reply = await chatCompletionService.GetChatMessageContentsAsync(chatHistory, kernel: kernel);

Python intercept — before get_chat_message_contents

The same pattern in Python with the semantic-kernel package:

import httpx
import base64
import os
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.image_content import ImageContent
from semantic_kernel.contents.text_content import TextContent
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

GLYPHWARD_API_KEY = os.environ["GLYPHWARD_API_KEY"]

async def scan_chat_history_images(chat_history: ChatHistory, threshold: int = 70) -> None:
    """Scan all ImageContent items in user messages. Raises if PI score exceeds threshold."""
    async with httpx.AsyncClient() as client:
        for message in chat_history.messages:
            if message.role.value != "user":
                continue
            for item in message.items:
                if not isinstance(item, ImageContent):
                    continue
                if item.data_uri:
                    # data URI: extract base64 part
                    b64 = item.data_uri.split(",", 1)[1] if "," in item.data_uri else item.data_uri
                    img_bytes = base64.b64decode(b64)
                elif item.uri:
                    resp = await client.get(str(item.uri), timeout=10)
                    img_bytes = resp.content
                else:
                    continue

                scan_resp = await client.post(
                    "https://api.glyphward.com/v1/scan",
                    json={
                        "data": base64.b64encode(img_bytes).decode(),
                        "modality": "image",
                        "source_trust": "low",
                    },
                    headers={"Authorization": f"Bearer {GLYPHWARD_API_KEY}"},
                    timeout=5,
                )
                result = scan_resp.json()
                if result["score"] > threshold:
                    raise ValueError(
                        f"Image blocked: multimodal PI score {result['score']} "
                        f"(region: {result.get('region')})"
                    )

# Usage:
# await scan_chat_history_images(chat_history)
# response = await chat_service.get_chat_message_contents(chat_history, settings=settings)

SK Kernel filters: a cleaner integration point

Semantic Kernel's filter pipeline (IPromptRenderFilter and IFunctionInvocationFilter in C#; PromptRenderFilter and FunctionInvocationFilter in Python) runs before and after prompt rendering and function calls. For a cleaner integration, implement the scan as a prompt render filter that intercepts multimodal prompts before they are sent to the model:

// C# — implement IPromptRenderFilter
public class MultimodalPIScanFilter : IPromptRenderFilter
{
    public async Task OnPromptRenderAsync(
        PromptRenderContext context, Func<PromptRenderContext, Task> next)
    {
        // Run the scan on chat history images before the prompt is rendered
        if (context.Arguments.ContainsName("chat_history"))
        {
            var history = (ChatHistory)context.Arguments["chat_history"]!;
            await GlyphwardScanner.ScanChatHistoryImagesAsync(history);
        }
        await next(context);
    }
}

// Register in the kernel builder:
// builder.Services.AddSingleton<IPromptRenderFilter, MultimodalPIScanFilter>();

The filter pattern keeps the scan concern separate from your business logic — you register it once and it applies to all chat completion calls that use the kernel, without modifying call sites throughout the codebase.

Get early access

SK memory and vector store: the indirect-PI surface

Semantic Kernel includes a memory / vector store abstraction for RAG use cases. If your pipeline indexes documents containing images (PDFs, presentations, scanned pages) into an SK vector store, the embedded images in those documents represent an indirect PI surface: a document with a FigStep payload on an image page enters the store, persists, and delivers its payload whenever that page is retrieved and passed to the model as context. Scan documents at ingestion time, before they are written to the vector store. The pre-ingestion scan pattern is the same as for LlamaIndex and other RAG pipelines — the framework wrapper changes; the scan call is identical.