ICP-by-product · Microsoft Semantic Kernel
Prompt-injection scanner for Microsoft Semantic Kernel
Microsoft Semantic Kernel (SK) is an open-source AI SDK for .NET, Python, and Java. Its ChatHistory supports multimodal message content via ImageContent objects — a user can supply image bytes alongside a text prompt, and SK passes those bytes to the underlying vision model (Azure OpenAI GPT-4o, OpenAI GPT-4V, or another configured connector) without inspecting them for adversarial payloads. Azure Prompt Shields, if enabled on the Azure OpenAI endpoint, covers the text-content path. It does not inspect the ImageContent bytes for FigStep-class typographic jailbreak instructions or AgentTypo-class glyph distortions. Scan those bytes before they enter the chat history.
TL;DR
Before calling chatCompletionService.GetChatMessageContentsAsync() (C#) or chat_service.get_chat_message_contents() (Python), scan every ImageContent item in the user message. POST the image bytes to Glyphward's /v1/scan — if the score exceeds your threshold, reject the request before it reaches the model. One POST, under 200 ms, returns a 0–100 score and the flagged pixel region. Free tier: 10 scans/day, no card. Start on the free tier.
How Semantic Kernel handles multimodal inputs
Semantic Kernel's content model uses a ChatMessageContent class whose Items collection can hold any mix of TextContent and ImageContent objects. An ImageContent holds either raw bytes (Data as a ReadOnlyMemory<byte> in C# or a bytes object in Python) or a URI reference (Uri in C# / uri in Python). When GetChatMessageContentsAsync() is called, the SK connector serialises the ChatHistory into the model provider's request format — the image_url content blocks for OpenAI-compatible APIs, or the image source blocks for Anthropic.
The serialisation step passes the image bytes through to the model API without any analysis. SK's kernel middleware (filters in the IFunctionInvocationFilter / IPromptRenderFilter pipeline) operates on prompt templates and function calls, not on raw image bytes. There is no SK built-in equivalent of a vision-layer PI scan. The gap is at the same location as in all other framework wrappers: the bytes leave your application and reach the vision encoder without a PI scan unless you add one explicitly.
C# intercept — before GetChatMessageContentsAsync
In a .NET application, add a scan helper that walks the ChatHistory items before the inference call:
using System.Net.Http.Json;
using System.Text;
using System.Text.Json;
using Microsoft.SemanticKernel.ChatCompletion;
public static class GlyphwardScanner
{
private static readonly HttpClient Http = new();
private const string ApiKey = "YOUR_GLYPHWARD_API_KEY"; // use env var in production
private const int ScoreThreshold = 70;
public static async Task ScanChatHistoryImagesAsync(ChatHistory chatHistory)
{
foreach (var message in chatHistory)
{
if (message.Role != AuthorRole.User) continue;
foreach (var item in message.Items)
{
if (item is not ImageContent imageContent) continue;
byte[] imageBytes;
if (imageContent.Data is { IsEmpty: false })
{
imageBytes = imageContent.Data.Value.ToArray();
}
else if (imageContent.Uri is not null)
{
imageBytes = await Http.GetByteArrayAsync(imageContent.Uri);
}
else continue;
var payload = new
{
data = Convert.ToBase64String(imageBytes),
modality = "image",
source_trust = "low"
};
using var request = new HttpRequestMessage(HttpMethod.Post,
"https://api.glyphward.com/v1/scan");
request.Headers.Add("Authorization", $"Bearer {ApiKey}");
request.Content = JsonContent.Create(payload);
using var response = await Http.SendAsync(request);
var result = await response.Content.ReadFromJsonAsync<JsonElement>();
int score = result.GetProperty("score").GetInt32();
if (score > ScoreThreshold)
{
string region = result.TryGetProperty("region", out var r)
? r.GetString() ?? "" : "";
throw new InvalidOperationException(
$"Image blocked: multimodal PI score {score} (region: {region})");
}
}
}
}
}
// Usage:
// await GlyphwardScanner.ScanChatHistoryImagesAsync(chatHistory);
// var reply = await chatCompletionService.GetChatMessageContentsAsync(chatHistory, kernel: kernel);
Python intercept — before get_chat_message_contents
The same pattern in Python with the semantic-kernel package:
import httpx
import base64
import os
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.contents.image_content import ImageContent
from semantic_kernel.contents.text_content import TextContent
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
GLYPHWARD_API_KEY = os.environ["GLYPHWARD_API_KEY"]
async def scan_chat_history_images(chat_history: ChatHistory, threshold: int = 70) -> None:
"""Scan all ImageContent items in user messages. Raises if PI score exceeds threshold."""
async with httpx.AsyncClient() as client:
for message in chat_history.messages:
if message.role.value != "user":
continue
for item in message.items:
if not isinstance(item, ImageContent):
continue
if item.data_uri:
# data URI: extract base64 part
b64 = item.data_uri.split(",", 1)[1] if "," in item.data_uri else item.data_uri
img_bytes = base64.b64decode(b64)
elif item.uri:
resp = await client.get(str(item.uri), timeout=10)
img_bytes = resp.content
else:
continue
scan_resp = await client.post(
"https://api.glyphward.com/v1/scan",
json={
"data": base64.b64encode(img_bytes).decode(),
"modality": "image",
"source_trust": "low",
},
headers={"Authorization": f"Bearer {GLYPHWARD_API_KEY}"},
timeout=5,
)
result = scan_resp.json()
if result["score"] > threshold:
raise ValueError(
f"Image blocked: multimodal PI score {result['score']} "
f"(region: {result.get('region')})"
)
# Usage:
# await scan_chat_history_images(chat_history)
# response = await chat_service.get_chat_message_contents(chat_history, settings=settings)
SK Kernel filters: a cleaner integration point
Semantic Kernel's filter pipeline (IPromptRenderFilter and IFunctionInvocationFilter in C#; PromptRenderFilter and FunctionInvocationFilter in Python) runs before and after prompt rendering and function calls. For a cleaner integration, implement the scan as a prompt render filter that intercepts multimodal prompts before they are sent to the model:
// C# — implement IPromptRenderFilter
public class MultimodalPIScanFilter : IPromptRenderFilter
{
public async Task OnPromptRenderAsync(
PromptRenderContext context, Func<PromptRenderContext, Task> next)
{
// Run the scan on chat history images before the prompt is rendered
if (context.Arguments.ContainsName("chat_history"))
{
var history = (ChatHistory)context.Arguments["chat_history"]!;
await GlyphwardScanner.ScanChatHistoryImagesAsync(history);
}
await next(context);
}
}
// Register in the kernel builder:
// builder.Services.AddSingleton<IPromptRenderFilter, MultimodalPIScanFilter>();
The filter pattern keeps the scan concern separate from your business logic — you register it once and it applies to all chat completion calls that use the kernel, without modifying call sites throughout the codebase.
SK memory and vector store: the indirect-PI surface
Semantic Kernel includes a memory / vector store abstraction for RAG use cases. If your pipeline indexes documents containing images (PDFs, presentations, scanned pages) into an SK vector store, the embedded images in those documents represent an indirect PI surface: a document with a FigStep payload on an image page enters the store, persists, and delivers its payload whenever that page is retrieved and passed to the model as context. Scan documents at ingestion time, before they are written to the vector store. The pre-ingestion scan pattern is the same as for LlamaIndex and other RAG pipelines — the framework wrapper changes; the scan call is identical.
Related questions
Does Semantic Kernel have any built-in content safety for images?
SK delegates content safety to the underlying model provider and any Azure Content Safety integration you configure on the Azure OpenAI endpoint. Azure Prompt Shields (if enabled) covers the text-content path. It does not inspect ImageContent bytes. SK itself does not include a multimodal PI scanner. The filter pattern described above is the recommended way to add one without modifying call sites.
This applies to SK with local models (Ollama, LM Studio)?
Yes. If you configure SK to use a local OpenAI-compatible connector pointing at Ollama or LM Studio, and the local model supports vision inputs (LLaVA, Moondream, etc.), the same ImageContent bytes reach the local model's vision encoder. The Glyphward scan intercept runs in your application before the connector dispatches the request — the target model endpoint is irrelevant.
How does the filter pattern compare to wrapping every call site?
The filter is registered once and applies to all chat completion calls through that kernel instance. Wrapping each call site is functionally equivalent but brittle — it requires every new call site to remember to invoke the scan wrapper. The filter approach is more maintainable: add one class, register it, and every multimodal prompt is scanned automatically.
What if the image is passed via a URI rather than bytes?
Both the C# and Python examples handle URI-referenced images by fetching the bytes before scanning. Do not skip scanning for URI-referenced images — the attacker can serve an adversarial image from any URL they control, and a scan-if-bytes-present-only check creates a trivial bypass via URL reference.
Does this apply to SK's planner and process framework?
Yes. SK's Handlebars Planner, Stepwise Planner, and Process Framework orchestrate function calls and can produce or consume multimodal content as intermediate results. Any step that introduces image data into the context — whether from user input, tool output, or retrieved memory — should be scanned before the image reaches a vision model call. The filter approach covers prompts rendered by the kernel planner; for process-framework steps that receive image data from tools, add the scan in the tool implementation.
Further reading
- Prompt-injection scanner for Azure OpenAI — the platform-level pattern for SK deployments that use Azure OpenAI as the backend.
- Prompt-injection scanner for LlamaIndex agents — the equivalent pattern in the Python LlamaIndex framework.
- Prompt-injection scanner for LangChain agents — the equivalent pattern in the Python LangChain framework.
- FigStep detection — the typographic attack class vision models are exposed to.
- Indirect prompt injection via images — the RAG / vector-store retrieval attack path.
- Azure Prompt Shields alternative — covers the text-layer gap Prompt Shields closes, distinct from the image-layer gap Glyphward covers.
- Multimodal LLM security API — the category-level overview.