Cryptography & AI Security · 2026-06-12

Post-quantum cryptography for AI orchestration: the harvest-now-decrypt-later threat to your system prompts

Q: Which NIST post-quantum standards apply to AI orchestration systems and what do they specify?

NIST finalized three post-quantum cryptography standards in August 2024: FIPS 203 (ML-KEM, based on CRYSTALS-Kyber), which provides quantum-resistant key encapsulation for TLS and symmetric key exchange; FIPS 204 (ML-DSA, based on CRYSTALS-Dilithium), which provides quantum-resistant digital signatures for authentication, JWT signing, and API request signing; and FIPS 205 (SLH-DSA, based on SPHINCS+), which provides a hash-based signature alternative when a lattice-based scheme is unacceptable for risk-diversification reasons. For AI orchestration, the most directly applicable standard is FIPS 203 (ML-KEM) because the primary exposure is in TLS key exchange — the handshake that establishes the session key for each HTTPS connection to your LLM provider, tool API, or agent-to-agent communication channel. ML-KEM replaces the Diffie-Hellman and ECDH key exchange that RSA and ECC TLS currently use, closing the harvest-now-decrypt-later window for all traffic protected after migration.

Q: What specific data in an AI agent pipeline is most exposed to harvest-now-decrypt-later?

The highest-sensitivity data categories in a typical agentic pipeline, ranked by exposure value to a nation-state or sophisticated criminal adversary: (1) System prompts — often contain competitive business logic, customer segmentation rules, proprietary scoring models, and tool credential patterns; once decrypted, they reveal not just what your AI does but how your business operates internally. (2) RAG retrieval queries — the queries your agent sends to a vector database or retrieval API reveal what your customers are asking about and what your internal knowledge base contains; this metadata is often more sensitive than the documents themselves. (3) Tool output responses — API responses from CRM, ERP, financial, and healthcare integrations contain the actual sensitive records your AI is processing. (4) Chain-of-thought scratchpads — extended thinking outputs from models like Claude or GPT-4o include intermediate reasoning that exposes decision logic your system prompt does not. (5) Agent-to-agent delegation messages — in CrewAI, AutoGen, and A2A-protocol multi-agent systems, the task assignments passed between agents describe your internal workflow orchestration. All five travel over TLS connections that RSA/ECC key exchange makes vulnerable to retrospective quantum decryption.

Q: What does NSM-10 require of federal agencies building AI systems?

National Security Memorandum 10 (NSM-10), signed May 2022, established the US federal timeline for post-quantum migration. Agencies were required to submit inventory of quantum-vulnerable cryptographic systems by November 2022, prioritise migration plans for national security systems by 2025, and complete migration of all national security systems to post-quantum algorithms by 2035, with the most sensitive systems targeted for completion by 2030. A CISA/NSA/NIST joint advisory published in 2022 and updated since identifies TLS key exchange as the highest-priority migration target because it protects data in transit — exactly the harvest-now-decrypt-later exposure. Federal AI systems are not carved out or exempted: an AI orchestration platform deployed in a federal agency is a covered cryptographic system under NSM-10 inventory requirements. Agencies running LangChain, Semantic Kernel, AWS Bedrock agents, or Azure OpenAI deployments need a documented PQC migration path for every TLS-encrypted connection in the agentic pipeline — not just the connection to the LLM provider, but also to vector databases, tool APIs, agent-to-agent communication channels, and logging/observability infrastructure.

Q: How should an AI security team approach PQC migration for an existing agentic pipeline?

A practical migration sequence for agentic pipelines: (1) Inventory all TLS connections in the pipeline — enumerate every outbound HTTPS call the agents make, including LLM provider endpoints, tool APIs, vector database connections, and any agent-to-agent HTTP channels. Tools like Wireshark in a test environment or OpenTelemetry traces with connection-level instrumentation can surface connections that are not visible from the application code. (2) Prioritise by data sensitivity — not all TLS connections carry equally sensitive data. The LLM provider connection (which carries system prompts and completions) and the RAG retrieval connection (which carries sensitive queries) are typically the highest priority. (3) Enable hybrid TLS at the library/runtime layer — both Chrome and Firefox already default to X25519Kyber768 hybrid key exchange for TLS 1.3; Node.js 22+ and Python's cryptography library 42+ support hybrid PQC key exchange. Most LLM SDK connections inherit from the underlying HTTP client, so upgrading to a PQC-capable runtime version is often sufficient for client-initiated connections. (4) Verify server-side support — the PQC TLS handshake requires the server (your LLM provider, tool API, etc.) to also negotiate the hybrid key. OpenAI and Anthropic have both announced PQC roadmaps; AWS Bedrock supports X25519MLKEM768 as of early 2026. (5) Generate audit evidence — for NSM-10 compliance or FedRAMP ATO, document which connections are migrated, the cipher suite in use, and the verification method (e.g., TLS handshake inspection confirming ML-KEM key exchange group).

Nation-state adversaries are archiving your LangChain, CrewAI, and AutoGen TLS traffic today. NIST FIPS 203/204/205 are final. Here is what harvest-now-decrypt-later means specifically for agentic AI pipelines — and why text-only prompt injection scanning is not the only gap in your AI security stack.

Every time your LangChain agent calls the OpenAI API, the message is encrypted with RSA or Elliptic Curve Diffie-Hellman key exchange. Every time a CrewAI orchestrator delegates a task to a sub-agent over an HTTP connection, that conversation is wrapped in standard TLS 1.3. Every time your RAG pipeline fires a query to a Pinecone or Weaviate vector database, the query and result travel over HTTPS.

All of this is secure today. None of it will be secure once a cryptographically relevant quantum computer (CRQC) exists.

That gap — between "secure now" and "secure against a future quantum adversary" — is the harvest-now-decrypt-later window. Sophisticated nation-state signals intelligence programmes have been archiving bulk internet traffic for years. The NSA's PRISM programme, the GCHQ TEMPORA submarine cable interception operation, and similar programmes in China, Russia, and other states all operate at the scale needed to archive high-value TLS sessions. The data is sitting in warehouses, waiting for the key.

For most enterprise applications, this is a long-term planning risk. For AI orchestration specifically, it is immediate and serious — because the data inside those TLS sessions is uniquely sensitive in ways that have no analogue in traditional web application traffic.

What travels inside your agent's TLS sessions

A traditional web application sends a user request over HTTPS and receives an HTTP response. Sensitive data (passwords, PII, financial records) is a small fraction of total traffic, and that data is also protected by application-layer controls — database encryption, access logs, audit trails — that survive independently of transport security.

AI orchestration traffic is different in three ways that make harvest-now-decrypt-later particularly consequential.

System prompts encode your business logic. A well-crafted system prompt for an enterprise AI assistant can run to several thousand tokens and contain: customer segmentation rules, proprietary scoring frameworks, internal workflow decisions, escalation thresholds, competitor positioning, and tool authentication patterns. This is not data your business has ever sent across the internet before. It is the accumulated institutional knowledge of your product and operations team, condensed into a machine-readable specification, sent in plaintext inside every TLS session your agent initiates. Once that TLS session is decrypted — a decade from now, in a foreign intelligence service's data centre — the system prompt is trivially recovered.

RAG queries reveal intent before the answer does. When your agent retrieves from a vector database, the query encodes what the user is looking for in a form that is often more sensitive than the retrieved document. A query like "Q4 revenue forecast assumptions" or "regulatory approval timeline for [drug name]" or "breach notification requirements for [jurisdiction]" reveals decision-making context, strategic priorities, and legal exposure in a way that a static document does not. RAG retrieval queries travel over TLS to your vector database provider — and that traffic is indistinguishable to a network-layer harvester from any other HTTPS traffic.

Agent-to-agent communication exposes internal orchestration architecture. Multi-agent frameworks (CrewAI, AutoGen, Google A2A, Anthropic multi-agent networks) delegate tasks between agents over HTTP. Each delegation message describes your internal decision architecture: which agent is responsible for what, what constraints it operates under, what context it receives. An adversary who can decrypt a month of CrewAI orchestration logs has a detailed operational blueprint of how your AI-driven business actually makes decisions — more operationally useful than any org chart or strategy document.

See our technical overview of post-quantum cryptography threats to AI security pipelines for a deeper treatment of the attack surface taxonomy.

The regulatory landscape: NIST FIPS 203/204/205 and NSM-10

Post-quantum migration is no longer a theoretical planning exercise. The standards are final and the federal deadlines are set.

NIST FIPS 203 (ML-KEM, based on CRYSTALS-Kyber) — finalized August 2024 — specifies the quantum-resistant key encapsulation mechanism that replaces RSA and ECDH in TLS key exchange. This is the most directly relevant standard for AI orchestration: every HTTPS connection your agents make uses key exchange, and ML-KEM is the drop-in replacement. FIPS 203 defines three parameter sets (ML-KEM-512, ML-KEM-768, ML-KEM-1024) with security levels equivalent to AES-128, AES-192, and AES-256 respectively.

NIST FIPS 204 (ML-DSA, based on CRYSTALS-Dilithium) — also August 2024 — specifies the digital signature algorithm that replaces ECDSA and RSA-PSS for code signing, JWT signing, and API authentication. For AI teams that sign their model artifacts, sign audit log entries, or use JWTs for agent authentication, FIPS 204 is the relevant migration target at the authentication layer.

NIST FIPS 205 (SLH-DSA, based on SPHINCS+) — August 2024 — provides a hash-based signature alternative that does not rely on lattice math, providing cryptographic risk diversification for teams that want independence from the lattice assumption family.

NSM-10 (National Security Memorandum 10, signed May 2022) established the federal migration timeline. The critical milestones for AI teams serving federal customers: cryptographic system inventory was due by November 2022; prioritized migration plans for national security systems were due by 2025; full migration of national security systems must complete by 2035, with the most sensitive systems (including those processing classified AI workloads) targeted for completion by 2030. Federal AI deployments — including cloud AI services used under FedRAMP authorizations — are covered systems. An agency running LangChain agents against AWS Bedrock or Azure OpenAI under a FedRAMP ATO needs to document PQC migration status for those TLS connections as part of NSM-10 compliance. See our FedRAMP AI security guidance for the intersection of FedRAMP controls and AI security requirements.

CISA has separately published a "Post-Quantum Cryptography Initiative" with prioritized guidance for network perimeter devices and critical infrastructure, with AI systems explicitly called out in the 2025 updates as a priority migration category due to the sensitivity of system prompt content. The same CISA guidance that covers deploying AI systems securely now cross-references PQC migration as a complementary control.

The timeline problem: when does the threat become real?

The most common objection to PQC migration urgency is: "Quantum computers capable of breaking RSA are 10–15 years away. Why act now?"

The objection misunderstands the harvest-now-decrypt-later threat model. The threat is not about when quantum computers will exist. The threat is about when the data being encrypted today will still be sensitive.

A system prompt describing your enterprise AI's decision logic does not expire in 18 months. The competitive intelligence value of knowing how your AI makes pricing, hiring, or strategic decisions may be highest precisely 5–10 years from now, when your competitors are trying to understand how you got ahead. A RAG query about a pharmaceutical trial does not expire at approval — the regulatory strategy it reveals may be valuable for years after the drug is on the market. Agent delegation messages describing how your AI-driven supply chain makes sourcing decisions have strategic value on the timescale of years.

Expert consensus on CRQC timelines has shifted earlier in recent years. The 2022 NSA assessment cited 2030–2035 as the range of concern. IBM's quantum roadmap targets 100,000+ physical qubit systems by 2033. Google's 2024 Willow chip demonstrated error correction below threshold for the first time. The harvest window for sophisticated adversaries — the gap between "when to archive" and "when to decrypt" — may be as short as 7–10 years. Traffic archived in 2023 may be decryptable by 2033.

For agentic AI systems whose architecture today will still be in production in 2033, the harvest-now-decrypt-later window is already open.

Where the migration surface is in a typical agentic pipeline

A representative agentic pipeline has more TLS connections than most teams have audited. Each is a distinct harvest-now-decrypt-later surface.

LLM provider connection. Every call to OpenAI, Anthropic, Google Gemini, AWS Bedrock, or Azure OpenAI uses HTTPS. This connection carries the system prompt on every request, the user query, and the model completion. It is typically the highest-sensitivity connection in the pipeline by data content. AWS Bedrock added X25519MLKEM768 hybrid key exchange support in early 2026; OpenAI and Anthropic have both publicly committed to PQC roadmaps but have not published deployment timelines. Check our Anthropic Claude API security and AWS Bedrock agents security pages for integration-specific guidance.

Vector database retrieval connection. Pinecone, Weaviate, Chroma, Qdrant, and pgvector all accept connections over TLS. The query embedding sent on retrieval contains a dense representation of the user's intent — and the nearest-neighbour results reveal what your knowledge base contains. Both sides of this connection are sensitive and both are exposed to HNDL.

Tool API connections. CRM (Salesforce, HubSpot), ERP (SAP, NetSuite), ticketing (Jira, ServiceNow), and calendar/email integrations all use HTTPS. When your agent reads or writes records over these connections, the data in transit is the actual sensitive business record — customer data, financial data, employee data. The tool connection is often the most directly sensitive segment in terms of regulatory data classification. See our guidance on MCP server security for the Model Context Protocol layer specifically.

Agent-to-agent communication. CrewAI, AutoGen, and multi-agent frameworks using A2A or HTTP for inter-agent communication create a network of HTTPS connections between agents. These connections are rarely audited specifically for cryptographic hygiene. The data in transit (task delegations, intermediate results, context handoffs) may seem less sensitive than LLM provider or tool connections, but in aggregate it describes the operational logic of your entire AI system. See our CrewAI security and agentic RAG pipeline pages for the specific integration points.

Observability and logging infrastructure. LLM observability platforms (LangSmith, Langfuse, Helicone, Arize) receive full prompt-completion pairs for every request your agent makes. The connection from your agent to your observability provider is an HTTPS connection that carries the complete history of everything your AI has said and done — often the highest-density summary of sensitive data in the entire pipeline. PQC migration for observability provider connections is frequently overlooked in PQC audits because observability is treated as internal tooling rather than a data-in-transit security concern.

What PQC migration actually looks like for an AI engineering team

PQC migration for an AI orchestration pipeline is not a single change. It is a connection-by-connection audit with a different migration path for each segment.

Client-side runtime upgrade. For connections your agent initiates (to LLM providers, tool APIs, vector databases), the key exchange algorithm is negotiated during the TLS handshake. Modern HTTP clients already support hybrid PQC: Python's cryptography library 42+ and httpx support X25519MLKEM768; Node.js 22+ and the undici HTTP client support hybrid PQC natively; Go 1.23+ includes the standard library crypto/tls ML-KEM extension. In many cases, upgrading your runtime version is sufficient to enable hybrid PQC key exchange on all outbound connections — no application code change required.

Server-side verification. Upgrading your client enables hybrid PQC only if the server also negotiates it. Audit your LLM providers, vector databases, and tool APIs for their PQC support status. Where PQC is not yet available server-side, document the gap as an open risk item and track provider announcements. For self-hosted infrastructure (including CI/CD pipelines that deploy your agents), upgrade nginx, Caddy, or your TLS termination layer — Caddy 2.8+ and nginx with BoringSSL support ML-KEM hybrid. See our CI/CD pipeline AI security guide for securing the deployment infrastructure.

JWT and API signing upgrade. If your agents authenticate to tool APIs or inter-agent channels using JWTs, upgrade the signing algorithm from ES256/RS256 to the FIPS 204 ML-DSA equivalent. Python's python-jose and joserfc libraries added ML-DSA support in 2025; the equivalent Node.js libraries are in active development.

Key storage and rotation. Any long-lived API keys, system prompt encryption keys, or agent identity certificates that were generated using RSA or ECC should be rotated to PQC equivalents on the same schedule as the TLS migration. A PQC TLS connection that authenticates with an RSA certificate still exposes the certificate chain to retrospective analysis.

Audit evidence generation. For NSM-10 compliance, FedRAMP ATO maintenance, or internal risk management, document each migrated connection with: the connection endpoint, the cipher suite negotiated, the verification method (e.g., Wireshark capture showing ML-KEM key exchange group 0x11EC for X25519Kyber768Draft00), and the date of migration. This evidence package satisfies NSM-10 inventory requirements and provides the cryptographic hygiene documentation that a FedRAMP 3PAO or auditor will request.

The multimodal security stack is a complementary control, not a substitute

PQC migration addresses the confidentiality threat to data in transit — the risk that archived traffic is retrospectively decrypted. It does not address the integrity threat to data in processing — the risk that adversarial inputs manipulate your AI's outputs at inference time.

Both threats are active. A system prompt that is never decrypted by a quantum adversary can still be bypassed by an adversarial image uploaded through your product's interface. A RAG corpus that is never stolen can still be poisoned by a malicious document injected through an agentic retrieval workflow. These are the attack surfaces that multimodal prompt injection covers — pixel-domain and waveform-domain payloads that bypass your AI's instruction-following by operating in channels that text-only scanners never inspect.

A complete AI security stack for 2026 addresses both layers:

PQC migration — closes the harvest-now-decrypt-later window on your system prompts, RAG queries, tool outputs, and agent communication. NIST FIPS 203/204/205 are the applicable standards; NSM-10 sets the federal deadline.
Multimodal input scanning — closes the image and audio injection gap at inference time. Every image your agent processes and every audio segment your voice AI transcribes needs raw-bytes scanning before it reaches the model, because text-only PI scanners are structurally blind to pixel-domain and waveform-domain payloads.

The two controls protect different parts of the attack surface and operate at different points in the data lifecycle. A PQC-migrated AI system that skips multimodal scanning has a live injection vulnerability. A fully multimodal-scanned AI system that uses RSA TLS has its system prompts sitting in an adversary's harvest archive. A mature AI security posture requires both.

FAQ

What is harvest-now-decrypt-later and why does it matter for AI systems?

Harvest-now-decrypt-later (HNDL) is a threat model in which an adversary — typically a nation-state intelligence service — records and archives encrypted network traffic today, intending to decrypt it retrospectively once a cryptographically relevant quantum computer (CRQC) becomes available. For AI systems, HNDL is particularly serious because the data inside TLS sessions is uniquely sensitive: system prompts encode proprietary business logic, RAG queries reveal strategic intent, and agent delegation messages expose internal decision architecture. Unlike a leaked password that can be rotated, a retrospectively decrypted system prompt from 2026 cannot be un-revealed. See the full answer above for detail on the NSA/GCHQ programmes that make this a non-hypothetical threat.

Which NIST post-quantum standards apply to AI orchestration systems and what do they specify?

Three standards finalized in August 2024: FIPS 203 (ML-KEM/Kyber) for quantum-resistant TLS key exchange — the primary migration target for AI orchestration. FIPS 204 (ML-DSA/Dilithium) for digital signatures used in JWT authentication, code signing, and API request signing. FIPS 205 (SLH-DSA/SPHINCS+) for hash-based signatures as a risk-diversification alternative. ML-KEM (FIPS 203) is the highest-priority standard for most AI teams because TLS key exchange is the mechanism that harvest-now-decrypt-later attacks exploit.

What specific data in an AI agent pipeline is most exposed to harvest-now-decrypt-later?

Ranked by sensitivity: (1) System prompts — business logic in machine-readable form. (2) RAG retrieval queries — reveal what users are searching for, often more sensitive than the documents retrieved. (3) Tool API responses — the actual sensitive records (CRM, ERP, financial, healthcare) your AI processes. (4) Chain-of-thought scratchpads — intermediate reasoning from extended thinking models. (5) Agent-to-agent delegation messages — operational blueprint of your AI decision architecture. All five travel over TLS and are vulnerable to HNDL until ML-KEM migration is complete end-to-end.

What does NSM-10 require of federal agencies building AI systems?

NSM-10 (May 2022) requires federal cryptographic system inventory, prioritized migration plans for national security systems by 2025, and full migration by 2035 (most sensitive systems by 2030). Federal AI deployments under FedRAMP ATOs are covered systems. Agencies running LangChain, Semantic Kernel, AWS Bedrock, or Azure OpenAI agents must document PQC migration status for every TLS connection in the agentic pipeline — including vector database connections, tool APIs, agent-to-agent channels, and observability infrastructure — not just the LLM provider connection.

How should an AI security team approach PQC migration for an existing agentic pipeline?

Five steps: (1) Inventory all TLS connections in the pipeline using OpenTelemetry traces or Wireshark in a test environment. (2) Prioritise by data sensitivity — LLM provider and RAG retrieval connections first. (3) Upgrade to a PQC-capable client runtime (Python cryptography 42+, Node.js 22+, Go 1.23+) for hybrid X25519MLKEM768 support on outbound connections. (4) Verify server-side support at each provider and document gaps as tracked risk items. (5) Generate audit evidence (cipher suite, TLS handshake capture, migration date) for NSM-10/FedRAMP documentation. See the full section above for the complete migration sequence.