Pipeline integrity AI security · PHMSA 49 CFR 195.452 · API 1163 4th ed. · Baker Hughes FlexPIG · TDW SmartScan · ROSEN RoCorr · SCC detection AI

Pipeline integrity ILI AI: how adversarial MFL anomaly map injection suppresses SCC colony detections and blocks PHMSA 49 CFR 195.452 excavation orders — and why API 1163 does not require adversarial robustness testing

API 1163 4th edition is the qualification standard for all inline inspection AI systems operating on PHMSA-regulated pipelines in the United States. It requires statistical proof of detection performance across the defect types in the tool's claimed scope — POD curves, sizing accuracy within ±10 %WT, and a Software Quality Assurance programme for AI classifiers. It does not require any adversarial robustness testing. A structured ±8 DN pixel perturbation in a rendered MFL anomaly map image — within the JPEG quantisation noise floor — can suppress a stress corrosion cracking colony below the 35 %WT excavation threshold without triggering any API 1163 performance metric. The result: no defect in the Call Report, no PHMSA excavation order, a weakening pipe section in an HCA, and a consequence envelope last established by PG&E San Bruno 2010 (8 fatalities) and Colonial Pipeline 2016 (350,000 US-gallon gasoline release).

How inline inspection pipeline AI works

The modern hazardous liquid and natural gas pipeline network in the United States spans approximately 3.3 million kilometres of onshore and offshore pipe, carrying petroleum products, natural gas, and other hazardous commodities through urban corridors, across drinking water watersheds, and beneath high-consequence population centres. Managing the structural integrity of this network against metal loss corrosion, stress corrosion cracking, manufacturing seam defects, and mechanical damage is the central challenge of pipeline Integrity Management under PHMSA's 49 CFR Part 195 (hazardous liquids) and 49 CFR Part 192 (natural gas).

Inline inspection — the deployment of instrumented pig tools that travel through the pipe bore under normal operating pressure, recording defect signatures from arrays of sensor elements — is the Tier 1 assessment method for IMP-required integrity assessments. A magnetic flux leakage (MFL) pig magnetises the pipe wall to near-saturation using permanent magnet arrays and measures the flux leakage at each axial and circumferential sample position. Ultrasonic testing (UT) pigs measure remaining wall thickness directly using pulse-echo or pitch-catch transducer arrays. EMAT (electromagnetic acoustic transduction) pigs are used for SCC detection in dry-gas pipelines where UT liquid couplant is unavailable. High-resolution camera pigs record the pipe bore surface in colour and near-infrared. Each tool generates hundreds of gigabytes of raw sensor data per 100 km pipeline segment.

This raw sensor data is not directly interpretable by a pipeline engineer. It is processed — rendered — into human-readable representations: the MFL anomaly map (a 2D colour raster encoding flux leakage amplitude as a function of axial position and circumferential clock position), the UT B-scan (a false-colour cross-sectional map encoding wall thickness at each measurement point), the EMAT C-scan (encoding crack opening signal amplitude). These rendered images are then submitted to AI classification engines — Baker Hughes FlexPIG AI, TDW SmartScan AI, ROSEN RoCorr UT AI, NDT Global Evo Series AI — whose outputs determine the Defect Call Report: the formal document that operators use to schedule excavations and repairs under their IMP.

The AI classification step is where the adversarial injection surface is introduced. The classifier receives a raster image — rendered from sensor data — and outputs a defect classification: metal loss vs. SCC vs. girth weld vs. background. It is this raster image that an adversary can modify. The raw sensor data may be unmodified; the modification occurs in the rendering pipeline or at the image ingestion boundary of the classification engine. For more detail on the specific adversarial injection surfaces across MFL, UT B-scan, above-ground drone inspection AI, and SCADA leak detection AI operating on this pipeline network, see our pipeline integrity inspection AI prompt injection overview.

The MFL anomaly map adversarial injection surface

An MFL anomaly map image has a characteristic colour distribution. Background pipe wall — undamaged steel at nominal wall thickness — appears as a uniform blue-to-purple background. Girth welds appear as narrow orange-to-red horizontal bands crossing the full circumference at regular axial intervals. Metal loss defects — corrosion pits, gouges, dents with metal loss — appear as compact clusters of high-amplitude red-to-orange pixels against the blue background, with the pixel amplitude encoding the flux leakage magnitude and therefore the local wall thickness reduction. Stress corrosion cracking colonies appear differently: as diffuse clusters of lower-amplitude orange-to-yellow pixels distributed across a broader axial extent, with a spatial pattern of closely spaced individual anomaly pixels whose aggregate morphology — the cluster shape, axial-to-circumferential aspect ratio, and amplitude envelope — identifies them as SCC rather than isolated metal loss.

This distinction — the morphological difference between the compact high-amplitude metal loss signature and the diffuse low-amplitude SCC signature — is exactly what an MFL AI classifier is trained to recognise. It is also exactly what an adversarial perturbation targets. An SCC colony at 35 %WT depth produces MFL pixel amplitudes that are already close to the background distribution — SCC is a tight crack that impedes flux leakage less than an open corrosion pit of the same depth. The adversarial perturbation needs to shift these already-marginal signals only ±8 DN per pixel channel — within JPEG quantisation noise — toward the background blue-to-purple distribution to cause the AI classifier to reclassify the cluster as acceptable background rather than a reportable SCC signal.

The impact of this misclassification cascades through the operator's IMP. Under API 1160 3rd edition and PHMSA guidance, a 35 %WT SCC colony in a High Consequence Area segment is an immediate scheduled condition requiring excavation within 6 months under 49 CFR 195.452(h)(4). A colony at 40 %WT approaches the immediate action threshold (50 %WT for liquid pipelines under API 579 Level 1) with a 1–2 mm/year growth rate that closes the gap in 5–10 years. The adversarially suppressed colony does not appear in the Defect Call Report. The operator has no knowledge of it. No excavation is scheduled. The IMP continues on its nominal 5–7 year reassessment interval with no flag that this segment requires accelerated assessment. The weakening pipe section remains in service for the duration of the interval.

UT B-scan adversarial injection: the minimum-wall-thickness attack

Ultrasonic ILI tools measure remaining wall thickness directly by timing the echo from the pipe bore surface to the back wall. The rendered UT B-scan is a false-colour cross-sectional image where the colour encodes the measured wall thickness at each UT transducer position — red indicating minimum wall below the operator's alert threshold, blue-to-green indicating nominal or heavy wall. AI classifiers for UT B-scan images identify: internal corrosion (irregular rough surface that produces a characteristic signal profile distinguishable from smooth internal bore surface); external corrosion (broad wall thickness reduction in a profile consistent with soil-side corrosion); hydrogen-induced cracking (HIC) laminar flaws producing mid-wall delamination signatures; and erosion (internal bore roughness patterns associated with high-velocity abrasive flow).

The adversarial injection surface in UT B-scan AI targets the minimum-wall-thickness call. An internal corrosion pit at 40 %WT produces a characteristic red signature in the B-scan — a compact region of reduced thickness against the nominal-wall background. A Gaussian blur centred on this red region — applied at ±6 DN pixel channel shift — spreads the minimum-thickness signal across the neighbouring transducer positions, reducing the peak pixel amplitude and causing the AI to classify the region as a mild wall reduction rather than a reportable corrosion pit. The fitness-for-service consequence of this misclassification is a missed corrosion pit that would otherwise trigger an API 579 Level 1 Remaining Strength Factor (RSF) calculation and potentially a Fitness for Service (FFS) assessment under API 579 Level 2 with a corrosion growth rate projection.

HIC laminar flaw suppression is the highest-consequence UT B-scan attack. An HIC laminar delamination at mid-wall depth produces a distinctive B-scan signature — an abrupt amplitude change at the flaw depth interface, combined with a loss of back-wall echo at the flaw location — that UT AI classifiers are specifically trained to distinguish from legitimate wall thickness variation. Adversarial perturbation that suppresses the interface amplitude change and restores the back-wall echo amplitude at the flaw location can cause the AI to classify the delamination as acceptable wall variation, missing an HIC colony that is associated with elevated risk of brittle fracture in hydrogen sulphide (H S) service under NACE MR0175 / ISO 15156.

The API 1163 qualification gap

API 1163 4th edition (2021) is the governing qualification standard for all ILI tools and analysis systems operating under PHMSA's Integrity Management programme. Section 5 of API 1163 specifies the performance requirements that an ILI vendor must demonstrate: defect detection capability quantified as Probability of Detection (POD) and Probability of Identification (POI) statistics; defect sizing accuracy quantified as the standard deviation of the depth error (typically ±10 %WT for metal loss at 1-sigma confidence); and a minimum defect reporting threshold below which the tool is not required to report anomalies.

Section 6 of API 1163 addresses ILI analysis software, including AI-based classifiers, requiring: documented training data provenance and validation dataset; version control for all AI model releases; change management procedure requiring re-qualification when the AI model or analysis algorithm is materially changed; and a Software Quality Assurance programme meeting the requirements of ISO/IEC 90003 or an equivalent software development standard. These are meaningful requirements. They ensure that the AI classifier's training data is documented, that model updates are controlled, and that performance claims are substantiated by structured testing against representative reference specimens.

The gap is in what API 1163 defines as the test environment. The POD and sizing accuracy statistics are derived by testing the ILI system on reference specimens — pipe sections with real defects of known dimensions, or high-fidelity simulated defects — and comparing the AI's calls to the ground truth measurements. The test specimens are representative of the expected in-service defect population: corrosion pits, SCC colonies, mechanical gouges, seam weld anomalies. The test protocol does not include adversarially perturbed rendered images. There is no API 1163 provision requiring the vendor to generate a set of adversarial MFL anomaly map images — MFL renders with structured pixel perturbations designed to suppress known defects — and measure the classifier's POD against this adversarial test set.

This is structurally identical to the gap we identified in CENELEC EN 50129 SIL 4 certification for railway signalling AI: a rigorous qualification process against expected failure modes and benign noise sources, with no adversarial robustness criterion, because adversarial ML was not in scope when the standard was written. The difference in the pipeline context is that the consequence of a missed defect is not a train collision — whose probability CENELEC explicitly quantifies — but a rupture in a High Consequence Area affecting populations and sensitive environmental receptors, with consequences in the PG&E San Bruno range.

PHMSA has signalled awareness of AI integrity assessment risks in its Mega-Rule Phase 2 discussions (2023–2025) and in the PIPES Act 2020 Section 114 mandate for research into ILI technology performance. But as of 2026, no PHMSA rulemaking has added an adversarial robustness requirement to the ILI qualification framework, and API 1163 4th edition contains no such criterion.

Consequence profile: San Bruno, Colonial Pipeline, and the Carlsbad fire

The consequence envelope for an ILI AI miss on a High Consequence Area segment is established by three historical pipeline failures, none of which involved adversarial injection — but each of which resulted from an integrity assessment programme failing to detect a critical structural defect before it caused a rupture.

PG&E San Bruno, California, 9 September 2010. A 762 mm NPS 30 natural gas transmission pipeline (Line 132, Segment 180) ruptured at a longitudinal seam weld that had been in service since 1956. The failure was caused by the combination of an original manufacturing defect in the electric resistance welded (ERW) seam — a lack-of-fusion condition — and operating pressure cycling that progressively extended the defect through fatigue. PG&E's ILI programme had run MFL inspections on adjacent segments of Line 132, but Segment 180 was not assessed with ILI before the failure because it had been incorrectly categorised as not requiring assessment under the IMP. The rupture produced a 9-metre crater, an ignited vapour cloud, and a fire that burned 37 structures and killed 8 people. NTSB investigation PAB-11-01 found PG&E's integrity management programme for this segment to be inadequate. PHMSA fined PG&E $1.6B. The lesson for adversarial injection is not the specific causal chain — it is the consequence profile of a pipeline integrity AI that misses a critical defect in an HCA segment and allows the segment to remain in service.

Colonial Pipeline, Shelby County, Alabama, 31 October 2016. A 711 mm NPS 28 petroleum products pipeline (Colonial Pipeline Line 1) leaked approximately 350,000 US gallons of refined gasoline into a floodplain adjacent to Autauga Creek, a tributary to the Alabama River and an environmentally sensitive waterway. PHMSA's investigation (INC201603081) attributed the release to internal corrosion under a disbonded coating section that was not identified by the preceding ILI run. The ILI tool's analysis classified the anomaly as a coating disbondment feature rather than active corrosion, and no excavation was scheduled. The adversarial analogy: an AI classifier that misidentifies an active corrosion pit as a benign anomaly produces the same operational outcome as this incorrect classification, regardless of whether the misidentification is caused by an ML generalisation error or an adversarially crafted pixel perturbation.

Pacific Gas Transmission, Carlsbad, New Mexico, 19 August 2000. A 762 mm NPS 30 natural gas transmission pipeline (PGT Line 2000) ruptured due to internal corrosion under a disbonded tape coating, producing a vapour cloud that ignited and burned 12 people in a campsite adjacent to the Pecos River. The NTSB investigation (PAR-03-01) identified the rupture as the result of severe internal corrosion that was not detected by the preceding hydrostatic test or ILI run, because the corrosion product (black powder) partially masked the anomaly signal in the MFL data. For adversarial injection, the masking mechanism is different — structured pixel manipulation rather than physical corrosion product — but the outcome is the same: a critical corrosion defect that the ILI analysis programme did not call.

Three attack vector classes

Adversarial injection into pipeline ILI AI can be delivered through three distinct vector classes, each requiring a different attacker capability level.

Vector 1: ILI data analysis system compromise. ILI vendors operate centralised data analysis centres where raw pig data is uploaded from field operations, processed, rendered into MFL anomaly maps and UT B-scans, and submitted to the AI classification engine. A compromise of the vendor's data analysis software — the rendering pipeline specifically — enables an attacker to intercept rendered images before they reach the AI classifier and apply structured perturbations to targeted anomalies. The attacker requires knowledge of the MFL anomaly map colour scale (vendor-specific, but standardised within each vendor's tool line) and the AI classifier's architecture (can be inferred from the POD specification in the API 1163 performance data sheet). Access requires vendor network penetration.

Vector 2: pipeline MitM at the operator boundary. ILI vendors deliver Defect Call Reports to pipeline operators via secure file transfer (SFTP, API EDR data exchange format, or vendor portal). Adversarial injection at this boundary modifies the rendered images in the vendor's delivery package before the operator's IMP team or independent ILI data analyst reviews them. This vector requires compromise of the file transfer channel or the operator's data receiving system. The adversarial perturbation targets the rendered MFL images included in the delivery package alongside the Defect Call Report, causing the review analyst to see a suppressed anomaly image that is consistent with the clean call in the report.

Vector 3: training data poisoning of the ILI AI model. ILI AI classifiers are trained on large annotated datasets of MFL anomaly map images, each labelled with the defect type and dimensions measured by in-the-ditch excavation. An attacker with access to the training data repository — or to the ILI vendor's ML development pipeline — can introduce poisoned training examples: annotated MFL images in which a real SCC colony is labelled as background. The poisoned model learns to classify a subset of SCC colony morphologies as acceptable background, creating a systematic blind spot that persists across all subsequent ILI runs using that model version. This vector requires the most sophisticated attacker capability — persistent access to the vendor's ML development environment — but produces the most durable effect.

Glyphward threshold 40 for pipeline integrity ILI AI

Glyphward's adversarial detection API operates as a pre-scan gate at the rendered image ingestion boundary of an ILI AI classifier. For each MFL anomaly map image or UT B-scan submitted to the defect classification engine, the gate submits the image to Glyphward's API, receives a risk score (0–100 scale), and compares to the configured threshold.

We configure this threshold at 40 for pipeline integrity ILI AI contexts. The threshold selection reflects two considerations. First, the consequence asymmetry: a false positive by the Glyphward gate (flagging a clean MFL map as adversarially perturbed) routes the image to a qualified ILI data analyst for manual review of the raw sensor data — a minor process delay well within the 6-month PHMSA excavation deadline. A false negative (passing an adversarially crafted MFL map) can allow a critical SCC colony to be missed, with the consequence profiles described above. The threshold of 40 is calibrated to minimise false negatives at the cost of an elevated false positive rate that is operationally manageable. Second, unlike fully autonomous safety-critical systems — where the consequence of a false positive is an immediate operational disruption (as in surgical robotics AI at threshold 35, where a false positive during a live procedure stops an operation) — pipeline integrity ILI operates on long planning cycles. A false positive on an MFL map adds analyst hours to the call report generation timeline, not an immediate safety intervention.

The Glyphward pre-scan gate generates a timestamped detection log for each image — scan_id, risk score, ILI run identifier, pipe joint reference, anomaly axial location in feet from launcher, and perturbation class (MFL cluster suppression, UT B-scan blur, camera contrast wash) — that satisfies the PHMSA record-keeping requirement under 49 CFR 195.452(l) and the API 1163 SQA anomalous-input logging requirement. For images flagged above threshold 40, the log entry records ‘outside validated input distribution — routed to manual analyst assessment per API 1163 Section 6.4’, providing the documentation that a PHMSA inspector reviewing the operator's IMP under 49 CFR 195.452(k) would expect for any AI-assisted ILI analysis system used on HCA segments.

The same architectural pattern — a pre-scan gate before a safety-critical AI classifier that has a formal qualification standard with no adversarial robustness criterion — applies across the physical inspection AI landscape: EASA AMC 20-16 for jet engine borescope inspection AI, CENELEC EN 50129 SIL 4 for railway signalling AI, and FERC Part 12 for dam safety monitoring AI. The regulatory gap is structural and common to all of them: standards developed before adversarial ML was a practical attack vector, applied to AI systems whose rendered-image input boundaries are adversarially manipulable within the noise floor of compressed image transmission.

Free tier — 10 scans/day, no card required. Submit a rendered MFL anomaly map or UT B-scan image to the Glyphward scanner to generate a baseline risk score for your ILI AI classification pipeline.

FAQ

What does PHMSA 49 CFR 195.452 require for ILI-based pipeline integrity management — and what happens when an ILI AI classifier fails?

PHMSA's Integrity Management rule for hazardous liquid pipelines — 49 CFR 195.452 — requires operators of pipelines in or affecting High Consequence Areas to implement a written Integrity Management Programme (IMP) including periodic ILI assessment of pipe segments. Inline inspection using instrumented pig tools is the Tier 1 assessment method. When an ILI run completes, the vendor delivers a Defect Call Report and the operator must evaluate each defect against fitness-for-service criteria (ASME B31G, RSTRENG, API 579). The 49 CFR 195.452(h)(4) deadline for HCA anomalies is 6 months for immediate action items, 1 year for scheduled conditions. An ILI AI classifier that misses a critical SCC colony causes the defect to be absent from the Call Report, no excavation is scheduled, and the segment remains in service past the regulatory deadline. PHMSA has no mechanism to detect this until the segment fails. The consequence envelope is established by PG&E San Bruno 2010 (8 fatalities, $1.6B) and Colonial Pipeline Alabama 2016 (350,000-gallon release).

What is the adversarial injection surface in MFL anomaly map AI — and why is SCC colony suppression the highest-risk target?

MFL pig raw sensor data is rendered into a 2D colour raster (the MFL anomaly map) where pixel colour encodes flux leakage amplitude. AI classifiers take this raster as input and output defect classifications. SCC colonies produce low-amplitude diffuse clusters — already close to the background distribution — making them the highest-risk adversarial suppression target. A ±8 DN per-channel pixel shift moves an SCC cluster at 35 %WT depth below the background detection threshold within JPEG quantisation noise. The AI reclassifies the colony as background. No defect is called. No PHMSA excavation order is generated. The colony at 1–2 mm/year growth rate reaches the critical burst threshold in 15–25 years undetected, with no subsequent ILI run mandated to catch the progression.

What is the API 1163 qualification standard — and why doesn't it close the adversarial robustness gap?

API 1163 4th edition (2021) requires ILI AI vendors to demonstrate POD/POI statistics, ±10 %WT sizing accuracy, and a Software Quality Assurance programme including training data provenance and model version control. These tests are conducted on reference specimens under controlled conditions — real or simulated defects of known dimensions. API 1163 does not require testing against adversarially perturbed rendered images. There is no POD metric reported against adversarial test sets, no SQA provision treating pixel perturbation of rendered sensor images as an in-scope failure mode. An ILI AI system holding a full API 1163 qualification certificate can be defeated by an ±8 DN MFL map perturbation, with no qualification test result speaking to that scenario. This is the same structural gap as CENELEC EN 50129 SIL 4 for railway signalling AI — formal qualification against expected failure modes without adversarial robustness coverage.

Which ILI AI systems are in scope — Baker Hughes FlexPIG, TDW SmartScan, ROSEN RoCorr UT, NDT Global?

Any ILI analysis pipeline that renders raw sensor data (MFL, UT, EMAT, camera) into a raster image and passes it to a deep learning classifier is in scope. Baker Hughes FlexPIG MFL AI (CNN-based defect classifier for metal loss, SCC, and mechanical damage), TDW SmartScan AI (MFL, Caliper, Deformation, Combo tool classification), ROSEN RoCorr UT AI (UT B-scan wall-thickness AI for corrosion and laminar flaw detection, NPS 6 to NPS 60+), NDT Global Evo Series AI (EMAT and UT ILI, crack and metal loss detection), Eddyfi Technologies PipeWIZARD AI (guided wave and conventional UT), and Percepto Arc drone aerial patrol AI (above-ground corrosion and coating holiday detection) are all in scope. The shared adversarial surface is the rendered sensor image input to the classification CNN — regardless of the underlying sensor modality.

How does a Glyphward pre-scan gate integrate with pipeline ILI AI at threshold 40, and what documentation does it generate for PHMSA?

Glyphward operates at the rendered image ingestion boundary — before the MFL classifier, UT B-scan classifier, or drone patrol classifier. Each image is submitted to Glyphward's API (8–15 ms latency), receives a risk score, and is compared to threshold 40. At or above 40, the gate raises an error, suppresses AI output, and routes the image to manual analyst review per API 1163 Section 6.4. Below 40, normal classification proceeds. The scan generates a timestamped log — scan_id, score, ILI run ID, pipe joint reference, axial location, perturbation class — satisfying PHMSA IMP record-keeping under 49 CFR 195.452(l) and the API 1163 SQA anomalous-input logging requirement. Flagged images are documented as ‘outside validated input distribution — routed to manual assessment’, providing the audit trail PHMSA inspectors expect for AI-assisted ILI on HCA segments.