Healthcare · AWS Bedrock · Multi-Agent AI

Cloud Migration Analysis Platform
AI-Governed Intelligence

A multi-agent AI pipeline that analyzes enterprise application portfolios and produces auditable, provenance-tagged 6R cloud migration recommendations — with zero hallucinated numbers and built-in healthcare governance.

36
Applications Analyzed
408
VMs Assessed
7
AI Agents per App
22
Governance Rules
$24
Total AI Cost
4.9h
Run Time (36 apps)

Cloud migration advisory has a data quality crisis

When a healthcare organization decides to migrate 36 clinical applications — PACS systems, oncology platforms, cardiology imaging — to the cloud, the analysis required is enormous. Traditional consulting approaches fail in predictable ways that cost time, money, and clinical trust.

🔢

Invented Numbers

Consultants produce ROI projections ($2.3M savings!) that cite no source, reference no formula, and can't be reproduced. When challenged, the analysis falls apart. Healthcare CIOs lose confidence in the entire recommendation.

⏱️

Months of Manual Work

A portfolio of 36 applications — each requiring vendor research, VM telemetry analysis, procurement intelligence, dependency mapping, and financial modeling — traditionally takes 3–6 months of analyst time at $150–$350/hr billing rates.

🏥

Healthcare Blind Spots

Generic cloud migration frameworks don't know that a radiation therapy linear accelerator interface is hardware-integrated and life-safety-classified — making cloud migration structurally impossible, regardless of whether a SaaS alternative exists.

📋

No Audit Trail

When a healthcare compliance officer asks "why did you recommend REPURCHASE for Sectra PACS with 124 VMs?", there is no traceable chain from evidence to recommendation. The analysis is a black box.

🔄

Inconsistent Results

Different analysts, different days, different answers. A portfolio analysis that takes three consultants three months has no reproducibility guarantee — critical when the same portfolio needs re-analysis after a vendor acquisition.

💸

Cost of Engagement

A mid-size healthcare portfolio advisory engagement runs $250K–$750K in consulting fees. Most of that spend goes to data gathering and formatting — work that is inherently automatable if the AI is governed properly.


Cloud Migration Analysis Platform

The platform is a multi-agent AI pipeline built on AWS Bedrock (Claude Sonnet 4.6) that processes an enterprise application portfolio through a chain of specialized agents — each governed by deterministic validation rules that run after every LLM call. The pipeline produces fully auditable recommendations where every number traces to a source.

Pipeline Execution Flow — 7 Agents per Application
Step 0
Telemetry Enrichment
Joins RVTools vCenter export with app assignment sheet
Agent 1
Telemetry
CPU/RAM utilization profile from RVTools + heuristics
Agent 2
Dependency
Migration risk, hardware deps, life-safety classification
Agent 3
Procurement
SaaS alternatives, vendor intelligence, contract signals
Agent 4
Provisioning
LLM classifies workload; Python engine does all math
Agent 5
Synthesizer
Final 6R recommendation with confidence gate
Agent 6
Confidence Advisor
Prioritized backlog to close evidence gaps
Agent 7
Portfolio Narrative
Executive-grade prose summary across all apps
The provisioning agent uses a deliberate split: the LLM call classifies workload type and migration complexity (small vocabulary, fast). Then a deterministic Python engine performs all financial calculations using the Sourced Values Registry.

This eliminates the primary source of ROI inflation — the LLM cannot invent a number because it never touches the math. The Python engine computes vm_count × SRC-TCO-001 × cloud_multiplier with a fully traceable formula in the output JSON.

Result: every ROI projection is reproducible to the cent, and can be re-verified by any Python developer with the source data.
Healthcare fields like phi_handled, baa_confirmed, hardware_dependency, and life_safety_classification are tri-state — not binary. The pipeline enforces this with a governance rule:

true / false — requires a citation or artifact reference
UNKNOWN — when the field is null or absent in source data
NEVER infer true/false from null — null equals UNKNOWN, full stop

This matters critically for healthcare: an agent that infers baa_confirmed=false from a missing field will cap confidence at 40 on a PHI-handling system — potentially blocking a valid migration recommendation. The tri-state model separates "we don't know" from "the answer is no."
The synthesizer agent has a pre-check that fires before all 6R rules:

IF hardware_dependency == true AND (latency_sensitive == true OR life_safety_classification == true)strategic_recommendation = RETAIN, unconditionally.

This catches systems like Philips Perinatal, Philips Patient Monitoring, and Mosaiq (radiation therapy) — where a vendor SaaS alternative exists, but the physical hardware integration and life-safety classification make migration structurally impossible regardless of cloud economics. The system explicitly refuses Rule 1 (REPURCHASE) for hardware-integrated clinical devices: nurse call controllers, infant security RFID, radiation therapy interfaces, defibrillator integrations.
Four hard confidence caps enforce data quality:

SRC-CAP-001: ceiling 40 — phi_handled=true AND baa_confirmed != true
SRC-CAP-002: ceiling 55 — majority of dimensions INFERRED or MISSING
SRC-CAP-003: ceiling 65 — zero CONFIRMED dimensions
SRC-CAP-004: ceiling 65 — snapshot_count=1 AND apm_telemetry_available=false

And an unconditional confidence gate: if final_confidence < 60 and the strategic recommendation is not RETAIN, the displayed recommendation becomes RETAIN. No exceptions. If an agent reasons that an exception applies, that reasoning is the signal to apply the gate.

This means the system surfaces evidence gaps honestly rather than masking them with confident-sounding recommendations.
After pipeline completion, scenario_bridge.py runs with zero LLM calls. It reads the completed recommendations JSON and produces a deterministic comparison of current state (on-prem, full TCO) versus target state (recommended disposition, projected cloud cost) for every application.

The scenario bridge is the layer that populates the portfolio-level financial summary: total current TCO, total projected savings, number of apps in each 6R bucket. Because it uses no LLM, it runs in seconds and is fully reproducible.

Provenance-by-Construction: every number is traceable

The core architectural principle is that hallucination must be structurally impossible to hide — not merely discouraged. This is achieved through a provenance model that requires every numeric value to carry one of four explicit tags. A deterministic Python validator (22 rules, no LLM) runs after every agent call and rejects non-compliant output.

The Four Provenance Categories — Required on Every Numeric Value

SOURCED

Value has a named industry source. References the Sourced Values Registry (Gartner, IDC, AWS MAP data). Validator rejects if citation doesn't resolve.

CUSTOMER_PROVIDED

Value came from customer-supplied data. Requires source artifact (e.g., rvtools_export.xlsx) and source field. Validator rejects if artifact is not in the input manifest.

COMPUTED

Derived from a formula whose inputs are themselves provenance-tagged. Requires formula, inputs, and computation. Validator recomputes and rejects on mismatch > 0.01.

DEFAULT_ASSUMPTION

Documented default used when data is missing. References the Default Assumptions Registry. Surfaced explicitly in the output so reviewers see every assumption made.

❌ Traditional (allows hallucination)
{
  "current_annual_tco_usd": 117000,
  "projected_cloud_cost_usd": 31200,
  "annual_savings_usd": 85800
}
// Where did $117,000 come from? 
// No source. No formula. Unverifiable.
✅ This platform (provenance-by-construction)
{
  "current_annual_tco_usd": {
    "value": 117000,
    "provenance": "COMPUTED",
    "formula": "vm_count × on_prem_tco_per_vm",
    "inputs": [
      {"field": "vm_count", "value": 18,
       "provenance": "CUSTOMER_PROVIDED",
       "source_artifact": "rvtools_export.xlsx"},
      {"field": "on_prem_tco_per_vm", "value": 6500,
       "provenance": "SOURCED",
       "source_citation": "SRC-TCO-001",
       "source_basis": "Gartner IaaS TCO 2022-24"}
    ],
    "computation": "18 × 6500 = 117000"
  }
}

Sourced Values Registry

A centralized table of every numeric constant the system uses — on-prem TCO per VM by type, cloud cost multipliers, migration factors — each with Gartner / IDC citation and variance band. Agents reference Source IDs; they cannot override values without a documented reason.

SRC-TCO-001: $6,500/VM/yr SRC-MULT-001: AWS 0.80× Gartner IaaS 2022–2024

22-Rule Deterministic Validator

A Python program (not an LLM) runs after every agent call. It mechanically checks provenance tags, resolves source citations, recomputes arithmetic, and gates downstream agents on failure. PASS / PASS_WITH_DEFAULTS / FAIL — explicit, logged, surfaced.

V1: Provenance required V4: Recomputed arithmetic V7: Confidence caps V11: Gate enforcement

C2 JSON Parse Retry

When an agent returns malformed JSON, the pipeline automatically retries up to 2 times with an explicit correction instruction injected into the next call. Parse failures and retries are logged and surfaced in the token report — never silently swallowed.

Max 2 retries Explicit correction context Full retry audit log

Default Assumptions Visibility

Every default value used (when customer data is missing) is surfaced in a top-level default_assumptions_used array in the output. Reviewers see exactly what the system assumed, why, and whether better data would resolve it.

DEF-MIG-001: migration factor recoverable: true/false Explicit in report

What this changes for a healthcare advisory practice

This platform doesn't just automate the analysis — it changes what's possible. Engagements that were previously gated by analyst headcount can now be run iteratively, updated when new telemetry arrives, and re-executed against a changed portfolio without starting over.

Before — Manual Advisory
Analysis timeline3–6 months
Consulting cost (36 apps)$250K – $750K
Cost per application$7K – $21K
ReproducibilityNone — analyst-dependent
Audit trailInformal / PowerPoint
Healthcare rulesFramework-agnostic
Re-run costFull engagement fee again
After — AI-Governed Pipeline
Analysis timeline4.9 hours (36 apps)
AI compute cost (36 apps)$24 total
Cost per application$0.67
ReproducibilityDeterministic — bit-for-bit
Audit trailFull JSON provenance chain
Healthcare rulesLife-safety, PHI, BAA, hardware
Re-run cost$24 again
99.97%
Cost Reduction vs. Manual
$24 AI compute vs. conservative $75K manual estimate for the same 36-app portfolio. The differential grows with portfolio size.
250×
Speed Improvement
4.9 hours vs. ~3 months of analyst work. Re-runs after new telemetry arrives take the same 4.9 hours — not another 3 months.
100%
Numeric Traceability
Every financial figure in the output JSON carries a provenance chain that resolves to a cited industry source, a customer artifact, or a documented formula.
0
Silent Hallucinations
22 validator rules running after every LLM call. Any numeric value without provenance causes a validation failure and retry — structurally prevents silent invention.

This v4 run analyzed the full client Imaging & Radiology portfolio: 36 clinical applications spanning PACS systems (Sectra, 124 VMs), oncology (Mosaiq/Elekta), cardiology imaging (Lumedx Apollo, GE Centricity), radiology AI (RapidAI, Viz.ai, HeartFlow), and enterprise imaging (Hyland OnBase, 55 VMs). The pipeline correctly applied life-safety RETAIN rules to Philips Perinatal, Philips Patient Monitoring, and Mosaiq — systems where hardware integration makes cloud migration structurally blocked regardless of SaaS availability. The final report was published to a live S3-hosted dashboard accessible to the engagement team within hours of data ingestion.


Measured cost at production scale

Full token accounting from the 2026-05-22 production run — 36 apps, 408 VMs, Claude Sonnet 4.6 on AWS Bedrock.

Agent Calls Total Tokens Cost % of Total
Telemetry 52 648,078 $6.06
25.3%
Provisioning 52 562,091 $5.32
22.2%
Confidence Advisor 36 756,546 $4.80
20.0%
Synthesizer 36 471,012 $3.51
14.6%
Dependency 38 332,724 $2.63
11.0%
Procurement 36 280,521 $1.65
6.9%
TOTAL — 36 apps / 408 VMs 250 3,050,972 $23.97

See the Live Output

The full Imaging & Radiology portfolio report and executive summary are live-hosted on AWS S3. Both are fully interactive with sortable tables, disposition breakdowns, and per-application evidence trails.

Built on AWS Bedrock · Claude Sonnet 4.6 · Python 3.11 · ~3M tokens · $24 total compute