AI-Enabled AML Typology Discovery and Detection

Solution & Architecture

Eight agents, three skill layers, one governed lifecycle.

The system is a constellation of eight specialized agents, each doing one well-bounded job. Four are heavy users of generative AI — Parsing, Data Binding, Translation, and SAR Mining. Four are deterministic engines — Ingestion, Pattern Discovery, Runtime Detection, and Calibration. The output is a typology library expressed in three layers — institution-agnostic specification, institution-specific data binding, and executable detection — that can be reviewed, versioned, and audited independently.

The four learning sources

The system absorbs typology knowledge from four distinct sources, each requiring its own ingestion treatment but converging on a common skill format.

Source	What it provides	Ingestion treatment
Regulatory advisories	Authoritative typology narratives with red flags. FinCEN, OFAC, FATF, FFIEC, foreign FIUs. Low volume, high authority.	Scheduled crawl with change detection; RSS where available; subscription email parsing.
SAR narrative mining	Highest-signal source. Analyst reasoning encoded in prose. Highly institution-specific.	Continuous read from case management; structured extraction; cross-narrative clustering.
ML pattern discovery	Unsupervised clusters of anomalous behavior; novel patterns no existing typology describes.	Graph community detection & sequence clustering on transaction data; agent names & explains clusters.
Peer & enforcement intelligence	314(b) information sharing, DOJ DPAs, court filings, consortium publications, enforcement actions.	Scheduled crawl of free sources; manual ingestion for 314(b); commercial feed where available.

The eight agents — in detail

Each agent has a single responsibility. The separation matters: it concentrates model risk in the four agents that depend on generative AI (Parsing, Data Binding, Translation, SAR Mining) and keeps the four deterministic agents (Ingestion, Pattern Discovery, Runtime, Calibration) auditable as pure code. The detection path itself contains no LLM in the hot path.

Jump to agent

AG-01Ingestion AG-02Parsing AG-03Data Binding AG-04Translation AG-05Pattern Discovery AG-06Runtime AG-07SAR Mining AG-08Calibration

AG-01

Ingestion Agent

Watches the world for change

● ACTIVE

Deterministic crawler · No LLM

Tasks · 24h

1,247

Throughput

147 source checks every 15 minutes; ~1,247 fetch operations / 24h

LLM cost · 24h

$0 (infrastructure only)

LLM usage

None

Inputs

147 monitored sources (FATF, FinCEN, Treasury, Egmont, FATF, OFAC, FCA, internal SAR queue)
RSS/Atom feeds (47 sources)
Scheduled HTTPS crawls with content-hash comparison (89 sources)
Email subscription parsing (11 sources)
Manual upload queue for documents emailed by compliance team

Outputs

Kafka topic: nexus.documents.new
Document metadata: source_id, hash, retrieved_at, type, size, URL
Raw text in object store (s3://nexus/documents/)

Internal pipeline — 5 stages

Fetch Deterministic

Pull source content (HTTP GET with conditional headers; honor robots.txt and rate limits)

Hash & dedupe Deterministic

SHA-256 of normalized content; compare against last-seen-hash table

Classify Rule-based

Document type tag (advisory/trend-report/typology/SAR/internal-policy)

Extract Deterministic

Pull PDF text via pdftotext + tabula for tables; HTML via readability

Enqueue Deterministic

Publish to Kafka topic nexus.documents.new with metadata envelope

Dependencies

Source registry table (REF_SOURCES): URLs, schedules, parsing hints
Object store for raw document persistence
Kafka cluster for downstream notification

Failure modes

Source URL changed → flagged as STALE after 3 failed checks; ops alerted
PDF extraction fails → routed to OCR fallback queue (Tesseract)
Hash collision → logged but processing continues

AG-02

Parsing Agent

Reads regulatory documents into structured typology specs

● ACTIVE

LLM-heavy · Claude Sonnet with tool use

Tasks · 24h

Throughput

~3 documents per 24h (advisories are rare)

LLM cost · 24h

$2.18 (3 documents × ~$0.73 avg)

LLM usage

5-12 LLM calls per document. Avg 4,200 input tokens + 2,800 output tokens per call.

Inputs

Documents from AG-01 (nexus.documents.new topic)
Prior typology specs (for reference / deduplication)
Concept registry (existing C.* concepts) for vocabulary alignment

Outputs

Typology spec YAML written to nexus.typologies.draft topic
Confidence score per behavioral component
List of new concepts flagged for AG-03 to bind
Self-critique notes appended as audit metadata

Internal pipeline — 6 stages

Chunk Deterministic

Split document into semantic sections (regex on headers + length cap)

Classify intent LLM

LLM: is this a new typology, an update, or unrelated?

Extract spec draft LLM

LLM with structured-output schema: produces YAML matching typology-spec.schema.json

Reuse concepts LLM + retrieval

RAG over concept registry; LLM proposes existing C.* concepts where applicable, marks NEW where not

Self-critique LLM

LLM reviews own output for missing components, vague metrics, conflicts with existing typologies

Validate schema Deterministic

JSON schema check against typology-spec.schema.json

Dependencies

Claude API (model: claude-sonnet-4-20250514)
Concept registry vector store (Pinecone-equivalent)
typology-spec.schema.json definition

Failure modes

Schema violation → up to 3 retries with error feedback in prompt; then flag for human review
Hallucinated concept → caught by schema check against concept registry
Ambiguous narrative → low-confidence flag set; AG-03 review required

AG-03

Data Binding Agent

Binds abstract concepts to bank data model expressions

⚠ REVIEW

LLM + Retrieval · concept resolution

Tasks · 24h

Throughput

~42 binding evaluations per 24h

LLM cost · 24h

$1.84 (LLM calls for novel bindings)

LLM usage

Used only for novel/composed bindings. Avg 2-4 calls per typology.

Inputs

Concept list from AG-02 (typology spec)
Binding registry (existing concept→data-expression mappings)
Data catalog (DIM_/FCT_ tables, column-level descriptions, sample values)
Composition rules: how to compose existing bindings into new ones

Outputs

Binding registry entries (REUSED, COMPOSED, NEW) per concept
Confidence score per binding (0-1)
Pending FCC review queue (for NEW + low-confidence)
Sample SQL snippets demonstrating each binding

Internal pipeline — 5 stages

Lookup existing Retrieval

Vector search for concept in binding registry; if confidence > 0.85, mark as REUSED

Compose LLM + retrieval

If close match (0.65-0.85), propose composition of 2-3 existing bindings

Author new LLM

If no match, LLM authors a new binding using data catalog as grounding; marks NEW

Confidence score LLM + deterministic

LLM rates own confidence; cross-checks: does the binding actually exist in data, does it return rows, sanity-check on sample query

Queue for review Rule-based

NEW and low-confidence CUSTOM bindings flagged for FCC review

Dependencies

Binding registry database (PostgreSQL)
Data catalog (Alation or equivalent)
Read access to data warehouse for binding validation
Claude API for novel binding authoring

Failure modes

Concept resolves to multiple equally-likely bindings → presents both to human for disambiguation
Binding refers to non-existent column → caught by data catalog cross-check
Binding returns zero rows on test → flagged as suspicious, escalated to data engineering

AG-04

Translation Agent

Compiles typology spec + bindings into runnable detection code

● ACTIVE

LLM Codegen · with deterministic validation

Tasks · 24h

Throughput

~6 typology compiles per 24h (when active)

LLM cost · 24h

$3.42 (per typology compile + backtest validation)

LLM usage

3-6 LLM calls per typology compile. Avg 8,000 input + 4,500 output tokens.

Inputs

Approved typology spec (Layer 1)
Approved bindings (Layer 2)
Target runtime (Spark SQL on OCP / PySpark / Java)
Detection code style guide and prior templates

Outputs

Detection code in target language(s)
Backtest report: recall on historical SARs, FP estimate, runtime measurement
Code provenance metadata for MRM audit
Performance characterization

Internal pipeline — 6 stages

Plan LLM

LLM produces query plan: tables to join, CTE structure, filter conditions, aggregations

Generate LLM

LLM emits code in target language using bindings verbatim; references concept names in comments

Compile check Deterministic

Run through SQL parser (sqlglot for SQL, Black for Python, javac for Java) — must compile

Backtest Deterministic + analytical

Execute query against 90-day historical data; compare alerts produced to known SARs

Performance check Deterministic

Estimate runtime; flag if > 30 min on full corpus

Self-critique LLM

LLM reviews generated code for correctness, efficiency, edge cases

Dependencies

Claude API (model: claude-sonnet-4-20250514)
SQL/code linters (sqlglot, Black, javac)
Backtest dataset (90-day historical transaction warehouse)
Code style guide repository

Failure modes

Generated code doesn't compile → up to 3 retries with error feedback
Backtest recall < 70% → blocks promotion, alerts data science team
Runtime > 30 min → blocks promotion, returns to LLM with performance feedback

AG-05

Pattern Discovery Agent

Unsupervised discovery of patterns not yet in the library

● ACTIVE

Graph + Clustering · No LLM in hot path

Tasks · 24h

Throughput

4 active cluster jobs; full graph rebuild every 6 hours

LLM cost · 24h

$0 (compute only, no LLM)

LLM usage

None in detection. LLM may be invoked at hand-off to AG-02 for spec drafting.

Inputs

Live transaction stream (90-day rolling window)
Customer dimension (account, demographics, profile)
Existing typology rules (to subtract known patterns)
Historical SAR corpus (positive examples)

Outputs

Cluster registry: cluster ID, confidence trend, evidence count, supporting features
Drift alerts when existing typology FP rate changes beyond baseline
Candidate typology surfacing when cluster confidence crosses threshold

Internal pipeline — 6 stages

Build graph Deterministic

Account-counterparty graph with transaction edges weighted by amount, frequency

Subtract known Deterministic

Remove subgraphs matching existing typology patterns

Cluster ML

Run community detection (Louvain) + temporal pattern clustering (k-shape on transaction sequences)

Score novelty Deterministic + stats

Compare cluster signature to existing typologies; high novelty = cluster of interest

Accumulate evidence Deterministic

Track cluster over time; add supporting evidence as new transactions arrive

Confidence update Stats

Recompute confidence as evidence accumulates; cross 0.75 threshold → surface as candidate

Dependencies

Graph database (Neo4j or equivalent)
Spark cluster for batch clustering jobs
Time-series store for cluster confidence history

Failure modes

Spurious cluster from data quality issue → confidence drops over time as evidence fails to accumulate
Slow drift goes undetected → mitigated by 90-day rolling window comparison
Cluster captures legitimate business pattern → caught at AG-02 spec drafting step before production

AG-06

Runtime Detection Agent

Executes detection logic against live transactions

● LIVE

Deterministic · High-throughput executor

Tasks · 24h

2.41M txn

Throughput

2.41M transactions evaluated per 24h; 44 alerts emitted today

LLM cost · 24h

$0 (cluster cost amortized)

LLM usage

None — pure deterministic execution

Inputs

Approved detection code from AG-04
Live transaction stream from core banking (Kafka)
Customer dimension snapshots (refreshed daily)
Reference data (high-risk jurisdictions, crypto exchanges, etc.)

Outputs

Kafka topic nexus.alerts.typology (production)
Kafka topic nexus.alerts.shadow (shadow mode)
Audit log entries in S3 with retention policy
Daily runtime + alert volume metrics

Internal pipeline — 6 stages

Schedule Deterministic

Cron triggers daily at 02:00 UTC; on-demand triggers for shadow mode

Load context Deterministic

Pull latest customer + reference data into Spark

Execute Spark SQL

Run typology query against last-N-day transaction window

Materialize evidence Deterministic

Build evidence struct for each alert (BCs met, transaction list, narrative inputs)

Route Deterministic

Production alerts → Actimize case queue. Shadow alerts → shadow comparison topic.

Audit log Deterministic

Every alert + execution metadata → immutable audit store

Dependencies

Spark cluster on OpenShift Container Platform
Kafka cluster
Actimize integration adapter
Data warehouse read access

Failure modes

Query timeout → typology marked degraded, on-call notified, runs at next slot
Reference data stale → flagged in metadata, alerts still emit but tagged
Actimize routing fails → alerts persist in retry queue; ops escalation

AG-07

SAR Mining Agent

Learns from internal SAR narratives to refine typologies

● ACTIVE

LLM-heavy · Narrative analyst

Tasks · 24h

Throughput

23 SARs processed in past 24h

LLM cost · 24h

$3.42 (23 SARs × ~$0.15 avg)

LLM usage

Avg 2 LLM calls per SAR (read + similarity ranking).

Inputs

New SARs from internal investigation queue (typically 20-50/day)
Existing typology library
Cluster registry from AG-05
Historical SAR corpus for similarity matching

Outputs

SAR → typology mapping with confidence
Novel pattern submissions to AG-05
Training set updates for AG-08
SAR processing report for FCC daily briefing

Internal pipeline — 6 stages

Read SAR LLM

LLM reads full SAR narrative + structured fields

Extract signals LLM

LLM extracts behavioral signals: customer profile, transaction pattern, red flags, outcome

Match to library LLM + retrieval

Compute similarity to each existing typology; rank top 3 matches with confidence

Decide mapping Rule-based

If top match > 90% confidence → mapped. If < 60% → novel. In between → clustered for review.

Update calibration training set Deterministic

Mapped SARs feed AG-08 calibration training corpus

Surface novel patterns Deterministic

Novel SARs surface to AG-05 for clustering with supporting metadata

Dependencies

Claude API for SAR reading
SAR repository (internal)
Typology library + concept registry

Failure modes

SAR pattern doesn't fit any typology and isn't novel enough → flagged for human review
Confidence borderline (60-90%) → routed to FCC analyst for adjudication
Sensitive SAR (insider, terrorism) → handled in restricted-access pipeline

AG-08

Calibration Agent

Tunes thresholds based on observed alert outcomes

○ IDLE

Statistical · No LLM

Tasks · 24h

Throughput

Auto-tune queue checked daily; 3 queued, next scheduled May 18

LLM cost · 24h

LLM usage

None

Inputs

Alert outcomes from Actimize (closed as SAR, closed as false positive, escalated)
Calibration training set (from AG-07 SAR mapping)
Current typology thresholds (baseline)
Operating constraints (max FP rate, min recall)

Outputs

Calibration change proposals with impact projections
Auto-applied minor adjustments (< 5% impact)
Per-typology calibration history for MRM audit

Internal pipeline — 6 stages

Aggregate outcomes Deterministic

Group last 30 days of alert outcomes by typology + behavioral component

Compute current performance Stats

Recall, precision, FP rate per typology

Search threshold space Stats

Grid search over threshold combinations; pareto-frontier of recall vs FP

Validate stability Stats

Bootstrap resampling; reject changes that show high variance

Propose adjustment Stats

Generate change set with projected impact (recall delta, FP delta, alert volume delta)

Human approval Workflow

Changes > 5% impact require human approval; smaller auto-apply

Dependencies

Alert outcome database
MRM-approved calibration constraints
Statistical computing environment (Python + scipy/statsmodels)

Failure modes

Insufficient outcome data → defers tuning until N >= 100 observed outcomes
Bimodal performance → flags for human investigation (likely two patterns merged)
Drift detected → triggers AG-05 to investigate; calibration freezes pending analysis

Functional architecture

The eight agents and skill artifacts compose into a layered architecture that separates intelligence sources, transformation, the skill library itself, runtime execution, and the continuous-learning loop. Crucially, the runtime detection path contains no generative model in the hot path — the AI work is upstream, in the construction and maintenance of the skill library, and downstream, in learning from outcomes.

Figure 2.1

Functional architecture — eight agents across five horizontal bands

Source Layer

FinCEN / FATF / Treasury

Internal SAR Queue

Live Transactions

Wolfsberg / Egmont / ACAMS

↓

Ingestion & Discovery

AG-01 Ingestion

AG-05 Pattern Discovery

↓

Transformation (LLM-heavy)

AG-02 Parsing

AG-03 Data Binding

AG-04 Translation

↓

Skill Library

Typology Spec (L1)

Data Binding (L2)

Executable Detection (L3)

Binding Registry

↓

Governance Gates

Candidate → Shadow → Production → Deprecated

MRM Review

FCC Approval

↓

Runtime (Deterministic)

AG-06 Runtime Detection

Evidence Payload

Alert → Actimize Case Management

↓

Continuous Learning

AG-07 SAR Mining

AG-08 Calibration

Outcomes → back to Skill Library

Heavy LLM (AG-02, AG-03, AG-04, AG-07)

Skill Artifact

Deterministic / Data Eng (AG-01, AG-05, AG-06, AG-08)

The three-layer skill artifact

A typology in the library is not a single object. It is three nested artifacts, each versioned independently, each reviewable by different stakeholders. This separation is what makes the system defensible under SR 11-7 and auditable for regulators.

Layer 01

Typology Specification

Institution-agnostic. Narrative, provenance, behavioral components, combination logic, default calibration. Portable across clients. The output of the parsing agent.

Layer 02

Data Binding

Institution-specific. Each abstract concept resolved against the bank's data dictionary, with confidence scores, edge cases, and human approvals. The institution's accumulated ontology.

Layer 03

Executable Detection

Generated artifact. Deterministic SQL or scenario code. Validation evidence from shadow runs. The runtime contract.

The compounding asset The Layer 2 binding registry is where the system's institution-specific moat lives. Once "low-income occupation" or "high-risk corridor" has been bound and approved at the bank, every future typology that references those concepts reuses the binding. The 51st typology takes a fraction of the effort of the first.

III

Integration with the AML Estate

A new intelligence layer above Actimize — not a replacement.

The system is deliberately additive. It does not displace Actimize, the case management workflow, the SAR filing pipeline, or the existing scenario library. It sits above them as a learning layer that produces governed typologies and routes alerts through the bank's existing investigation infrastructure.

Where this system sits in the existing landscape

The bank's current AML estate is well-established: Actimize for detection scenarios and case management, the data lake or warehouse for transaction data, the SAR filing pipeline to FinCEN, and the model risk management governance that wraps it all. The proposed system threads through these without replacing them.

Figure 3.1

Three zones — existing estate, new typology system, integration points

Existing · Source Data

Bank Data Estate

Transaction Data Core transactions, wires, P2P, cash

Customer & KYC Customer master, occupation, risk rating

Reference Data Geography, MCC/NAICS, sanctions

Filed SARs Historical narratives, dispositions

New · Typology System

Nexus Typology Layer

Ingestion + Parsing + Binding + Translation Eight agents producing skills

Skill Library Versioned three-layer artifacts

Binding Registry Institution ontology

Lifecycle Governance Candidate → Shadow → Production

Runtime Detection Deterministic execution

Existing · Investigation

Actimize & Beyond

Actimize Scenarios Existing detection logic (preserved)

Actimize Case Management Alert disposition workflow

Investigator Workbench Analyst tooling, RFI, escalation

SAR Filing Pipeline Narrative drafting, FinCEN BSA E-Filing

End-to-end alert flow

Transaction lands in data estate → Typology skill evaluates → Alert with evidence payload → Actimize case created via API → Analyst dispositions → Disposition feeds back to learning loop

Five integration points worth naming explicitly

The integration is not a single connection; it is a small set of well-defined contracts between the new system and the existing estate. Each is technically straightforward but operationally meaningful.

Integration point	Direction	Mechanism
Transaction data feed	Bank → Typology system	Read access to the analytics layer of the data estate (Iceberg, Hadoop, Snowflake). No new pipelines.
SAR narrative ingestion	Actimize → Typology system	Daily extract of filed SAR narratives plus disposition codes. API or scheduled file drop.
Alert publication	Typology system → Actimize	Alerts routed via Actimize Alert REST API with full evidence payload, typology reference, version. Appears as a new alert type alongside existing scenarios.
Disposition feedback	Actimize → Typology system	Analyst dispositions on typology-generated alerts feed back into shadow validation and calibration tuning.
MRM artifact handoff	Typology system → MRM	Generated development docs, validation reports, and ongoing monitoring dashboards exported to MRM document repository.

Coexistence with the existing Actimize scenario library

Existing Actimize scenarios continue to run unchanged. The new system contributes additional alerts through the same case-management front door. Over time, three patterns emerge: (a) new typology coverage that Actimize did not have; (b) refinements to existing scenarios where the new system suggests better calibration; (c) candidate retirements where the new system demonstrates that older scenarios are producing only redundant alerts. None of this is forced — the institution decides what to retire and when.

What is explicitly out of scope This system does not replace Actimize, does not own SAR filing, does not own case management, and does not score transactions in real-time payment paths. It is a typology intelligence and detection layer, not an end-to-end AML platform.

Sequencing the implementation

A realistic deployment proceeds in three phases. The first delivers value within a quarter without touching production detection; the second introduces shadow alerts; the third graduates the most-validated skills into production alerting.

Phase 01 · Quarter 1

Foundation & first skills

Stand up ingestion of FinCEN advisories and 12 months of historical SARs. Build initial binding registry. Produce first 5–8 skills in candidate state. Validate against historical alerts. No production impact.

Phase 02 · Quarters 2–3

Shadow operation

Promote skills to shadow mode. Generate parallel alerts not visible to investigators but compared daily against Actimize output. Tune calibration. Validate recall against historical SARs. Begin pattern discovery agent operation.

Phase 03 · Quarter 4+

Production alerting

Graduate validated skills to production. Alerts flow to Actimize case management. Disposition feedback loop active. SAR mining continuous. New skills proposed monthly; retirement decisions on existing scenarios begin.

Three Worked Examples

Three sources, one library — how each becomes a typology.

The system is designed to learn from three distinct sources, each with its own discovery rhythm: regulatory advisories arrive in bursts, transaction patterns surface gradually, and internal SAR narratives accumulate continuously. The three examples below trace one typology each through the same governed pipeline, ending in the same versioned library. The mechanism is identical; only the entry point differs.

Example A

From a FinCEN advisory

→ TYP-PIG-001 · Pig-Butchering Investment Fraud

Example B

From transaction patterns

→ TYP-TRD-007 · Trade Invoice Manipulation (candidate)

Example C

From internal SAR narratives

→ TYP-EFM-001 · Elder-Fraud Mule (production)

Regulatory path

From a FinCEN advisory to a draft typology

FIN-2026-A002 · Pig-Butchering Investment Fraud · Issued 2026-04-22 · 12 hours from publication to draft

A.1

AG-01 Ingestion Agent

The advisory is detected and fetched

FinCEN publishes a new advisory on its public advisories page. AG-01 has FinCEN in its monitored source registry; its 15-minute crawl detects the new content via header diff and content hash. The advisory PDF is fetched, text is extracted with pdftotext, and the document is enqueued to Kafka topic nexus.documents.new. Total elapsed time from publication: 9 minutes.

FIN-2026-A002 · Detected 2026-04-22 14:39 UTC · 847 KB · Document type: ADVISORY Excerpt from the advisory: "Pig-butchering schemes typically begin with relationship-building over social media or messaging platforms, transitioning to investment pitches that direct victims to fraudulent crypto trading platforms. Initial victim deposits are small; subsequent deposits escalate as victims are shown fabricated trading gains. Funds are typically wired to U.S. money services businesses registered as crypto exchanges, or directly to overseas exchanges. Red flags include: customer with no prior crypto activity initiating wires to a crypto MSB or VASP; wires to new beneficiaries with no prior payment history; escalating amount pattern over 30 to 90 days; elderly or recently bereaved customers; retirement account withdrawals preceding the wires."

A.2

AG-02 Parsing Agent

The advisory is read into a structured typology spec

AG-02 reads the full advisory, identifies it as a new typology (not an update or restatement), and emits a Layer 1 specification. The six behavioral concepts in the red-flag list become first-class objects in the spec, each with confidence scores and provenance back to specific paragraphs in the source document.

# Layer 1 typology spec drafted from FIN-2026-A002
typology_id: TYP-PIG-001
name: "Pig-Butchering Investment Fraud"
source: FinCEN FIN-2026-A002
issued: 2026-04-22
draft_confidence: 0.91

behavioral_concepts:
  - id: C.WIRE_OUTBOUND
    description: "Outbound wire transfer initiated by customer"
  - id: C.NEW_BENEFICIARY
    description: "Counterparty with no prior payment history to customer"
  - id: C.CRYPTO_OR_NEW_MSB
    description: "Beneficiary classified as crypto exchange, VASP, or newly-registered MSB"
  - id: C.AMOUNT_TREND
    description: "Escalating transaction amounts over 30 to 90 days"
  - id: C.AGE_AND_MARITAL_STATUS
    description: "Customer age 60+ or recently bereaved"
  - id: C.RETIREMENT_WITHDRAWAL
    description: "Retirement account withdrawal preceding outbound wire by <14 days"

combination_logic:
  primary: "C.WIRE_OUTBOUND AND C.NEW_BENEFICIARY AND C.CRYPTO_OR_NEW_MSB"
  strengthening: "PLUS any 2 of: C.AMOUNT_TREND, C.AGE_AND_MARITAL_STATUS, C.RETIREMENT_WITHDRAWAL"

A.3

AG-03 Data Binding Agent

Concepts are resolved against the bank's data model

AG-03 takes the six concepts and resolves each against the bank's data catalog. Four concepts find existing bindings in the registry (REUSED); two are novel (NEW) and routed for FCC review. The reuse rate is high because three of the four reused bindings were authored for earlier typologies — this is the compounding-asset effect in action.

Concept	Status	Binding Expression	Confidence
`C.WIRE_OUTBOUND`	REUSED	`FCT_TRANSACTIONS WHERE txn_type IN ('WIRE_OUT_FED', 'WIRE_OUT_SWIFT')`	0.98
`C.NEW_BENEFICIARY`	REUSED	`NOT EXISTS prior txn to counterparty in 30+ days (BIND-NEW-BENEF v1.2)`	0.96
`C.CRYPTO_OR_NEW_MSB`	NEW	`beneficiary.entity_classification IN ('CRYPTO_EXCHANGE','VASP') OR (classification='MSB' AND registered_within_days < 365)`	0.84 · FCC review
`C.AMOUNT_TREND`	REUSED	`linear regression slope > 0 with min 4 wires over 30 days (BIND-AMT-TREND v1.0)`	0.92
`C.AGE_AND_MARITAL_STATUS`	COMPOSED	`DIM_CUSTOMER.age >= 60 OR marital_status_changed_within_days < 365`	0.88
`C.RETIREMENT_WITHDRAWAL`	NEW	`account_type IN ('IRA','401K') AND withdrawal date < wire_date AND date_diff < 14 days`	0.81 · FCC review

A.4

AG-04 Translation + Governance Gate

Detection code is generated; the typology enters Shadow

AG-04 compiles the spec plus bindings into Spark SQL. The query compiles cleanly. AG-04's deterministic validation pass runs the query against 90-day historical transactions; it returns 12 candidate matches and four of those candidates correspond to SARs already filed by FCC investigators for similar patterns — a strong out-of-sample signal. The typology enters Shadow mode for 60 days of parallel running before any production decision.

Spec status

Drafted in 4 min

Binding reuse

4 of 6 reused

Backtest match

12 candidates · 4 align with filed SARs

Next gate

Shadow 60 days

Total elapsed: 12 hours from FinCEN publication to draft typology entering Shadow validation. The same workstream done manually by FCC and Compliance Engineering would historically have taken 8 to 12 weeks.

ML discovery path

From transaction patterns to a candidate typology

ML-2026-Q1-44 · Trade Invoice Manipulation, Electronics Corridor · Confidence crossed threshold on 2026-05-09

B.1

AG-05 Pattern Discovery Agent

A cluster surfaces from unsupervised graph analysis

AG-05 runs continuous community detection and temporal pattern clustering on the 90-day rolling transaction graph, subtracting subgraphs that match existing typology patterns. One residual cluster — persistent for 11 weeks — surfaces a coherent payment pattern that no current typology covers: 34 small commercial customers (electronics importers and resellers) sending wires to overseas suppliers for invoices systematically above market price for the declared goods.

ML-2026-Q1-44 · First seen 2026-02-21 · Confidence 0.82 (crossed 0.75 threshold on 2026-05-09) Pattern signature: wire payments to small electronics importers in Hong Kong, Singapore, and UAE for invoices systematically 30 to 45 percent above corridor median price for declared HS code 8517 (telephone and communications equipment). 34 accounts in the cluster, $47.2M total volume over 11 weeks. The graph community is tight: the same 8 beneficiaries appear across 28 of the 34 originating accounts. No current typology flags this pattern because trade-based ML detection at the bank relies on documentary review at wire initiation, not retrospective price analysis against external corridor data.

B.2

AG-05 → AG-02 handoff

The cluster becomes the input to spec drafting

When cluster confidence crosses 0.75 with corroborating external signals (in this case, two recent FATF and Wolfsberg publications on TBML in electronics corridors), AG-05 packages the cluster evidence and hands it to AG-02. The handoff includes the cluster signature, supporting transactions, the eight shared beneficiaries, FATF/Wolfsberg corroboration documents, and a confidence trend chart showing how the cluster has stabilized.

# Cluster handoff payload to AG-02
cluster_id: ML-2026-Q1-44
signature:
  edge_pattern: "WIRE_OUT_SWIFT to entity_classification='IMPORTER' (small commercial)"
  price_signal: "declared invoice 30-45% above corridor median for HS 8517"
  graph_density: "8 beneficiaries shared across 28 of 34 originating accounts"
  persistence: "11 weeks, confidence trend monotonically increasing"

external_corroboration:
  - source: FATF
    document: "Trade-Based Money Laundering: Trends and Developments (2020)"
    relevance: 0.89
  - source: Wolfsberg Group
    document: "Statement on Effective Monitoring for Suspicious Activity (2024)"
    relevance: 0.71

recommended_action: "AG-02 draft spec from cluster evidence; FCC reviews novel bindings; proceed to shadow"

B.3

AG-02 Parsing Agent (cluster mode)

A spec is drafted from cluster evidence + external corroboration

AG-02 in cluster mode reads the cluster signature, the eight shared beneficiaries' transaction histories, and the two corroborating external documents. It drafts a Layer 1 spec with four behavioral concepts — two of which reference market-data lookups the bank does not currently maintain. These data dependencies are surfaced explicitly in the spec, so FCC and Compliance Engineering can decide whether to commission the external data subscription before the typology advances.

# Layer 1 typology spec drafted from ML-2026-Q1-44
typology_id: TYP-TRD-007 # candidate
name: "Trade Invoice Manipulation — Electronics Corridor"
source: ML-2026-Q1-44 cluster + FATF TBML 2020 + Wolfsberg MSA 2024
draft_confidence: 0.78

behavioral_concepts:
  - id: C.WIRE_OUTBOUND_TRADE
    description: "Outbound SWIFT wire categorized as trade settlement"
  - id: C.BENEFICIARY_SMALL_IMPORTER
    description: "Beneficiary classified as importer/reseller with annual revenue < $50M"
  - id: C.INVOICE_OVER_CORRIDOR_MEDIAN  # NEW · requires external data
    description: "Invoice amount per unit declared above 25th percentile of corridor market price for HS code"
  - id: C.SHARED_BENEFICIARY_NETWORK
    description: "Beneficiary appears in network with 5+ unrelated originating accounts"

data_dependencies:
  - "External corridor pricing reference (e.g., S&P Global Trade Tariff data) — not currently maintained"
  - "HS code enrichment on outbound wires — partially available (FX wires only)"

B.4

FCC review · Governance gate

The candidate enters Compliance review before any binding work

Because this candidate depends on external data the bank does not currently maintain, governance places it in a different queue from FinCEN-derived typologies. FCC reviews the spec, the cluster evidence, and the proposed data dependencies. The decision before AG-03 binding work begins is a business one: commission the external pricing data feed (one-time investment), or close the candidate with a documented rationale. The system has done its job — surfacing a real pattern with provenance — and now hands the decision to humans with the right context.

Cluster persistence

11 weeks

External corroboration

2 documents

New data needed

Corridor pricing feed

Decision

FCC + Compliance Eng

The shape of ML discovery: Unlike the FinCEN path, the ML path often surfaces patterns that require investment decisions before they can be implemented. The system's job is to find the pattern and present the trade-off honestly — not to assume that every interesting cluster should become a typology.

Internal narrative path

From SAR narratives to a production typology

47 SARs · Aug 2024 through Apr 2025 · TYP-EFM-001 in production since Jan 2026 · 94% historical recall

C.1

AG-07 SAR Mining Agent

A SAR narrative enters the analyst-narrative pipeline

One of 47 SARs filed over an eight-month period. Each describes a variant of the same underlying pattern, but no formal typology existed for it at the time of filing. AG-07 reads each narrative, extracts a structured pattern signature, and accumulates the signatures into a similarity graph.

SAR-2025-04-7732 · Filed 2025-04-25 · Subject: Maria L., CIF 4471829 Subject opened a personal checking account on 2024-11-03 declaring her occupation as 'rideshare driver' with declared annual income of $42,000. Over the period 2025-02-10 through 2025-04-15, the account received 23 Zelle transfers totaling $87,400, originating from 19 distinct senders, none of whom had any prior payment history with the subject. Memo lines included 'rent', 'roommate', 'utilities', and 'loan repayment' but no consistent pattern. Within 24 to 72 hours of each receipt, funds were aggregated and transferred via outbound Zelle in amounts of $4,500 to $4,900 to three counterparties. The subject's debit card showed no point-of-sale activity consistent with a rideshare driver's expense profile. Subject was unresponsive to RFI dated 2025-04-22. SAR filed based on suspected money mule activity feeding into a layering network. Average age of inbound senders: 67.

C.2

AG-07 + AG-05 · Clustering

The pattern emerges from 47 narratives

After 47 SARs accumulate over eight months, AG-07's similarity graph reveals a tight cluster: young account-holders with low-income declared occupations, receiving Zelle from elderly counterparties, aggregating and outbound-Zelling within 24 to 72 hours. AG-05 confirms the cluster signature is statistically distinct from existing typology coverage. The cluster confidence crosses threshold; AG-07 hands the cluster to AG-02 with all 47 SAR narratives attached.

# Cluster signature derived from 47 SARs
cluster_id: SAR-CLUSTER-2025-04
narratives: 47
date_range: 2024-08 through 2025-04
pattern:
  subject_profile:
    age_range: "18-32 (median 24)"
    declared_occupation: "low-income service categories: rideshare, food delivery, dog walking"
    account_tenure: "< 12 months"
  inbound_pattern:
    rail: Zelle
    sender_count: "15-30 distinct senders over 30-60 days"
    sender_age: "median age 65+, no prior relationship to subject"
  outbound_pattern:
    timing: "24-72 hours after aggregation"
    amount: "$4,500-$4,900 (just below CTR threshold of $5,000 + reporting buffer)"
    counterparties: "3-5 recurring beneficiaries across the cluster"

C.3

AG-02 → AG-03 → AG-04

Spec, binding, and translation in one governed pass

AG-02 drafts the Layer 1 spec from the cluster signature plus all 47 narratives. AG-03 resolves bindings — reusing five existing bindings (occupation classification, age tier, Zelle rail filter, low-income proxy, time-since-account-opening) and authoring three new ones (sender-age-distribution, post-aggregation-velocity, recurrent-beneficiary-network). AG-04 compiles to Spark SQL and runs the backtest against the source SARs themselves: 44 of 47 SARs are recalled (94 percent), with three near-misses caused by Zelle-rail metadata gaps that AG-04 surfaces as a known limitation.

Spec drafted

TYP-EFM-001 v0.1

Bindings

5 reused, 3 new

Backtest recall

44 of 47 SARs (94%)

Code lines

187 Spark SQL

C.4

Lifecycle gates · Shadow → Production

Sixty days of shadow before the typology goes live

The typology runs in Shadow mode for 60 days alongside existing scenarios. FCC reviews every alert. Shadow produces 38 candidate alerts; 23 are filed as SARs (61 percent precision); 15 are closed. AG-08 Calibration adjusts the post-aggregation-velocity threshold from 72 to 96 hours after Shadow analysis shows several legitimate-pattern false positives at the original cutoff. MRM reviews the validation package (Layer 1 spec, Layer 2 bindings with audit trail, Layer 3 code with provenance, backtest evidence, FP analysis); approves on January 15, 2026.

Shadow window

60 days

Shadow alerts

Shadow precision

23 SARs filed (61%)

Calibration adjustments

1 threshold (AG-08)

C.5

AG-06 Runtime Detection · Live alert

A new live alert is produced — with full lineage to its 47 source SARs

AG-06 runs the production query nightly. On 2026-05-15, it produces alert ALT-2026-05-1144 against Tyler B., a 23-year-old account holder showing the exact pattern. The alert is routed to Actimize with the full evidence payload — typology reference, version, contributing transactions, behavioral concepts matched, and lineage to the 47 SARs that produced the typology in the first place. The analyst sees not just an alert, but the case-law of patterns that justified the detection.

Alert ID

ALT-2026-05-1144

Subject

Tyler B. · CIF 8847291

Concepts matched

8 of 8

Avg sender age

Typology lineage: Derived from 47 SARs filed 2024-08 through 2025-04. Shadow validated Oct–Dec 2025 with 94% historical recall. Approved for production by FCC and MRM 2026-01-15. Source-of-truth: TYP-EFM-001 v1.0.0 · BIND-BANK-TYP-EFM-001 v1.0.0.

The system is a closed loop. Regulators publish, analysts file, machines discover. Each source contributes typologies through the same governed pipeline, into the same versioned library, executed by the same deterministic runtime — and every disposition from every analyst makes the library a little better. — Closing observation

AI-Enabled AML Typology
Discovery & Detection

Typologies are the bank's view of financial crime — and today they are largely static.

The three pressures on the typology library

Where AI is genuinely needed — and where it isn't

From rulebook to learning system

Today

Tomorrow

Eight agents, three skill layers, one governed lifecycle.

The four learning sources

The eight agents — in detail

Functional architecture

The three-layer skill artifact

Layer 01

Layer 02

Layer 03

A new intelligence layer above Actimize — not a replacement.

Where this system sits in the existing landscape

Five integration points worth naming explicitly

Coexistence with the existing Actimize scenario library

Sequencing the implementation

Phase 01 · Quarter 1

Phase 02 · Quarters 2–3

Phase 03 · Quarter 4+

Three sources, one library — how each becomes a typology.

The advisory is detected and fetched

The advisory is read into a structured typology spec

Concepts are resolved against the bank's data model

Detection code is generated; the typology enters Shadow

A cluster surfaces from unsupervised graph analysis

The cluster becomes the input to spec drafting

A spec is drafted from cluster evidence + external corroboration

The candidate enters Compliance review before any binding work

A SAR narrative enters the analyst-narrative pipeline

The pattern emerges from 47 narratives

Spec, binding, and translation in one governed pass

Sixty days of shadow before the typology goes live

A new live alert is produced — with full lineage to its 47 source SARs