Real-time fraud detection at 18,000 TPS for a digital bank
A digital-only bank was losing money on card-present fraud, mule accounts, and UPI scams faster than its rules engine could keep up. We architected a streaming fraud platform on Apache Flink and ScyllaDB that scores every transaction in under 80 milliseconds — with a graph-based mule-account detector and explainable AI for every decline.
The problem
The bank’s legacy fraud engine was a 2,000-rule decision tree maintained by a 6-person team. Latency budget was 400ms but it routinely missed; rule changes took 3 weeks to deploy. Mule-account networks were detected only after the money had moved 2-3 hops out of reach. Annual fraud loss was running 0.18% of GMV — twice industry benchmark.
The solution
A streaming feature store on Apache Flink that materialises 240+ behavioural and graph features per customer in real time, a gradient-boosted scoring model with sub-50ms inference, and a live transaction graph in ScyllaDB used to detect mule clusters as they form. Every decline is paired with SHAP-based explanation served to the customer-care console.
What we built
Sub-80ms decisions at 18K TPS peak
Apache Flink streaming feature store + LightGBM in ONNX runtime. P99 round-trip 78ms across the entire pipeline.
Graph-based mule detection
Live transaction graph in ScyllaDB. Detects mule clusters as soon as the second hop happens, not after the seventh. Cut mule loss 71%.
Explainable every decline
Per-decision SHAP values surfaced to the care team. Customer service can explain a decline in under 30 seconds — cut fraud-decline complaints 54%.
Adaptive thresholds per cohort
Risk thresholds vary by customer tenure, channel, geography, and transaction history. Less friction for trusted customers, tighter guardrails for new ones.
Daily model refresh
Production model retrained nightly on the previous day’s adjudicated outcomes. Champion-challenger every Monday. No more 3-week rule deploys.
Counterfactual explainer for analysts
Fraud analysts can ask “what would have changed if amount were 30% lower?” and see model behaviour shift live. Helps tune policy without deploying.
How it’s built
The numbers
“Going from a 2,000-rule engine to a streaming model was the single most impactful technical change we made in 18 months. Loss is down, false-positives are down, and the fraud team finally sleeps.”
— CTO, Digital Bank (28M customers)
Have a project that looks like this?
If your engagement combines 3 or more disciplines, we’d like to hear about it. Tell us the constraint, the deadline, and the outcome that matters — we’ll come back with a scoped proposal.