RC RANDOM CHAOS

Six SQL Patterns for Catching Transaction Fraud Without ML or Graph Databases

· via Hacker News

Original source

SQL patterns I use to catch transaction fraud

Hacker News →

A data analyst on a program-integrity team argues that most transaction fraud detection is best handled in plain SQL against the right tables, not with machine learning or graph databases. The patterns generalize across credit cards, healthcare claims, e-commerce, and government benefits — anywhere money moves and gets logged.

The six patterns build on each other. Velocity queries flag cardholders exceeding thresholds in 1-minute, 5-minute, and 1-hour windows, since card-testing rings and benefits-trafficking rings operate at different speeds. Impossible-travel checks use LAG window functions and haversine distance to surface cards that appear to move faster than a commercial jet between swipes. Amount-anomaly queries catch round small-dollar values like $1 or $5 (classic card tests on dumped numbers) and amounts just below ID-check or ATM-cap thresholds like $99.99 or $499.99. Merchant-side detection looks for skimmer compromise by comparing each merchant’s unique-card count per hour against its own rolling seven-day baseline rather than a static threshold, since a Costco and a used bookshop can’t share a cutoff. Off-hours detection profiles each cardholder’s habitual spending hours over 90 days and flags transactions outside that envelope.

The broader point is practical: tune two knobs per query (window and threshold), maintain whitelists for legitimate outliers like vending-machine route operators, and prefer self-relative baselines over global ones. QUALIFY simplifies sliding windows in Snowflake, BigQuery, Databricks, and Teradata; Postgres users wrap the same logic in a CTE.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.