Return-fraud rings: detecting customers who don't think they're working together

A small apparel store owner in Austin emailed us last summer. She had been doing returns for an online shop for six years, knew her customers well enough to recognize most of them by name, and had noticed something she could not put into a sentence.

There were three customer accounts on her store. Different names. Different emails. Different phone numbers. They placed orders within a week of each other. They returned items within a week of each other. The shipping addresses were two distinct apartments in the same building. The customer-name strings shared a last name with one letter changed. The dates of birth, where her checkout collected them, all fell in the same month.

She did not think the three accounts were one customer. She thought they were three customers who knew each other and had figured out something about her return policy together. She wanted to know if there was a tool that could help her see the pattern.

That email is the merchant problem behind RefundSentry's fraud-ring detection. The mental model that unlocks it is graphs, not lists.

Why per-customer signals miss this

Fraud signals built around a single customer's behavior have a structural blind spot. Every signal looks at one entity (a customer, an order, an address) and asks "is anything about this entity unusual?" That works when the fraud is one bad actor placing many orders, because the actor's individual customer record accumulates suspicious patterns over time.

It does not work when the fraud is several actors who each look reasonable individually. Customer A places three orders over six months and returns one of them. That is a normal return rate. Customer B does the same. Customer C does the same. Each customer's record is unremarkable. The pattern is in the relationships between them, not in any individual record.

To see the pattern you have to model the relationships. That means joining customers to addresses, to payment methods, to customer-name fragments, to dates of birth, and asking which of these attributes are shared across more than one customer. The shared-attribute graph is the structure that makes a ring visible.

The data model

RefundSentry maintains a FraudRingAlert table. Each row in the table represents a cluster of customer accounts that share enough attributes to be plausibly coordinated. The fields are simple: a list of Shopify customer GIDs, a confidence score, the attributes that triggered the cluster, the cluster's status (open, confirmed, dismissed), and timestamps.

The cluster construction runs as a periodic batch, not in the webhook path. The reason is that the queries that find these clusters are fundamentally cross-customer queries. They join the customer table to itself on shared attributes. A new order's webhook does not give us the right input shape to run this query, and trying to run it inline would block the webhook handler far past the 5-second ACK budget.

The batch runs nightly, looks at customers active in the last 90 days, and emits a FraudRingAlert row for each newly-discovered cluster. The clustering logic is intentionally simple: shared address fingerprint, shared payment method (card hash + expiry hash), shared customer name with normalized comparison, shared date of birth (when collected). A cluster requires at least two shared attributes across at least two customers. The simplicity is the point. We tried more sophisticated clustering during early development and it produced more false positives than the simple version.

When a FraudRingAlert row is created, the merchant gets a notification. The notification links to an interactive graph view of the cluster. The view shows the customers as nodes and the shared attributes as edges. A merchant looking at the graph sees the ring shape immediately: three customer nodes, two shared addresses, one shared payment method, one shared name fragment. The visual is what does the explanatory work that a list of customer IDs does not do.

What happens when one ring member is confirmed

The most useful piece of structure in this system is what happens when the merchant marks one of the ring's customers as fraud. The CONFIRMED label cascades. Every other customer in the same cluster gets marked as elevated risk. New orders from any cluster member trigger immediately on a "fraud-ring CONFIRMED" signal that contributes hard evidence to the score.

The cascade is automatic but bounded. If the cluster has 10 or more members, the cascade does not run inline. It enqueues an operator-review job because a 10-member cluster is large enough that mistakenly marking all 10 as fraud is more damaging than missing the cascade for a few hours. Spec 119 introduced this threshold after we saw a single-mistake-cascading-to-many-customers case during early use.

Confirmed-fraud cascades respect the GDPR redaction pipeline. When a customer in a cluster is redacted via customers/redact, that customer's row is removed from the cluster's customerIds list and from the address-fingerprint join. The cluster's confidence is recomputed without that customer's contribution. If the redaction takes the cluster below the two-customer threshold, the cluster is marked dismissed.

The interactive graph view

The graph view is the part of this feature that took the most engineering work and the part that produces the most "I had no idea this was happening" moments for merchants.

The view renders the cluster as a force-directed graph: customer nodes in one shape, address fingerprints in another, payment methods in a third, name-fragments in a fourth. Edges connect customers to the attributes they share. A pure-list rendering of the same data ("customer 12345 and customer 67890 share address fingerprint abc...") does not do the work the graph does. The graph makes the topology of the ring legible. A merchant looking at the graph can tell at a glance that the ring is shaped like a star (one address shared by everyone) versus shaped like a chain (A and B share address x, B and C share address y, A and C share name fragment z) versus shaped like a clique (everyone shares everything).

Shape matters because it tells the merchant what kind of operator they are dealing with. Star-shaped rings tend to be one operator using throwaway customer accounts. Clique-shaped rings tend to be coordinated groups. Chain-shaped rings tend to be operators who know they are being watched and are trying to avoid the obvious overlaps.

Engineer detail. The FraudRingAlert.customerIds column is an array of Shopify customer GIDs (the gid://shopify/Customer/12345 format), not internal database IDs. The reason is that customer rows can be redacted; the GID is the durable handle that lets us look up the current state of the customer (or its absence) without depending on a row that may have been deleted by the GDPR pipeline. The tradeoff is that lookups against FraudRingAlert.customerIds go through a join on Customer.shopifyGid, which has its own index. Both reads are fast.

The clustering batch runs in app/lib/fraud-rings/cluster.server.ts. The first version used a sophisticated graph-embedding approach with node2vec and learned similarity. We threw it out. The signal-to-noise was bad on the kind of data Shopify merchants have (sparse customer attributes, lots of single-purchase customers, no labeled training data for embedding quality). The current version is rule-based: build the bipartite graph of customers-to-attributes, find connected components, emit components of size >= 2 as candidate clusters, score the cluster by attribute-overlap density. Sophisticated approaches go on the roadmap for the day we have cross-shop network data (spec 197 has the cross-shop indexes; we are not yet running cross-shop clustering) and address fingerprints from the libpostal-backed parser (specs 205 and 206; the parser is shipping but the address-cluster signal does not yet read from it).

Where rings come from

The same six attribute combinations show up across most of the rings we see:

Shared address with different names is the most common. Two or three accounts shipping to one apartment, ordering and returning rotating items. Often the attribute that ties them together is the apartment number rather than the street, which is why address fingerprinting matters more than raw address-string matching.

Shared payment method with different addresses is the next most common. The same card retried across multiple shipping addresses. Usually a single operator using throwaway shipping addresses to defeat the velocity signals.

Shared name fragments with same date of birth and different addresses is the third pattern. Customer-typed name strings with one letter different ("Anna Mendez" versus "Ana Mendez"), customer-typed birth dates that are identical. Tends to be name-collision evasion rather than literal coordination.

The remaining patterns are rarer: shared phone numbers (people fat-finger their own number), shared device fingerprints (less reliable than the others, lots of legitimate household sharing), shared shipping carriers (almost always coincidence on small stores).

What the merchant does with the alert

A FraudRingAlert arrives in the merchant's notification feed with a one-line description and a link to the graph view. The merchant looks at the graph, decides whether the cluster looks coordinated, and either confirms or dismisses. The confirmation cascades to score adjustments and customer tagging. The dismissal records the merchant's judgment and the engine learns to weight the same attribute combination less aggressively for that merchant.

A merchant who never gets a FraudRingAlert is fine. The feature is a high-precision low-recall tool. It only fires when the attribute overlap is unambiguous. We deliberately tuned for precision because the failure mode of low precision is the merchant marks all the alerts as dismissed and stops reading them.

Take-away

A fraud ring is a structural pattern, not a per-customer pattern. It is invisible to fraud tools that score one customer at a time. The detection has to live at the graph layer (customer-to-attribute joins, connected-component clustering, cascade on confirmation) and the alerting has to render the topology so a human can see it.

The merchant problem this solves is real. Most small DTC operators have seen the pattern at least once. They did not have a tool that named it for them.

RefundSentry is an intelligence layer for Shopify return fraud. See pricing for plans during the private beta.

Return-fraud rings: detecting customers who don't think they're working together

Return-fraud rings: detecting customers who don't think they're working together

Why per-customer signals miss this

The data model

What happens when one ring member is confirmed

The interactive graph view

Where rings come from

What the merchant does with the alert

Take-away

Stop return fraud before it costs you

RefundSentry Engineering

Continue Reading

We rebuilt our risk engine, here's what was wrong with v1

Why we store every webhook for a year and what we do with it