The anatomy of a repeat return fraudster
A small apparel merchant we work with ran into a pattern that felt obvious in hindsight and was completely invisible at the time. Five customer accounts, two shipping addresses, twelve orders over three months, eleven returns and one chargeback. Total losses, including refund payouts and the chargeback fee, around $1,840.
This post walks through what the data actually showed, in the order it landed in the merchant's database, and which of the new identity-pivot signals would have surfaced it earlier than the existing engine did. Names and addresses are anonymized; the rest is real.
The setup
The merchant runs a women's apparel store on Shopify. Average order value around $90, typical return rate around 18%, mostly fit and sizing reasons. Most chargebacks they had ever seen were honest disputes (the package never arrived, the package was damaged), and the chargeback rate was well below 1% of orders.
The store had been on RefundSentry for about four months when the pattern started. The risk engine was running with the spec 105 signal set, which includes shared-address-cluster, multi-account address, account-age vs order value mismatch, and the existing customer-history signals. Confidence threshold for HIGH zones was set at the engine default.
What landed in the data
Week 1, day 3: Customer A is created. Address: 4271 Maple Lane, Apt B. Order placed for $156. Standard checkout, no anomalies. The order ships, customer A receives it, files a return for "wrong size" three days later. Refund issued. Return scored at 22, low zone, not flagged.
Week 1, day 9: Customer B is created. Same address, 4271 Maple Lane, Apt B. Order placed for $103. Multi-account-address signal fires (2 customers at the address now). Account-age vs value mismatch fires marginally. Return arrives 11 days later, scored at 38, medium zone, on the merchant's review queue.
Week 2, day 14: Customer A places a second order, $174. No new signal contributions. Returns it 8 days later, scored at 31, low zone.
Week 3, day 21: Customer C is created. Different address, 89 Birchwood Drive. Different city. Order placed for $194. No address-cluster signals fire because the address is brand new. Returns scored at 27, low zone.
Week 4, day 28: Customer A files a chargeback on their first order, the $156 one. Reason code: "fraudulent, unauthorized." This is the first explicit fraud signal in the data.
Here is the moment where everything starts to look different in retrospect. The chargeback is on the address that already had two customers running returns. The chargeback gets recorded as a ChargebackEvent with customerId = A and (in the spec 147 schema) addressFingerprintId = F1. Customer profile A's chargebackCount ticks to 1. The return record gets cascaded to outcome=CONFIRMED_FRAUD by the order-cascade pipeline (spec 143). The original order's RiskScore row gets the CONFIRMED_FRAUD outcome.
In the pre-spec-147 engine, that confirmed-fraud label is trapped on customer A's record. It does not propagate to customer B at the same address. It does not affect customer C at the other address.
Week 5, day 33: Customer D is created. Address: 89 Birchwood Drive (same as customer C). Order placed for $221. No new identity-pivot signals fire because the address has only 2 customers (C and D), below the multi-account threshold.
Week 6, day 39: Customer B places a second order, $112, returns it 7 days later. Score 35.
Week 7, day 47: Customer E is created. Address: 4271 Maple Lane, Apt B. The third customer at that address. Multi-account-address fires harder (now 3 distinct customers). Score elevates into medium zone. The return that arrives 9 days later scores at 51, clearly elevated, but not high zone, not blocked.
Week 9, day 61: Customer C files a chargeback on their first order. Same reason code: "fraudulent, unauthorized." Now there is a chargeback at the second address.
Week 10-12: customers B, D, and E each return more orders. Returns scored in the 40-50 range.
Total losses by week 12: 11 successful returns plus 1 chargeback fee plus 1 disputed-and-lost amount. Around $1,840.
What the new signals would have caught
Replay the same sequence with the spec 147 identity-pivot signals enabled. The shop's chargeback backfill was already complete (the shop was a long-time RefundSentry customer with current data), so all four address-pivot signals are live.
Week 1, days 3 and 9: same as before. No prior chargebacks at the address, no prior fraud labels, no prior cohort activity. Returns score in the low and low-medium zones.
Week 4, day 28 (first chargeback): customer A's chargeback gets recorded. The new chargeback writes addressFingerprintId = F1 on the ChargebackEvent row. The order-cascade pipeline sets outcome=CONFIRMED_FRAUD on the order's RiskScore row. The denormalization happens at upsert time.
Week 4, day 30 (the next return at F1): customer B requests a return. The risk engine evaluates. Now priorChargebackAtAddress reads the lifetime count at F1 and finds 1 chargeback. TRIGGERED at the 1.0x tier, contributes 18 base points. priorFraudAtAddress reads the confirmed-fraud-customer count at F1 and finds 1 (customer A). TRIGGERED, contributes 20. sharedWithFraudConfirmed flips true (customer A is in the customerIds array AND has a CONFIRMED_FRAUD outcome). TRIGGERED, contributes 15. The existing multi-account-address signal continues to fire at the 1.0x tier.
The combined score on customer B's return at week 4: well into the high zone. The merchant's review automation flags it. The merchant looks at the customer record, sees the connection to customer A's chargeback, and either denies the refund, requests verification, or labels it.
Whatever the merchant decides, customer B's return becomes a labeled training example. If labeled fraud, customer B's CustomerProfile gets the CONFIRMED_FRAUD outcome, which propagates further at the address fingerprint to anyone else who ships there.
Week 5, day 33: customer D is created at the second address (89 Birchwood Drive). At this point the second address has not yet had a chargeback (customer C's chargeback comes at week 9), so the address-pivot signals do not fire on D. However, if customer A or B's hashed email or phone matches anything else in the cohort, the email/phone signals could fire here. In the actual case, the fraudster used different emails, so the cohort signals do not catch the cross-address connection.
Week 9, day 61 (second chargeback at second address): customer C's chargeback lands. addressFingerprintId = F2. From this point forward, customer D and any future customer at F2 will fire the address-pivot signals.
Week 10+ (subsequent returns from D, E): all elevated. The fraud is contained at week 4 in the new model versus week 12 in the old.
The estimated losses with the new signals enabled: around $580, a 68% reduction.
What surprised the merchant
Three things stood out when we walked the merchant through the analysis:
The chargeback on customer A was the inflection point in both the old and new models, but the old model only acted on it for customer A. The new model uses the chargeback as a signal about the address, which is the correct framing.
Customer C's first order was placed three weeks before customer A's chargeback. There was no information in the data at that time that customer C was related. The new signals do not catch customer C's first order. They catch customer C's second activity, after customer C's own chargeback. That is fine, there is no time-travel signal that catches a totally fresh address with no prior chargebacks. The signals work forward in time.
The multi-account-address signal fired on customer B's first return (week 1, day 9), but the merchant's review queue did not have an automation to act on medium-zone scores at that time. The merchant has since added a review-queue automation that triggers on any return where multi-account-address is firing, regardless of zone, on the basis that the multi-account pattern is itself a strong signal worth investigation.
For more on how the customer-history pivot signals work alongside the address-pivot signals, see email and phone recycling in return fraud.
What the merchant changed afterward
Three operational shifts:
The merchant added a "hold for review" automation on returns scoring above 60 with HIGH confidence. Previously they had been blocking only on scores above 80. The new automation captures the elevated-score-but-not-maxed cases that the address pivot signals produce.
The merchant adopted the Confirm Not Fraud workflow systematically. Previously they had been labeling about 30% of medium-zone returns. They now label every return that goes through the review queue, regardless of decision. This generates training data the engine uses to adjust signal contributions per shop.
The merchant turned on the email and phone cohort signals, which had defaulted to the registry weights. They had been using the address-pivot signals during the beta but had left email/phone at default. The cohort signals catch a different subset and contributed an estimated 15% additional flag rate on returns the merchant looked at.
For the deep dive into how the chargeback dispute is the inflection point and why the address-pivot signals matter most, see why your Shopify chargeback data is not enough. The fraud-ring layer of the system is in fraud ring detection beyond the obvious signals. The walkthrough of address-keyed signals specifically is in how fraudsters reuse addresses after a chargeback.
Common questions
Are these numbers typical?
The losses ($1,840) and the loss reduction estimate (68%) are specific to this merchant. They depend on average order value, return policy, and how aggressive the merchant's automation rules are. We have seen reductions in the 40-80% range across the merchants we have analyzed. A merchant with stricter blocking automation will see a higher reduction; a merchant with more lenient policies will see less.
How does the merchant know the labels are correct?
They do not, with certainty. Confirm-fraud labels are merchant judgments. Sometimes they are wrong. The engine treats labels as input rather than ground truth, and a merchant who is consistently mis-labeling will train the engine in the wrong direction. We coach merchants to label only the cases they are confident about and use the Not Sure outcome on ambiguous cases. The engine uses Not Sure as a non-signal, it does not contribute either way.
What if the fraudster had used different addresses every time?
Then the address pivot would not catch them. The email and phone pivots might, depending on whether the fraudster reused those identifiers. A fraudster who rotates addresses, emails, AND phones across every order is genuinely hard to catch with the current signal set, and would only be caught by the per-customer signals if their order patterns were anomalous enough to fire those.
Did the merchant's chargeback win rate change?
Indirectly. The new signals do not fight chargebacks; they prevent the returns that lead to chargebacks. Catching a return at the request stage and either holding or denying the refund prevents the chargeback from ever happening. Over the six months following the spec 147 release, this merchant's chargeback count dropped from a steady 2-3 per month to about 1 every 2 months.
How long after install does the new merchant start seeing these signals?
The address-pivot signals require the chargeback backfill to complete. For a new merchant with a year of historical data, that takes anywhere from minutes to a few hours. Email and phone pivots are immediately live (no backfill required). The post-pass newAccountAtKnownRiskAddress signal is immediately live.
Closing thought
The pattern in this post would have been catchable without RefundSentry, by a human looking carefully at the data. The merchant could have spotted it in week 4 when customer A's chargeback landed if they had thought to query other customers at the same address. They did not, because thinking to do that on every chargeback is not a habit a merchant has time to maintain.
The signals automate exactly that habit. Every chargeback, on every order, every time, the engine asks the question "what else is happening at this address?" and elevates the score on related returns accordingly. Same question for emails. Same for phones.
If you want to see the same kind of analysis on your own store's data, RefundSentry is free during the private beta. The full identity-pivot signal set is live for every shop on the platform. See pricing for plan details and docs for the signal reference and tuning guide.