Why your Shopify chargeback data is not enough
Shopify gives merchants a chargeback record for every dispute. The order GID, the dispute amount, the reason code, the timestamp. You can look up any dispute, see the details, and decide whether to fight it or eat it. The data is accurate and timely.
What Shopify does not give you is the connective tissue between disputes. There is no native UI that says "this dispute is the third one at this address," or "the customer who placed this order shares an email hash with two prior chargeback customers." Those joins do not exist in the Shopify admin. They have to be computed somewhere else.
This post is about what that gap costs your store, what kinds of patterns slip through because of it, and what RefundSentry's spec 147 release does to close it.
The shape of the gap
Pull up any dispute in the Shopify admin. You will see the order, the customer, the amount, the reason, and the dispute status. You can click through to the order, click through to the customer, and read their order history. That is it.
What you cannot see, without leaving Shopify and reconciling externally:
- Whether this customer's address fingerprint has received chargebacks under other customer accounts
- Whether this customer's email or phone hash matches any other customer in your store with chargeback history
- Whether this customer is part of an active fraud-ring alert with other customers at the same address
- Whether the chargeback is the first in an emerging cluster (2 chargebacks in 90 days at the same address) or part of a long-tail history
For a single chargeback investigation, you can manually build these joins. Open the order, copy the address, search the customer list for other orders to the same address, look at each customer for prior disputes. Twenty minutes per dispute. For a store doing two or three disputes a week, manageable. For a store doing five or ten a day, impossible.
Why this gap exists
Shopify's data model is order-centric. Customers, orders, returns, and disputes all relate back to the order. The platform's job is to record what happened, not to inference about why or whether. This is the right architectural choice for a commerce platform, adding pattern-detection logic to the core data layer would slow it down and complicate it. It is also why every Shopify merchant who wants pattern detection ends up bolting an external app on top of the native data, exporting data to a spreadsheet, or paying a third-party fraud platform that has its own opinionated view of the customer.
The gap is intentional, and it is genuinely fine for the simpler 80% of fraud cases. A customer with three chargebacks across one Shopify customer account will be visible to anyone looking at the customer record. The native data is sufficient.
The gap becomes a problem in the harder 20% of fraud cases. Specifically, the cases where:
- A fraudster has rotated customer accounts to reset their visible history
- The fraud is distributed across multiple customer accounts that share an address, email, phone, or other pivot
- The pattern is recent enough that the per-customer counters have not yet crossed the threshold
These are the cases that RefundSentry's identity-pivot signals target.
What "good enough" detection looks like in practice
Several patterns show up consistently in merchant investigations:
A customer files a chargeback and abandons the account. Two weeks later, a brand-new customer places an order shipping to the same address. Without an address-keyed signal, this is invisible. With one, the new return scores in the medium-or-high zone the moment it lands. See how fraudsters reuse addresses after a chargeback for a full walkthrough.
A merchant labels a return as fraudulent. The fraudster creates a new customer account and ships to the same address. Without fraud-ring or confirmed-fraud propagation, the new return scores clean. With it, the new return inherits the merchant's fraud confirmation through the address pivot.
A fraud team operates four customer accounts simultaneously, with different addresses but a shared email hash. Three of the accounts have one chargeback each. The fourth is fresh and is about to file a return. Without an email cohort signal, the fourth return scores clean. With it, the cohort sum of 3 chargebacks elevates the fourth return to the high-risk zone. See email and phone recycling in return fraud for the cohort mechanics.
A warehouse address has received two chargebacks in the last 60 days. The lifetime count at that address is only two, below the strong-tier threshold for the lifetime signal. The velocity-window signal catches it because the cluster is recent.
What the new release adds
RefundSentry's spec 147 ships seven new identity-pivot signals plus one post-pass combination signal. The four most directly relevant to "Shopify chargeback data alone is not enough":
priorChargebackAtAddress reads the lifetime count of CHARGEBACK-type dispute events linked to the current return's shipping address fingerprint, scoped to your shop. Tier-graduated points: 1, 2, or 3+ chargebacks contribute progressively more.
recentChargebackVelocityAtAddress reads the same data, time-bounded to the trailing 90 days (merchant-tunable). Triggers at 2 or more recent chargebacks. Catches in-flight clusters before they cross the lifetime threshold.
priorFraudAtAddress reads RiskScore rows where outcome is CONFIRMED_FRAUD on RETURN or ORDER subjects. Counts distinct customer GIDs at the fingerprint with that outcome. Triggers on 1 or more.
sharedWithFraudConfirmed is a boolean signal that flips when any customer at the address fingerprint either has a confirmed-fraud outcome OR is a member of a CONFIRMED FraudRingAlert. Lower base points than priorFraudAtAddress because it is fire-once. The two signals run in parallel.
The remaining three pivot signals (priorChargebackEmail, priorChargebackPhone, priorChargebackSameCard) use the customer-history pivot rather than the address pivot. The card signal is registered as a NOT_AVAILABLE stub pending Shopify Admin API support for stable card fingerprints. The email and phone signals are live.
How the data gets there
Two pieces of denormalization make this practical to compute on every return without slowing the engine:
The first is an addressFingerprintId foreign key on every ChargebackEvent row. Without this, the signal would need to join through the order to the shipping address on every score, which is too slow for a hot path. With it, counting chargebacks at an address is a single indexed query.
The second is the chargebackCount aggregate on every CustomerProfile. This was already populated by spec 075's chargeback-analytics work. The new cohort signals just sum across rows sharing an emailHash or phoneHash, scoped by shopId.
A per-shop backfill job populates the chargeback foreign key for historical rows. Backfill takes anywhere from a few minutes to a few hours depending on the shop's chargeback volume. While the backfill is running, the four address-pivot signals return NOT_AVAILABLE rather than NOT_TRIGGERED for that shop. We would rather miss a few hours of detection than emit a wrong score from a partially populated dataset.
The order of magnitude of the new queries: roughly 2 ms of additional latency per return on a shop with under 100k chargebacks. The composite indexes on [shopId, addressFingerprintId], [shopId, emailHash], and [shopId, phoneHash] keep the lookups in single-digit milliseconds even at much larger sizes.
What changes for merchants who turn this on
The most visible change is that more returns now score in the medium-and-above zone. The base rate of high-zone returns goes up, because the pre-feature engine was systematically missing the identity-pivot fraud subset. Some merchants will see a 5-10% increase in flagged returns; others will see more or less depending on their fraud-mix.
The second change is that some confirmed-fraud labels now propagate where they did not before. A merchant who labeled a return as fraudulent six months ago will start seeing new returns from new customers at the same address inherit the elevated score. This is the longest-overdue gap in the engine and the one most directly addressed by spec 147.
The third change is that emerging clusters are visible earlier. The 90-day velocity signal flags a 2-chargeback cluster that the lifetime signal would not flag until the third. This shifts detection from reactive ("we already had three chargebacks at this address") to proactive ("we have two, and a third is likely on the way").
For deeper reading on how the new signals interact with broader fraud-ring detection, see fraud ring detection beyond the obvious signals. For a real-world walkthrough, see the anatomy of a repeat return fraudster.
Common questions
Does Shopify Plus include any of this natively?
Shopify Plus has Shopify Protect for orders, but it is focused on payment fraud at the order level, not return fraud or pattern detection across customer accounts. It catches stolen-card scenarios, not address-recycling scenarios.
Can I build this with the Admin API and a custom job?
In principle, yes. The Admin API exposes orders, customers, and disputes. You would need to build the AddressFingerprint table, populate it from order shipping addresses, denormalize chargeback IDs onto the fingerprint table, run cohort counts on email and phone hashes (which you would need to compute and store yourself, since the Admin API does not expose them), and run the engine on every return webhook. RefundSentry is the result of doing exactly that across many merchants.
How does this affect my chargeback win rate?
Indirectly. The new signals do not fight chargebacks for you, they flag returns that are likely to produce future chargebacks. Catching a return at the request stage and either holding or denying the refund prevents the chargeback from happening at all. Merchants who use the score to gate refund decisions tend to see chargeback rates drop within a few months.
What if a chargeback is filed in error (not fraud) and gets resolved in the merchant's favor?
The dispute event still gets recorded as a ChargebackEvent. The signal would still count it. This is a known limitation: the signal cannot distinguish between a won and lost chargeback at the address-pivot level. We address this at the merchant level via the Confirm Not Fraud action, which records a positive label for that customer and address combination, reducing the signal's contribution on future returns at the same fingerprint.
Are these signals available in the free Shopify app store version?
RefundSentry is currently in private beta. Pricing and availability will be set at launch. See pricing for the current state and signup info.
Closing thought
The gap between "Shopify gives you the dispute" and "your store catches the next dispute before it happens" is exactly the gap RefundSentry exists to close. The new identity-pivot signals are the most concrete step toward that goal we have ever shipped. They use data Shopify already gives you, joined by RefundSentry, scored by the unified risk engine, surfaced in the existing dashboards.
If you want to see the signals fire on your own store's data, RefundSentry is free during the private beta. The address pivot, email cohort, phone cohort, and velocity signals are live for every shop on the platform. See the docs for setup or sign up for the beta to start scoring returns on your store today.