- Exposing logged out users and analyzing logged-in metrics like revenue or a funnel going from logged-out marketing page landing -> to a logged-in subscription purchase.
- Utilizing one:many relationships, e.g. a single user owns multiple accounts. ID resolution lets you aggregate metric from the user’s mapped accounts. Note that you lose power when using this approach but it is statistically sound.
The Challenge: Connecting User Identifiers
A common challenge in experimentation is linking user identifiers before and after an event boundary—most often, signups. Experimenters usually have a logged-out ID (e.g., a cookie or Statsig stableID) and, for users who sign up, a userID created afterward. Since business metrics are typically computed at the userID level, teams often want to randomize on logged-out identifiers but measure outcomes on logged-in metrics like revenue or LTV. Most platforms require manual joins or preprocessing to connect these identifiers, leading to complex, error-prone queries that must reconcile exposures across time and mapping tables. Statsig Warehouse Native eliminates this overhead with an automatic, no-code way to connect identifiers across these boundaries—centralized, consistent, and reproducible.Mapping Modes
When using ID resolution, you can choose from one of three modes:- Strict 1:1 mapping enforces that identities have a singular mapping. If you have a mapping between two IDs that are always 1:1, this mode enforces that the mapping is singular and warns you if there’s data where that’s to the case. Users with a single identity can use downstream metrics from the secondary identity, and multi-mapped users are considered corrupted and discarded from the analysis.
- First-touch mapping is a way to attribute activities of secondary ID(s) to one primary ID by recognizing the treatment effect comes from the first time the user is exposed to the experiment.
- Last-touch mapping is a way to attribute activities of secondary ID(s) to one primary ID by recognizing the treatment effect comes from the most recent time the user is exposed to the experiment.
Strict 1:1 Mapping

First Touch Mapping (Mixed Population)

Last Touch Mapping (Mixed Population)

What does Mixed Population mean?

Explanation of Methodology
- Primary IDs are preferred over secondary IDs if present in the data.
- Secondary IDs are only used to join metrics to exposures, but the unit of analysis is still the primary ID.
- Many (primary) to one (secondary) mapping is handled through attributing the secondary ID to ONE primary ID.
- One (primary) to many (secondary) mapping is implicitly handled by treating all secondary IDs as the same unit.
- Statsig supports a mixture of primary and secondary IDs in the same experiment.
How to Enable ID Resolution in a Statsig Experiment
Setting up identity resolution in Statsig is very simple. You can either log or join data to provide both IDs on your assignment source, or provide one ID in the assignment source along with a mapping table between the IDs in the form of an Entity Property Source.Using Property Source
To use Identity Resolution across experiments in your project, you will need a lookup table that has both the ID you are exposing on and the selected targeted ID. This table can be configured by setting up an Entity Property Source with both IDs present. Once that’s done, you can simply select this source when configuring your secondary ID type, and Statsig handles the join for you.Using Assignment Source
When creating an assignment source, provide a column for both ID types. It is assumed that your ‘Primary ID’ will be non-null for exposure records. Your secondary ID can be null. If your secondary ID is sparse (some records are null, and some are not due to logging), Statsig will back-attribute any identified secondary ID to other records from the same Primary ID.- For metric sources with the primary ID, metrics will be joined to exposures based on that primary ID
- For metric sources with only the secondary ID, metric will be joined to exposures based on that Secondary ID
- If using strict mode, users with a duplicate mapping are dropped from analysis. Using first-touch, units use their first exposure record, and merge data from all mapped secondary IDs.
Mapping Changes
If a change is made to the entity property source or assignment source’s definition or underlying data, that will be reflected on the next reload. This is why a full reload is required, since otherwise historical changes to the mapping can lead to inconsistent data on incremental reloads or explore queries.Best Practices
We strongly recommend using an Entity Property Source to provide a cleaned unit mapping from your warehouse. However, you can also provide mappings on your exposure source by logging multiple identifiers in the exposure data - Statsig will greedily use this to match across identifiers. For both modes, an experiment can currently only have one mapped ID type - e.g. secondary_id->user_id, or secondary_id->account_id, but not both. All modes will require a full reload, so that there’s not data inconsistency due to historical mappings being changed or new mappings introduced. The property source or assignment source used to provide mappings will be filtered to records within the experiment’s date range. If a mapping is “evergreen”, or not scoped to a specific time period, you can omit the timestamp on the entity property source.Example of a supported schema
if your assignment source data contains:{stableID: 'unknown_123', exp_id: 'PDP Test', test_group: 'Control'}
and your metric sources contain data that represents a metric as:{userID: 'known_abc', event: 'page_load'}
Your Entity Source or Assignment source must contain the secondary identity (in this case, userID) that will enable Statsig to join your assignment data with your metric data:{stableID: 'unknown_123', userID: 'known_abc', country: 'USA'}
Considerations
Deduplicating records can lead to biased results, so Statsig preforms two extra health checks on this kind of experiment.- Statsig will check your deduplication rate and warn you if it is unusually high. It’s expected that some secondary IDs will have multiple logged-out IDs due to users using different devices or clearing browser history
- Statsig will perform a chi-squared test evaluating if the deduplication rate is identical across arms of the experiment. In some cases, an experiment may cause more users to come back (for example an email resurrection campaign), in which case duplicates are expected to be more frequent in that arm and can be a positive outcome. In this case, you can perform first-touch attribution to maintain a common identifier