Sequential testing ab testing

7/3/2023 0 Comments

Sequential testing ab testing

The compensation factor changes all the time as experiments start and end. In effect, we’re wasting 50% of the users. When experiment E3 ends we can completely get rid of salt 1 - but because we diluted the buckets the released space cannot be used until E4 finishes. The remaining four buckets can be allocated to some other experiment.

We call the amount of required overallocation of buckets the “compensation factor” - and in this case it’s 1/.50 = 2. But because of the dilution of bucket size (those new eight buckets only get 50% of the traffic), we need to allocate four of them to E4 to get 25% of the population. Now we have eight free buckets we can use for experiments. What if we shuffled the free users into new buckets using a new salt? How can we allocate buckets safely without jeopardizing randomization? If we were to pick only bucket 0 and 1 we would have a 100% overlap with experiment E1 that just ended, which might lead to biased results due to carryover effects. Now 50% of the buckets are free and we want to start E4 that needs 25% of the population. So what happens when two experiments (E1and E1) end, releasing some space that can be allocated to a new experiment? Since we’re experimenting a lot, most buckets are always allocated to an experiment. Note that we also have a per-experiment salt to spread users from the allocated buckets over treatments, but for simplicity we omit that from the images in this article.

In the image below, experiment E1 has been allocated buckets 0 and 1. This is done by hashing users into buckets using a tree of “salts” (it’s worth noting that if two experiments are disjoint because of targeting, we do not have to use the same salt tree for them).įor this article, imagine that we split users into 8 buckets:Ī user ends up in bucket 1 if HASH(user id, SALT) % 8 = 1 and so forth. We have developed something we call the “salt machine” that automatically reshuffles users without the need to stop all experiments. With requirements of exclusivity and holdbacks, assigning users to experiments gets quite complex if we do not want to compromise on randomization (and we do not want to). This means that they need to be able to start and stop experiments at any time. The Salt MachineĪt Spotify, autonomous are teams free to move at schedules that fit them best. Once this test is done, the holdback is released and these users will go into new experiments. This way we can get a read for the compound effect of everything the team decided to ship during the quarter. When the quarter ends, a single test is run on these users where the combined experience of all (successful) experiments is given to the treatment group. Experiments that run throughout the quarter will never be assigned to any of those users subject to the holdback. Users in these holdbacks are exempt from the general experimentation that happens in the domain.Īt Spotify we have established a pattern where at the start of a quarter, we create a new holdback. We implement holdbacks (the practice of exempting a set of users from experiments and new features, in order to see long-term effects and combined evaluation) in domains. We’re planning to decouple exclusivity from the domain concept, to allow for experiments across domains to also be exclusive to each other. Currently only experiments in a single domain can be exclusive to each other. For this reason many experiments need to run in an exclusive manner, where a user can only be in one of a set of experiments that can potentially impact each other.

When a lot of teams experiment in the same proximity, there’s risk of interaction effects.

0 Comments

YOUR CART

Sequential testing ab testing

Leave a Reply.

Author

Archives

Categories