How to pick a stable 5% of users, requests, or conversations without storing assignments

Jun 28, 2026

hashingdeterministic-bucketingfeature-flagssampling

How to pick a stable 5% of users, requests, or conversations without storing assignments

A beginner-friendly explanation of deterministic bucketing: a common pattern behind gradual rollouts, feature flags, A/B tests, and sampling.

1. The problem: choosing a small slice consistently

In product and infrastructure work, you often need to choose only a small part of a larger population.

Show a new feature to a small percentage of users first
Put 10% of users into an experiment group
Run expensive logging on only 1% of requests
Send a small slice of conversations through a new AI model in the background

These are all versions of the same problem:

From many entities, choose about 5% or 10% in a stable, repeatable way.

This article uses customer-support conversations as the concrete example. The same idea works for users, accounts, workspaces, requests, and other IDs.

Imagine a customer-support system. A new AI model reads support conversations and predicts things like category, urgency, or suggested next action.

Before trusting that model in production, you may want to run it carefully:

Start in shadow mode
In shadow mode, the model observes conversations and produces labels
But its output does not change the actual user experience yet
Send only about 5% of conversations to it at first
If the results look good, increase that to 25%, then 50%, then 100%

This is a gradual rollout. The same idea appears in feature flag systems, experiments, and A/B testing.

At first, the sampling part sounds trivial. Pick 5% of conversations at random and call it done.

But the details matter.

2. What we actually need

The goal is not just "return true with 5% probability." We need a few stronger properties.

The selected share should be about 5%.

If there are 100 conversations, about 5 should be selected. If there are 100,000, about 5,000 should be selected.

The same conversation should get the same answer every time.

A conversation may be processed more than once: when it is created, when the customer replies, when an agent updates it, and so on.

If the conversation is selected once and not selected the next time, the collected data becomes incomplete.

Increasing the percentage should only add conversations.

If you move from 5% to 25%, the conversations that were already in the first 5% should stay in. The rollout should add more conversations, not reshuffle the whole population.

We should not have to store one assignment per conversation.

You could store a database flag saying "this conversation was selected." That works, but it adds storage and write paths for something that should be cheap.

The pattern that satisfies these requirements is often called deterministic bucketing, sticky bucketing, or percentage rollout bucketing.

3. Why `Math.random()` is not enough

The first version many people reach for looks like this:


if (Math.random() < 0.05) {

// sample this conversation

}

This is correct only in a narrow sense: each time this line runs, it has a 5% chance of entering the block.

The problem is that Math.random() gives a new value every time it is called.

For one conversation, you might get:


initial handling : selected

reply #1         : not selected

reply #2         : selected

That is not what we want. We want the whole conversation to be either in or out.

You could fix this by calling Math.random() once and saving the result. But then you need to store that assignment for every conversation.

We want the same stability without storing the assignment.

4. The idea: give every ID a stable bucket

If we want the same answer every time without storage, we need a function that turns the same input into the same output every time.

For example:


conversation_12345  ->  42

conversation_12346  ->  7

conversation_12347  ->  81

That number is called a bucket. You can think of it as a numbered slot.

To keep the explanation simple, imagine there are 100 buckets, numbered 0 through 99.


0  1  2  3  4  5  6  ...  97  98  99

To sample about 5%, select conversations whose bucket is less than 5.


selected

|-------------|

0  1  2  3  4  5  6  ...  97  98  99

|---- < 5 ----|

To roll out to 25%, move the threshold to 25.


selected

|---------------------------------------------|

0  1  2  3  4  5  6  ...  23  24  25  ...  99

|------------------- < 25 --------------------|

The bucket for each conversation never changes. Only the threshold changes.

| Conversation bucket | at 5% (< 5) | at 25% (< 25) | at 50% (< 50) |

| --- | --- | --- | --- |

This gives us three important properties:

The same conversation ID always maps to the same bucket
Raising the rollout percentage only adds conversations
The bucket can be recomputed from the ID, so there is no assignment to store

Now we need a good way to turn IDs into buckets.

That is where hashing comes in.

5. What a hash function does

A hash function turns an input into a fixed-size number.


"conversation_12345"  ->  3287581120

"conversation_12346"  ->  1529084417

"conversation_12347"  ->  409817263

Two properties matter for this use case:

The same input always produces the same output.
Similar inputs should spread out across the output range.

conversation_12345 and conversation_12346 look almost identical. Only the last character differs.

A useful hash function should not put those two values next to each other just because the inputs look similar. It should mix the input enough that the outputs are spread around.

That is exactly what we need for rollout sampling:

Same conversation ID -> same hash
Sequential or patterned IDs -> values that look spread out
Hash value modulo the bucket count -> a stable bucket

6. A practical implementation

Using only 100 buckets is easy to explain, but it is coarse. It works for 5% or 25%, but not for 0.1% or 12.34%.

In practice, it is common to use many more buckets. This example uses 100_000.

5_000 out of 100_000 buckets is 5%
25_000 out of 100_000 buckets is 25%
12_340 out of 100_000 buckets is 12.34%

Here is one TypeScript implementation using FNV-1a, a small non-cryptographic hash:


const BUCKET_COUNT = 100_000;

const encoder = new TextEncoder();

function fnv1a32(input: string): number {

let hash = 0x811c9dc5; // 2166136261

for (const byte of encoder.encode(input)) {

hash ^= byte;

hash = Math.imul(hash, 0x01000193); // 16777619

}

return hash >>> 0;

}

function bucketForRollout(rolloutKey: string, entityId: string): number {

return fnv1a32(`${rolloutKey}:${entityId}`) % BUCKET_COUNT;

}

function isInRollout(

rolloutKey: string,

entityId: string,

percent: number,

): boolean {

const threshold = Math.floor((percent / 100) * BUCKET_COUNT);

return bucketForRollout(rolloutKey, entityId) < threshold;

}

Usage:


const enabled = isInRollout(

"support-model-v2-shadow-mode",

[conversation.id](http://conversation.id),

5,

);

This checks whether conversation.id is in the first 5% of buckets for the rollout named "support-model-v2-shadow-mode".

The rolloutKey is important.

7. Why include a `rolloutKey`?

If you hash only the conversation ID, every feature will tend to pick the same first 5%.


feature A's 5%: conversation_1, conversation_9, conversation_20, ...

feature B's 5%: conversation_1, conversation_9, conversation_20, ...

Sometimes that may be what you want. For example, you may intentionally want a fixed internal test group.

But usually, different rollouts should have independent assignments. The first 5% for one feature does not need to be the same first 5% for another feature.

That is why the rollout name, flag key, or experiment key should be part of the hash input.


"support-model-v2-shadow-mode:conversation_12345"

"new-routing-rule:conversation_12345"

The conversation ID is the same, but the full input string is different, so the bucket can be different.

This plays a similar role to a seed or salt.

8. Which ID should you hash?

The choice of ID determines what "stable" means.

In the example above, we hash conversation.id. That means the assignment is stable per conversation.

The same user may have one conversation selected and another conversation not selected:


user_123's conversation_A: selected

user_123's conversation_B: not selected

That is fine if the thing you are measuring is conversation-level behavior.

For a user-facing feature, you may want to hash user.id instead. That keeps the experience stable for the user.


isInRollout("new-inbox-ui", [user.id](http://user.id), 10);

For B2B products, you may even want account-level or workspace-level stability:


isInRollout("new-admin-dashboard", [workspace.id](http://workspace.id), 10);

This is not just an implementation detail. It is a product decision.

Use conversation.id when each conversation should be independently assigned
Use user.id when the user's experience should be consistent
Use workspace.id or account.id when a whole organization should move together

If you pick the wrong unit, the code can be technically correct but operationally awkward.

9. Why not use a simpler calculation?

At this point, it is fair to ask:

Why use a hash at all? Why not just use the last two digits of an ID?

For a numeric ID, that might look like:


const bucket = Number(conversationId) % 100;

If IDs are perfectly random, this may be fine.

But production IDs are often not random. They may be sequential, time-based, or shaped by database internals.

With sequential IDs, a simple modulo can keep the original pattern visible:


ID     bucket

1000   0

1001   1

1002   2

1003   3

...

Over a long enough period, this may look balanced. But over short windows, it can correlate with creation time. That may matter if traffic changes during the day, an incident creates a burst of conversations, or a particular customer cohort arrives together.

Naive transformations can be worse. For example, adding the digits of an ID keeps a lot of structure from the input:


conversation ID       digit sum

900000000000000        9

900000000000001       10

900000000000002       11

900000000000003       12

900000000000004       13

The output still follows the input pattern.

A hash is used to break that structure.


conversation ID       bucket after hashing

900000000000000        32

900000000000001        51

900000000000002        70

900000000000003        89

900000000000004        56

The goal is not cryptographic secrecy. The goal is to avoid obvious clustering and correlations in the buckets.

10. What FNV-1a is doing

The fnv1a32 function above is an implementation of FNV-1a, a well-known non-cryptographic hash.

The core loop is small:


hash ^= byte;

hash = Math.imul(hash, 0x01000193);

For each byte of the input:

XOR the byte into the current hash value
Multiply the hash by a fixed constant

Repeated over the whole input, this mixes each byte into the final 32-bit value.

One way to think about it:


start with a fixed value

-> mix in byte 1 -> stir

-> mix in byte 2 -> stir

-> mix in byte 3 -> stir

-> ...

After all bytes are processed, the final 32-bit value is mapped into a bucket:


bucket = hash % BUCKET_COUNT;

FNV-1a is not a cryptographic hash. Do not use it for passwords, signatures, or tamper detection.

For stable traffic splitting, though, it is often a reasonable choice: simple, fast, deterministic, and well understood.

11. What the XOR is for

It is worth looking at hash ^= byte with actual numbers.

First, hash is not "the original ID as a number." It is the running state built from the characters seen so far.


initial value

-> state after reading byte 1

-> state after reading byte 2

-> state after reading byte 3

-> ...

FNV-1a uses XOR to put the next byte into that running state.

The ^ operator is XOR, a bitwise operation. For one bit, the rule is:

That is hard to feel from the table alone, so here is the same operation with decimal numbers and binary side by side.

Suppose the low 8 bits of the current state are 177, and the next character is "A". In ASCII, "A" is 65.


hash = 177 = 10110001

byte =  65 = 01000001

--------

XOR  = 240 = 11110000

177 ^ 65 is 240.

In terms of powers-of-two places, this means:

The 64 place was 0 in hash and 1 in byte, so it became 1
The 1 place was 1 in both, so it became 0
The other places stayed the same because the byte had 0 there

So XOR is not adding 65.


addition: 177 + 65 = 242

XOR     : 177 ^ 65 = 240

The same "A" byte can make the value go up or down depending on the current state.


177 ^ 65 = 240  (goes up)

241 ^ 65 = 176  (goes down)

The point of "flipping" is not that we want flipped bits as the final result. The point is simpler: the next byte should change the running state.

With addition, "A" is treated as the number 65 and added. With XOR, "A" is treated as the bit pattern 01000001, which switches parts of the current state.


addition: use the byte as a number to add

XOR     : use the byte as a pattern of switches

But XOR alone would be weak.

Suppose the starting state is 177, and we feed "A" (65) and "B" (66) using only XOR.


"A" then "B": 177 ^ 65 ^ 66 = 178

"B" then "A": 177 ^ 66 ^ 65 = 178

The order does not matter. XOR alone can tell that the same ingredients were used, but it does not care much about their order.

That is why FNV-1a multiplies immediately after the XOR. Here is a tiny 8-bit toy version using multiplier 5 and wrapping into the range 0..255.


start = 177

"A" then "B":

177 ^ 65 = 240

240 * 5  = 176  (after wrapping at 256)

176 ^ 66 = 242

242 * 5  = 186  (after wrapping at 256)

"B" then "A":

177 ^ 66 = 243

243 * 5  = 191  (after wrapping at 256)

191 ^ 65 = 254

254 * 5  = 246  (after wrapping at 256)

With XOR alone, both orders ended at 178. With multiplication between bytes, "A" then "B" ends at 186, while "B" then "A" ends at 246.

That is the basic shape of FNV-1a:


change the running state with the next byte -> XOR

move the state before the next byte arrives -> multiply

Real FNV-1a uses 32 bits instead of 8, and the multiplier is 16,777,619 instead of 5. The shape is the same.

Could we use addition or subtraction instead of XOR?


hash = Math.imul(hash, 0x01000193) + byte;

Yes. If the only goal is "change the running state using the next byte," addition and subtraction can do that too.


hash = hash + byte;

hash = hash - byte;

hash = hash ^ byte;

All three change hash based on the next byte.

The difference is how they change it.


addition   : use the byte as a number to add

subtraction: use the byte as a number to subtract

XOR        : use the byte as a 0/1 switch pattern

You can build a hash in the form hash = hash * K + byte. Some hash functions use that kind of idea.

But that is a different hash design, not FNV-1a.

FNV-1a does not add the next byte as a number. It uses the byte as a 0/1 switch pattern to change the running state, then multiplies immediately to move that state around.

This does not mean XOR is always better than addition or subtraction. XOR is not magic by itself.

The important point is the division of roles in FNV-1a:

XOR: use the next byte to change the running state
Multiplication: move the changed state before the next byte arrives

If you replace XOR with addition, you have created a different hash. That may be fine, but it is no longer FNV-1a, and you would need to evaluate that design on its own: distribution, collisions, and behavior on sequential or patterned IDs.

What about multiplication alone?


hash = Math.imul(hash, 0x01000193);

That does not work by itself because the input byte never enters the state. With the same initial value, multiplying the same number of times gives the same result. In the extreme, the hash would depend on length, not content.

What if the ID is numeric and we multiply the original ID directly?


const bucket = Math.imul(Number(id), 0x01000193) % BUCKET_COUNT;

That is still not a very good general hash.

First, IDs are often strings. They may look like conversation_12345, include prefixes, or be UUIDs. Even if an ID is numeric, it may be too large to represent safely as a JavaScript Number.

Second, multiplying sequential IDs by a constant keeps a lot of structure.


id      bucket for id * K

1000    some position

1001    that position plus K

1002    that position plus K again

1003    that position plus K again

The order may look more shuffled, but neighboring IDs still have a fixed relationship.

A hash reads the whole ID as bytes, puts each byte into the running state, and updates the state after each byte. Prefixes, suffixes, separators, and digits all influence the final value.

12. The multiplication and the clock analogy

This line can also look mysterious:


hash = Math.imul(hash, 0x01000193);

0x01000193 is hexadecimal. In decimal, it is 16,777,619. So the line means: multiply the current hash by 16,777,619.

But the goal is not to make the number larger and larger.

The goal is to move the running state to another position inside a 32-bit range, wrapping around when it overflows.

A clock is a useful analogy.

On a 12-hour clock:


10 o'clock + 5 hours = 3 o'clock

It does not become 15 o'clock. It wraps around after 12.

A 32-bit integer works the same way, except the clock has 2^32 positions. If the result goes past the end, it wraps around.

Here is a tiny 8-bit example. An 8-bit value has the range 0..255, so it wraps at 256.


240 * 5 = 1200

1200 % 256 = 176

So in an 8-bit world, 240 * 5 becomes 176, not 1200.


240 -> 176

It is better to think of this as moving to another clock position, not growing the number.

Real FNV-1a does the same thing in 32 bits:


hash * 16,777,619

but wrapped modulo 2^32

That is why the code uses Math.imul.

JavaScript's normal numbers are Number values, implemented as floating-point values. If you use normal *, very large integer products can lose exact low-bit information.

This multiplication can get very large:


roughly at most:

4,294,967,295 * 16,777,619

≈ 72,000,000,000,000,000

That is beyond the range where JavaScript Number can represent every integer exactly.

But FNV-1a does not need the full huge product. It needs the low 32 bits after 32-bit wraparound.

Math.imul(a, b) is made for that:

Treat a and b as 32-bit integers
Multiply them as 32-bit integers
Return the low 32 bits of the result

So this line is the JavaScript way to write the 32-bit wrapping multiplication that FNV-1a expects:


hash = Math.imul(hash, 0x01000193);

Conceptually, it means:


hash = (hash * 16,777,619) wrapped modulo 2^32

The final line is related:


return hash >>> 0;

Math.imul returns a value that JavaScript views as a signed 32-bit integer. That means the same 32 bits may appear as a negative number.


Math.imul(0xffffffff, 5);        // -5

Math.imul(0xffffffff, 5) >>> 0;  // 4294967291

The bits are the same. >>> 0 reinterprets those 32 bits as an unsigned integer, which is convenient before applying % BUCKET_COUNT.

FNV-1a uses the multiplier 16777619, which is odd. The 32-bit wraparound size, 2^32, is made only of factors of 2. An odd multiplier shares no factor with it.

That matters because multiplication by an odd number modulo 2^32 is a permutation of the 32-bit values. It does not immediately collapse everything into a smaller subset of values.

That does not mean "odd multiplier equals perfect distribution." The useful spread comes from the combination of XORing each byte, multiplying by the FNV prime, and repeating that process across the input.

For this article, the practical takeaway is enough:

IDs often contain patterns
Hashing mixes those patterns into a wider numeric range
The bucket threshold then gives a stable rollout decision

13. "About 5%" is not "exactly 5%"

The wording matters.

This method selects about 5%, not exactly 5% in every possible set.

If there are only 17 conversations, 5% is 0.85 conversations. You cannot select 0.85 of a conversation. The actual result might be 0, 1, or 2 conversations.

The smaller the population, the more the observed percentage can move around.

As the population grows, the observed percentage tends to get closer to the configured percentage. With 100,000 conversations, 5% should be around 5,000 conversations.

There is also a tiny modulo bias in hash % BUCKET_COUNT, because 2^32 is not evenly divisible by 100_000. Some buckets correspond to one more raw hash value than others.

For rollout sampling, that bias is usually far too small to matter.

If you need exactly 5% of a known population, you need a different process: collect the whole population, sort or rank it, and take exactly the top 5%.

The method in this article optimizes for different properties:

It works locally for each request
It requires no stored assignment
It gives the same answer for the same ID
It lets the rollout percentage increase monotonically

Those are usually the properties you want for gradual rollout infrastructure.

14. Recap

This is not a new trick. Deterministic bucketing is a common pattern in feature flags, A/B tests, and gradual rollouts.

But the first time you see it, the code can look surprisingly dense:


hash(id) % 100_000 < threshold

That small expression gives you:

About 5% selection
Stable decisions for the same conversation
Monotonic rollout from 5% to 25% to 50%
No per-conversation assignment stored in a database

The key idea is to stop thinking of sampling as "roll a new random number now."

Instead, give each entity a stable position by hashing its ID. Then treat the rollout percentage as a threshold. Moving the threshold forward adds more entities without changing the ones that were already in.

That is the core of storage-free, stable rollout sampling.