Consistent Hashing
Consistent hashing is a way to spread keys across a changing set of servers so that when a server is added or removed, only a small fraction of keys move.
That is the whole reason it exists.
Consistent hashing (one line): When the set of nodes changes, only about $1/n$ of keys need to move on average (for many ring setups), instead of nearly all keys as in plain
hash(key) % N.
Modulo Hashing
Section titled “Modulo Hashing”A normal hash scheme often looks like this:
server = hash(key) % Nwhere N is the number of servers.
This works fine until N changes. If you had 4 servers and then add a 5th, almost every key gets remapped, because % 4 and % 5 give very different results. That causes massive cache misses, data movement, rebalancing, and latency spikes.
Consistent hashing: the intuition
Section titled “Consistent hashing: the intuition”Consistent hashing fixes that.
Imagine all possible hash values arranged in a circle. This circle is usually called the hash ring. Both servers and keys are hashed onto the same ring.
To place a key, you move clockwise on the ring until you find the first server. That server owns the key.
So instead of “key goes to server number hash(key) % N”, the rule becomes:
The key belongs to the next server clockwise on the ring.
That small change gives the big benefit.
Hash Ring Example (hash space 0–99)
Section titled “Hash Ring Example (hash space 0–99)”Suppose the hash space is from 0 to 99, wrapped in a circle.
We hash three servers:
A -> 20B -> 50C -> 80So the ring looks like this:
0 ----20(A)------50(B)------80(C)------99 -> wraps to 0Now hash some keys:
apple -> 12banana -> 26cat -> 72dog -> 91Assign each key to the first server clockwise from the key’s position.
- apple → 12: clockwise from 12, first server is A at 20 → apple → A
- banana → 26: first server is B at 50 → banana → B
- cat → 72: first server is C at 80 → cat → C
- dog → 91: clockwise from 91, we wrap and hit A at 20 → dog → A
So ownership is:
- A: keys in (80, 20] (after wraparound)
- B: keys in (20, 50]
- C: keys in (50, 80]
Add a server
Section titled “Add a server”Now add server D → 65.
0 ----20(A)------50(B)--65(D)---80(C)------99 -> 0Check the sample keys:
- apple, banana, cat, dog: unchanged in this example.
The important part: only keys in the interval newly claimed by D move (for example, keys that used to fall to C but now fall before D in the clockwise walk). Not all keys. Not most keys—just that slice.
With traditional modulo hashing, adding one server would remap nearly everything. With consistent hashing, adding one server typically remaps about 1 / (number of servers) of the keys—not the whole keyspace.
When a node joins, only keys in the new slice move—not the full ring.
Remove a server
Section titled “Remove a server”Start again with A=20, B=50, C=80. Suppose server B fails.
0 ----20(A)---------------------80(C)------99 -> 0Now only the keys that used to go to B are reassigned, and they go to the next clockwise server (C). Keys owned by A stay on A; keys owned by C stay on C.
Again: minimal disruption compared to modulo.
Same ring idea as a join—only the vacated range’s keys migrate to the successor.
Why this matters in real systems
Section titled “Why this matters in real systems”Consistent hashing is used when you need stable key placement while nodes change.
Typical use cases:
- Distributed caches (Memcached, Redis-style clusters)
- Distributed key-value stores
- Sharding user data across machines
- Load balancing (e.g. sticky-by-key)
- Peer-to-peer systems
- Object storage systems
If a cache cluster grows from 10 to 11 nodes, you do not want 90%+ of cached items to disappear from the nodes clients expect. Consistent hashing avoids that.
Visual way to think about ownership
Section titled “Visual way to think about ownership”Each server owns the arc of the ring immediately before it when you walk counterclockwise: the server at a point is responsible for everything behind it until the previous server.
For A=20, B=50, C=80, ownership is:
- A owns (80, 20]
- B owns (20, 50]
- C owns (50, 80]
(No second ring diagram here—the earlier figure plus this interval notation is enough.)
Modulo vs consistent hashing
Section titled “Modulo vs consistent hashing”| Modulo hashing | Consistent hashing | |
|---|---|---|
| Rule | server = hash(key) % N | First server clockwise from hash(key) on the ring |
| When $N$ changes | Most keys change destination | Insert/remove one point; only keys in the affected region move |
Modulo pain (numeric):
hash("apple") = 1414 % 4 = 214 % 5 = 4 <- moved
hash("banana") = 2727 % 4 = 327 % 5 = 2 <- moved againThis pattern repeats for most keys whenever $N$ changes.
Consistent hashing: you keep the same ring positions for old servers and insert the new server at one point on the ring. Only keys in that neighborhood move. That is the main advantage.
The problem with the basic version
Section titled “The problem with the basic version”The simple version has a weakness: servers may not be evenly spaced on the ring. If you hash only one position per server, you can get imbalance.
Example:
A -> 5B -> 10C -> 90Then C might own a huge part of the ring, while A and B own tiny parts.
That means:
- Uneven load
- Hotspots
- Bad resource utilization
This is why practical consistent hashing usually uses virtual nodes.
Virtual nodes
Section titled “Virtual nodes”Instead of putting each physical server on the ring once, put it there many times.
Example:
Server A gets positions: 5, 35, 70Server B gets positions: 10, 45, 85Server C gets positions: 20, 60, 95Each physical server appears as multiple virtual nodes (often called vnodes). Keys are assigned to the nearest vnode clockwise, and that vnode maps back to its physical server.
This gives a much more even distribution.
Many tokens per machine smooth out hot and cold arcs.
Why virtual nodes help
Section titled “Why virtual nodes help”Because many small slices per server average out better than one big random slice.
- Without vnodes: one unlucky hash can give a server too much data.
- With vnodes: ownership is split into many smaller chunks, load becomes smoother, and adding/removing a server redistributes many small pieces instead of one giant chunk.
Small vnode example
Section titled “Small vnode example”Suppose the ring is:
A1=10, B1=20, C1=30, A2=40, B2=50, C2=60, A3=70, B3=80, C3=90- If a key hashes to 47, the first clockwise vnode is B2=50, so the key goes to physical server B.
- If a key hashes to 76, the first clockwise vnode is B3=80, again B.
Even though the decision is made using vnodes, the real owner is still the physical server.
Weighted consistent hashing
Section titled “Weighted consistent hashing”Sometimes servers do not have equal capacity. For example:
- One machine: 64 GB RAM, 16 CPUs
- Another: 16 GB RAM, 4 CPUs
You may want the bigger server to receive more keys. A common way is to give the bigger server more virtual nodes:
small server gets 100 vnodeslarge server gets 400 vnodesThen the large server owns about 4× as much of the hash ring.
Weighted placement: vnode count tracks capacity.
Replication in consistent hashing
Section titled “Replication in consistent hashing”In real distributed systems, you usually want copies of data.
A common rule:
- Primary = first server clockwise from the key
- Replica 1 = second distinct server clockwise
- Replica 2 = third distinct server clockwise
So a key may be stored on multiple nodes around the ring.
Example ring:
A at 20B at 50C at 80D at 90If a key hashes to 72:
- Primary = C (first clockwise server at or after 72)
- Replica 1 = D
- Replica 2 = A (wrap around)
This helps with fault tolerance.
Primary plus replicas follow successive servers on the ring.
Tip: Use a fast, non-cryptographic hash like Murmur3 or xxHash for consistent hashing so hashing stays cheap at high throughput.