Skip to content
Dev Dump

Database Replication

In modern distributed systems, relying on a single database server is a recipe for disaster. Database replication is the process of keeping multiple copies of the same data across different servers or geographical locations.

Database Replication Overview A typical setup where a Primary database handles all writes and replicates data to Read-Only replicas.

Replication solves several core engineering challenges:

  • High Availability (HA): If one server crashes or a data center loses power, other nodes contain a full copy of the data and can take over instantly.
  • Read Scalability: By routing read-heavy operations (like loading user profiles) to replicas, you significantly reduce the load on the primary database.
  • Reduced Latency: You can place replicas geographically closer to your users (e.g., an EU replica for European users) to speed up read queries.
  • Analytics & Backups: Complex, long-running analytics queries can be run on a dedicated replica without slowing down the primary transactional database.

Before diving into architectures, it’s crucial to understand the roles within a replication cluster:

Leader (Primary / Master): The designated node that accepts all write operations from the application. It dictates the order of operations and propagates them to followers.

Follower (Replica / Secondary / Slave): Nodes that receive data changes from the Leader. They apply these changes to their local storage and usually only accept read queries.

Replication Lag: The time delay between a write committing on the Leader and that same write being applied on a Follower. In a healthy system, this is usually milliseconds, but it can spike under heavy load.


When a user writes data, how does the leader ensure the followers get it? This introduces the trade-off between Synchronous and Asynchronous replication.

Synchronous vs Asynchronous Replication Timeline comparing the round-trips required for synchronous versus asynchronous replication.

StrategyCommit TimingProsCons
SynchronousLeader waits for replicas to acknowledge the write before responding to the client.Strong consistency. Replicas are guaranteed to have the latest data.Higher latency. The leader blocks if a replica is slow or network fails.
AsynchronousLeader commits locally, replies to client immediately, and updates replicas in the background.Low latency. Extremely fast writes, resilient to slow followers.Data loss risk. If the leader crashes before replicating, recent writes are permanently lost.

Replication mode comparison Synchronous, semi-synchronous, and asynchronous replication at a glance.

ModeConsistency / durabilityWrite latencyData loss risk (leader crash)
SynchronousHighestHighestLowest
Semi-synchronousMedium–highMediumLow–medium
AsynchronousEventual across replicasLowestHighest

The way nodes communicate defines the replication architecture. There are three primary patterns used in the industry today.

Single-leader replication flow One leader accepts writes and ships an ordered replication log; followers apply changes and may serve reads.

This is the most common and easiest to understand. All writes go through one central node. The leader produces an ordered replication log (for example binlog or WAL); followers receive or pull that log and apply updates in order—serving read traffic when your consistency requirements allow.

  • Use Case: Read-heavy applications like blogs, news sites, or standard web platforms.
  • Failover: If the leader crashes, the system must perform a failover. It selects the most up-to-date follower and promotes it to be the new leader. Clients are then re-routed to the new leader.

Single-Leader Replication and Failover In single-leader setups, when the leader dies, a follower is promoted to take its place.

  1. Replica reconnects to the leader.
  2. It requests missing log entries from its last known position.
  3. It catches up from checkpoint or binlog/WAL offset.
  1. Detect loss of leader (heartbeats, health checks, timeouts).
  2. Promote the most up-to-date replica (avoiding a stale promotion that loses acknowledged writes).
  3. Re-point other replicas and application writers to the new leader.
  4. Fence the old leader (STONITH, revoke credentials, load balancer drain) to prevent split-brain and dual writers.

Risks to plan for: promoting a lagging replica (data loss), split-brain if two nodes believe they are primary, and long outages if failover is manual or poorly automated.

  1. Take a consistent snapshot from the leader (or a replica caught up to a known log position).
  2. Restore that snapshot on the new node.
  3. Start replication from the exact log position matching the snapshot.
  4. Serve reads from the new replica only once lag is within your SLO.

Track continuously: replication lag (time and bytes), apply throughput, failover duration, leader write latency, and breaches of replica freshness SLOs.

  • Send strongly consistent reads to the leader when the business logic cannot tolerate stale data.
  • Send stale-tolerant reads to replicas to offload the primary.
  • Automate failover with fencing or lease checks; rehearse planned and unplanned failovers.
  • Treat backups and PITR as separate from replication—they solve different failure classes.

Multi-Leader Replication Multiple leaders handle traffic independently, typically across different regions, and sync with each other.

In a multi-leader setup, multiple nodes accept writes simultaneously and continuously replicate changes among themselves.

  • Use Case: Global applications where users in the US write to a US leader, and users in the EU write to an EU leader, ensuring low write latency globally. It also excels for offline operation support (like collaborative document editing).

Leaderless Replication Leaderless systems rely on reading and writing to multiple nodes (a quorum) to ensure consistency.

Pioneered by Amazon’s Dynamo system (and used in Cassandra and Riak), leaderless architectures abandon the concept of a primary node. Clients send write and read requests to multiple nodes in parallel.

In a leaderless system, three parameters govern consistency:

ParameterDescription
NTotal number of replicas storing each piece of data
WWrite quorum — minimum nodes that must acknowledge a write
RRead quorum — minimum nodes queried for a read

The Key Rule: If W + R > N, you guarantee strong consistency because at least one node in the read set must have the latest write. This is called quorum-based replication.

Write Flow:

  1. Client sends a write request to a coordinator node
  2. Coordinator forwards the write to all 3 replicas
  3. As soon as 2 replicas acknowledge → write succeeds
  4. The third replica catches up asynchronously (anti-entropy)

Read Flow:

  1. Client sends a read request to the coordinator
  2. Coordinator queries 2 replicas in parallel
  3. If values differ → resolve conflict (e.g., pick latest timestamp)
  4. Return the most recent version

Since W + R = 2 + 2 = 4 > 3, at least one overlapping node is guaranteed to have the latest data.

ConfigurationTrade-off
W=N, R=1Writes are slow (all nodes), reads are fast
W=1, R=NWrites are fast, reads are slow (query all)
W=1, R=1Maximum speed, but no consistency guarantee

Use Case: Extremely high-availability systems where downtime is unacceptable, such as shopping carts, session stores, or massive event logging.


ArchitectureBest ForTrade-off
Single-LeaderRead-heavy workloads (blogs, web apps)Simple, but requires failover mechanisms
Multi-LeaderGlobal write performance, offline supportConflict resolution complexity
LeaderlessMaximum availability (carts, sessions)Eventual consistency, tunable via quorum

Key Takeaway: Choose your replication strategy based on your consistency vs. availability requirements. Use W + R > N in leaderless systems to guarantee overlap between reads and writes.