Distributed Transaction Handling

In a distributed system, ensuring data consistency across multiple, independent services is one of the hardest engineering challenges.

A transaction is a group of operations executed as a single, indivisible unit. The golden rule of transactions is “All or Nothing”. If transferring money involves debiting Account A and crediting Account B, both must succeed, or both must fail.

Monolithic vs Microservice Transactions

The complexity of transaction management directly correlates with your system’s architecture.

Monolithic Architecture

Monolithic Transaction In a monolith, all modules share a single database, allowing the use of standard ACID transactions.

In a monolith, you rely on the database’s native transaction capabilities. The database guarantees ACID properties (Atomicity, Consistency, Isolation, Durability). If a step fails, you simply issue a ROLLBACK command, and the database magically undoes everything.

Microservices Architecture

Microservice Transaction In a microservices architecture, operations span multiple independent databases.

In microservices, the “Order,” “Payment,” and “Inventory” logic live in different applications, each with its own database. A single business transaction now spans across the network.

A Distributed Transaction is a transaction that spans multiple physical systems or services, requiring coordination to ensure that all participating databases either commit their changes or roll them back together.

Since you cannot wrap a single START TRANSACTION and COMMIT block around three different REST APIs, you need distributed transaction patterns.

Pattern 1: Two-Phase Commit (2PC)

Two-Phase Commit (2PC) is an algorithm that ensures all services agree to commit or roll back changes simultaneously. It relies on a central Transaction Coordinator.

The Two Phases

2PC Implementation The Coordinator ensures all services are ready in Phase 1 before telling them to commit in Phase 2.

Phase 1: Prepare Phase
- The coordinator asks all participating services: “Are you ready to commit this transaction?”
- Each service checks its constraints, locks the necessary rows in its database, and replies “Ready” or “Failed”.
Phase 2: Commit Phase
- If all services replied “Ready,” the coordinator sends a “Commit” command.
- If any service replied “Failed” (or timed out), the coordinator sends a “Rollback” command.

Rollback Scenario

2PC Rollback If any service fails during the Prepare phase, the Coordinator aborts the entire transaction.

Pattern 2: The SAGA Pattern

Because 2PC scales poorly in cloud environments, modern microservices use the SAGA Pattern.

Instead of one massive, distributed transaction, a SAGA breaks the work into a series of local transactions.

SAGA Pattern A sequence of local transactions. If a later step fails, explicit compensating transactions are triggered.

How it works:

Service A does its work, commits to its database, and triggers Service B.
Service B does its work, commits, and triggers Service C.
If Service C fails, the system cannot issue a standard database ROLLBACK for A and B (they already committed). Instead, it must execute Compensating Transactions (e.g., Service B issues a refund, Service A marks the order as cancelled).

This relies on Eventual Consistency—the system might be in a partially completed state for a few seconds before the SAGA finishes or compensates.

SAGA Implementations

There are two primary ways to coordinate a SAGA.

1. Choreography (Event-Driven)

SAGA Choreography Services react to events on a message broker without a central controller.

In choreography, there is no central brain. Services publish domain events to a Message Broker (like Kafka or RabbitMQ), and other services listen and react.

Pros	Cons
No single point of failure.	Very hard to track the state of a transaction.
Highly decoupled and scalable.	Complex debugging; requires robust distributed tracing.

2. Orchestration (Command-Driven)

SAGA Orchestration *An orchestrator explicitly commands services and manages the transaction state machine.*x

In orchestration, a central service (the Orchestrator) manages the workflow. It sends commands to services, waits for their replies, and decides the next step (or triggers compensations).

Pros	Cons
Clear visibility into the transaction state.	The orchestrator is a single point of failure/bottleneck.
Easy to understand and debug the workflow.	Couples services slightly more to the orchestrator.

2PC vs SAGA Comparison

Aspect	2PC	SAGA
Consistency	Strong (ACID across all nodes)	Eventual (Base)
Performance	Slower (Locks resources, blocking)	Faster (Non-blocking local commits)
Failure Handling	Standard Database Rollback	Application-level Compensating Transactions
Best For	Short transactions, internal networks, single-vendor databases.	Long-running workflows, public cloud, diverse microservices.

Summary (TL;DR)

Monoliths rely on standard ACID database transactions.
Microservices require distributed transaction patterns because data is split across multiple databases.
2PC guarantees strong consistency but blocks resources and scales poorly.
SAGA breaks work into local transactions and uses compensating actions to undo work, favoring high availability and eventual consistency.