Multiversion Concurrency Control (MVCC) – A Deep Dive Part (1/2)

TL;DR

Ever built something where two users tried to update the same record at the same time, and everything broke? Or maybe you ran into those weird bugs where a user sees stale data or, worse, corrupted data after multiple updates. If you've worked with databases in a real-world app, you've probably hit some kind of concurrency issue. That’s where Multiversion Concurrency Control (MVCC) comes in.

MVCC is how databases like PostgreSQL, MySQL (InnoDB), Oracle, and even some distributed systems handle concurrent reads and writes without locking everything up. Instead of making one transaction wait for another, MVCC gives each transaction its consistent view of the data and lets the database handle the magic behind the scenes.

Before we jump in, here is what we will cover:

What is MVCC
How MVCC works internally
The impact of MVCC on transaction isolation levels

What is MVCC?

Multiversion Concurrency Control (MVCC) is a database concurrency control technique that enables multiple transactions to access the database simultaneously without blocking each other. Instead of using locks for concurrency control, MVCC allows different transactions to work with different versions of data based on when they started.

Traditional Lock-Based Concurrency Control (LBCC) creates contention by forcing transactions to wait for each other. MVCC solves this issue by ensuring the following:

Non-blocking reads: Transactions can read without waiting for writes.
Optimistic concurrency control: No locking overhead, making reads efficient.
Snapshot isolation: Transactions see consistent snapshots of the database, reducing conflicts.

How MVCC Works Internally: Versioned Rows and Visibility Rules

MVCC avoids read-write conflicts by maintaining multiple versions of each row in the database. Instead of locking rows during a transaction, MVCC allows each transaction to operate on a snapshot of the data (A view that doesn’t change until the transaction ends).

In MVCC-enabled databases, each row carries metadata that enables version tracking and snapshot-based visibility. This metadata typically includes a transaction ID (XID) that created or last modified the row, timestamps or system change numbers (SCNs) used for visibility comparison, and a reference to a previous version of the row either stored in undo logs or as part of a tuple version chain. When a row is updated, the system doesn't overwrite the existing data. Instead, it creates a new version of the row with the updated data and preserves the old version to support concurrent readers. This retained version is either stored inline (e.g., PostgreSQL's heap tuples) or in separate undo logs (e.g., MySQL’s InnoDB). As a result, read-only transactions can continue accessing the older version of the row until they are complete, ensuring that readers never block writers and writers never block readers.

To determine which version of a row is visible to a transaction, the database applies a set of visibility rules. These rules allow a transaction to see only the rows that were committed before the transaction began while excluding rows that were either deleted before the transaction started or are part of uncommitted changes from other transactions. This mechanism ensures consistent reads without requiring locks. The specific implementation details vary between databases. For instance, PostgreSQL uses xmin and xmax fields in its tuple headers to manage version visibility, while MySQL's InnoDB relies on undo logs and transaction IDs to reconstruct the correct version at runtime.

Isolation Levels

Isolation Level	Prevents	Behavior
Read Committed	Dirty Reads	Each statement sees the latest committed data
Repeatable Read	Dirty Reads, Non-Repeatable Reads	Each transaction sees a stable snapshot
Serializable	All anomalies	Uses strict ordering, reducing concurrency

PostgreSQL Default: Read Committed
MySQL InnoDB Default: Repeatable Read

Repeatable Read in PostgreSQL

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT * FROM employees;
-- Another transaction updates data
SELECT * FROM employees;
-- Still sees the old snapshot
COMMIT;

In Part 2, we’ll dive into real-world examples to see MVCC in action across different databases with a little code.

Multiversion Concurrency Control (MVCC) – A Deep Dive Part (1/2)

TL;DR

What is MVCC?

How MVCC Works Internally: Versioned Rows and Visibility Rules

Isolation Levels

Repeatable Read in PostgreSQL

Comments (0)

Read More

#reading

#popular

Multiversion Concurrency Control (MVCC) – A Deep Dive Part (1/2)

TL;DR

What is MVCC?

How MVCC Works Internally: Versioned Rows and Visibility Rules

Isolation Levels

Repeatable Read in PostgreSQL

Comments (0)

Read More

Backend Development Demystified: From Structure to Real-Life APIs

Streaming Postgres Changes with Debezium and Kafka Connect: A Hands-On Tutorial

How fsync and synchronous_commit Affect PostgreSQL Performance

We Listened: Pgai Vectorizer Now Works With Any Postgres Database

#reading

#popular