"Should I use SQL or NoSQL? B-Trees or LSM Trees?" If you've ever felt overwhelmed choosing the right database for your app, you're not alone. Underneath every database is a rich ecosystem of storage engines and transaction protocols — and choosing right can mean the difference between blazing-fast performance or painful bottlenecks.

In this blog, we step into the world of database internals — through the lens of story — and explore how systems like MySQL, MongoDB, Cassandra, and PostgreSQL really work under the hood.


Let me tell you a story.

It all began when I was designing a backend system for a fast-growing e-commerce app. It needed to handle thousands of concurrent users, real-time product inventory updates, personalized recommendations, and a dashboard that updated faster than you could say “out of stock.” And like many developers before me, I hit the question:

“Which database should I use?”

What followed was a deep dive down the rabbit hole of database internals — a world where storage engines clash, transactions dance in delicate sync, and choices aren’t always black and white.

This post is that journey — and maybe, it’ll help you find your own answer.


📁 Behind the Scenes of a Database

At first glance, databases look simple. You insert data. You query it. Maybe update or delete a few rows.

But under the hood, it’s a wild machine — made up of layers like:

  • Transport: How your queries travel to the server
  • Query Parser & Optimizer: What your SQL actually becomes
  • Execution Engine: Where it all gets done
  • Storage Engine: The core, the vault, the thing that makes it all possible

That’s where our journey truly begins.


🍊 Storage Engine Showdown: B-Tree vs LSM Tree

🎓 The Classic Hero: B-Trees

Imagine a grand old library with neatly sorted sections and labeled drawers. That’s a B-Tree.

Efficient, organized, and time-tested. Every insert knows where to go. Every query finds what it needs quickly.

It works like this:

  • Your data is stored in sorted blocks
  • Every read is fast (like O(log n) fast)
  • Updates happen in-place — meaning some random disk I/O, but that's fine for OLTP systems

Databases like MySQL (InnoDB) and PostgreSQL love B-Trees. They're rock solid when you need strong consistency, fast lookups, and ACID transactions.

But…

🔥 The Young Disruptor: LSM Trees

Then you meet LSM Trees — Log-Structured Merge Trees.

This one doesn’t bother with in-place updates. It writes everything to memory first, and flushes it to disk in sorted chunks called SSTables. Every now and then, it cleans up by merging — a process called compaction.

It’s like writing notes on sticky pads, then compiling a clean notebook later.

The result? Blazing-fast write performance. Perfect for logs, metrics, IoT streams, and other write-heavy systems.

LSM Trees power Cassandra, RocksDB, HBase, and even parts of MongoDB.


⚖️ When You Have to Choose

It felt like I was in a Western showdown:

B-Tree Wins If… LSM Tree Wins If…
Reads are frequent Writes are frequent
You need ACID compliance Eventual consistency is okay
OLTP-style transactions Streaming or time-series data

But that wasn’t the end. A good database isn’t just about reading or writing.


🔐 The Transaction Tapestry

Remember that moment in every heist movie where timing is everything?

That’s what transactions are like. You need your operations to be Atomic, Consistent, Isolated, and Durable — aka ACID.

✪️ SQL Databases (Relational)

In systems like MySQL or PostgreSQL, this is handled with:

  • Undo logs
  • WAL (Write-Ahead Logs)
  • MVCC (Multi-Version Concurrency Control)

Everything is locked, tracked, and reversible.

🌐 NoSQL Databases

In contrast, systems like Cassandra and DynamoDB favor eventual consistency — they go for BASE (Basically Available, Soft state, Eventual consistency).

They work like eventually synced notebooks:

Updates hit one node, and others catch up in the background. Fast, distributed, but less strict.


🧵 A Thread on Concurrency

Concurrency is where things got trickier.

With B-Trees, concurrency is controlled through careful locking:

  • Shared locks, exclusive locks, even update locks
  • B-Link trees (a clever enhancement) let reads flow even during writes

With LSM Trees, it’s much more lock-free:

  • MemTables take writes concurrently
  • SSTables are immutable
  • Compaction is background work

It was like comparing a bank vault with a revolving door system.


🧬 The Hybrid Age

In the real world, no one-size-fits-all database exists.

Some systems started combining the best of both:

  • MySQL has plugins for RocksDB
  • MongoDB switched to an LSM-like engine (WiredTiger)
  • Aurora blends SQL compatibility with NoSQL performance

🧠 My Takeaway

Choosing the right database isn’t about trends — it’s about trade-offs.

Ask yourself:

  • Is your workload read-heavy or write-heavy?
  • Do you need strict transactions, or is speed more important?
  • Are you handling structured business data, or millions of streaming events?

Once you answer these, your storage engine almost picks itself.


✍️ Final Thoughts

That little "insert into users" line in your code? It kicks off a cascade of logic and engineering brilliance that spans decades.

Understanding database internals made me a better backend engineer — and hopefully, now it’ll do the same for you.