Cristiano Ronaldo vs. You with 10 Followers - X / Twitter

Welcome back! In my last post, we explored the three pillars of data systems: reliability, scalability, and maintainability.
You can visit the last chapter from here :
Chapter 1: Foundations of data systems

Today, we'll zoom in on one of the trickiest challenges modern social networks face—fan-out—using Twitter as our case study.

"When Cristiano Ronaldo tweets, it needs to reach 100 million in seconds"

What Is Fan-Out?

Fan-out is the process of delivering one message (a tweet) to all of a user's followers.

If you have 50 followers, fan-out is easy
If you have 100 million followers, you need special magic

There are two main approaches:

Fan-Out on Write
Fan-Out on Read

Approach 1: Fan-Out on Write

(Ideal for "regular" users)

Fan-Out on Read Diagram

User posts a tweet
The tweet is stored in a durable Tweet Store
A background job pushes that tweet into each follower's Timeline Cache

Read path

When a follower opens Twitter, their timeline is already pre-filled
The app just fetches cached tweets—instant display

Why it works

Fast reads for followers
Simple ranking and pagination

The Scaling Challenge

Let's break down why this fails for celebrities:

# For 31M followers with 90s delivery SLA
31,000,000 followers / 90 seconds = 344,444 writes/sec

# If 1% of followers check immediately
31,000,000 × 1% = 310,000 reads/sec

And there are more than 4.6k writes/sec system-wide. Now imagine if Ronaldo just dropped a 'hello' on X

Trade-offs

Write amplification: N writes per tweet (where N = number of followers)
Doesn't scale to celebrity levels

Approach 2: Fan-Out on Read

(Necessary for mega-stars like Ronaldo)

Fan-Out on Write Diagram

Celebrity posts a tweet
Store the tweet once in the tweets table
When each follower loads their timeline:
- Fetch their personal cache (small fan-out of regular users)
- Dynamically merge the celebrity's tweet on the fly

SELECT t.id,
       u.screen_name,
       t.text,
       t.timestamp
FROM follows f
JOIN tweets t
  ON t.sender_id = f.followee_id
JOIN users u
  ON u.id = t.sender_id
WHERE f.follower_id = :current_user
ORDER BY t.timestamp DESC
LIMIT 50;

Why it's needed

1 write per tweet—constant cost
Handles millions of followers without crashing

Trade-offs

Slower reads (joins and lookups)
Harder to cache consistently

Twitter's Hybrid Solution

Twitter combines both strategies:

Regular users → Fan-Out on Write
Celebrities → Fan-Out on Read

The Nuanced Reality

Even celebrity tweets use strategic writes for critical followers:

First 1M Followers
- Immediate fan-out-on-write to timeline caches
- Targets super-fans most likely to engage immediately
Remaining Followers
- True fan-out-on-read via dynamic query merging
- Served through geographically distributed edge caches

# Why this hybrid works
100M followers = 
1M (1%) active immediately + 99M (99%) eventual consistency

Core components under the hood:

Tweet Store: Manhattan (Twitter’s distributed database)
Timeline Cache: Redis with active replication
Stream Processing: Heron for batched background jobs
Rate Limiters: Raven for traffic shaping

Key Lessons for Your Own Systems

1. Segment Users by Follower Count

Why:

<10k followers: Full fan-out-on-write is safe and fast
10k-1M followers: Hybrid approach with write batching
>1M followers: Primarily fan-out-on-read with edge caching

2. Precompute vs Compute-on-Demand

Balance:

Precompute: Timeline caches, trending lists, follower graphs
Compute live: Celebrity tweets, search results, archived content

Rule of Thumb:

"Cache what 80% of users will see immediately. Compute what 20% might request later."

3. 99th-Percentile Latency Focus

Why It Matters:

Average latency hides worst-case scenarios
1% slow requests = 3.6M unhappy users/hour (at 100k RPS)

Prioritization:

Ensure 95% requests complete in <100ms
Optimize until 99% <200ms
Let 1% (edge cases) degrade gracefully

4. Hot/Warm/Cold Data:

Hot: In-memory cache (last 3h tweets)
Warm: SSD storage (3h-7d tweets)
Cold: HDD/tape (archival)

"There are no solutions, only trade-offs. The art is balancing write costs vs read complexity."

What's Next?

In the next post, I'll dive into data modeling and query languages—because fast fan-out needs a solid home for your data.

Got fan-out horror stories? Or clever caching tricks? Share them in the comments below! 🚀

Cristiano Ronaldo vs. You with 10 Followers - X / Twitter

What Is Fan-Out?

Approach 1: Fan-Out on Write

Approach 2: Fan-Out on Read

Twitter's Hybrid Solution

The Nuanced Reality

Core components under the hood:

Key Lessons for Your Own Systems

1. Segment Users by Follower Count

2. Precompute vs Compute-on-Demand

3. 99th-Percentile Latency Focus

4. Hot/Warm/Cold Data:

What's Next?

Comments (0)

Read More

#reading

#popular

Cristiano Ronaldo vs. You with 10 Followers - X / Twitter

What Is Fan-Out?

Approach 1: Fan-Out on Write

Approach 2: Fan-Out on Read

Twitter's Hybrid Solution

The Nuanced Reality

Core components under the hood:

Key Lessons for Your Own Systems

1. Segment Users by Follower Count

2. Precompute vs Compute-on-Demand

3. 99th-Percentile Latency Focus

4. Hot/Warm/Cold Data:

What's Next?

Comments (0)

Read More

TWCT-T-D55: The Iron Throne of Current Sensing

Definition of Ready (DoR) vs. Definition of Done (DoD): Crafting Clear Criteria for Dev Teams

첫 글을 시작하면서...

Study.taxi — a Duolingo clone for everything and anything: an AI WebApp Experiment

#reading

#popular