HLD: Design a Twitter/X Feed

Design the Twitter timeline โ€” fanout-on-write vs fanout-on-read, hybrid strategy for celebrities, sharding, and real-time delivery.

must hard โฑ 35 min hldtwitternews-feedfanoutshardingtimelinesocial-graph
Mastery:
Why interviewers ask this
Twitter-style feed is the most popular HLD question. It forces a deep decision on fanout strategy, which has non-obvious trade-offs at scale.

Step 1: Requirements

Functional:

  • Post a tweet (text, images, videos)
  • Follow/unfollow a user
  • View home timeline (posts from people I follow, reverse chronological)
  • Like, retweet, reply
  • Search tweets

Non-functional:

  • 500M users, 300M DAU
  • Timeline load < 200ms p99
  • Tweets delivered to followers within 5 seconds
  • Reads >> writes (read-heavy)
  • Eventual consistency on timeline is OK

Estimation:

Writes: 500K tweets/day รท 86,400 โ‰ˆ 6 tweets/sec
Reads:  300M DAU ร— 5 timeline loads/day รท 86,400 โ‰ˆ 17,000 reads/sec
Storage: 500K tweets/day ร— 500 bytes ร— 365 ร— 10 years โ‰ˆ 900 GB (text only)
Fan-out: average 200 followers/user โ†’ 500K ร— 200 = 100M timeline writes/day

Step 2: Core data model

-- Users
users(id BIGINT PK, username, email, follower_count, following_count, ...)

-- Tweets
tweets(id BIGINT PK, author_id BIGINT FK, content TEXT, media_urls TEXT[],
       like_count INT, retweet_count INT, created_at TIMESTAMPTZ)

-- Follow graph (write-heavy, separate from tweet storage)
follows(follower_id BIGINT, followee_id BIGINT, created_at TIMESTAMPTZ)
  PRIMARY KEY (follower_id, followee_id)
  INDEX (followee_id, follower_id)  โ† for "get all followers of user X"

-- Home Timeline (precomputed โ€” see fanout below)
home_timeline(user_id BIGINT, tweet_id BIGINT, tweet_at TIMESTAMPTZ)
  PRIMARY KEY (user_id, tweet_at DESC, tweet_id)

Step 3: The core decision โ€” fanout strategy

This is the heart of the design. Two approaches:

Option A: Fanout-on-write (push model)

When Alice tweets, immediately write Aliceโ€™s tweet_id into every followerโ€™s home timeline:

Alice posts tweet #12345
    โ†“
Fan-out worker looks up Alice's followers (potentially millions)
    โ†“
Writes tweet_id into home_timeline for each follower
    โ†“
Bob loads his timeline โ†’ just reads pre-built home_timeline list

Pros:

  • Read is O(1) โ€” timeline is pre-built, just fetch and display
  • Very fast timeline load

Cons:

  • Write amplification: Kylie Jenner tweets once โ†’ 150M row inserts
  • Wasted writes for offline users
  • Hot-shard problem: celebrityโ€™s followers may all hash to same partition

Option B: Fanout-on-read (pull model)

When Bob loads his timeline, look up everyone he follows and merge their recent tweets:

Bob loads timeline
    โ†“
Fetch IDs of all users Bob follows (from follows table)
    โ†“
For each followee: fetch their last N tweets
    โ†“
Merge-sort all tweets by timestamp โ†’ show top 20
    โ†“
Cache result in Redis

Pros:

  • No write amplification; posting a tweet is O(1)
  • Naturally handles celebrities

Cons:

  • Timeline load is slow: if Bob follows 1,000 people, thatโ€™s 1,000 DB reads per load
  • Cache invalidation on new tweets is complex

Option C: Hybrid (what Twitter actually does)

Normal users (< 10K followers): fanout-on-write
Celebrity users (> 10K followers): fanout-on-read

Bob's timeline = prebuilt_timeline_from_normal_users
              + real-time merge of celebrity tweets on load
When Alice (1K followers) tweets:
  โ†’ Write to each follower's prebuilt home_timeline (fast, bounded)

When Elon (100M followers) tweets:
  โ†’ Store tweet only in tweets table

When Bob (follows both Alice and Elon) loads timeline:
  โ†’ Read prebuilt timeline (includes Alice's tweet)
  โ†’ Additionally query tweets table for Bob's celebrity follows
  โ†’ Merge by timestamp
  โ†’ Cache result for 30s

Step 4: Architecture

Client
  โ”‚ POST /tweets
  โ–ผ
[ Tweet Service ]
  โ”œโ”€ Store tweet in tweets table (Cassandra or Postgres)
  โ”œโ”€ Publish to Kafka topic "new_tweets"
  โ””โ”€ Return to client immediately

[ Kafka "new_tweets" ]
  โ†“
[ Fanout Workers (consumer group) ]
  โ”œโ”€ Fetch follower list from Social Graph Service
  โ”‚     (cached in Redis โ€” follower list rarely changes)
  โ”œโ”€ For normal users: batch write to home_timeline (Cassandra)
  โ””โ”€ For celebrities: skip (read-time fanout instead)

[ Timeline Service ]
  โ”œโ”€ Read home_timeline from Cassandra
  โ”œโ”€ Merge celebrity tweets from cache
  โ””โ”€ Return top 20 tweets

[ Cache Layer ]
  โ”œโ”€ home_timeline:{user_id} โ†’ cached rendered timeline
  โ”œโ”€ celebrity_tweets:{user_id} โ†’ recent tweets for celebrity accounts
  โ””โ”€ follower_list:{user_id} โ†’ fan-out list

Step 5: Storage choices

Tweets: Cassandra (wide column)

  • Partition key: author_id
  • Clustering key: tweet_id DESC (most recent first)
  • Write-optimized, handles high throughput

Home timeline: Cassandra

  • Partition key: user_id
  • Clustering key: tweet_at DESC, tweet_id
  • LIMIT 20 per query

Follow graph: PostgreSQL

  • Fast lookup of follower lists
  • Indexed on both follower_id and followee_id
  • Much smaller data volume

Cache: Redis

  • Rendered timeline JSON: TTL 30 seconds
  • Celebrity tweet IDs: TTL 10 seconds
  • Follower counts: TTL 60 seconds

Step 6: Real-time delivery

When Alice posts, Bob should see it within 5 seconds:

Alice tweets
  โ†’ Kafka โ†’ Fanout worker โ†’ writes to Bob's home_timeline
                          โ†’ publishes to Redis pub/sub "timeline:{bob_id}"

Bob's client:
  โ†’ Maintains WebSocket connection to Notification Service
  โ†’ Notification Service subscribes to Redis pub/sub for Bob's channel
  โ†’ On new message: pushes "new tweet" event to Bob's WebSocket
  โ†’ Client fetches new tweets from Timeline API

Simpler alternative: Client polls every 10s. Works fine for most apps; WebSocket for premium/pro users.


Step 7: Media (images/video)

POST /tweets with media:
  1. Client uploads media to pre-signed S3 URL directly
  2. After upload, client submits tweet with media S3 keys
  3. Tweet service stores media keys in tweets table
  4. Async transcoder (Lambda) generates thumbnails + video variants
  5. CDN serves media; tweet stores CDN URLs

Never stream media through app servers.


Tweets are written to Elasticsearch asynchronously via Kafka consumer. Search API queries Elasticsearch.

Kafka "new_tweets" โ†’ Search Indexer โ†’ Elasticsearch index
                                         โ†‘
Client โ†’ Search API โ†’ Elasticsearch โ”€โ”€โ”€โ”€โ”˜

Say it out loud
โ€œThe core decision is fanout strategy. For normal users I use fanout-on-write โ€” when they post, a Kafka consumer writes their tweet ID into each followerโ€™s home_timeline in Cassandra. Timeline read is O(1). But for users with more than ~10K followers, write amplification makes fanout-on-write impractical โ€” Kylie Jenner tweeting would trigger 150M writes. So celebrity tweets are stored and read on-demand at timeline load time, merged with the prebuilt timeline. This is the hybrid model Twitter actually uses.โ€

Likely follow-up questions
  • What is the difference between fanout-on-write and fanout-on-read?
  • How do you handle celebrities with 100M followers?
  • How do you shard the timeline table?
  • How do you push real-time tweets to online followers?
  • How do you rank the feed?

References