Step 1: Requirements
Functional:
- Post a tweet (text, images, videos)
- Follow/unfollow a user
- View home timeline (posts from people I follow, reverse chronological)
- Like, retweet, reply
- Search tweets
Non-functional:
- 500M users, 300M DAU
- Timeline load < 200ms p99
- Tweets delivered to followers within 5 seconds
- Reads >> writes (read-heavy)
- Eventual consistency on timeline is OK
Estimation:
Writes: 500K tweets/day รท 86,400 โ 6 tweets/sec
Reads: 300M DAU ร 5 timeline loads/day รท 86,400 โ 17,000 reads/sec
Storage: 500K tweets/day ร 500 bytes ร 365 ร 10 years โ 900 GB (text only)
Fan-out: average 200 followers/user โ 500K ร 200 = 100M timeline writes/day
Step 2: Core data model
-- Users
users(id BIGINT PK, username, email, follower_count, following_count, ...)
-- Tweets
tweets(id BIGINT PK, author_id BIGINT FK, content TEXT, media_urls TEXT[],
like_count INT, retweet_count INT, created_at TIMESTAMPTZ)
-- Follow graph (write-heavy, separate from tweet storage)
follows(follower_id BIGINT, followee_id BIGINT, created_at TIMESTAMPTZ)
PRIMARY KEY (follower_id, followee_id)
INDEX (followee_id, follower_id) โ for "get all followers of user X"
-- Home Timeline (precomputed โ see fanout below)
home_timeline(user_id BIGINT, tweet_id BIGINT, tweet_at TIMESTAMPTZ)
PRIMARY KEY (user_id, tweet_at DESC, tweet_id)
Step 3: The core decision โ fanout strategy
This is the heart of the design. Two approaches:
Option A: Fanout-on-write (push model)
When Alice tweets, immediately write Aliceโs tweet_id into every followerโs home timeline:
Alice posts tweet #12345
โ
Fan-out worker looks up Alice's followers (potentially millions)
โ
Writes tweet_id into home_timeline for each follower
โ
Bob loads his timeline โ just reads pre-built home_timeline list
Pros:
- Read is O(1) โ timeline is pre-built, just fetch and display
- Very fast timeline load
Cons:
- Write amplification: Kylie Jenner tweets once โ 150M row inserts
- Wasted writes for offline users
- Hot-shard problem: celebrityโs followers may all hash to same partition
Option B: Fanout-on-read (pull model)
When Bob loads his timeline, look up everyone he follows and merge their recent tweets:
Bob loads timeline
โ
Fetch IDs of all users Bob follows (from follows table)
โ
For each followee: fetch their last N tweets
โ
Merge-sort all tweets by timestamp โ show top 20
โ
Cache result in Redis
Pros:
- No write amplification; posting a tweet is O(1)
- Naturally handles celebrities
Cons:
- Timeline load is slow: if Bob follows 1,000 people, thatโs 1,000 DB reads per load
- Cache invalidation on new tweets is complex
Option C: Hybrid (what Twitter actually does)
Normal users (< 10K followers): fanout-on-write
Celebrity users (> 10K followers): fanout-on-read
Bob's timeline = prebuilt_timeline_from_normal_users
+ real-time merge of celebrity tweets on load
When Alice (1K followers) tweets:
โ Write to each follower's prebuilt home_timeline (fast, bounded)
When Elon (100M followers) tweets:
โ Store tweet only in tweets table
When Bob (follows both Alice and Elon) loads timeline:
โ Read prebuilt timeline (includes Alice's tweet)
โ Additionally query tweets table for Bob's celebrity follows
โ Merge by timestamp
โ Cache result for 30s
Step 4: Architecture
Client
โ POST /tweets
โผ
[ Tweet Service ]
โโ Store tweet in tweets table (Cassandra or Postgres)
โโ Publish to Kafka topic "new_tweets"
โโ Return to client immediately
[ Kafka "new_tweets" ]
โ
[ Fanout Workers (consumer group) ]
โโ Fetch follower list from Social Graph Service
โ (cached in Redis โ follower list rarely changes)
โโ For normal users: batch write to home_timeline (Cassandra)
โโ For celebrities: skip (read-time fanout instead)
[ Timeline Service ]
โโ Read home_timeline from Cassandra
โโ Merge celebrity tweets from cache
โโ Return top 20 tweets
[ Cache Layer ]
โโ home_timeline:{user_id} โ cached rendered timeline
โโ celebrity_tweets:{user_id} โ recent tweets for celebrity accounts
โโ follower_list:{user_id} โ fan-out list
Step 5: Storage choices
Tweets: Cassandra (wide column)
- Partition key:
author_id - Clustering key:
tweet_id DESC(most recent first) - Write-optimized, handles high throughput
Home timeline: Cassandra
- Partition key:
user_id - Clustering key:
tweet_at DESC, tweet_id LIMIT 20per query
Follow graph: PostgreSQL
- Fast lookup of follower lists
- Indexed on both
follower_idandfollowee_id - Much smaller data volume
Cache: Redis
- Rendered timeline JSON: TTL 30 seconds
- Celebrity tweet IDs: TTL 10 seconds
- Follower counts: TTL 60 seconds
Step 6: Real-time delivery
When Alice posts, Bob should see it within 5 seconds:
Alice tweets
โ Kafka โ Fanout worker โ writes to Bob's home_timeline
โ publishes to Redis pub/sub "timeline:{bob_id}"
Bob's client:
โ Maintains WebSocket connection to Notification Service
โ Notification Service subscribes to Redis pub/sub for Bob's channel
โ On new message: pushes "new tweet" event to Bob's WebSocket
โ Client fetches new tweets from Timeline API
Simpler alternative: Client polls every 10s. Works fine for most apps; WebSocket for premium/pro users.
Step 7: Media (images/video)
POST /tweets with media:
1. Client uploads media to pre-signed S3 URL directly
2. After upload, client submits tweet with media S3 keys
3. Tweet service stores media keys in tweets table
4. Async transcoder (Lambda) generates thumbnails + video variants
5. CDN serves media; tweet stores CDN URLs
Never stream media through app servers.
Step 8: Search
Tweets are written to Elasticsearch asynchronously via Kafka consumer. Search API queries Elasticsearch.
Kafka "new_tweets" โ Search Indexer โ Elasticsearch index
โ
Client โ Search API โ Elasticsearch โโโโโ