Databases: SQL vs NoSQL, indexing, transactions

The NoSQL families and when to use each, B-tree vs LSM indexing, ACID vs BASE and isolation levels, sharding & rebalancing, replication topologies, and access-patterns-drive-the-schema.

must hard โฑ 34 min databasessqlnosqlindexingtransactionsisolationshardingreplication
Mastery:
Why interviewers ask this
Backend rounds always probe data decisions. Confident, tradeoff-aware DB answers signal you can build real systems, not just pass algorithms.

The decisions: which store, how to index, how to keep data correct under concurrency, and how to scale it. The unifying principle: access patterns drive the schema, not the other way around.

SQL vs NoSQL โ€” and the NoSQL families

Pick by access pattern and consistency needs, not hype.

Relational (SQL โ€” Postgres, MySQL) โ€” structured schema, joins, and ACID transactions. Default choice when data is relational and correctness matters (payments, orders, anything with invariants across rows). Scales further than people assume (replicas, partitioning, modern Postgres handles enormous load).

NoSQL is not one thing โ€” itโ€™s four families, each tuned for an access pattern:

FamilyExamplesData modelBest forWeakness
Key-valueRedis, DynamoDB, Memcachedkey โ†’ blobMassive-scale get/put by key, sessions, cachesNo queries beyond the key
DocumentMongoDB, CouchbaseJSON-like docsFlexible/nested schema, self-contained entitiesJoins weak; cross-doc consistency hard
Wide-columnCassandra, HBase, Bigtablerows with dynamic columns, partition+clustering keysWrite-heavy, time-series, huge volume, predictable queriesQuery must match the key design; no ad-hoc joins
GraphNeo4j, Neptunenodes + edgesDeeply connected data, traversals (social graph, fraud)Doesnโ€™t scale horizontally as easily
Time-seriesInfluxDB, TimescaleDBtimestamped pointsMetrics, IoT, monitoring (append + range scan)Niche; specialized query model

The honest interview framing: โ€œSQL by default for relational, transactional data. I reach for a specific NoSQL family when an access pattern demands it: KV for raw-scale get/put, document for flexible nested shapes, wide-column for write-heavy time-series at huge volume, graph for traversals โ€” accepting weaker joins or consistency in exchange.โ€

Indexing: B-tree vs hash vs LSM

An index is a data structure that lets the DB find rows without scanning all of them. The cost is always the same: indexes use space and slow writes (every insert/update maintains them), so index the columns you filter/sort/join on โ€” not everything.

  • B-tree (default in Postgres/MySQL): a balanced sorted tree โ†’ O(log n) point lookups and range scans (WHERE age BETWEEN, ORDER BY). The workhorse.
  • Hash index: O(1) exact-match lookups, but no range queries and no ordering. Use only when you exclusively do equality lookups.
  • LSM-tree (Cassandra, RocksDB, LevelDB): buffers writes in memory (memtable), flushes sorted runs to disk (SSTables), compacts in the background. Write-optimized โ€” sequential disk writes, great throughput โ€” at the cost of read amplification (a read may check several SSTables; mitigated by Bloom filters) and compaction overhead.
B-tree:  fast reads + ranges,  slower random writes  โ†’ read-heavy, OLTP
LSM-tree: very fast writes,    reads need merging     โ†’ write-heavy, time-series

A composite index on (a, b) serves queries filtering on a, or a then b, but not b alone (left-prefix rule). A covering index that includes all selected columns lets the DB answer from the index without touching the table.

Rule of thumb
B-tree for read-heavy/range queries; LSM for write-heavy ingestion. Order composite-index columns most-selective-and-equality first, range column last. And remember: every index is a tax on writes โ€” profile before adding.

Normalization vs denormalization

  • Normalized (3NF): each fact stored once, related by foreign keys. No update anomalies, less storage โ€” but reads need joins, which get expensive at scale.
  • Denormalized: duplicate data so a read hits one place (e.g. embed the author name in each post). Faster reads, but writes must update every copy and you risk inconsistency.

Normalize for write-heavy, integrity-critical data; denormalize for read-heavy paths where join cost dominates. NoSQL stores push you toward denormalization because they lack good joins โ€” you model the data around the query.

ACID vs BASE

ACID (relational): Atomic (all-or-nothing), Consistent (invariants hold), Isolated (concurrent txns donโ€™t corrupt each other), Durable (committed survives a crash). Use it when multiple writes must succeed together โ€” a money transfer debits one account and credits another; partial completion is a bug.

BASE (many NoSQL): Basically Available, Soft state, Eventually consistent. Trade strict consistency for availability and scale. Right for feeds, counters, catalogs where a brief stale read is harmless.

Isolation levels & their anomalies

Isolation is the โ€œIโ€ in ACID โ€” how concurrent transactions interleave. Stronger isolation prevents more anomalies but costs concurrency/throughput.

LevelPreventsStill allows
Read Uncommittedโ€”Dirty reads (see uncommitted data)
Read CommittedDirty readsNon-repeatable reads, phantoms
Repeatable Read+ Non-repeatable readsPhantoms (in standard SQL; MySQL/Postgres add protections)
Serializable+ Phantoms โ€” behaves as if serialLowest concurrency

The anomalies, concretely:

  • Dirty read โ€” you read another txnโ€™s uncommitted change that may roll back.
  • Non-repeatable read โ€” you read a row twice in one txn and get different values (someone committed an update between).
  • Phantom read โ€” you run the same range query twice and new rows appear (someone inserted matching rows).
  • Lost update โ€” two txns read-modify-write the same row; one overwrites the other. (Fix: SELECT ... FOR UPDATE or atomic increments.)

Default in Postgres is Read Committed; MySQL InnoDB defaults to Repeatable Read. Name the anomaly youโ€™re guarding against and pick the weakest level that prevents it โ€” Serializable is correct but throttles throughput.

Transactions caveat: the N+1 trap

Fetching a list, then firing one query per item, is N+1 queries โ€” 1 query for 100 posts, then 100 for each author = 101 round trips, a silent killer. Fix with eager loading: a join, or a single WHERE author_id IN (...) batch.

Sharding / partitioning & rebalancing

When one machine canโ€™t hold the writes/storage, split data across nodes by a shard key:

StrategyProCon
Range (by key range)Efficient range scansHot ranges (recent data on one shard)
Hash (hash the key)Even distributionNo range queries
Directory/lookupFlexible rebalancingLookup service is a dependency/SPOF

Choosing the key: derive it from your access patterns and pick one that distributes load evenly and keeps related data co-located (so common queries hit one shard). A bad key creates hot shards and forces scatter-gather across all shards.

Rebalancing is the hard part. hash % N reshuffles everything when N changes; instead use consistent hashing or pre-split into many small partitions and move whole partitions to new nodes. Either way you avoid moving most of the data.

Replication topologies

TopologyWritesReadsTradeoff
Leader-follower (single leader)Leader onlyFollowers (scale reads)Replication lag โ†’ stale follower reads; failover needed
Multi-leaderAny leaderAnyMulti-region low-latency writes, but write conflicts to resolve
Leaderless / quorum (Dynamo)Any N nodesAnyTunable consistency via W + R > N; no single failover point
  • Sync vs async (leader-follower): sync waits for a follower ack โ†’ no data loss on failover, higher latency; async โ†’ fast writes but a crashed leader loses the un-replicated tail and followers lag.
  • Quorum math: with N replicas, if writes go to W and reads from R, then W + R > N guarantees a read overlaps a write โ†’ strong-ish consistency. Tune (W=N,R=1) for read-fast or (W=1,R=N) for write-fast.

Access patterns drive the schema โ€” worked example

Design storage for a chat app. Donโ€™t start with entities โ€” start with the queries:

Access patterns:

  1. Load the most recent 50 messages in a conversation (by far the hottest).
  2. Page older messages on scroll-up.
  3. Show a userโ€™s list of conversations, most-recently-active first.

Decision: This is write-heavy (every message is a write), read by conversation in time order, with no need for cross-conversation joins. A wide-column store (Cassandra) fits:

Table messages
  partition key:  conversation_id      -- co-locates a conversation on one node
  clustering key: message_ts DESC       -- stored sorted by time โ†’ pattern 1 & 2 are a single seek

Table conversations_by_user
  partition key:  user_id
  clustering key: last_active_ts DESC   -- pattern 3

Pattern 1 becomes a single partition read of the first 50 rows โ€” no scan, no join. Pattern 3 is a denormalized second table maintained on write (write amplification we accept for read speed โ€” classic CQRS-flavored modeling). Had I started from a normalized ER diagram, Iโ€™d have a messages table needing an index on (conversation_id, ts) and a join for the conversation list โ€” fine at small scale, but the partition-key design is what scales. The queries chose the store, the keys, and the duplication.

Interview questions & model answers

Q: SQL or NoSQL for a new service โ€” how do you decide? โ€œBy access pattern and consistency needs. If the data is relational with cross-row invariants and I need transactions โ€” SQL. If I need a specific pattern at scale โ€” raw get/put (KV), flexible nested docs (document), or write-heavy time-series (wide-column) โ€” and can give up joins or strong consistency, the matching NoSQL family. I default to Postgres and justify any move away from it.โ€

Q: B-tree vs LSM-tree? โ€œB-tree gives balanced read/write with fast point and range queries โ€” ideal for read-heavy OLTP. LSM buffers writes in memory and flushes sorted runs, so itโ€™s write-optimized with sequential I/O โ€” great for ingestion and time-series โ€” but reads may merge several SSTables (Bloom filters help) and compaction adds background work. So: read-heavy โ†’ B-tree, write-heavy โ†’ LSM.โ€

Q: Walk me through isolation levels. โ€œRead Uncommitted allows dirty reads. Read Committed stops those but allows non-repeatable and phantom reads. Repeatable Read stops non-repeatable reads. Serializable behaves as if transactions ran one at a time, preventing phantoms too โ€” but with the least concurrency. I pick the weakest level that prevents the specific anomaly I care about, and use SELECT FOR UPDATE or atomic increments to avoid lost updates.โ€

Q: How do you choose a shard key? โ€œFrom the access patterns: it must distribute load evenly to avoid hot shards, and co-locate data thatโ€™s queried together so common queries hit one shard instead of scatter-gather. For a chat app, conversation_id co-locates a conversation. I avoid keys that monotonically increase (timestamps) because recent traffic piles onto one shard.โ€

Q: How do you rebalance shards without moving all the data? โ€œAvoid hash % N, which reshuffles everything when N changes. Use consistent hashing โ€” adding a node only remaps one arc โ€” or pre-split into many small partitions and migrate whole partitions to new nodes. Either way, only a fraction of the data moves.โ€

Q: Explain quorum consistency. โ€œWith N replicas, write to W and read from R. If W + R > N, every read set overlaps every write set, so a read sees the latest write โ€” strong-ish consistency without a single leader. You tune W and R for read- or write-optimized behavior, trading latency against consistency.โ€

Q: What is the N+1 problem and how do you fix it? โ€œLoading a list then querying once per item โ€” 1 + N round trips. I fix it by eager-loading: a join or a single batched WHERE id IN (...). ORMs hide it, so I watch query logs.โ€

Common mistakes / what weak candidates do

  • โ€œNoSQL is faster/more scalableโ€ as a blanket claim, without naming a family or an access pattern.
  • Indexing everything (or nothing) โ€” not connecting indexes to the actual filter/sort/join columns, or ignoring their write cost.
  • Not knowing isolation levels beyond โ€œACID,โ€ and missing lost-update/phantom anomalies.
  • Choosing a monotonic shard key (auto-increment id, timestamp) that creates a hot shard, or ignoring rebalancing entirely.
  • Designing entities before access patterns, then unable to justify the keys or indexes.
  • Assuming dual writes to two stores are consistent (use the outbox pattern instead).
  • Forgetting replication lag โ€” reading your own write from a lagging replica and calling it a bug in the app.

Say it out loud
โ€œI let access patterns drive the schema. Default to relational for transactional, relational data; reach for a specific NoSQL family when a pattern needs horizontal scale or flexible schema, accepting weaker joins/consistency. B-tree for reads and ranges, LSM for write-heavy ingestion. I index the columns I filter and join on, pick the weakest isolation level that prevents the anomaly I care about, choose a shard key that distributes load and co-locates related data, and scale reads with replicas (knowing the lag) and writes by sharding โ€” rebalancing with consistent hashing so I donโ€™t move all the data.โ€

Likely follow-up questions
  • When would you pick a wide-column store over a document store?
  • B-tree vs LSM-tree โ€” what's the read/write tradeoff?
  • Walk me through the isolation levels and the anomalies they prevent.
  • How do you choose a shard key, and how do you rebalance?
  • Leader-follower vs multi-leader vs quorum replication?

References