APIs & communication: REST, gRPC, GraphQL, WebSockets

API styles and when to choose each, REST done right (status codes, idempotency, pagination, versioning), real-time options, sync vs async, and resilience: retries, timeouts, circuit breakers, rate limiting.

must medium โฑ 30 min apirestgrpcgraphqlwebsocketsidempotencyresilience
Mastery:
Why interviewers ask this
Every backend design includes choosing how services and clients talk. Knowing the styles, their tradeoffs, and resilience patterns (retries, timeouts, circuit breakers) signals real engineering judgment.

How clients and services talk is a design decision with real tradeoffs. Pick the style by the communication pattern, design the API well, and make the calls resilient โ€” those are the three things interviewers probe.

API styles โ€” and when to choose each

StyleTransport / formatStrengthsWeaknessesReach for it when
RESTHTTP + JSONUbiquitous, cacheable, simple, tooling everywhereOver/under-fetching; many round trips for nested dataPublic APIs, CRUD, broad client compatibility
gRPCHTTP/2 + Protobuf (binary)Fast, compact, strong typing/codegen, streaming, bidirectionalNot browser-native (needs proxy), binary = harder to debugInternal service-to-service, low-latency, polyglot microservices
GraphQLHTTP + query languageClient picks exactly the fields โ†’ no over/under-fetch; one round trip for nested data; strong schemaCaching is harder (POST queries); server complexity; risk of expensive queriesMany varied clients, deeply nested data, mobile (bandwidth)

Over-fetching = the endpoint returns more than the view needs (wasted bytes). Under-fetching = the view needs several endpoints / nested calls (waterfall round trips). REST is prone to both; GraphQL solves them by letting the client specify the shape โ€” at the cost of caching and query-cost control (mitigated with persisted queries, depth/complexity limits, and DataLoader to batch and avoid N+1).

Rule of thumb
REST for public/CRUD APIs, gRPC for internal high-throughput service-to-service, GraphQL when diverse clients need flexible, nested data and youโ€™d otherwise build a dozen bespoke endpoints. Theyโ€™re not mutually exclusive โ€” gRPC internally, REST/GraphQL at the edge is common.

REST done right

REST is โ€œuse HTTP as intended.โ€ Senior signals:

Resource design. Nouns, not verbs: GET /users/123/orders, not /getUserOrders. Use HTTP methods for semantics: GET (read, safe), POST (create), PUT (replace, idempotent), PATCH (partial update), DELETE (idempotent).

Status codes (use the real ones, not 200 for everything):

  • 2xx โ€” 200 OK, 201 Created (+ Location), 202 Accepted (async), 204 No Content.
  • 4xx โ€” 400 bad request, 401 unauthenticated, 403 forbidden, 404 not found, 409 conflict, 422 validation, 429 rate-limited.
  • 5xx โ€” 500 server error, 503 unavailable (with Retry-After).

Idempotency. GET/PUT/DELETE are idempotent by definition; POST is not. For unsafe-to-retry operations (payments, orders) accept an Idempotency-Key header: the server records the key + result, so a retried request returns the original outcome instead of acting twice. (See classics.)

Pagination. Cursor-based for large/changing lists (?cursor=&limit=) โ€” stable as the head changes, O(1) resume. Offset/page only for small, stable datasets.

Versioning. URL (/v1/), header (Accept: application/vnd.api+json;version=1), or query param. URL versioning is the most visible/cacheable. Version when you make breaking changes; prefer additive, backward-compatible changes to avoid versioning at all.

Other niceties: ETag/If-None-Match for caching/conditional requests, consistent error envelopes, filtering/sorting via query params, HATEOAS (rarely required but worth naming).

Real-time: WebSockets vs SSE vs long-polling

When the server must push to the client (chat, live scores, notifications):

TechniqueDirectionHowBest forCost
Short pollingclientโ†’serverClient requests on a timerSimple, infrequent updatesWasteful, laggy
Long pollingclientโ†’serverRequest held open until data or timeout, then re-issuedFallback when WS unavailableMany held connections, hacky
SSE (Server-Sent Events)serverโ†’client onlyOne long-lived HTTP stream, auto-reconnectServer push: feeds, notifications, dashboardsUnidirectional; HTTP/1.1 connection limits
WebSocketbidirectionalPersistent full-duplex TCP after HTTP upgradeChat, collaboration, games, anything two-wayStateful connections complicate scaling/LB

Choose by direction + frequency. Server-only push at moderate rate โ†’ SSE (simpler, plain HTTP, auto-reconnect). Two-way / high-frequency โ†’ WebSocket. No real-time need โ†’ donโ€™t; just poll or refetch. Scaling WebSockets means handling sticky/stateful connections, a pub/sub backplane (Redis) to broadcast across server instances, and connection limits โ€” call that out.

Synchronous vs asynchronous (request/response vs event-driven)

  • Synchronous (request/response): caller waits for the result. Simple, immediate, but couples caller to calleeโ€™s availability and latency, and chains of sync calls compound latency and failure (one slow service stalls the whole request).
  • Asynchronous (event-driven): caller emits an event / enqueues a job and moves on; the work happens later. Decouples services, smooths spikes, and improves resilience โ€” at the cost of eventual consistency and harder debugging/tracing.
SYNC:   client โ†’ A โ†’ B โ†’ C   (waits; A's latency = A+B+C; C down โ†’ whole call fails)
ASYNC:  client โ†’ A โ†’ [queue] โ†’ workers โ†’ B, C   (A returns now; B/C catch up; retries safe)

Use sync when the caller genuinely needs the answer now (read a balance). Use async for slow/spiky/fan-out work (send email, process upload, update search index) โ€” and pair it with the outbox pattern so the event and the state change are atomic. โ€œIโ€™d return 202 and process asynchronouslyโ€ is the right answer for anything the user doesnโ€™t need to block on.

API gateway responsibilities

The single entry point in front of a microservices fleet, centralizing cross-cutting concerns so services stay focused:

  • AuthN/AuthZ โ€” validate tokens once at the edge.
  • Rate limiting & quotas โ€” shed abuse before it reaches services.
  • Routing & aggregation โ€” path-based routing; combine several backend calls into one client response (BFF-style).
  • Protocol translation โ€” REST at the edge โ†” gRPC internally.
  • TLS termination, logging, metrics, caching.

Keep it thin (no business logic โ€” that recreates a monolith) and replicated (itโ€™s a choke point and potential SPOF).

Resilience: timeouts, retries, circuit breakers

Networks fail; calls hang. These three patterns turn brittle calls into resilient ones โ€” and they interact, so know the order.

Timeouts. Never make an unbounded network call. A missing timeout means one slow dependency exhausts your thread/connection pool and cascades. Set aggressive, per-call timeouts (and budget them across a request chain).

Retries. Retry transient failures (timeouts, 503, connection resets) โ€” but only idempotent operations, with exponential backoff + jitter (so retries donโ€™t synchronize into a thundering herd), and a cap (a few attempts, not infinite). Retrying a non-idempotent POST without an idempotency key double-acts.

Circuit breaker. If a dependency keeps failing, stop calling it. The breaker tracks the failure rate; past a threshold it trips (open) and fails fast for a cooldown (no waiting on a dead service), then half-opens to test recovery with a trial request, and closes when healthy. This prevents retries from hammering a struggling service and gives it room to recover โ€” and lets you degrade gracefully (serve a cached/default response while open).

CLOSED โ”€โ”€(failures > threshold)โ”€โ”€โ–บ OPEN โ”€โ”€(cooldown)โ”€โ”€โ–บ HALF-OPEN
   โ–ฒ                                                        โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€(trial succeeds)โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              (trial fails โ†’ back to OPEN)

Together: timeout bounds each call, retry w/ backoff handles transient blips, circuit breaker stops retries from worsening a sustained outage, and bulkheads (isolated pools per dependency) keep one failing dependency from sinking the rest. Add rate limiting (token bucket) at the edge to protect against overload. (See building blocks and classics.)

Interview questions & model answers

Q: REST vs gRPC vs GraphQL? โ€œREST for public/CRUD APIs โ€” ubiquitous, cacheable, simple. gRPC for internal service-to-service โ€” binary Protobuf over HTTP/2 is fast and strongly typed with codegen and streaming, but not browser-native. GraphQL when many varied clients need flexible nested data and REST would over/under-fetch โ€” the client picks the fields, at the cost of harder caching and query-cost control. Often gRPC internally, REST or GraphQL at the edge.โ€

Q: Over-fetching vs under-fetching? โ€œOver-fetching: the endpoint returns more than the view needs โ€” wasted bandwidth. Under-fetching: the view needs multiple calls, causing waterfall round trips. REST suffers both; GraphQL fixes them by letting the client specify the exact shape, which Iโ€™d protect with depth/complexity limits and DataLoader batching to avoid N+1.โ€

Q: WebSockets vs SSE vs long-polling? โ€œBy direction and frequency. Server-to-client push at moderate rate โ€” SSE: one long-lived HTTP stream with auto-reconnect, simple. Two-way or high-frequency โ€” WebSocket: persistent full-duplex, but stateful so scaling needs sticky connections and a Redis pub/sub backplane to broadcast across instances. Long-polling only as a fallback. No real-time need โ€” just refetch.โ€

Q: How do you make a POST safe to retry? โ€œAn idempotency key: the client sends a unique key per logical operation; the server records the key with its result, so a retry returns the original outcome instead of acting twice. Combined with timeouts and bounded retries, that makes the call safe under network failures.โ€

Q: Sync vs async โ€” when each? โ€œSync when the caller needs the answer now and the chain is short โ€” reading data. Async for slow, spiky, or fan-out work: return 202, enqueue, and process in workers, which decouples services and smooths load at the cost of eventual consistency. Iโ€™d pair async event publishing with the outbox pattern so the event and DB write are atomic.โ€

Q: How do retries, timeouts, and circuit breakers fit together? โ€œTimeouts bound every call so a hung dependency canโ€™t exhaust my pool. Retries with exponential backoff and jitter handle transient failures โ€” but only for idempotent calls, with a cap. A circuit breaker trips after sustained failures so I fail fast instead of retrying a dead service, then half-opens to probe recovery. Plus bulkheads to isolate pools and rate limiting at the edge.โ€

Q: What does an API gateway do, and what should it not do? โ€œIt centralizes cross-cutting concerns: auth, rate limiting, routing, request aggregation, protocol translation, TLS, observability โ€” so services stay focused. It should NOT hold business logic (that recreates a monolith at the edge) and must be replicated since itโ€™s a choke point and SPOF.โ€

Q: How do you version a REST API? โ€œPrefer additive, backward-compatible changes so I rarely need to version. When a breaking change is unavoidable, URL versioning (/v1/) is the most visible and cacheable; header-based is cleaner but less obvious. I deprecate old versions on a published timeline rather than breaking clients.โ€

Common mistakes / what weak candidates do

  • Returning 200 for everything (including errors), defeating clients and caches.
  • Treating POST as idempotent โ€” retrying without an idempotency key double-charges.
  • Using verbs in REST paths (/getUser) and ignoring HTTP method semantics.
  • Defaulting to WebSockets when SSE (or plain refetch) suffices, then ignoring the stateful-scaling cost.
  • Making everything synchronous, so one slow service stalls the whole request chain.
  • Calling without timeouts, or retrying non-idempotent ops / retrying without backoff+jitter (thundering herd).
  • No circuit breaker, so retries pile onto a failing dependency and cascade.
  • Putting business logic in the gateway or forgetting itโ€™s a SPOF that must be replicated.
  • Choosing GraphQL without addressing caching and query-cost/N+1 control.

Say it out loud
โ€œPick the style by the pattern: REST for public/CRUD, gRPC for internal high-throughput, GraphQL for flexible nested data across many clients. Do REST right โ€” proper status codes, idempotency keys on unsafe retries, cursor pagination, additive versioning. SSE for server push, WebSockets for two-way. Default to async (202 + queue + outbox) for slow/spiky work to decouple services. And make every call resilient: timeout, retry with backoff+jitter on idempotent ops, circuit breaker, bulkheads, and edge rate limiting.โ€

Likely follow-up questions
  • REST vs gRPC vs GraphQL โ€” when each?
  • WebSockets vs SSE vs long-polling for real-time?
  • How do you make a POST safe to retry?
  • What does an API gateway do?
  • Retries, timeouts, circuit breakers โ€” how do they fit together?

References