API Design: REST, gRPC & GraphQL

Design APIs interviewers respect: resource modeling, the right status codes, idempotency, pagination, versioning, error envelopes, and when to reach for gRPC or GraphQL over REST. Contract-first thinking and the resilience patterns that wrap every call.

must medium โฑ 30 min apirestgrpcgraphqlidempotencyversioningpagination
Mastery:
Why interviewers ask this
API design is the surface every backend service exposes. It shows whether you think in contracts, model resources cleanly, and handle the hard parts โ€” idempotency, pagination, versioning, errors โ€” that separate a hobby endpoint from a production API.

An API is a contract. The job is to model resources cleanly, choose the right protocol for the communication pattern, and design the hard parts โ€” idempotency, pagination, versioning, errors โ€” so clients can depend on you for years. This lesson is the backend-dev view; the system-design version covers the same ground from a whiteboard angle.

Pick the style by the communication pattern

StyleTransport / formatStrengthsWeaknessesReach for it when
RESTHTTP + JSONUbiquitous, cacheable, simple, tooling everywhereOver/under-fetching; many round trips for nested dataPublic APIs, CRUD, broad client reach
gRPCHTTP/2 + Protobuf (binary)Fast, compact, codegen, streaming, strong typingNot browser-native (needs a proxy), binary is harder to debugInternal service-to-service, low latency, polyglot
GraphQLHTTP + query languageClient picks exact fields, one round trip for nested data, strong schemaCaching is hard, server complexity, query-cost risk (N+1)Many varied clients, deeply nested data, mobile

They are not mutually exclusive โ€” gRPC internally, REST or GraphQL at the edge is the common production shape.

Rule of thumb
REST for public/CRUD APIs, gRPC for internal high-throughput service-to-service, GraphQL when diverse clients need flexible nested data and youโ€™d otherwise ship a dozen bespoke endpoints.

REST done right

REST is โ€œuse HTTP as intended.โ€ The senior signals:

Resource modeling. Nouns, not verbs: GET /users/123/orders, never /getUserOrders. Methods carry the semantics โ€” GET (read, safe), POST (create), PUT (replace, idempotent), PATCH (partial), DELETE (idempotent).

Status codes โ€” use the real ones, not 200 for everything:

  • 2xx โ€” 200 OK, 201 Created (+ Location header), 202 Accepted (async), 204 No Content.
  • 4xx โ€” 400 bad request, 401 unauthenticated, 403 forbidden, 404 not found, 409 conflict, 422 validation, 429 rate-limited.
  • 5xx โ€” 500 server error, 503 unavailable (+ Retry-After).

Error envelope. Return a consistent, machine-readable shape so clients can branch on a code, not parse prose:

{
  "error": {
    "code": "INSUFFICIENT_FUNDS",
    "message": "Balance 12.00 is below the 50.00 charge",
    "requestId": "req_9f2a...",
    "details": [{ "field": "amount", "issue": "exceeds_balance" }]
  }
}

Idempotency. GET/PUT/DELETE are idempotent by definition; POST is not. For unsafe-to-retry operations (payments, orders) accept an Idempotency-Key header. The server stores the key with its result, so a retried request returns the original outcome instead of charging twice. This is the single most-probed API-design detail.

Pagination. Cursor-based for large or changing lists (?cursor=&limit=) โ€” stable as the head moves, O(1) resume. Offset/page only for small, stable datasets (offset re-scans rows and can skip or duplicate items when the list mutates mid-paging).

Versioning. Prefer additive, backward-compatible changes so you rarely version at all (add fields, never remove or repurpose them). When a breaking change is unavoidable: URL versioning (/v1/) is the most visible and cacheable; header versioning is cleaner but less obvious. Deprecate on a published timeline.

Caching & conditional requests. ETag + If-None-Match to return 304 Not Modified; Cache-Control for freshness. Ties into caching strategies.

gRPC: the internal workhorse

gRPC is contract-first โ€” you write a .proto, generate typed clients/servers in every language, and get binary Protobuf over HTTP/2.

service OrderService {
  rpc GetOrder(GetOrderRequest) returns (Order);
  rpc StreamOrderEvents(OrderQuery) returns (stream OrderEvent); // server streaming
}
message GetOrderRequest { string order_id = 1; }

Wins: 5-10x smaller payloads than JSON, codegen kills hand-written client bugs, four streaming modes (unary, server-, client-, bidi-stream), and HTTP/2 multiplexing. Costs: not browser-native (needs gRPC-Web + a proxy), binary frames are harder to eyeball, and you must manage .proto evolution (field numbers are forever โ€” only add, never reuse a number).

GraphQL: client-shaped responses

One endpoint, the client sends a query describing exactly the fields it wants โ€” solving RESTโ€™s over-fetching (endpoint returns more than the view needs) and under-fetching (view needs several calls โ†’ waterfall). The price: caching is harder (queries are POSTs), and a naive resolver triggers the N+1 problem (one query per nested item). Tame it with DataLoader batching, and cap blast radius with depth/complexity limits and persisted queries.

Contract-first and the resilience wrapper

Whatever the style, design the contract before the code โ€” an OpenAPI/Protobuf/SDL schema reviewed up front catches breaking changes, generates clients and mocks, and lets frontend and backend build in parallel. And every remote call needs the resilience trio (covered deeply in distributed systems):

  • Timeout โ€” never an unbounded call; one hung dependency exhausts your pool.
  • Retry with exponential backoff + jitter โ€” only on idempotent ops, capped.
  • Circuit breaker โ€” stop hammering a failing dependency; fail fast and recover.

Interview questions & model answers

Q: REST vs gRPC vs GraphQL โ€” when each? โ€œREST for public/CRUD APIs โ€” ubiquitous, cacheable, simple. gRPC for internal service-to-service โ€” binary Protobuf over HTTP/2 is fast and strongly typed with codegen and streaming, but not browser-native. GraphQL when many varied clients need flexible nested data and REST would over/under-fetch, at the cost of harder caching and query-cost control. Often gRPC internally, REST or GraphQL at the edge.โ€

Q: How do you make a POST safe to retry? โ€œAn idempotency key. The client sends a unique key per logical operation; the server stores the key with its result, so a retry returns the original outcome instead of acting twice. Combined with bounded retries and timeouts, that makes the call safe under network failures.โ€

Q: Offset vs cursor pagination? โ€œCursor for large or changing lists โ€” you page relative to a stable marker, so inserts and deletes at the head donโ€™t skip or duplicate rows, and resume is O(1). Offset re-scans OFFSET rows each page (slow at depth) and can skip/duplicate when the list mutates. Offset is fine only for small, stable data.โ€

Q: How do you evolve an API without breaking clients? โ€œAdditive, backward-compatible changes โ€” add optional fields, never remove or repurpose existing ones. In Protobuf that means new field numbers only. When a breaking change is unavoidable, version it (URL /v1/ for visibility) and deprecate the old version on a published timeline with telemetry on who still uses it.โ€

Q: What makes a good error response? โ€œA real status code (422 for validation, 409 for conflict โ€” not 200-with-an-error-body), plus a stable machine-readable code clients can branch on, a human message for logs, a request ID for tracing, and field-level details for validation. Consistent envelope across every endpoint.โ€

Common mistakes / what weak candidates do

  • 200 for everything, including errors โ€” defeats clients and caches.
  • Verbs in REST paths (/getUser, /createOrder) and ignoring method semantics.
  • Treating POST as idempotent โ€” retries double-charge without an idempotency key.
  • Offset pagination on large/changing lists โ€” slow and lossy.
  • Breaking changes without versioning โ€” removing or repurposing a field silently breaks clients.
  • Prose-only errors with no stable code or request ID.
  • Reaching for GraphQL without addressing caching, N+1, and query-cost limits.
  • Designing code-first and reverse-engineering the contract, so clients canโ€™t build in parallel.

Say it out loud
โ€œAn API is a contract. REST for public/CRUD, gRPC for internal high-throughput, GraphQL for flexible nested data. Do REST right โ€” real status codes, a stable error envelope, idempotency keys on unsafe retries, cursor pagination, and additive versioning so I rarely break clients. Design contract-first, and wrap every call in timeout + retry-with-backoff + circuit breaker.โ€

Likely follow-up questions
  • REST vs gRPC vs GraphQL โ€” when each?
  • How do you make a POST safe to retry?
  • How do you paginate a large, changing list?
  • How do you evolve an API without breaking clients?
  • What goes in a good error response?

References