The edge between clients and your services — what belongs there, what doesn't, and how to keep it from becoming a monolith in disguise.

API Gateway Pattern

The edge between clients and your services — what belongs there, what doesn't, and how to keep it from becoming a monolith in disguise.

TL;DR (human)

An API gateway is the single ingress point: TLS termination, auth verification, rate limiting, routing, observability. It should be thin — cross-cutting concerns only; never business logic. Fat gateways become bottlenecks and deploy risks. Common patterns: BFF (Backend-for-Frontend), GraphQL federation, reverse-proxy router.

For agents

What belongs in the gateway

Concern	At gateway
TLS termination	✓
Request routing (host / path)	✓
Auth token verification (cheap path)	✓ (deeper auth = service-level)
Rate limiting (per IP / per identity)	✓
Request / response logging	✓ (sampled)
Trace propagation (request-id)	✓
Compression (gzip / brotli)	✓
Static asset serving	✓ (or CDN)
Header normalisation	✓
CORS	✓
Bot detection	✓
Geo restrictions	✓
Request transformation (rare)	Maybe
Response caching (rare)	Maybe
Business logic	✗
Database queries	✗
Cross-service orchestration	✗

What does NOT belong

Business validation: each service validates its own inputs (per contracts-zod-pattern.md).
Service-specific transforms: that's the service's job; gateway should be generic.
Cross-service orchestration: separate orchestration service; not gateway.
Data fetching: gateway passes through; services own data.

Symptoms of a fat gateway:

Gateway codebase larger than any backend service.
Gateway requires expert team to modify.
Gateway deploys gate other deploys.
Gateway is a single point of failure with no quick replacement path.

If your gateway has these symptoms, it's eaten responsibilities. Trim.

Architectures

Reverse-proxy router (thinnest): pure routing + cross-cutting:

client → ALB / NGINX / Envoy → service A
                              → service B
                              → service C

Services own their own auth, validation, business logic. Gateway adds little but routing and cross-cutting.

BFF (Backend-for-Frontend): one gateway-ish service per client type:

web client → BFF-web → (services)
mobile     → BFF-mob → (services)
admin      → BFF-adm → (services)

BFF tailors API shape per client; reduces per-client over-fetch. Avoids "the API tries to please everyone".

GraphQL gateway / federation: GraphQL endpoint composes subgraphs:

client → GraphQL gateway → user-service (User subgraph)
                          → flow-service (Flow subgraph)
                          → audit-service (Audit subgraph)

Federation (Apollo Federation, Hot Chocolate Federation): subgraphs declare their types; gateway stitches.

API mesh (less common): pure proxy with declarative composition rules; no code.

Choosing

Pattern	Use when
Reverse-proxy	< 10 services; simple shape
BFF	Multiple client types with distinct API needs
GraphQL	Rich data graph; many clients; query flexibility valued
Mesh	Lots of services + standard composition; less common

Authentication at the gateway

The gateway verifies tokens (cheap path): signature check, expiry, basic shape.

Deeper auth (per-resource access, capability checks) happens at the service layer (per ../security/rbac-pattern.md). Gateway sends the verified principal-id; service makes finer decisions.

Why split:

Gateway can stay generic + fast.
Services know their own resources + permissions; centralised would couple too tightly.

Request-id + tracing

Gateway:

Generates requestId (UUID/v7 or ULID) if missing.
Adds X-Request-Id header on outgoing request to services.
Creates the root trace span.
Logs the request with id.

Services propagate. Observability correlates (per ../quality/observability-pattern.md).

Versioning at the gateway

If versioned URLs (/v1/..., /v2/...):

Gateway routes by version.
Both versions live during deprecation.
Gateway can run a transformation if v1 → v2 is mechanical (rare; usually the service version owns it).

Caching at the gateway

For cacheable responses:

Honor Cache-Control from services.
Tag-based purge.
Per-user cache requires care (key must include user id).

Most caching is best at CDN (closer to users); gateway-level caching is for shared backend responses.

Cost concerns

Gateway is on every request — cost discipline:

Latency budget: < 10ms steady-state at the gateway.
Memory + CPU profile under load.
Auto-scale per traffic.
Per-environment sized (staging smaller than prod).

A slow gateway impacts every endpoint. Profile + tune.

Failure mode: gateway as bottleneck

When the gateway can't be skipped, it's a single point of failure. Mitigations:

Multi-region: per-region gateway (per multi-region-pattern.md).
Multi-AZ within region.
Health checks + auto-replacement.
Direct service access for internal callers: services call each other directly when feasible, bypassing gateway.

Service-to-service

Inside the cluster, services often call each other directly (mesh) rather than through the gateway:

Gateway is for external traffic.
Internal: service-mesh (per service-mesh-pattern.md) handles cross-cutting (mTLS, observability, retries).

Don't route internal traffic through the external gateway. Adds latency; couples internal architecture to external entry.

Deployment risk

Gateway deploys block every service. Mitigations:

Canary deploys.
Blue-green for the gateway specifically.
Roll-forward only (per ../quality/ci-cd-pipeline-pattern.md).
Practice gateway rollback (drill).

Common failure modes

Fat gateway: business logic crept in. → Refactor; move logic to services.
Gateway as a deploy gatekeeper: services can't deploy without gateway change. → Stable contract; service-side changes don't require gateway changes.
Gateway as cache that lies: stale data; users confused. → Conservative caching; service-driven invalidation.
No request-id propagation: cannot trace requests. → Mandatory.
CORS handled in each service inconsistently: → Centralise at gateway.
TLS termination at gateway only: internal traffic plaintext. → mTLS internal (service mesh).

Tooling stack (typical)

Concern	Tool
Cloud-native	AWS API Gateway, GCP API Gateway, Azure APIM
Self-hosted	Kong, Tyk, KrakenD, Apache APISIX
Reverse-proxy	NGINX, Caddy, Envoy, Traefik, HAProxy
GraphQL federation	Apollo Router, Cosmo, Mercurius
BFF framework	Whatever your stack uses (Next.js, Nest.js, Rails, etc.)

Adoption path

Few services: ALB / Load balancer is enough; no "gateway" per se.
~10 services: reverse-proxy gateway with cross-cutting concerns.
Multiple client types: BFFs.
Rich data graph: GraphQL federation.
Mature mesh: gateway + service mesh; internal traffic doesn't traverse gateway.

API Gateway Pattern

API Gateway Pattern

TL;DR (human)

For agents

What belongs in the gateway

What does NOT belong

Architectures

Choosing

Authentication at the gateway

Request-id + tracing

Versioning at the gateway

Caching at the gateway

Cost concerns

Failure mode: gateway as bottleneck

Service-to-service

Deployment risk

Common failure modes

Tooling stack (typical)

Adoption path

See also

On this page