Design a scalable API Gateway with throttling, caching, and Lambda integration.

Q: Design a scalable API Gateway with throttling, caching, and Lambda integration.

Learn the answer to "Design a scalable API Gateway with throttling, caching, and Lambda integration." with detailed explanations, code examples, and best practices on DeployU.

The Scenario

You need to build a production API:

Requirements:
├── REST API for mobile and web clients
├── Expected traffic: 10,000 requests/second peak
├── Latency: p99 under 200ms
├── Authentication: API keys + JWT
├── Rate limiting per customer tier
├── Caching for GET requests
├── Backend: Lambda functions
└── Multi-stage: dev, staging, production

The Challenge

Design an API Gateway architecture with proper throttling, caching strategies, authorization, and Lambda integration patterns.

Wrong Approach

A junior engineer might create one API for all environments, skip throttling configuration, use proxy integration for everything, or ignore caching. These approaches mix environments, allow abuse, increase Lambda costs, and hurt performance.

Addresses symptoms, not root cause

Right Approach

A senior engineer designs with separate stages, implements usage plans with API keys, configures method-level caching, uses request validation, and chooses the right integration type for each endpoint.

Step 1: API Gateway Architecture

API Gateway Architecture:
┌─────────────────────────────────────────────────────────────────────┐
│                        API Gateway                                   │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    Custom Domain                             │   │
│  │               api.example.com                                │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                              │                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    Stages                                    │   │
│  │   /prod    /staging    /dev                                  │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                              │                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │              Request Flow                                    │   │
│  │  Request → Auth → Throttle → Validate → Cache → Backend     │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                              │                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    Integrations                              │   │
│  │   Lambda    HTTP Proxy    AWS Services    Mock               │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Step 2: Create REST API with Terraform

# REST API
resource "aws_api_gateway_rest_api" "main" {
  name        = "orders-api"
  description = "Orders REST API"

  endpoint_configuration {
    types = ["REGIONAL"]  # or EDGE for global
  }

  # Binary media types (for file uploads)
  binary_media_types = [
    "image/*",
    "application/pdf"
  ]

  tags = {
    Environment = "production"
  }
}

# API Resources
resource "aws_api_gateway_resource" "orders" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  parent_id   = aws_api_gateway_rest_api.main.root_resource_id
  path_part   = "orders"
}

resource "aws_api_gateway_resource" "order_id" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  parent_id   = aws_api_gateway_resource.orders.id
  path_part   = "{orderId}"
}

# GET /orders - List orders
resource "aws_api_gateway_method" "list_orders" {
  rest_api_id   = aws_api_gateway_rest_api.main.id
  resource_id   = aws_api_gateway_resource.orders.id
  http_method   = "GET"
  authorization = "CUSTOM"
  authorizer_id = aws_api_gateway_authorizer.jwt.id

  # Require API key
  api_key_required = true

  # Request parameters
  request_parameters = {
    "method.request.querystring.limit"  = false
    "method.request.querystring.cursor" = false
  }
}

# Lambda integration
resource "aws_api_gateway_integration" "list_orders" {
  rest_api_id             = aws_api_gateway_rest_api.main.id
  resource_id             = aws_api_gateway_resource.orders.id
  http_method             = aws_api_gateway_method.list_orders.http_method
  integration_http_method = "POST"
  type                    = "AWS_PROXY"
  uri                     = aws_lambda_function.list_orders.invoke_arn

  # Timeout
  timeout_milliseconds = 29000  # Max 29 seconds
}

# POST /orders - Create order
resource "aws_api_gateway_method" "create_order" {
  rest_api_id   = aws_api_gateway_rest_api.main.id
  resource_id   = aws_api_gateway_resource.orders.id
  http_method   = "POST"
  authorization = "CUSTOM"
  authorizer_id = aws_api_gateway_authorizer.jwt.id

  api_key_required = true

  # Request validation
  request_validator_id = aws_api_gateway_request_validator.body.id
  request_models = {
    "application/json" = aws_api_gateway_model.create_order.name
  }
}

# Request validator
resource "aws_api_gateway_request_validator" "body" {
  name                        = "validate-body"
  rest_api_id                 = aws_api_gateway_rest_api.main.id
  validate_request_body       = true
  validate_request_parameters = true
}

# Request model for validation
resource "aws_api_gateway_model" "create_order" {
  rest_api_id  = aws_api_gateway_rest_api.main.id
  name         = "CreateOrderRequest"
  content_type = "application/json"

  schema = jsonencode({
    "$schema" = "http://json-schema.org/draft-04/schema#"
    type      = "object"
    required  = ["customerId", "items"]
    properties = {
      customerId = {
        type      = "string"
        minLength = 1
      }
      items = {
        type     = "array"
        minItems = 1
        items = {
          type     = "object"
          required = ["productId", "quantity"]
          properties = {
            productId = { type = "string" }
            quantity  = { type = "integer", minimum = 1 }
          }
        }
      }
    }
  })
}

Step 3: JWT Authorizer

# Lambda authorizer for JWT validation
resource "aws_api_gateway_authorizer" "jwt" {
  name                   = "jwt-authorizer"
  rest_api_id            = aws_api_gateway_rest_api.main.id
  type                   = "TOKEN"
  authorizer_uri         = aws_lambda_function.authorizer.invoke_arn
  authorizer_credentials = aws_iam_role.authorizer.arn

  # Cache authorization results
  authorizer_result_ttl_in_seconds = 300

  identity_source = "method.request.header.Authorization"
}

# Authorizer Lambda
resource "aws_lambda_function" "authorizer" {
  function_name = "api-authorizer"
  runtime       = "python3.11"
  handler       = "authorizer.handler"
  role          = aws_iam_role.authorizer_lambda.arn

  filename         = "authorizer.zip"
  source_code_hash = filebase64sha256("authorizer.zip")

  environment {
    variables = {
      JWT_SECRET_ARN = aws_secretsmanager_secret.jwt_secret.arn
      ISSUER         = "https://auth.example.com"
    }
  }
}

# authorizer.py
import json
import jwt
import boto3
import os
from functools import lru_cache

secrets = boto3.client('secretsmanager')

@lru_cache(maxsize=1)
def get_jwt_secret():
    response = secrets.get_secret_value(SecretId=os.environ['JWT_SECRET_ARN'])
    return response['SecretString']

def handler(event, context):
    token = event['authorizationToken'].replace('Bearer ', '')

    try:
        payload = jwt.decode(
            token,
            get_jwt_secret(),
            algorithms=['HS256'],
            issuer=os.environ['ISSUER']
        )

        return generate_policy(
            payload['sub'],
            'Allow',
            event['methodArn'],
            context={
                'userId': payload['sub'],
                'email': payload.get('email'),
                'tier': payload.get('tier', 'free')
            }
        )

    except jwt.ExpiredSignatureError:
        raise Exception('Unauthorized')
    except jwt.InvalidTokenError:
        raise Exception('Unauthorized')

def generate_policy(principal_id, effect, resource, context=None):
    policy = {
        'principalId': principal_id,
        'policyDocument': {
            'Version': '2012-10-17',
            'Statement': [{
                'Action': 'execute-api:Invoke',
                'Effect': effect,
                'Resource': resource
            }]
        }
    }

    if context:
        policy['context'] = context

    return policy

Step 4: Usage Plans and Throttling

# API Key
resource "aws_api_gateway_api_key" "customer" {
  name    = "customer-api-key"
  enabled = true
}

# Usage plan - Free tier
resource "aws_api_gateway_usage_plan" "free" {
  name = "free-tier"

  api_stages {
    api_id = aws_api_gateway_rest_api.main.id
    stage  = aws_api_gateway_stage.prod.stage_name

    throttle {
      path        = "/orders/GET"
      burst_limit = 10
      rate_limit  = 5
    }
  }

  throttle_settings {
    burst_limit = 50
    rate_limit  = 20
  }

  quota_settings {
    limit  = 10000
    period = "MONTH"
  }
}

# Usage plan - Pro tier
resource "aws_api_gateway_usage_plan" "pro" {
  name = "pro-tier"

  api_stages {
    api_id = aws_api_gateway_rest_api.main.id
    stage  = aws_api_gateway_stage.prod.stage_name
  }

  throttle_settings {
    burst_limit = 500
    rate_limit  = 200
  }

  quota_settings {
    limit  = 1000000
    period = "MONTH"
  }
}

# Usage plan - Enterprise (no quota)
resource "aws_api_gateway_usage_plan" "enterprise" {
  name = "enterprise-tier"

  api_stages {
    api_id = aws_api_gateway_rest_api.main.id
    stage  = aws_api_gateway_stage.prod.stage_name
  }

  throttle_settings {
    burst_limit = 5000
    rate_limit  = 2000
  }
}

# Associate API key with usage plan
resource "aws_api_gateway_usage_plan_key" "customer" {
  key_id        = aws_api_gateway_api_key.customer.id
  key_type      = "API_KEY"
  usage_plan_id = aws_api_gateway_usage_plan.pro.id
}

Step 5: Caching Configuration

# Stage with caching
resource "aws_api_gateway_stage" "prod" {
  stage_name    = "prod"
  rest_api_id   = aws_api_gateway_rest_api.main.id
  deployment_id = aws_api_gateway_deployment.main.id

  cache_cluster_enabled = true
  cache_cluster_size    = "0.5"  # GB

  variables = {
    environment = "production"
  }

  access_log_settings {
    destination_arn = aws_cloudwatch_log_group.api_access.arn
    format = jsonencode({
      requestId          = "$context.requestId"
      ip                 = "$context.identity.sourceIp"
      requestTime        = "$context.requestTime"
      httpMethod         = "$context.httpMethod"
      path               = "$context.path"
      status             = "$context.status"
      latency            = "$context.responseLatency"
      integrationLatency = "$context.integrationLatency"
    })
  }
}

# Method settings for caching
resource "aws_api_gateway_method_settings" "list_orders" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  stage_name  = aws_api_gateway_stage.prod.stage_name
  method_path = "orders/GET"

  settings {
    caching_enabled      = true
    cache_ttl_in_seconds = 60

    logging_level      = "INFO"
    data_trace_enabled = false
    metrics_enabled    = true

    throttling_burst_limit = 500
    throttling_rate_limit  = 200
  }
}

# Disable caching for mutations
resource "aws_api_gateway_method_settings" "create_order" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  stage_name  = aws_api_gateway_stage.prod.stage_name
  method_path = "orders/POST"

  settings {
    caching_enabled = false
    logging_level   = "INFO"
    metrics_enabled = true
  }
}

Step 6: Custom Domain and WAF

# Custom domain
resource "aws_api_gateway_domain_name" "main" {
  domain_name              = "api.example.com"
  regional_certificate_arn = aws_acm_certificate.api.arn

  endpoint_configuration {
    types = ["REGIONAL"]
  }
}

# Base path mapping
resource "aws_api_gateway_base_path_mapping" "main" {
  api_id      = aws_api_gateway_rest_api.main.id
  stage_name  = aws_api_gateway_stage.prod.stage_name
  domain_name = aws_api_gateway_domain_name.main.domain_name
  base_path   = "v1"
}

# WAF Web ACL
resource "aws_wafv2_web_acl" "api" {
  name  = "api-gateway-waf"
  scope = "REGIONAL"

  default_action {
    allow {}
  }

  # Rate limiting rule
  rule {
    name     = "RateLimitRule"
    priority = 1

    override_action {
      none {}
    }

    statement {
      rate_based_statement {
        limit              = 10000
        aggregate_key_type = "IP"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "RateLimitRule"
      sampled_requests_enabled   = true
    }
  }

  # AWS managed rules
  rule {
    name     = "AWSManagedRulesCommonRuleSet"
    priority = 2

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesCommonRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "CommonRuleSet"
      sampled_requests_enabled   = true
    }
  }

  # SQL injection protection
  rule {
    name     = "AWSManagedRulesSQLiRuleSet"
    priority = 3

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesSQLiRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "SQLiRuleSet"
      sampled_requests_enabled   = true
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name                = "ApiGatewayWAF"
    sampled_requests_enabled   = true
  }
}

# Associate WAF with API Gateway
resource "aws_wafv2_web_acl_association" "api" {
  resource_arn = aws_api_gateway_stage.prod.arn
  web_acl_arn  = aws_wafv2_web_acl.api.arn
}

REST API vs HTTP API

REST API (API Gateway):
├── Full features (caching, WAF, request validation)
├── Usage plans and API keys
├── Request/response transformation
├── Private integrations
├── Higher latency (~15ms overhead)
└── Higher cost

HTTP API:
├── Simpler, faster (~5ms overhead)
├── JWT authorizer built-in
├── OIDC and OAuth 2.0 support
├── Auto-deploy
├── 70% cheaper than REST API
└── Missing: caching, request validation, WAF (use CloudFront)

Systematic, production-ready debugging

API Gateway Best Practices

Feature	Configuration	Purpose
Caching	60s for GET, disabled for mutations	Reduce backend calls
Throttling	Per method based on tier	Prevent abuse
Validation	Request body and parameters	Fail fast, protect backend
Logging	Access + execution logs	Debugging and compliance
WAF	Rate limit + managed rules	Security

Practice Question

Why should you use request validation in API Gateway instead of validating in Lambda?

Questions