DynamoDB is throttling requests and costs are high. Optimize the table design.

Q: DynamoDB is throttling requests and costs are high. Optimize the table design.

Learn the answer to "DynamoDB is throttling requests and costs are high. Optimize the table design." with detailed explanations, code examples, and best practices on DeployU.

The Scenario

Your DynamoDB table is having issues:

Current Problems:
├── ProvisionedThroughputExceededException: 500+ per hour
├── Monthly cost: $2,400 (10x expected)
├── Hot partition detected: "USER#premium"
├── Table size: 50GB
├── Read capacity: 5,000 RCU (provisioned)
├── Write capacity: 1,000 WCU (provisioned)
└── GSI count: 5 (all with same partition key)

The Challenge

Optimize the DynamoDB table design to eliminate throttling, reduce costs, and handle traffic spikes without over-provisioning.

Wrong Approach

A junior engineer might just increase provisioned capacity, enable auto-scaling with high maximums, or add more GSIs. These approaches increase costs without fixing the root cause, don't solve hot partition issues, and GSIs duplicate storage costs.

Addresses symptoms, not root cause

Right Approach

A senior engineer analyzes access patterns, fixes partition key design for even distribution, uses write sharding for hot keys, implements efficient GSIs, and chooses the right capacity mode based on traffic patterns.

Step 1: Analyze Current Access Patterns

# Check table metrics
aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ConsumedReadCapacityUnits \
  --dimensions Name=TableName,Value=orders \
  --start-time 2024-01-01T00:00:00Z \
  --end-time 2024-01-02T00:00:00Z \
  --period 3600 \
  --statistics Sum

# Check for throttling
aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ThrottledRequests \
  --dimensions Name=TableName,Value=orders \
  --start-time 2024-01-01T00:00:00Z \
  --end-time 2024-01-02T00:00:00Z \
  --period 300 \
  --statistics Sum

# Describe table
aws dynamodb describe-table --table-name orders \
  --query '{PartitionKey: Table.KeySchema, ItemCount: Table.ItemCount, Size: Table.TableSizeBytes}'

Step 2: Fix Hot Partition with Write Sharding

# BEFORE: Hot partition key
# PK: "USER#premium" receives 80% of writes

# AFTER: Write sharding
import random
import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('orders')

SHARD_COUNT = 10

def get_sharded_pk(user_type: str) -> str:
    """Add random shard suffix to distribute writes."""
    shard = random.randint(0, SHARD_COUNT - 1)
    return f"{user_type}#{shard}"

def put_order(order: dict):
    """Write order with sharded partition key."""
    item = {
        'PK': get_sharded_pk(order['user_type']),
        'SK': f"ORDER#{order['order_id']}",
        'user_id': order['user_id'],
        'amount': order['amount'],
        'created_at': order['created_at'],
        # Store original PK for queries
        'user_type': order['user_type']
    }
    table.put_item(Item=item)

def query_orders_by_type(user_type: str) -> list:
    """Query all shards in parallel."""
    import concurrent.futures

    def query_shard(shard: int):
        response = table.query(
            KeyConditionExpression=Key('PK').eq(f"{user_type}#{shard}")
        )
        return response['Items']

    all_items = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=SHARD_COUNT) as executor:
        futures = [executor.submit(query_shard, i) for i in range(SHARD_COUNT)]
        for future in concurrent.futures.as_completed(futures):
            all_items.extend(future.result())

    return all_items

Step 3: Optimized Table Design

resource "aws_dynamodb_table" "orders" {
  name         = "orders"
  billing_mode = "PAY_PER_REQUEST"  # On-demand for variable traffic
  hash_key     = "PK"
  range_key    = "SK"

  attribute {
    name = "PK"
    type = "S"
  }

  attribute {
    name = "SK"
    type = "S"
  }

  attribute {
    name = "GSI1PK"
    type = "S"
  }

  attribute {
    name = "GSI1SK"
    type = "S"
  }

  # Single GSI with overloaded keys (instead of 5 GSIs)
  global_secondary_index {
    name            = "GSI1"
    hash_key        = "GSI1PK"
    range_key       = "GSI1SK"
    projection_type = "ALL"
  }

  # TTL for automatic cleanup
  ttl {
    attribute_name = "expires_at"
    enabled        = true
  }

  # Point-in-time recovery
  point_in_time_recovery {
    enabled = true
  }

  tags = {
    Environment = "production"
  }
}

Step 4: Single Table Design Pattern

"""
Single Table Design - Multiple entity types in one table

Entity Types:
- USER: User profile data
- ORDER: Order records
- PRODUCT: Product catalog

Access Patterns:
1. Get user by ID
2. Get all orders for a user
3. Get order by ID
4. Get orders by status (GSI)
5. Get products by category (GSI)
"""

# Item structure examples
items = [
    # User entity
    {
        'PK': 'USER#user123',
        'SK': 'PROFILE',
        'entity_type': 'USER',
        'email': 'user@example.com',
        'name': 'John Doe',
        'GSI1PK': 'USER',  # For listing all users
        'GSI1SK': 'user123'
    },
    # Order entity
    {
        'PK': 'USER#user123',
        'SK': 'ORDER#2024-01-15#order456',
        'entity_type': 'ORDER',
        'order_id': 'order456',
        'amount': 99.99,
        'status': 'SHIPPED',
        'GSI1PK': 'ORDER#SHIPPED',  # For querying by status
        'GSI1SK': '2024-01-15#order456'
    },
    # Product entity
    {
        'PK': 'PRODUCT#prod789',
        'SK': 'DETAILS',
        'entity_type': 'PRODUCT',
        'name': 'Widget',
        'price': 29.99,
        'GSI1PK': 'CATEGORY#electronics',  # For querying by category
        'GSI1SK': 'PRODUCT#prod789'
    }
]

# Access pattern implementations
def get_user(user_id: str):
    """Get user profile."""
    response = table.get_item(
        Key={'PK': f'USER#{user_id}', 'SK': 'PROFILE'}
    )
    return response.get('Item')

def get_user_orders(user_id: str, limit: int = 20):
    """Get all orders for a user, sorted by date."""
    response = table.query(
        KeyConditionExpression=Key('PK').eq(f'USER#{user_id}') & Key('SK').begins_with('ORDER#'),
        ScanIndexForward=False,  # Descending order
        Limit=limit
    )
    return response['Items']

def get_orders_by_status(status: str, limit: int = 100):
    """Get orders by status using GSI."""
    response = table.query(
        IndexName='GSI1',
        KeyConditionExpression=Key('GSI1PK').eq(f'ORDER#{status}'),
        Limit=limit
    )
    return response['Items']

def get_products_by_category(category: str):
    """Get products by category using GSI."""
    response = table.query(
        IndexName='GSI1',
        KeyConditionExpression=Key('GSI1PK').eq(f'CATEGORY#{category}')
    )
    return response['Items']

Step 5: Batch Operations for Efficiency

from boto3.dynamodb.types import TypeSerializer
import boto3

dynamodb = boto3.client('dynamodb')
serializer = TypeSerializer()

def batch_write_orders(orders: list):
    """Batch write up to 25 items at a time."""
    # DynamoDB limit: 25 items per batch
    batch_size = 25

    for i in range(0, len(orders), batch_size):
        batch = orders[i:i + batch_size]

        request_items = {
            'orders': [
                {
                    'PutRequest': {
                        'Item': {k: serializer.serialize(v) for k, v in order.items()}
                    }
                }
                for order in batch
            ]
        }

        response = dynamodb.batch_write_item(RequestItems=request_items)

        # Handle unprocessed items (throttling)
        unprocessed = response.get('UnprocessedItems', {})
        while unprocessed:
            import time
            time.sleep(0.1)  # Exponential backoff in production
            response = dynamodb.batch_write_item(RequestItems=unprocessed)
            unprocessed = response.get('UnprocessedItems', {})

def batch_get_orders(order_keys: list):
    """Batch get up to 100 items at a time."""
    batch_size = 100
    all_items = []

    for i in range(0, len(order_keys), batch_size):
        batch = order_keys[i:i + batch_size]

        request_items = {
            'orders': {
                'Keys': [
                    {
                        'PK': {'S': key['PK']},
                        'SK': {'S': key['SK']}
                    }
                    for key in batch
                ]
            }
        }

        response = dynamodb.batch_get_item(RequestItems=request_items)
        all_items.extend(response['Responses']['orders'])

    return all_items

Step 6: Cost Optimization Strategies

# On-demand for unpredictable traffic
resource "aws_dynamodb_table" "orders_ondemand" {
  name         = "orders"
  billing_mode = "PAY_PER_REQUEST"
  # Automatically scales
  # Pay only for what you use
  # Best for: variable traffic, new applications
}

# Provisioned with auto-scaling for predictable traffic
resource "aws_dynamodb_table" "orders_provisioned" {
  name         = "orders"
  billing_mode = "PROVISIONED"
  read_capacity  = 100
  write_capacity = 50
}

resource "aws_appautoscaling_target" "read" {
  max_capacity       = 1000
  min_capacity       = 100
  resource_id        = "table/${aws_dynamodb_table.orders_provisioned.name}"
  scalable_dimension = "dynamodb:table:ReadCapacityUnits"
  service_namespace  = "dynamodb"
}

resource "aws_appautoscaling_policy" "read" {
  name               = "DynamoDBReadAutoScaling"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.read.resource_id
  scalable_dimension = aws_appautoscaling_target.read.scalable_dimension
  service_namespace  = aws_appautoscaling_target.read.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "DynamoDBReadCapacityUtilization"
    }
    target_value       = 70.0
    scale_in_cooldown  = 60
    scale_out_cooldown = 60
  }
}

# Reserved capacity for consistent workloads (save up to 76%)
# Must be purchased through AWS Console or CLI

Step 7: Query Optimization

# Use ProjectionExpression to reduce read costs
def get_order_summary(user_id: str, order_id: str):
    """Get only needed attributes."""
    response = table.get_item(
        Key={'PK': f'USER#{user_id}', 'SK': f'ORDER#{order_id}'},
        ProjectionExpression='order_id, amount, #s, created_at',
        ExpressionAttributeNames={'#s': 'status'}  # 'status' is reserved
    )
    return response.get('Item')

# Use FilterExpression sparingly (still consumes full read capacity)
def get_recent_high_value_orders(user_id: str, min_amount: float):
    """Filter after query - use with caution."""
    response = table.query(
        KeyConditionExpression=Key('PK').eq(f'USER#{user_id}') & Key('SK').begins_with('ORDER#'),
        FilterExpression='amount >= :min_amount',
        ExpressionAttributeValues={':min_amount': min_amount}
    )
    return response['Items']

# Better: Use GSI with amount in sort key
def get_high_value_orders_optimized(min_amount: float):
    """Query GSI where amount is in the key."""
    response = table.query(
        IndexName='GSI-ByAmount',
        KeyConditionExpression=Key('entity_type').eq('ORDER') & Key('amount').gte(min_amount)
    )
    return response['Items']

Systematic, production-ready debugging

DynamoDB Cost Optimization Summary

Strategy	Savings	When to Use
On-demand billing	Variable	Unpredictable traffic
Auto-scaling	30-50%	Predictable patterns
Reserved capacity	Up to 76%	Consistent baseline
Single table design	50%+ storage	Multiple entity types
ProjectionExpression	Per-query	Large items, few attributes needed
TTL	Storage costs	Temporary data

Practice Question

Why does a hot partition cause throttling even when you have unused capacity in other partitions?

Questions