BlogGenAI Deployment

How to Build an Enterprise-Grade RAG Chatbot on AWS with Terraform

Sivaram Raju
Founder
Saturday, Nov 8, 25

The Problem: Project Knowledge is Scattered and Lost

In any large project, knowledge is a beast to manage. You have design docs, best practice guides, API specifications, and code examples spread across wikis, PDFs, and shared drives. When a new team member asks, “What’s the right way to provision an S3 bucket for this project?” the answer is buried in a 50-page document they’ll never find.

This isn’t just inefficient—it’s a recipe for costly mistakes. Bugs and bad architecture happen when critical information is hard to find.

This post will show you how to solve that problem. We will build an enterprise-grade, Retrieval-Augmented Generation (RAG) chatbot using Terraform to deploy a scalable, multi-service architecture on AWS. This “Knowledge Bot” will ingest all of your project’s documents and provide instant, accurate answers to your team’s questions, complete with source citations.

This is how you build a “single source of truth” that people will actually use.

Architecture: The Serverless RAG Blueprint on AWS

We’re not using the AWS console. We’re not building a toy. This is a production-ready architecture defined entirely with Terraform. It’s repeatable, scalable, and secure.

Figure 1: The secure, end-to-end request flow for the Knowledge Bot.
Figure 1: The secure, end-to-end request flow for the Knowledge Bot.

The diagram above shows how a user’s request travels securely from their browser to the private backend and back. Notice the key security boundaries: the API Gateway acts as the public entry point, while the core logic runs in an isolated VPC with no internet access.

RAG Pipeline: Ingestion vs. Query

At the heart of our RAG bot are two distinct processes, as shown below: data ingestion and query processing.

Figure 2: The two core phases of the RAG pipeline.
Figure 2: The two core phases of the RAG pipeline.

Here’s the breakdown of the two core workflows shown above:

  1. Phase 1: Data Ingestion (The “R” in RAG):

    • A user uploads source documents (PDF, Markdown, etc.) to a secure S3 bucket.
    • The S3 upload triggers a Sync Lambda function.
    • This function tells the Bedrock Knowledge Base to start an “ingestion job.”
    • The Knowledge Base automatically pulls the new document, splits it into chunks, creates vector embeddings using Amazon Titan, and stores them in its managed vector store.
  2. Phase 2: Query & Generation (The “G” in RAG):

    • A user asks a question in the HTML/JS frontend.
    • The frontend calls our API Gateway endpoint.
    • API Gateway triggers the Chat Lambda function.
    • The Lambda sends the query to the Bedrock Knowledge Base using the RetrieveAndGenerate API.
    • Bedrock finds the most relevant document chunks (retrieval), passes them to the Claude 3 Sonnet model along with the user’s question (augmentation), and generates a human-readable answer (generation).
    • The response, including source citations from the original documents, is streamed back to the user.

We intentionally use two separate Lambda functions. The sync-function is responsible only for starting the data ingestion pipeline. It’s triggered by S3 events and has a short lifecycle. The chat-function is the user-facing API, optimized for low latency and high concurrency. This separation of concerns is a critical best practice for building robust, scalable serverless applications. It allows us to apply different IAM permissions, memory settings, and timeouts for each function, preventing a slow data sync from impacting the user’s chat experience.


A Word on Security: Building a Fortress Around Your Bot

A chatbot that handles internal documents must be secure. This architecture was designed with a security-first mindset, implementing multiple layers of defense. This is non-negotiable for any real-world application.

Here’s how we protect our application from end to end:

  1. Private Networking by Default: Our Lambda functions operate within a private VPC. They have no direct internet access, which drastically reduces the attack surface. They communicate with other AWS services like Bedrock through a VPC Endpoint, ensuring traffic never leaves the AWS network.

  2. Principle of Least Privilege (IAM): We don’t use a generic, over-privileged role. Each Lambda function has a specific IAM Execution Role with only the permissions it absolutely needs. For example, the chat function is only allowed to call bedrock:RetrieveAndGenerate on our specific Knowledge Base.

  3. API Gateway Controls: The public-facing API Gateway acts as a secure front door. We enforce API Key authentication, so only authorized clients can make requests. We also configure a strict CORS policy to ensure requests can only originate from our allowed frontend domains.

  4. Encryption Everywhere: All data is encrypted, both in transit (with TLS) and at rest. The S3 bucket for source documents and the Bedrock Knowledge Base’s underlying vector store are both encrypted using AWS KMS (Key Management Service) with a customer-managed key.

  5. Content Safety with Guardrails: We use Bedrock Guardrails to automatically filter both user inputs and the AI’s generated responses. This helps block harmful content, prevent the model from going off-topic, and adds a critical layer of safety to the user-facing application.


Code Deep Dive: Infrastructure as Code with Terraform

Our entire architecture is defined in Terraform. This is how real teams work—no manual clicking in the console. This ensures we can deploy the exact same architecture across Dev, Staging, and Prod environments with confidence.

Repository Structure

.
├── terraform/
│   ├── environments/
│   │   └── prod/
│   │       ├── main.tf
│   │       ├── variables.tf
│   │       └── backend.tf
│   └── modules/
│       ├── api-gateway/
│       ├── bedrock/
│       ├── lambda/
│       ├── s3/
│       └── vpc/
├── src/
│   ├── frontend/
│   │   └── knowledge-bot.html
│   └── lambda/
│       ├── chat-function/
│       │   └── index.js
│       └── sync-function/
│           └── index.js
└── README.md

Terraform Configuration

Here are the most important files that define our infrastructure.


provider "aws" {
region = var.aws_region
default_tags {
  tags = var.tags
}
}

data "aws_caller_identity" "current" {}
data "aws_region" "current" {}

resource "random_id" "bucket_suffix" {
byte_length = 4
}

# S3 Bucket for source documents
resource "aws_s3_bucket" "source_docs" {
bucket = "${var.project_name}-source-docs-${random_id.bucket_suffix.hex}"
}

resource "aws_s3_bucket_server_side_encryption_configuration" "source_docs" {
bucket = aws_s3_bucket.source_docs.id
rule {
  apply_server_side_encryption_by_default {
    kms_master_key_id = var.cmk_key_id
    sse_algorithm     = "aws:kms"
  }
}
}

# Bedrock Knowledge Base Module
module "bedrock" {
source = "../../modules/bedrock"

project_name        = var.project_name
environment         = var.environment
source_docs_bucket  = aws_s3_bucket.source_docs.bucket
cmk_key_id          = var.cmk_key_id
}

# API Gateway with Lambda Module
module "api_gateway" {
source = "../../modules/api-gateway"

project_name               = var.project_name
environment                = var.environment
knowledge_base_id          = module.bedrock.knowledge_base_id
data_source_id             = module.bedrock.data_source_id
cmk_key_id                 = var.cmk_key_id
allowed_origins            = var.allowed_origins
# ... other variables
}

variable "project_name" {
description = "Name of the project"
type        = string
default     = "aws-knowledge-bot"
}

variable "environment" {
description = "Environment (e.g., 'dev', 'staging', 'prod')"
type        = string
default     = "dev"
}

variable "aws_region" {
description = "The AWS region to deploy resources into."
type        = string
default     = "us-east-1"
}

variable "cmk_key_id" {
description = "KMS Customer Managed Key ID for encryption."
type        = string
}

variable "allowed_origins" {
description = "A list of allowed origins for CORS requests to the API."
type        = list(string)
default     = ["http://localhost:3000", "https://*.your-domain.com"]
}

variable "tags" {
description = "Common tags to apply to all resources."
type        = map(string)
default = {
  Project     = "Knowledge-Bot"
  Owner       = "YourName"
  ManagedBy   = "Terraform"
}
}

The Backend: Lambda Functions

Our logic lives in two Node.js Lambda functions.


const { BedrockAgentRuntimeClient, RetrieveAndGenerateCommand } = require("@aws-sdk/client-bedrock-agent-runtime");

const client = new BedrockAgentRuntimeClient({ region: process.env.AWS_REGION });

// Create a standard success response
const createSuccessResponse = (data) => ({
  statusCode: 200,
  headers: { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' },
  body: JSON.stringify({ success: true, data })
});

// Create a standard error response
const createErrorResponse = (statusCode, error) => ({
  statusCode,
  headers: { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' },
  body: JSON.stringify({ success: false, error })
});

exports.handler = async (event) => {
  // Handle CORS preflight requests
  if (event.httpMethod === 'OPTIONS') {
      return {
          statusCode: 200,
          headers: {
              'Access-Control-Allow-Origin': '*',
              'Access-Control-Allow-Headers': 'Content-Type,X-Amz-Date,Authorization,X-Api-Key',
              'Access-Control-Allow-Methods': 'POST,OPTIONS'
          },
          body: ''
      };
  }

  try {
      const body = JSON.parse(event.body);
      const { question, sessionId } = body;

      if (!question) {
          return createErrorResponse(400, 'Question is required.');
      }

      const command = new RetrieveAndGenerateCommand({
          input: { text: question },
          retrieveAndGenerateConfiguration: {
              type: 'KNOWLEDGE_BASE',
              knowledgeBaseConfiguration: {
                  knowledgeBaseId: process.env.KNOWLEDGE_BASE_ID,
                  modelArn: `arn:aws:bedrock:${process.env.AWS_REGION}::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0`,
                  guardrailConfiguration: {
                      guardrailIdentifier: process.env.GUARDRAIL_ID,
                      guardrailVersion: process.env.GUARDRAIL_VERSION
                  }
              }
          },
          sessionId: sessionId || undefined
      });

      const response = await client.send(command);

      const result = {
          answer: response.output?.text,
          sessionId: response.sessionId,
          citations: response.citations
      };

      return createSuccessResponse(result);

  } catch (error) {
      console.error('Error processing chat request:', error);
      return createErrorResponse(500, 'Internal server error: ' + error.message);
  }
};

const { BedrockAgentClient, StartIngestionJobCommand } = require('@aws-sdk/client-bedrock-agent');

const bedrockClient = new BedrockAgentClient({ region: process.env.AWS_REGION });

exports.handler = async (event) => {
  console.log('Sync function invoked:', JSON.stringify(event, null, 2));

  const { KNOWLEDGE_BASE_ID, DATA_SOURCE_ID } = process.env;

  // Ensure the event is from S3 and has records
  if (!event.Records || event.Records.length === 0) {
      console.log('No records found in event, skipping.');
      return { statusCode: 200, body: 'No records to process.' };
  }

  try {
      const command = new StartIngestionJobCommand({
          knowledgeBaseId: KNOWLEDGE_BASE_ID,
          dataSourceId: DATA_SOURCE_ID,
          description: `Automated sync triggered by S3 upload at ${new Date().toISOString()}`
      });

      const response = await bedrockClient.send(command);
      const jobId = response.ingestionJob?.ingestionJobId;

      console.log('Successfully started ingestion job:', jobId);

      return {
          statusCode: 200,
          body: JSON.stringify({
              message: 'Ingestion job started successfully',
              ingestionJobId: jobId
          })
      };
  } catch (error) {
      console.error('Error starting ingestion job:', error);
      return {
          statusCode: 500,
          body: JSON.stringify({
              error: 'Failed to start ingestion job',
              message: error.message
          })
      };
  }
};

Real Deployments Get You Hired. Tutorials Don’t.

This RAG chatbot isn’t a toy—it’s infrastructure you’d deploy at a real company. Build cloud solutions that prove you can ship—not just study. Get hands-on with production-grade projects that belong on your resume.


The Frontend: A Simple & Embeddable UI

The frontend is a single HTML file with vanilla JavaScript. This self-contained approach makes it incredibly easy to embed the chatbot into any existing website or internal portal using an iframe.

frontend/knowledge-bot.html (JavaScript Snippet)
// Configuration
const CONFIG = {
    API_URL: 'https://YOUR-API-ID.execute-api.us-east-1.amazonaws.com/dev/chat', // <-- Make sure to use your actual API Gateway URL
    SESSION_ID: 'session-' + Date.now()
};

// Send message to the API
async function sendMessage() {
    const message = chatInput.value.trim();
    if (!message || isLoading) return;

    addMessage('user', message);
    setLoading(true);

    try {
        const response = await fetch(CONFIG.API_URL, {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                question: message,
                sessionId: CONFIG.SESSION_ID
            })
        });

        if (!response.ok) throw new Error(`HTTP error! status: ${response.status}`);

        const data = await response.json();
        const { answer, citations } = data.data;

        addMessage('bot', answer, citations);

    } catch (error) {
        console.error('Error sending message:', error);
        addMessage('bot', 'Sorry, I encountered an error while trying to connect to the service.');
    } finally {
        setLoading(false);
    }
}

Join 5,000+ Students Building Real Cloud Solutions

You’ve seen what a production RAG chatbot looks like. Now build 10+ projects like this—each one deployable, debuggable, and resume-worthy. Create your account and start proving your skills with production-grade cloud architecture.


Why This Breaks: Common Pitfalls & How to Fix Them

Building a multi-service cloud application is never a straight line. Here are the real-world problems you will hit and how to solve them.

Pitfall #1: 'Access Denied' - The IAM Nightmare

The Error: Your Lambda function logs show AccessDeniedException when trying to call Bedrock or S3.

Why it happens: This is the #1 most common problem in cloud development. Your Lambda function’s IAM Execution Role does not have explicit permission to talk to the Bedrock service or the S3 bucket. By default, a Lambda function can’t access anything.

The Fix: You must attach policies to the Lambda role that grant specific permissions. In our Terraform module for the API Gateway (which creates the Lambda), we define the role and attach the necessary policies.

# From modules/api-gateway/main.tf

resource "aws_iam_role" "chat_lambda_role" {
  name = "\${var.project_name}-chat-lambda-role-\${var.environment}"
  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Action = "sts:AssumeRole",
      Effect = "Allow",
      Principal = { Service = "lambda.amazonaws.com" }
    }]
  })
}

resource "aws_iam_policy" "bedrock_policy" {
  name = "\${var.project_name}-bedrock-policy-\${var.environment}"
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Action   = "bedrock:RetrieveAndGenerate",
      Effect   = "Allow",
      Resource = "arn:aws:bedrock:\${data.aws_region.current.name}::knowledge-base/\${var.knowledge_base_id}"
    }]
  })
}

resource "aws_iam_role_policy_attachment" "bedrock" {
  role       = aws_iam_role.chat_lambda_role.name
  policy_arn = aws_iam_policy.bedrock_policy.arn
}

# Also attach AWSLambdaVPCAccessExecutionRole for VPC access
resource "aws_iam_role_policy_attachment" "vpc_access" {
  role       = aws_iam_role.chat_lambda_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole"
}

This code creates a role that the Lambda function “assumes,” and attaches a policy granting it permission to call RetrieveAndGenerate only on the specific Knowledge Base we created. This is the principle of least privilege in action.

Pitfall #2: The Sync Job Fails Silently

The Problem: You upload a new document to S3, but your chatbot’s knowledge isn’t updated. You check the sync-function logs and see no errors.

Why it happens: The Bedrock Knowledge Base ingestion job failed, but your Lambda function doesn’t know that. The StartIngestionJob API is asynchronous. It kicks off the job and immediately returns a 200 OK response. The actual processing happens in the background and can take minutes. If it fails, it fails silently from your Lambda’s perspective.

The Fix: For a production system, you need a monitoring loop. A more robust solution would use AWS Step Functions to orchestrate the process:

  1. S3 Trigger starts a Step Function workflow.
  2. Step 1: Call StartIngestionJob (our current sync-function logic).
  3. Step 2: Enter a Wait state (e.g., for 30 seconds).
  4. Step 3: Call GetIngestionJob to check the job’s status.
  5. Step 4 (Choice State):
    • If IN_PROGRESS, loop back to Step 2.
    • If COMPLETE, end the workflow successfully.
    • If FAILED, trigger an SNS notification to alert an administrator.

This creates a resilient, observable pipeline for data ingestion, which is critical for an enterprise-grade RAG system.

Conclusion: You’ve Built a Real AI Solution

This isn’t just a chatbot. It’s a scalable, secure, and automated solution to a real business problem. By building this project, you’ve demonstrated skills that are in high demand:

  • Infrastructure as Code: You know how to define and deploy cloud resources repeatably with Terraform.
  • Serverless Architecture: You can build event-driven systems using Lambda, API Gateway, and S3.
  • Applied AI: You understand how to implement a RAG pipeline using AWS Bedrock, a cutting-edge AI service.
  • Security Best Practices: You’ve implemented the principle of least privilege with IAM roles and encrypted data at rest.

Putting this project on your resume and talking about the pitfalls and design choices is what separates you from candidates who have only done tutorials. You’ve built, broken, and fixed a real-world system.

Ready to Deploy?

This was a complex, multi-service build. You’ve seen the architecture, learned the pitfalls, and understand what production AI infrastructure looks like.

Stop reading tutorials and watching videos. Start shipping real solutions. Sign up at DeployU → and build the portfolio that gets you hired.

Share this DeployU story
Build & Learn.
That's how real engineers learn. Stop reading docs and start your first deployment. Your live cloud environment is 60 seconds away.
Get Started - free forever

You're reading the guide. Ready to build it? Try this lab on a AWS, risk-free AWS sandbox.

Deploy Now