Migrate from static VMs to Kubernetes-based dynamic agents for cloud-native CI/CD.

Q: Migrate from static VMs to Kubernetes-based dynamic agents for cloud-native CI/CD.

Learn the answer to "Migrate from static VMs to Kubernetes-based dynamic agents for cloud-native CI/CD." with detailed explanations, code examples, and best practices on DeployU.

The Scenario

Your organization runs Jenkins on 20 static EC2 instances costing $15,000/month. You observe:

Current Infrastructure:
- 20 x m5.xlarge instances ($0.192/hr each)
- Average utilization: 35%
- Peak utilization: 100% (queue builds up)
- Off-hours utilization: 5%
- Monthly cost: ~$15,000

Leadership wants to reduce costs while improving scalability. The platform team has a production Kubernetes cluster available.

The Challenge

Migrate Jenkins to Kubernetes with dynamic pod-based agents that scale to zero during off-hours and handle peak loads without queuing.

Wrong Approach

A junior engineer might just deploy Jenkins master on Kubernetes and keep the VM agents, or run agents as StatefulSets with fixed replicas, or skip testing and do a big-bang migration. These approaches don't leverage Kubernetes benefits, maintain cost inefficiency, and risk production outages.

Addresses symptoms, not root cause

Right Approach

A senior engineer deploys Jenkins master on Kubernetes with proper persistence, implements dynamic pod agents using the Kubernetes plugin, creates optimized pod templates for different workloads, sets up proper networking and security, and migrates incrementally with parallel running.

Step 1: Deploy Jenkins Master on Kubernetes

# jenkins-master-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: jenkins
  namespace: jenkins
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jenkins
  template:
    metadata:
      labels:
        app: jenkins
    spec:
      serviceAccountName: jenkins
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      containers:
        - name: jenkins
          image: jenkins/jenkins:lts-jdk17
          ports:
            - containerPort: 8080
              name: http
            - containerPort: 50000
              name: jnlp
          resources:
            requests:
              memory: "2Gi"
              cpu: "1"
            limits:
              memory: "4Gi"
              cpu: "2"
          env:
            - name: JAVA_OPTS
              value: >-
                -Xmx2g
                -XX:+UseG1GC
                -Djenkins.install.runSetupWizard=false
            - name: CASC_JENKINS_CONFIG
              value: /var/jenkins_home/casc_configs
          volumeMounts:
            - name: jenkins-home
              mountPath: /var/jenkins_home
            - name: casc-config
              mountPath: /var/jenkins_home/casc_configs
          livenessProbe:
            httpGet:
              path: /login
              port: 8080
            initialDelaySeconds: 120
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /login
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 10
      volumes:
        - name: jenkins-home
          persistentVolumeClaim:
            claimName: jenkins-home-pvc
        - name: casc-config
          configMap:
            name: jenkins-casc-config
---
apiVersion: v1
kind: Service
metadata:
  name: jenkins
  namespace: jenkins
spec:
  type: ClusterIP
  ports:
    - port: 8080
      targetPort: 8080
      name: http
    - port: 50000
      targetPort: 50000
      name: jnlp
  selector:
    app: jenkins
---
apiVersion: v1
kind: Service
metadata:
  name: jenkins-agent
  namespace: jenkins
spec:
  type: ClusterIP
  ports:
    - port: 50000
      targetPort: 50000
  selector:
    app: jenkins

Step 2: Configure RBAC for Dynamic Agents

# jenkins-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: jenkins
  namespace: jenkins
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: jenkins-agent-role
  namespace: jenkins
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["create", "delete", "get", "list", "watch", "patch"]
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create", "get"]
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get", "list"]
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["create", "delete", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: jenkins-agent-binding
  namespace: jenkins
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: jenkins-agent-role
subjects:
  - kind: ServiceAccount
    name: jenkins
    namespace: jenkins

Step 3: Configure Kubernetes Cloud in JCasC

# jenkins-casc-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: jenkins-casc-config
  namespace: jenkins
data:
  jenkins.yaml: |
    jenkins:
      numExecutors: 0
      mode: EXCLUSIVE
      clouds:
        - kubernetes:
            name: "kubernetes"
            serverUrl: "https://kubernetes.default"
            namespace: "jenkins"
            jenkinsUrl: "http://jenkins.jenkins.svc.cluster.local:8080"
            jenkinsTunnel: "jenkins-agent.jenkins.svc.cluster.local:50000"
            containerCapStr: "50"
            maxRequestsPerHostStr: "32"
            retentionTimeout: 5
            waitForPodSec: 600

            templates:
              # Default lightweight agent
              - name: "default"
                label: "kubernetes default"
                nodeUsageMode: "NORMAL"
                idleMinutes: 10
                containers:
                  - name: "jnlp"
                    image: "jenkins/inbound-agent:latest"
                    workingDir: "/home/jenkins/agent"
                    resourceRequestCpu: "200m"
                    resourceRequestMemory: "256Mi"
                    resourceLimitCpu: "500m"
                    resourceLimitMemory: "512Mi"
                yamlMergeStrategy: "override"

              # Node.js build agent
              - name: "nodejs"
                label: "nodejs npm"
                containers:
                  - name: "jnlp"
                    image: "jenkins/inbound-agent:latest"
                    resourceRequestMemory: "256Mi"
                    resourceLimitMemory: "512Mi"
                  - name: "node"
                    image: "node:18-alpine"
                    command: "sleep"
                    args: "infinity"
                    resourceRequestMemory: "1Gi"
                    resourceLimitMemory: "2Gi"
                volumes:
                  - hostPathVolume:
                      hostPath: "/var/run/docker.sock"
                      mountPath: "/var/run/docker.sock"

              # Docker build agent
              - name: "docker"
                label: "docker"
                containers:
                  - name: "jnlp"
                    image: "jenkins/inbound-agent:latest"
                  - name: "docker"
                    image: "docker:24-dind"
                    privileged: true
                    resourceRequestMemory: "1Gi"
                    resourceLimitMemory: "4Gi"
                volumes:
                  - emptyDirVolume:
                      mountPath: "/var/lib/docker"
                      memory: false

              # Heavy workload agent
              - name: "heavy"
                label: "heavy build"
                nodeSelector: "node-type=compute-optimized"
                containers:
                  - name: "jnlp"
                    image: "jenkins/inbound-agent:latest"
                    resourceRequestCpu: "2"
                    resourceRequestMemory: "4Gi"
                    resourceLimitCpu: "4"
                    resourceLimitMemory: "8Gi"

Step 4: Create Optimized Pipeline with Kubernetes Agents

// Jenkinsfile using Kubernetes agents
pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: jenkins-agent
spec:
  containers:
  - name: jnlp
    image: jenkins/inbound-agent:latest
    resources:
      requests:
        memory: "256Mi"
        cpu: "100m"
  - name: node
    image: node:18-alpine
    command: ["sleep", "infinity"]
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "2Gi"
        cpu: "1"
    volumeMounts:
    - name: npm-cache
      mountPath: /root/.npm
  - name: docker
    image: docker:24-cli
    command: ["sleep", "infinity"]
    env:
    - name: DOCKER_HOST
      value: tcp://localhost:2375
  - name: dind
    image: docker:24-dind
    securityContext:
      privileged: true
    env:
    - name: DOCKER_TLS_CERTDIR
      value: ""
  volumes:
  - name: npm-cache
    persistentVolumeClaim:
      claimName: npm-cache-pvc
'''
        }
    }

    stages {
        stage('Build') {
            steps {
                container('node') {
                    sh '''
                        npm ci --cache /root/.npm
                        npm run build
                    '''
                }
            }
        }

        stage('Test') {
            steps {
                container('node') {
                    sh 'npm test'
                }
            }
        }

        stage('Docker Build') {
            steps {
                container('docker') {
                    sh '''
                        docker build -t myapp:${BUILD_NUMBER} .
                        docker push registry.company.com/myapp:${BUILD_NUMBER}
                    '''
                }
            }
        }
    }
}

Step 5: Implement Cache for Faster Builds

# npm-cache-pvc.yaml - Shared cache for faster builds
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: npm-cache-pvc
  namespace: jenkins
spec:
  accessModes:
    - ReadWriteMany  # Multiple pods can share
  storageClassName: efs-sc  # Use EFS for shared storage
  resources:
    requests:
      storage: 50Gi
---
# maven-cache-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: maven-cache-pvc
  namespace: jenkins
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 100Gi

Step 6: Implement Pod Template Selection Logic

// vars/selectAgent.groovy - Shared library for smart agent selection
def call(Map config = [:]) {
    def workloadType = config.type ?: 'default'
    def resources = config.resources ?: [:]

    def podTemplates = [
        'default': defaultPod(),
        'nodejs': nodejsPod(resources),
        'java': javaPod(resources),
        'docker': dockerPod(resources),
        'heavy': heavyPod(resources)
    ]

    return kubernetes {
        yaml podTemplates[workloadType]
    }
}

def defaultPod() {
    return '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: jnlp
    image: jenkins/inbound-agent:latest
    resources:
      requests:
        memory: "256Mi"
        cpu: "100m"
'''
}

def nodejsPod(Map resources) {
    def memory = resources.memory ?: '1Gi'
    def cpu = resources.cpu ?: '500m'

    return """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: jnlp
    image: jenkins/inbound-agent:latest
  - name: node
    image: node:18
    command: ["sleep", "infinity"]
    resources:
      requests:
        memory: "${memory}"
        cpu: "${cpu}"
      limits:
        memory: "${memory}"
        cpu: "${cpu}"
    volumeMounts:
    - name: npm-cache
      mountPath: /root/.npm
  volumes:
  - name: npm-cache
    persistentVolumeClaim:
      claimName: npm-cache-pvc
"""
}

// Usage in Jenkinsfile
pipeline {
    agent {
        script {
            selectAgent(
                type: 'nodejs',
                resources: [memory: '2Gi', cpu: '1']
            )
        }
    }
    stages {
        // ...
    }
}

Systematic, production-ready debugging

Migration Cost Comparison

Metric	Before (VMs)	After (K8s)	Savings
Monthly Cost	$15,000	$3,500	77%
Peak Capacity	20 agents	50+ pods	150%+
Off-hours Cost	$5,000	$500	90%
Queue Wait Time	15 min	2 min	87%
Setup Time	Hours	Seconds	99%

Practice Question

Why should you use a separate container for the build tools (like node, maven) instead of installing them in the jnlp container?

Questions