Canary Deployment

What is Canary Deployment?

Canary Deployment gradually rolls out changes to a small subset of users before deploying to everyone, reducing risk.

Architecture

┌─────────────┐
│Load Balancer│
└──────┬──────┘

   ┌───┴────────────┐
   │                │
┌──▼──────┐    ┌───▼────┐
│ Stable  │    │ Canary │
│ (90%)   │    │ (10%)  │
│  v1.0   │    │  v2.0  │
└─────────┘    └────────┘

Gradually increase canary traffic

Kubernetes Implementation

# Stable version (90% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-stable
  labels:
    app: myapp
    track: stable
spec:
  replicas: 9
  selector:
    matchLabels:
      app: myapp
      track: stable
  template:
    metadata:
      labels:
        app: myapp
        track: stable
    spec:
      containers:
        - name: myapp
          image: myapp:v1.0.0
          ports:
            - containerPort: 3000

---
# Canary version (10% traffic)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-canary
  labels:
    app: myapp
    track: canary
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myapp
      track: canary
  template:
    metadata:
      labels:
        app: myapp
        track: canary
    spec:
      containers:
        - name: myapp
          image: myapp:v2.0.0
          ports:
            - containerPort: 3000

---
# Service (load balances across both)
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
  ports:
    - port: 80
      targetPort: 3000
  type: LoadBalancer

Gradual Rollout Script

#!/bin/bash

# Phase 1: 10% canary
kubectl scale deployment myapp-stable --replicas=9
kubectl scale deployment myapp-canary --replicas=1
echo "Canary at 10%"
sleep 300  # Monitor for 5 minutes

# Check metrics
ERROR_RATE=$(curl -s http://prometheus/api/v1/query?query=error_rate)
if [ $ERROR_RATE -gt 0.01 ]; then
  echo "High error rate detected. Rolling back."
  kubectl scale deployment myapp-canary --replicas=0
  exit 1
fi

# Phase 2: 25% canary
kubectl scale deployment myapp-stable --replicas=6
kubectl scale deployment myapp-canary --replicas=2
echo "Canary at 25%"
sleep 300

# Phase 3: 50% canary
kubectl scale deployment myapp-stable --replicas=5
kubectl scale deployment myapp-canary --replicas=5
echo "Canary at 50%"
sleep 300

# Phase 4: 100% canary
kubectl scale deployment myapp-stable --replicas=0
kubectl scale deployment myapp-canary --replicas=10
echo "Canary at 100%"

# Promote canary to stable
kubectl delete deployment myapp-stable
kubectl label deployment myapp-canary track=stable --overwrite

Istio Service Mesh

# VirtualService for traffic splitting
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
    - myapp
  http:
    - match:
        - headers:
            user-type:
              exact: beta
      route:
        - destination:
            host: myapp
            subset: canary
    - route:
        - destination:
            host: myapp
            subset: stable
          weight: 90
        - destination:
            host: myapp
            subset: canary
          weight: 10

---
# DestinationRule
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: myapp
spec:
  host: myapp
  subsets:
    - name: stable
      labels:
        track: stable
    - name: canary
      labels:
        track: canary

GitHub Actions Canary

name: Canary Deployment

on:
  push:
    branches: [main]

jobs:
  deploy-canary:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Deploy canary (10%)
        run: |
          kubectl set image deployment/myapp-canary myapp=myapp:${{ github.sha }}
          kubectl scale deployment myapp-canary --replicas=1
      
      - name: Wait and monitor
        run: |
          sleep 300
          ./check-metrics.sh
      
      - name: Increase to 25%
        if: success()
        run: |
          kubectl scale deployment myapp-stable --replicas=6
          kubectl scale deployment myapp-canary --replicas=2
          sleep 300
          ./check-metrics.sh
      
      - name: Increase to 50%
        if: success()
        run: |
          kubectl scale deployment myapp-stable --replicas=5
          kubectl scale deployment myapp-canary --replicas=5
          sleep 300
          ./check-metrics.sh
      
      - name: Full rollout
        if: success()
        run: |
          kubectl set image deployment/myapp-stable myapp=myapp:${{ github.sha }}
          kubectl scale deployment myapp-canary --replicas=0
      
      - name: Rollback
        if: failure()
        run: |
          kubectl scale deployment myapp-canary --replicas=0

Monitoring

// Monitor canary metrics
async function monitorCanary() {
  const metrics = await prometheus.query({
    query: 'rate(http_requests_total{track="canary"}[5m])'
  });
  
  const errorRate = await prometheus.query({
    query: 'rate(http_requests_total{track="canary",status=~"5.."}[5m])'
  });
  
  const latency = await prometheus.query({
    query: 'histogram_quantile(0.95, http_request_duration_seconds{track="canary"})'
  });
  
  // Compare with stable
  const stableErrorRate = await prometheus.query({
    query: 'rate(http_requests_total{track="stable",status=~"5.."}[5m])'
  });
  
  if (errorRate > stableErrorRate * 1.5) {
    console.log('Canary error rate too high. Rolling back.');
    return false;
  }
  
  if (latency > 1000) {
    console.log('Canary latency too high. Rolling back.');
    return false;
  }
  
  return true;
}

Feature Flags

// Combine with feature flags
class CanaryFeatureFlag {
  isEnabled(userId) {
    // Route beta users to canary
    if (this.isBetaUser(userId)) {
      return true;
    }
    
    // Route percentage to canary
    const hash = this.hash(userId);
    return (hash % 100) < this.canaryPercentage;
  }
  
  isBetaUser(userId) {
    return betaUsers.includes(userId);
  }
  
  hash(str) {
    let hash = 0;
    for (let i = 0; i < str.length; i++) {
      hash = ((hash << 5) - hash) + str.charCodeAt(i);
    }
    return Math.abs(hash);
  }
}

Automated Rollback

# Flagger for automated canary
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: myapp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange:
          min: 99
        interval: 1m
      - name: request-duration
        thresholdRange:
          max: 500
        interval: 1m

Benefits

  1. Reduced risk: Gradual rollout
  2. Early detection: Find issues quickly
  3. Easy rollback: Affect fewer users
  4. A/B testing: Compare versions

Challenges

  1. Complexity: More infrastructure
  2. Monitoring: Need good metrics
  3. Duration: Slower than blue-green
  4. Stateful apps: Session handling

Interview Tips

  • Explain pattern: Gradual rollout to subset
  • Show implementation: Kubernetes scaling
  • Demonstrate monitoring: Metrics comparison
  • Discuss Istio: Traffic splitting
  • Mention rollback: Automated based on metrics
  • Show benefits: Risk reduction

Summary

Canary Deployment gradually rolls out changes to small percentage of users. Start with 10%, monitor metrics, increase to 25%, 50%, then 100%. Use Kubernetes replica scaling or Istio traffic splitting. Monitor error rates and latency. Automated rollback on issues. Reduces risk compared to big bang deployments.

Test Your Knowledge

Take a quick quiz to test your understanding of this topic.

Test Your Cicd Knowledge

Ready to put your skills to the test? Take our interactive Cicd quiz and get instant feedback on your answers.