
Darwin Martin
Principal Software Engineer

Beyond The Basics: Canary Software Deployment Strategy Part 1
Canary birds are sensitive to gas and show visible distress when detecting it. In the past, coal miners used canary birds as an early warning system for harmful gases like carbon monoxide (CO) and methane (CH4). Canary birds alerted the miners of danger and prompted them to take corrective action.
A canary deployment serves the same purpose. The deployment allows organizations to expose a new feature to an early sub-segment of users without a big bang deployment. The goal is to test new functionality on a subset of customers, such as early adopters, or when you are dogfooding features before releasing them to the entire user base. If a showstopper bug is discovered, a rollback can be executed with minimal, if any, customer impact. Just like in the coalmine you can evacuate, rollback the code changes, if you are alerted by the canary (deployment). If everything works as intended, you can gradually add more users while monitoring logs, errors, and software health.
If Software Fails in Production, Then QA Didn’t Do Their Job 🤔
I hear this falsehood constantly: if a bug is discovered, then QA didn’t do their job. This is not a fair assessment. Continuous integration has changed the way we develop and release software. Always keep in mind that the continuous integration environment is different from production, and synthetic tests are not always enough to reveal problems. Some issues only appear when they hit production, and by that time, the damage is already done. Canary deployments allow us to dip a toe in the water to test before jumping in, thus protecting the coffers and reputation of the business.
How It Differs from Other Software Deployment Strategies
Blue-Green Deployment
Definition: Blue-green deployment involves maintaining two identical environments (Blue and Green). One environment is live, while the other is idle and ready for the new version.
Comparison with Canary:
- Risk Management: Both canary and blue-green deployments aim to minimize risk. However, canary deployment does so gradually by exposing a small subset of users to the new version first, while blue-green switches all traffic at once after testing.
- Resource Usage: Blue-green deployment requires maintaining two full environments, which can be resource-intensive. Canary deployment typically requires fewer additional resources since it only involves a subset of instances.
- Rollback: Rolling back in blue-green deployment is straightforward since you can simply switch back to the previous environment. In canary deployment, rollback involves reverting traffic routing and possibly rolling back the new version from the subset of instances.
Rolling Deployment
Definition: Rolling deployment gradually replaces instances of the old version with the new version, updating one instance at a time or in small batches.
Comparison with Canary:
- Granularity: Both strategies involve gradual rollout, but canary deployments start with a small user base and incrementally increase traffic to the new version. Rolling deployments continuously replace old instances until all are updated.
- Monitoring: Canary deployment places a stronger emphasis on monitoring user feedback and performance metrics for a small subset before broader rollout. Rolling deployment monitors the update process continuously across all instances.
- Risk: Canary deployment allows for early detection of issues on a small scale, while rolling deployment can catch issues as they spread but may expose more users to potential problems if not detected early.
A/B Testing
Definition: A/B testing (split testing) runs two or more versions of the application simultaneously to compare performance and gather user feedback.
Comparison with Canary:
- Objective: A/B testing focuses on comparing user experiences and performance between different versions, typically for feature validation or UI/UX changes. Canary deployment focuses on ensuring the new version is stable and performs well before full rollout.
- User Segmentation: A/B testing targets different user segments with different versions to gather comparative data. Canary deployment targets a small subset with the new version to validate it before a wider release.
- Implementation Complexity: A/B testing can be more complex to implement due to the need for detailed analytics and user segmentation. Canary deployment is primarily about phased rollout and monitoring.
Feature Toggles (Feature Flags)
Definition: Feature toggles allow specific features to be turned on or off in the production environment without deploying new code.
Comparison with Canary:
- Scope: Feature toggles manage the exposure of individual features within a single codebase, allowing for granular control. Canary deployment manages the rollout of entire versions of an application.
- Flexibility: Feature toggles offer more flexibility in enabling/disabling features quickly. Canary deployment provides a phased rollout of new versions, but toggling features within those versions requires additional mechanisms.
- Use Case: Feature toggles are ideal for testing specific features and can be used in conjunction with canary deployments to control feature rollout within a canary release.
Big Bang Deployment
Definition: Big bang deployment involves deploying the new version to all users at once, replacing the old version entirely.
Comparison with Canary:
- Risk: Big bang deployment carries the highest risk since all users are exposed to the new version simultaneously. Canary deployment mitigates this risk by limiting initial exposure and gradually increasing it.
- Rollback: Rolling back in a big bang deployment can be complex and disruptive, as it involves reverting the entire deployment. Canary deployment allows for easier rollback since only a subset of instances and users are initially affected.
- Downtime: Big bang deployment often requires scheduled downtime, which can be disruptive. Canary deployment aims to achieve minimal or no downtime by gradually introducing the new version.
Three Ways to Add Canary Deployments
Canary deployments can be implemented in various ways, each offering different tools and methodologies to achieve a gradual rollout. Here are three common approaches to executing a canary deployment in a Kubernetes environment:
Using Nginx Canary Annotations
Nginx, a popular web server and reverse proxy, can be configured to support canary deployments through annotations. This method involves using Nginx's ability to route a small percentage of traffic to a canary version of your service, while the majority of traffic continues to be served by the stable version.
How It Works:
- Step 1: Deploy the stable version of your service and configure Nginx as the ingress controller.
- Step 2: Deploy the canary version of your service alongside the stable version.
- Step 3: Use Nginx annotations to define the traffic split between the stable and canary versions.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: my-service
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10" # Routes 10% of the traffic to the canary
spec:
rules:
- host: my-service.example.com
http:
paths:
- backend:
serviceName: my-service
servicePort: 80
- backend:
serviceName: my-service-canary
servicePort: 80
Advantages:
- Simple to set up with existing Nginx ingress configurations.
- Fine-grained control over traffic distribution using annotations.
Disadvantages:
- Requires familiarity with Nginx and Kubernetes ingress configurations.
- Limited to environments using Nginx as the ingress controller.
Use Case:
Ideal for teams already using Nginx as their ingress controller who need a straightforward way to implement canary deployments.
Using Argo Rollouts
Argo Rollouts is a Kubernetes controller and set of CRDs (Custom Resource Definitions) that provide advanced deployment capabilities, including blue-green and canary deployments. Argo Rollouts allows you to define detailed rollout strategies, including progressive delivery and automated analysis.
How It Works:
- Step 1: Install Argo Rollouts in your Kubernetes cluster.
- Step 2: Define a Rollout resource that specifies the desired canary strategy.
- Step 3: Deploy the Rollout resource and monitor the progress.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: my-service
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 2m}
- setWeight: 30
- pause: {duration: 2m}
template:
metadata:
labels:
app: my-service
spec:
containers:
- name: my-service
image: my-service:latest
Advantages:
- Provides advanced deployment strategies and detailed control over rollouts.
- Integrates well with other tools in the Kubernetes ecosystem.
Disadvantages:
- More complex setup and configuration compared to simpler tools.
- Requires understanding of Argo Rollouts' CRDs and deployment strategies.
Use Case:
Suitable for teams looking for a robust and feature-rich deployment tool that supports advanced canary deployment strategies.
With Flagger & Istio
Flagger is a Kubernetes operator that automates the promotion of canary deployments using a service mesh. When combined with Istio, Flagger can route traffic, perform analysis, and roll back changes automatically based on metrics and thresholds.
How It Works:
- Step 1: Install Flagger and Istio in your Kubernetes cluster.
- Step 2: Define a Canary resource that specifies the traffic routing and analysis metrics.
- Step 3: Deploy the Canary resource and let Flagger manage the rollout process.
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: my-service
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-service
service:
port: 80
canaryAnalysis:
interval: 1m
threshold: 10
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
threshold: 99
- name: request-duration
threshold: 500
Advantages:
- Automated traffic management and rollback based on real-time metrics.
- Integration with Istio for advanced traffic routing and observability.
Disadvantages:
- Requires installation and configuration of both Flagger and Istio.
- Steeper learning curve due to the complexity of service mesh and automated analysis.
Use Case:
Best for teams that need automated and metric-driven canary deployments, particularly those already using or planning to use a service mesh like Istio.
Conclusion
There is still so much to cover for these software deployment strategies! Even with a narrow focus on canary deployments, I want to do this topic justice and continue my philosophy of “Beyond The Basics,” where I provide more than just high-level coverage. Each method of executing a canary deployment offers unique benefits and trade-offs. Whether you choose Nginx annotations, Argo Rollouts, or Flagger with Istio, understanding the differences can help you select the best approach for your needs. In future blog posts, I'll dive deeper into each of these methods, providing step-by-step guides and best practices. Stay tuned!