Scalable Continuous Delivery Pipelines

Back when I first started building web apps we’d just “do it in production” by vi’ing Perl & PHP files on the server. This was fine because the risks and expectations were low. No big deal if I broke the app for a few hours. Good thing I made an app.php-bak copy!

As software became more critical to businesses, the risks of making changes to production systems increased. To cope with these risks we slowed down delivery through processes. Today many enterprises are so bogged down by risk aversion that they may only deploy to production once a year or less. The rate of change in businesses and software continues to increase and the expectations are even higher. Downtime is not an option but that change also needs to go out now!

We need a way to reduce risk and increase the rate of delivery. The two seem to be opposing but Continuous Delivery provides a way to deliver more often AND reduce risk.

As the name implies, the idea of Continuous Delivery is to continuously deliver changes. There are many ways to do that but the process should be scalable depending on the acceptable amount of risk. For some apps a little potential downtime is worth the tradeoff of essentially being able to “do it in production” for some changes.

Continuous Delivery can be thought of as a pipeline… There is an input / source that needs to be moved and transformed in a variety of ways until it reaches it’s output, the production system. By adding or removing the steps in between we can adjust the risk profile.

In order to have a Continuous Delivery Pipeline there are a few things necessary to facilitate any size process:

  • A Source Control System (SCM) that enables developers to collaborate but also enables a direct correlation between a point-in-time in the source and a deployment of that source.
  • Single directional flow from source to deployment. No more “do it in production” because that breaks the source to deployment correlation.
  • A repeatable and pure method of transforming source into something that can run. Repeatability is usually accomplished by using a build tool. However, often builds specify dependencies with version ranges, sacrificing the pureness. Don’t do that.

With that infrastructure the simplest form of Continuous Delivery can be achieved. Systems like Heroku provide automated tooling for this kind of pipeline. On Heroku you can either kick off deployment by pushing a git repo directly to Heroku, which then runs the build, then stores and deploys the generated artifacts. This is the infamous git push heroku master. A newer method (which I prefer) is instead to push the changes to GitHub and have Heroku then auto-deploy those changes. Here is a demo of that:

For apps that can tolerate more risk this simple pipeline is great! But many apps need a process that better supports collaboration and risk mitigation. The next layer that may be added to the pipeline is a Continuous Integration system that runs the build and tests. If there is a failure then the changes should not be deployed. Here is an demo of doing CI-verified auto-deployment on Heroku:

Reducing risk further we could add a staging system that auto-deploys the changes (with or without CI validation depending on your needs). After manual validation the changes could then be manually promoted to production.

Taking things a step further we can hook into GitHub’s Pull Request process to deploy and validate changes before they are merged into a production branch (e.g. master). In Heroku this is called “Review Apps” in which Heroku automatically creates a fresh environment with the changes for every Pull Request. This enables actual app testing as part of the review cycle.

This full pipeline with Pull Request apps / Review Apps, CI validation, staging auto-deployment, and manual production promotion significantly reduces the risk of doing frequent deployments. Many organizations that use this kind of process are able to do hundreds of deployments every day! This also helps disperse risk over many deployments instead of accumulating risk for those big once-a-year deploys. Check out a demo of this entire flow on Heroku:

Sometimes a feature may not be ready for end user testing or launch but that shouldn’t prevent you from actually deploying the feature! For this you can use “Feature Flags” a.k.a. Feature Toggles.

Another technique that can be useful to reduce risk with Continuous Delivery is “Canary Deploys” where only a portion of users run on a new version of the app. Once a period of time validates the new version is “safe” for everyone, it can be rolled out to the rest of the users.

Of course Continuous Delivery isn’t a silver bullet as there are always tradeoffs. One of the challenges with Continuous Delivery is with database schemas. For instance, what if you were to do a schema migration as part of a Canary Deploy? With two versions of the app running simultaneously you may break one version with the schema changes from another. NoSQL / Schema-less databases are one way to address the issue. Another option is to decouple code deployments from schema migrations, utilizing testing / staging environments to validate the schema changes.

Implementing Continuous Delivery with large and complex systems can be pretty tough. But this is one of those things that if you don’t figure out how do, it won’t matter cause your business will likely fade as it is overtaken by startups that deliver software as the business needs it. If you need some more practical advice on how to get there check out my Comparing Application Deployment: 2005 vs. 2015 blog post. Let me know how it goes.

Comparing Application Deployment: 2005 vs. 2015

Note: Check out the Latvian Translation.

Over the past 10 years the ways we build and deliver applications has changed significantly. It seems like much of this change has happened overnight but don’t worry, it is perfectly normal to look up and feel disoriented in the 2015 deployment landscape.

This article compares the deployment in 2005 with “modern” deployment so that all the new terms and techniques will make sense. Forewarning: My background is primarily Java / JVM so I will use that terminology but try to make the ideas polyglot.

2005 = Multi-App Containers / App Servers / Monolithic Apps
2015 = Microservices / Docker Containers / Containerless Apps

Back in 2005 many of us worked on projects that resulted in a WAR file – a zip file containing a Java web application and its library dependencies. That web application would be deployed alongside other web applications into a single app server sometimes called a “container” because it contained and ran one or more applications. The app server provided a bunch of common services to the web apps like an HTTP server, a service directory, and shared libraries. Unfortunately deploying multiple apps in a single container created high friction for scaling, deployment, and resource usage. App servers were supposed to isolate an app from its underlying system dependencies in order to avoid “it works on my machine” problems but things often didn’t work that smoothly due to differing system dependencies and configuration that lived outside of the app server / container.

In 2015 apps are being deployed as self-contained units, meaning the app includes everything it needs to run on top of a standard set of system dependencies. The granularity of the self-contained unit differs depending on the deployment paradigm. In the Java / JVM world a “containerless” app is a zip file that includes everything the app needs on top of the JVM. Most modern JVM frameworks have switched to this containerless approach including Play Framework, Dropwizard, and Spring Boot. A few years ago I wrote in more detail about how app servers are fading away in the move from monolithic middleware to microservices and cloud services.

For a more complete and portable self-contained unit, system-level container technologies like Docker and LXC bundle the app with its system dependencies. Instead of deploying a bunch of apps into a single container, a single app is added to a Docker image and deployed on one or more servers. On Heroku a “Slug” file is similar to a Docker image.

Microservices play a role in this new landscape because deployment across microservices is independent, whereas with traditional app servers individual app deployment often involved restarting the whole server. This was one reason for the snail’s pace of deployment in enterprises – deployments were incredibly risky and had to be coordinated months in advance across numerous teams. Hot deployment was a promise that was never realized for production apps. Microservices enable individual teams to deploy at will and as often as they want. Microservices require the ability to quickly provision, deploy, and scale services which may have only a single responsibility. These requirements fit well with the infrastructure provided by containerless apps running on Docker(ish) Containers.

2005 = Manual Deployment
2015 = Continuous Delivery / Continuous Deployment

The app servers of 2005 that ran multiple monolithic apps combined with manual load balancer configurations made application upgrades risky and painful so deployments were usually done sparingly in designated maintenance windows. Back then it was pretty much unheard of to have a deployment pipeline that fully automated delivery from an SCM to production.

Today Continuous Delivery and Continuous Deployment enable developers to get code to staging and production sometimes as often as tens or even hundreds of times a day. Scalable deployment pipelines range from the simple “git push heroku master” to a more risk averse pipeline that includes pull requests, Continuous Integration, staging auto-deployment, manual promotion to production, and possibly Canary Releases & Feature Flags. These pipelines enable organizations to move fast and distribute risk across many small releases.

In order for Continuous Delivery to work well there are a few ancillary requirements:

  • Release rollbacks must be instant and easy because sometimes things are going to break and getting back to a working state quickly must be painless and fast.
  • Patch releases must be able to make it from SCM to production (through a continuous delivery pipeline) in minutes.
  • Load balancers must be able to handle automatic switching between releases.
  • Database schema changes should be decoupled from app releases otherwise releases and rollbacks can be blocked.
  • App-tier servers should be stateless with state living in external data stores otherwise state will be frequently lost and/or inconsistent.

2005 = Persistent Servers / “Pray it never goes down”
2015 = Immutable Infrastructure / Ephemeral Servers

When a server crashed in 2005 stuff usually broke. Some used session replication and server affinity but sessions were lost and bringing up new instances usually took quite a bit of manual work. Often changes were made to production systems via SSH making it difficult to accurately reproduce a production environment. Logging was usually done to local disk making it hard to see what was going on across servers and load balancers.

Servers in 2015 are disposable, immutable, and ephemeral forcing us to plan for them to go down. Tools like Netflix’s Chaos Monkey randomly shut down servers to make sure we are preparing for crashes. Load balancers and management backplanes work together to start and stop new instances in an instant enabling rapid scaling both up and down. By being immutable we can no longer fix production issues by SSHing into a server but now environments are easily reproducible. Logging services route STDOUT to an external service enabling us to see the log stream in real time, across the whole system.

2005 = Ops Team
2015 = DevOps

In 2005 there was a team that would take your WAR file (or other deployable artifact) and be responsible for deploying it, managing it, and monitoring it. This was nice because developers didn’t have to wear pagers but ultimately the Ops team often couldn’t do much if there was a production issue at 3am. The biggest downside of this was that Ops became all about risk mitigation causing a tremendous slowdown in software delivery.

Modern technical organizations of all sizes are ditching the Ops velocity killer and making developers responsible for the stuff they put into production. Services like New Relic, VictorOps, and Slack help developers stay on top of their new operational responsibilities. The DevOps culture also directly incentivizes devs not to deploy things that will end up waking them or a team member up at 3am. A core indicator of a DevOps culture is whether a new team member can get code to production on their first day. Doing that one thing right means doing so many other things right, like:

  • 3 Step Dev Setup: Provision the system, Checkout the code, and Run the App
  • SCM / Team Review (e.g. GitHub Flow)
  • Continuous Integration & Continuous Deployment / Delivery
  • Monitoring and Notifications

DevOps can sound very scary to traditional enterprise developers like myself. But from experience I can attest that wearing a pager (metaphorically) and assuming the direct risk of my deployments has made me a much better developer. The quality of my code and my feelings of fulfillment have increased with my new level of ownership over what is in production.

Learn More

I’ve just touched the surface of many of the deployment changes over the past 10 years but hopefully you now have a better understanding of some of the terminology you might be hearing at conferences and on blogs. For more details on these and related topics, check out The Twelve-Factor App and my blog Java Doesn’t Suck – You’re Just Using it Wrong. Let me know what you think!

Huge thanks to Jason Hand and Joe Kutner for reviewing this blog post.

Auto-Deploy GitHub Repos to Heroku

My favorite new feature on Heroku is the GitHub Integration which enables auto-deployment of GitHub repos. Whenever a change is made on GitHub the app can be automatically redeployed on Heroku. You can even tell Heroku to wait until the CI tests pass before doing the deployment. I now use this on almost all of my Heroku apps because it allows me to move faster and do less thinking (which I’m fond of).

For apps like jamesward.com I just enable deployment straight to production. But for apps that need a less risky setup I have a full Continuous Delivery pipeline that looks like this:

  1. Push to GitHub
  2. CI Validates the build
  3. Heroku deploys changes to staging
  4. Manual testing / validation of staging
  5. Using Heroku Pipelines, promote staging to production

I’m loving the flexibility and simplicity of this new feature! Check out a quick screencast to see how to setup and use Heroku GitHub auto-deployment:

Notice that none of this required a command line! How cool is that?!?

Atlanta Presentation: Practicing Continuous Delivery

Tomorrow I’ll be presenting Practicing Continuous Delivery on the Cloud at the Atlanta No Fluff Just Stuff conference. Here is the session description:

This session will teach you best practices and patterns for doing Continuous Delivery / Continuous Deployment in Cloud environments. You will learn how to handle schema migrations, maintain dev/prod parity, manage configuration and scaling. This session will use Heroku as an example platform but the patterns could be implemented anywhere.

This has become my favorite sessions to present. So if you are going to be at Atlanta NFJS, then I hope to see you there!