When a few of us joined Enova in the early 2010s there was a scheduled (or “standard”) release day.
As in we’d only schedule releases on a particular day of the week. This was usually a Thursday and it was an all.day.affair.
We’d all pile into a conference room at 9 AM, expect to be there until 5 PM (maybe later if things didn’t go “well”), and get lunch on the house for the effort.
Because 8+ hours of deployments required a bit of sustenance.
Perforce commands were run one at a time (we didn’t use Git yet), our batch jobs had to be stopped for the entirety of the deployment, and the release to our monolith application took 60-90 minutes to cycle through the cluster.
Rinse and repeat through each of our brands (we had 5 apps at the time), add in time to do manual post-release checks, and you’d have finally completed a full day of deployments.
But if something went wrong (as it so often did at the beginning), you’d add time and add time and add time. Sometimes this included pushing out a few “emergency” releases which could feel a little like…
Eventually commands were chained together into the first iteration of our current deployment solution (we affectionately called it deploy-a-tron in those early days), “standard” release days were joined by “off-cycle” releases, and things evolved.
But we knew we could do better.
And so we did.
5-10 releases a week in 2010 turned into 30 releases a week by 2014, which turned into 80 releases a week by 2018, and as many as 150 releases a week in 2019. As we progressed, emergency releases became few and far between and every workday was a “release day”.
As release volume increased we also automated key compliance controls like segregation of duties, which is a significant requirement since we are part of the heavily regulated financial services industry. By automating our controls we ensured reliability and repeatability as we scaled.
And although we deployed 150+ times in just 5 work days earlier this year, there was still a lot of overhead in the process and we knew we could do even better.
So, we in the Deployment Engineering team recently tackled two projects to streamline the deployment process and put self-service solutions into the hands of developers.
Self-Service Deployment Pipelines
First, we built self-service deployment pipelines. This provided developers with the ability to create their own build & deployment pipelines as code in a matter of minutes.
Before, setting up a new deployment pipeline meant:
- Creating a ticket requesting pipelines for a new service or lambda, often missing details and requiring a lot of back-and-forth
- Manually copying existing staging and production Jenkins jobs
- Manually re-configuring the Jenkins jobs for the new application
- Was not reliably reproducible
- Made communication and collaboration on changes unnecessarily difficult
- Required a Deployment Engineer’s time & attention
Under the old process, the average total time from start to finish was anywhere from a few hours to entire weeks 🙁
Now, all we need is a simple ~6-10 line pull request which auto-generates staging and production pipeline jobs already configured, source-controlled, and ready to go.
Under the new solution, the total time from start to finish is about 25 minutes 🙂
This positively impacts time to delivery for our Software Engineering teams as well as reduced workload for our team, all while maintaining compliance controls.
Support is already live for Lambdas, Ruby, Go, and Vue (with Ember close behind) – and although this was a huge win, we didn’t stop there!
Utilizing our new pipeline solution we took things even further and automated releases down to a single self-service push of a button.
When a developer is ready to push their code to prod they have to do nothing more than hit a button on their deployment ticket, confirm the automatically-generated release parameters are correct, and voila!
No more asking someone in Deployment Engineering to release for them, no waiting in a queue until someone’s available to deploy – just release when you’re ready.
Behind the scenes checks review the same requirements we did previously – things like ensuring all pull request checks are successful, stories are accepted by stakeholders, and segregation of duties checks pass (among others validations).
If everything passes:
In the event there are any issues, we added robust error messaging so troubleshooting and resolution are self-service, too.
Where To Now?
We’ve certainly come a long way from those early days of all-day releases, but there is always more to do. In the upcoming months our team will be focusing on a variety of enhancements including more self-service solutions, blue-green deployments, canary releases, and establishing a container pipeline. If this kind of work sounds interesting to you, check out our openings across technology and join us in building the next generation of Enova.