The Battle of Automated Service Deployment Tools
The Battle of Automated Service Deployment Tools
At Grabyo, we are massive advocates of infrastructure as code (IaC) and have been using Terraform since the early days. There are plenty of other blogs about IaC, so I won’t labour the point here, but there are many advantages that make it desirable for businesses of all shapes and sizes. But one thing that we noticed pretty quickly is that IaC isn’t suitable for automating the creation and destruction of service environments, which hasn’t been solved yet. Until recently.
The Battle of the Solutions
In the blue corner…
At the tail end of 2020, Hashicorp (the same incredible people who make Terraform) released a tool to solve precisely this problem: Hashicorp Waypoint. Waypoint is a plugin driven tool to help build, deploy and release any service, and it supports a bunch of deployment targets straight out of the box, including AWS ECS, AWS EC2, Kubernetes and Azure. This makes it super easy to get started, and with a few lines of code, it was straightforward to get a simple service up and running by following the standard examples. In addition, it comes with a simple CLI, which means anybody can use it, from operations teams to development teams.
Hashicorp Waypoint Architecture
But ECS is a complex service and supports many additional features. One such feature that we needed at Grabyo was an integration with AWS App Mesh. At launch, Waypoint was pretty basic in functionality, and some essential features were missing (such as adjusting the exposed ports of a container). This meant the integration with App Mesh was almost impossible, and Waypoint would have to do something exceptional for us to adjust our service mesh plans (our road to microservices was still very new at this time, so we were prepared to change course slightly if the benefits were worth it). It’s worth noting that we could have built our own plugin for Waypoint to support any custom functionality. Unfortunately, this was not feasible as plugins are written in Go, creating too much dependency on our platform team for updates. We were looking for something that enabled our development teams to move at their own pace.
As we looked deeper into ways we could leverage Waypoint to make life easy, we unearthed another issue: Waypoint did not support the creation of additional infrastructure. Now I hear what you’re saying: “But why would you need that? Just do that with Terraform!”. True, you could do this with Terraform, but it gets tricky when each deployment requires its own SQS queue or SNS topic. For example, I don’t want to “pre-provision” infrastructure for an arbitrary number of service environments. So core service infrastructure is OK to include with Terraform, and the deployments can be managed with Waypoint, but “deployment-specific infrastructure” must be controlled by a third tool. Ew…
And in the red corner…
About the same time, AWS released Proton, and as an AWS Partner, we were immediately interested. Proton has a slightly different approach to Waypoint and focuses on helping operations teams manage hundreds of service deployments with ease. In Proton, operations teams create templates and environments that service teams can use to deploy source code packages. Updates to these templates can be rolled out seamlessly to all services with limited developer involvement, which is excellent from a security standpoint but is not without its drawbacks.
AWS Proton Architecture
Here at Grabyo, we like our teams to have responsibility for their entire service, which means everything, including infrastructure, security, code and deployments. Unfortunately, the use of Proton would bind our developers to specific templates defined by our platform team, which can be a huge inconvenience for both parties. Specifically, the development teams now need to request updates from the platform team, and the platform team have to schedule said updates ASAP to prevent being a blocker at the expense of their work.
Not only this, but Proton’s tight integration with Version Control Systems (VCS) such as Git required all changes to be made available on a remote branch before they could be deployed. This slows the development feedback loop and forces developers to commit and push changes still under heavy development. We wanted our teams to request deployments straight from their terminal without checking their changes into Git.
Although we didn’t have time to wait for an implementation, we were so passionate about solving these problems we even volunteered feedback to AWS and had numerous meetings with them about the features that we needed. As a result, some of these features are being worked on today.
With neither Hashicorp Waypoint nor AWS Proton fully solving our needs, and nothing else available on the market, we turned to an in-house solution acutely aware that such an approach would come with an additional maintenance overhead that needed to be minimized.
Our platform team was not deterred and set about designing and developing DeployTools (DT). From our previous research, the essential requirements included:
- The ability to support deployment-specific infrastructure, such as SQS queues. Service owners should be free to architect their services with whatever components they need, and DT should not be a limiting factor.
- AppMesh support.
- Minimal maintenance effort; this had to be comparable to a managed solution such as Waypoint or Proton.
- Preview environments can get expensive quickly, and we, therefore, needed an automated cleanup process. Unfortunately, this did not come out of the box with either Waypoint or Proton.
Today, DT is such an integral part of our developer workflows that it deserves its blog post, but in short, we settled on the following:
- A simple serverless REST API for DT itself. This allows us to expand quickly in the future with more features (such as preview environments on PRs) and keep maintenance costs low.
- Deployment infrastructure described with Cloudformation templates. This allows templates to be reused for multiple deployments and is easy to integrate with AppMesh and other AWS services.
Hashicorp Waypoint and AWS Proton are much-needed technologies in the industry today. Both solve specific problems exceptionally well, despite not being entirely suited to our use case at Grabyo. We, therefore, opted to build our tooling to solve our requirements precisely and efficiently. Today, our solution supports ephemeral environments from localhost and pull requests, environment-ready notifications in slack, automated environment cleanup for cost-saving and support for a multi-account structure with an ambitious roadmap to support even more features such as complex traffic shaping strategies during deployments.