Environmentless Development
Alex Pearson
Platter CTOThis is the second of a five-part series on Branchstack, the approach behind Platter's latest Branching Postgres product. Read the series introduction here
Software is not just code. Assumptions are made about the runtime or system that executes the code itself. State is shared across databases, third-party APIs, and other cloud-hosted services. And in most development workflows, arrangements of all of these parts are packaged in the form of "environments". Working on a new feature happens in a local "development" environment. Acceptance testing might happen in a "staging" environment. Users interact with a "production" environment. These environments are hard to manage, with drift and misconfigurations between them frequently causing outages.
But it doesn't have to be this way. Instead of accepting the errors and complexity that come from trying to manage evironments, we can get rid of the idea of environments entirely.
#
Environments TodayTo understand the problem that environmentless development might solve, let's first take a look at how we got to today's environment-based deployment pipelines.
The environmental approach did not appear out of a vacuum. Code under development should behave differently than the end-product shipped to users. Feature work in code needs quick iteration and verbose debugging more than peak performance or small asset size. And while software was mostly code, the separation between that code on a developer's computer and the asset installed by users was a natural one.
But that distinction became less clear as programs began to depend on external services. As applications grew to include databases, queues, third-party APIs, and other cloud-native components, strategies emerged to integrate those services with the same local development loop used for code:
- Twelve-Factor methodology helped make code context-independent, interfacing with "backing services" in each environment through URLs exposed as environment variables. This means that apps can interact with a database through a
DATABASE_URL
, whether that database is a locally-running Postgres process or a hosted cluster accessed through an array of connection pools and load balancers. While not strictly part of twelve-factor methodology, many of these applications used.env
files and a loader likedotenv
to group environment variables into discrete environments. - Configuration-as-Code and Infrastructure-as-Code introduced environment components (app configuration and backing service provisioning, respectively) to a version-controlled codebase, enabling a record of changes for easier rollout and rollback of new environment setups.
- GitOps used that version-controlled configuration-in-code to automate and reconcile the state of different environments and the transitions between them. As the name implies, this automation is driven by a
git
-based workflow.
The end result of many of these pipelines are separate development environments on local machines, separate staging environments for each pull request made through a git
repository host like GitHub, and a single production environment for end users.
But the existing state of the art for transitioning between development and production is still incomplete in two important ways:
- Most of that automation and isolation tends to happen at the end of feature work instead of the beginning, meaning that development work needs to be rigorously tested in precious "staging" environments outside of the more immediate feature-development feedback loop.
- While the principles behind GitOps are clear, the implementation of the automation required to adhere to those principles is not. Getting to the "ideal" GitOps workflow described above requires committing to a top-to-bottom orchestration framework like Kubernetes to spin up copies of a staging environment for each pull request. That added complexity has its own costs.
#
Environmental DisastersAs the complexity of an application grows, it becomes harder and harder to manage these independent environments. As the number and kind of environments to manage increases, so too does the need to automate transitions between those environments. And that automation requires its own maintenance, expertise, and specialized platform teams.
For successful-enough software products, managing these environments can quickly become more complex than writing the software itself, as faults in this pipeline to production manifest as the most dangerous types of bugs and service outages. And as each environment strays farther from production, the likelihood of mishaps occuring increases.
Consider the simplest case of a "development" environment on a local machine that includes application code and a single Postgres database. A common development setup would include running the database as a daemonized process locally and running some hot-reloading file-watching command for iterating on the code itself. Considering just the database in this example, here are a few categories of environmental bugs you're bound to encounter:
- Stale data: SQL queries are state-dependent! The only surefire way to be sure that migrations or queries work as expected at runtime is to run them against the same data as production. Where this is impossible for compliance, security, or practical reasons, development data should be as close to production data as possible.
- Configuration mismatches: configuration differences between databases can lead to wildly different behaviors. Are you certain that local Postres instances are configured identically to the instance your users use?
- Hardware mismatches: software behaves differently when run on different machines. Have you accounted for these differences in your software, too? Have you dutifully checked the performance of your database when different numbers of threads are available? Are you developing features using hardware under the same kind of workloads seen in production? Is your virtualization or containerization strategy the same as your database host's?
- Networking differences: in this example, the database and application are communicating with one another over a network. Does that connection use the same protocols, addresses, and encryption across environments? Are you accurately emulating the delays, timeouts, and retries that need to happen between these services when they're not running on a single machine? In the case of databases, are your connection pooling strategies the same?
Some of these might seem like edge-cases. But here at Platter, we've run into variations of errors like these in our previous efforts to smooth over the differences between development and production environments like this one.
We learned that, while careful separation of context and GitOps-style automation might smooth the transition between these disparate development and production environments, there is a fundamental mismatch here that simply cannot be solved with automation of any kind: it is simply impossible to adequately replicate an production "environment" through the entire development lifecycle, no matter how many containers and automations you throw at the problem.
Given that truth, the solution seems obvious: stop trying to replicate production.
#
There Is Only One EnvironmentWhile local iteration on code will probably not go away any time soon, that iteration can happen against resources running in the same cloud-based environment as production. This makes feature work both simpler and more reliable, as long as those resources can be quickly copied (Branching) and respond automatically to increases in demand from feature work and end-users alike (Serverless).
Stay tuned for more thorough deep-dives into the concepts of Branching and Serverless in future posts in this Branchstack series.
There are a few popular services that are beginning to leverage the idea of the "environmentless" workflow for local development. For example, Stripe uses a test API key to split customers into "test" and "live" buckets while retaining the exact same API endpoints for both. As a developer writing a feature against Stripe's service, you can be confident that Stripe's API will give you the same experience as your users precisely because you're using the same service as your users!
While "test" and "live" buckets are not as flexible as custom branching, it's a still a powerful example of environmentless development in action. Just think of how many bugs have been prevented thanks to Stripe's decision allow feature work against their "production" API.
Platter's branching Postgres database was built from the start to be truly environmentless: there is one database instance with many branches hosted in the cloud. Whether you use direct connections or a generated client, Platter Postgres branches use the exact same hardware, protocols, authentication, configuration, and data as their parent branches. That way, you can be confident that interactions with your database will behave the same on a local machine, in a serverless function, or inside a Docker container.
#
Building EnvironmentlessBeyond Postgres, there are many applications of the environmentless principle more generally. We'll be bringing you more in-depth articles and case-studies through this blog in future posts.
But getting rid of "environments" is only a piece of the Branchstack puzzle. Stay tuned for the next part of this series on Serverless architecture!
If you have any questions about moving to a single environment or using Platter Postgres to implement this development pattern, please send us an email at [email protected]
or join us on the #kitchen
Slack channel.