Designing Scalable Development Workflows

Posted on August 16, 2021

When architecting a new software system, one of the decisions that has to be made fairly early on is how developers will build and make changes to the system. The traditional approach revolves around a central source code repository. Developers retrieve the code from this repository, edit the code using standard text editors or IDEs, and then send their changes back to the central source code repository. Of course, since the repository only stores code and does not execute it, there is a separate deployment step that must occur before users or testers can use the updated software.

This workflow might vary slightly depending on the version control system being used to host the central code repository. For example, for a project with only a single developer, the central code repository may just be a single set of files on the developer’s computer. For projects that need to keep track of older versions of code as the system grows, or projects that involve multiple developers working simultaneously, more advanced version control systems like SVN, Git, or Mercurial might be used. Some of these version control systems (the “decentralized” ones) might not technically recognize the existence of a “central” repository, but in practice, whichever repository is used to store the authoritative code that is used to produce the final product acts as the central repository, and the workflow is still essentially unchanged.

This approach is well established and is the standard approach for virtually all major public software projects as is apparent from GitHub’s popularity. However, many projects developed internally at companies (even large ones) choose to eliminate the central code repository from the workflow, and developers instead make changes directly to a shared environment that immediately reflects their changes. Sometimes this is justified, but it often slows progress and makes it impossible to scale efficiently beyond one or two developers.

One of the reasons companies use a workflow where developers make changes directly in a shared execution environment is that the tools they are using do not easily support the traditional workflow. For example, while there are data modeling tools that are designed to help define a database schema without building it directly in the database management system, these tools are often costly and difficult to work with. Companies often end up having developers make changes directly in a “development” or non-production database and ask them to keep track by hand of the changes that they make. Then, when it is time to release the software or promote the new version to another environment, the list of changes that was manually compiled during development is referenced as someone tries to make the same changes in the target environment. This approach, however, is extremely error-prone and frustrating for developers. This is the reason for a significant amount of risk that companies experience when upgrading software. While there may be a test deployment performed in a test environment before the deployment is performed in production, it is extremely difficult (or, as is usually the case, impossible) to design tests that will catch all of the missing changes.

Another reason companies use this suboptimal workflow is that the tools they are using simply do not support the traditional workflow. Informatica PowerCenter, Oracle APEX, and virtually any other client-server development tools fit in this category. Oracle APEX requires a database and application server to run the web application that developers use to build their own application. This directly conflicts with the typical workflow where developers use standard tools with minimal dependencies. APEX allows developers to export their applications to a text-based file, but it is not at all practical for developers to edit this file with a text editor. Instead, a developer needs a full Oracle Database, Oracle APEX, and Java EE application server to perform any development. In order to use the traditional workflow, each developer would have to have a separate environment with these components, and this is difficult and costly to manage, unless tools like virtual machines are used so that new environments can be turned on and off with the push of a button. Even then, the exported text files describing the application can contain information specific to the environment that was used for development, so developers need to be trained to normalize this information before sending changes back to the central repository. This leads many companies to avoid the complexity and just have developers make changes directly in a shared APEX environment. But this can be especially difficult for developers, as the global state in an APEX application can make it hard to avoid stepping on each other’s toes.

It is best to choose development tools that support traditional development workflows, but when this is not possible, there are techniques that companies can use to mitigate the challenges described above without abandoning the traditional development workflow. These techniques will be covered in future posts.