The term DevOps has been around for many years. Small and big companies adopt DevOps concepts for different purposes, e.g. to increase the quality of software. In this blog post, we define DevOps, present its pros and cons, highlight a few concepts and see how these can impact the entire organization.

What is DevOps?

At a high level, DevOps is understood as a technical, organizational and cultural shift in a company to run software more efficiently, reliably, and securely. From this first definition, we can see that DevOps is much more than "use tool X" or "move to the cloud". DevOps starts with the understanding that development (Dev), operations (Ops) and quality assurance (QA) are not treated as siloed disciplines anymore. Instead, they all come together in shared processes and responsibilities across collaborating teams. DevOps achieves this through various techniques. In the section "Implementation", we present a few of these concepts.

Benefits

Benefits of applying DevOps include:

Let's have a look how these benefits can be achieved by applying DevOps ideas:

How to implement DevOps

Automation and Continuous Integration (CI) / Continuous Delivery (CD)

Automation refers to a key aspect of the engineering-driven part of DevOps. With automation, we aim to reduce the need for human action, and thus the possibility of human error, as far as possible by sending your software through an automated and well-understood pipeline of actions. These automated actions can build your software, run unit tests, integrate it with existing systems, run system tests, deploy it, and provide feedback on each step. What we are describing here is usually referred to as Continuous Integration (CI) and Continuous Delivery (CD). Adopting CI/CD invests in a low-risk and low-cost way of crossing the chasm between "software that is working on an engineer's laptop" and "software that running securely and reliably on production servers".

CI/CD is usually tied to a platform on top of which the automated actions are run, e.g., Gitlab. The platform accepts software that should be passed through the pipeline, executes the automated actions on servers which are usually abstracted away, and provides feedback to the engineering team. These actions can be highly customized and tied together in different ways. For example, one action only compiles the source code and provides the build artifacts to subsequent actions. Another action can be responsible for running a test-suite, another one can deploy software. Such actions can be defined for different types of software: A website can be automatically deployed to a server, or a Desktop application can be made available to your customers without human interaction.

Besides the fact that CI/CD can be used for all kinds of software, there are other advantages to consider:

  1. The CI/CD pipeline is well-understood and maintained by the teams: the actions that are run in a pipeline can be flexibly updated, extended, etc. Infrastructure as Code can be a powerful concept here.
  2. Run in standardized environments: Version conflicts between tools and configuration or dependency mismatches only have to be fixed once when the pipeline is built. Once a pipeline is working, it will continue to work as the underlying servers and their software versions don't change. No more conflicts between operating systems, tools, versions of tools across different engineers. Pipelines are highly reproducible. Containerization can be a game-changer here.
  3. Feedback: Actions sometimes fail, e.g. because a unit test does not pass. The CI/CD platform usually allows different reporting mechanisms: E-mail someone, update the project status on your repository overview page, block subsequent actions or cancel other pipelines.

The next sections cover more DevOps concepts that benefit from automation.

Multiple Environments

The CI/CD can be extended by deploying software to different environments. These deployments can happen in individual actions defined in your pipeline. Besides the production environment, which runs user-facing software, staging and testing environments can be defined where software is deployed to. For example, a testing environment can be used by the engineering team for peer-reviewing and validating software changes. Once the team agreed on new software, it can be deployed to a staging environment. A usual purpose of the staging environment is to mimic the production environment as closely as possible. Further tests can be run in a staging environment to make sure the software is ready to be used by real users. Finally, the software reaches production-readiness and is deployed to a production environment. Such a production deployment can be designed using a gradual rollout, i.e. canary deployments.

Different environments not only realize different semantics and confidence levels of running software, e.g. as described in the previous paragraph, but also serve as an agreed-upon view on software in the entire organization. Multi-environment deployments make your software and quality thereof easier to understand. This is because of the gained insights when running software, in particular on infrastructure that is close to a production setting. Generally, running software gives much more insights into the performance, reliability, security, production-readiness and overall quality. Different teams, e.g. security experts or a dedicated QA-team (if your organization follows this practice) can be consulted at different software quality stages, i.e. different environments in which software runs. Additionally, non-technical staff can use environments, e.g. specialized ones for demo purposes.

Ultimately, integrating multiple environments structures QA and smoothens the interactions between different teams.

Fail early

No matter how well things are working in an organization that builds software, bugs happen and bugs are expensive. The cost of bugs can be projected to the manpower invested into fixing the bug, the loss of reputation due to angry customers, and generally negative business impact. Since we can't fully avoid bugs, there exist concepts to reduce both the frequency and impact of bugs. "Fail early" is one of these concepts.

The basic idea is to catch bugs and other flaws in your software as early in the development process as possible. When software is developed, unit tests, compiler errors and peer reviews count towards the early and cheap mechanisms to detect and fix flaws. Ideally, a unit test tells the developer that the software is not correct, or, a second pair of eyes reveals a potential performance issue during a code review. In both cases, not much time and effort is lost and the flaw can be easily fixed. However, other bugs might make it through these initial checks and land in testing or staging environments. Other types of tests and QA should be in place to check the software quality. Worst case, the bug outlives all checks and is in production. There, bugs have much higher impact and require more effort by many stakeholders, e.g. the bug fix by the engineering team and the apology to the customers.

To save costs, cheap checks such as running a test suite in an automated pipeline should be executed early. This will save costs as flaws discovered later in the process result in higher costs. Thus, failing early increases cost efficiency.

Rollbacks

DevOps can also help to react quickly to changes. One example of a sudden change is a bug, as described in the last section, which is discovered in the production environment. Rollbacks, for example as manually triggered pipelines, can recover the well-functioning of a production service in a timely manner. This can be useful when the bug is a hard one and needs hours to be identified and fixed. These hours of degraded customer experience or even downtime makes paying customers unhappy. A faster mechanism is desired, which minimizes the gap between a faulty system and a recovered system. A rollback can be a fast and effective way to recover system state without exposing customers to company failure much.

Policies

DevOps concepts impose a challenge to security and permission management as these span the entire organization. Policies can help to formulate authorizations and rules during operations. For example, implementing the following security requirements may be required:

The authentication and authorization tools provided by CI/CD providers or cloud vendors can help to design such policies according to your organizational needs.

Observability

As software is running and users are interacting with your applications, insights such as error rates, performance statistics, resource usages, etc. can help to identify bottlenecks, mitigate future issues, and drive business decisions through data. There exist two major ways to establish different forms of observability:

Logging and metrics can help to define goals and to align a development team with a QA team for example.

Disadvantages

So far, we only looked at the benefits and characteristics of DevOps. Let's have a brief look at the other side of the coin by commenting on the possible negative side effects and disadvantages of adopting DevOps concepts.

Conclusion

In this blog post, we explored the definition of DevOps and presented several DevOps concepts and use-cases. Furthermore, we evaluated benefits and disadvantages. Adopting DevOps is an investment into a low-friction and automated way of developing, testing, and running software. Technical improvements, e.g. automation, as well as increased collaboration between teams of different disciplines ultimately improve the efficiency in your organization long-term.

However, DevOps represents not only technical effort but also impacts the entire company, e.g. how teams communicate with each other, how issues are resolved, and what teams feel responsible for. Finding the right balance and choosing the best concepts and tools for your teams represents a challenge. We can help you with identifying and solving the DevOps transformation in your organization.

Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.