January 29, 2011 § Leave a comment
My team supports a legacy system. Any change to this system requires at least fifteen hours of work before it can be in production and delivering value. This work includes merging change lists down to trunk, manually testing against production-like environments, waiting for eight hours of automatic testing to complete, and finally overseeing a midnight deployment. Small incremental changes to this system are prohibitively expensive; most changes go out in multi-month releases. This approach to releasing software is undesirable because:
- It delays delivering value to customers. You spent an hour on a bug fix, but customers will not see it for months.
- It encourages late merges which delay code integration. Integration problems are always easier to address early in the software development process.
- It involves a very long feedback cycle. Developers need to wait hours to find out if their changes caused any builds to break or tests to fail.
- It discourages teams from releasing small, incremental changes. Releasing lots of changes together has been shown to result in higher numbers of defects.
- It’s just not fun.
Continuously Delivering Value
The reason that developers spend time fixing bugs and implementing features is that we think it will be valuable to our customers. If we frequently deliver value to our customers they will be happy. Happy customers are friendly, they will buy more of your stuff, and convince their friends to buy more of your stuff. This means that you (rather than your competitors) will have more friendly customers paying more money.
Valuable software must meet customer demands and be as free of bugs as possible. There is no way to release zero defect code. However, the number of defects can be minimized with good engineering practices such as code reviews, automated unit, component, and acceptance tests (both functional and non-functional), integrating early, delivering small changes frequently, ensuring that automated builds deliver fast feedback, and performing regular, manual exploratory testing.
We want valuable changes to be running in production as early as possible. This requires that every change become a candidate for a production release and that the build, test, and deployment process must be completely automated and run on every checkin. We accomplished this by chaining together several TeamCity builds (via snapshot dependencies) to create a build pipeline. Each TeamCity build represents a step in the pipeline:
- Build the distribution (via Maven). This involves running unit and component tests. This build produces a unique version and labels the SCM changelist.
- Stage distribution, deployment scripts, environment specific configuration, to the QA environment.
- Deploy using the staged artifacts from step 2.
- Run automated acceptance tests.
Repeat steps 2-4 for all production-like environments, always using the same artifact from step 1. If any of the steps fail (a single test or script failure will cause a step to fail), then the pipeline stops, automatically rolls back the software to the last known working version, and reports a failed build. A developer then needs to quickly determine if he/she can fix forward in a reasonable amount of time or needs to rollback.
Automating the entire build/test/deploy process was a huge benefit to our team. We no longer need to go through a tedious manual verification and deployment exercise. Our time and energy that used to go in to manually verifying, preparing and deploying releases can now be spent developing new features and fixing bugs. The otherwise painful integration tasks are performed automatically with every checking, this means that important errors and bad assumptions are found early on in the process when it is easiest to fix them.