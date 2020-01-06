If you write software, chances are you’ve come across Continuous Integration, or CI. You might never have heard of it – but you wonder what all the ticks, badges and mysterious status icons are on open-source repositories you find online. You might hear friends waxing lyrical about the merits of CI, or grumbling about how their pipeline has broken again.
Want to know what all the fuss is about? This article will explain the basic concepts of CI, but will focus on an example, since that’s the best way to understand it. Let’s dive in.
What is CI anyway?
The precise definition of Continuous Integration refers to the practice of software developers frequently checking in their code, usually multiple times a day in a commercial setting, to a central repository. When the code is checked in, automated tests and builds are run, to verify the small changes which have been made. This is in preference to working on a ginormous slab of code for a week, checking it in, and finding out it fails a large number of tests, and breaks other people’s code.
Whilst this is a valid definition, colloquially CI has become synonymous with the automation part of this process; when people refer to CI, they are often referring to the tests, builds, and code coverage reports which run automatically on check-in.
Additionally, CI is often lumped together with its sister, Continuous Deployment (CD). CD is the practice of deploying your application automatically: as soon as your code has been pushed to the correct branch and tests have passed. We’ll talk more about this soon.
Case study – a simple API
I’m going to save any more explanation or discussion of the merits of CI until after we’ve seen an example, because this will make it easier to picture what’s going on.
The aim of this example is to make a very simple Python application, then use CI to automatically test it, and CD to automatically deploy it. We’re going to use GitLab CI, because it’s a neat, integrated solution that is easy to setup. You can view the finished repository containing all the files here.
Let’s start by creating a Python file containing our main application logic. In this case, it’s some string processing functions.
""" web/logic.py. Contains main application code. """ def capitalise(input_str): """Return upper case version of string.""" return input_str.upper() def reverse(input_str): """Return reversed string.""" return input_str[::-1]
Let’s also add some extremely basic tests for this code:
""" test_logic.py. Tests for main application code. """ from web import logic def test_capitalise(): """Test the `capitalise` function logic.""" assert logic.capitalise("hackaday") == "HACKADAY" def test_reverse(): """Test the `reverse` function logic.""" assert logic.reverse("fresh hacks") == "skcah hserf" assert logic.reverse("racecar") == "racecar"
Ok, now that we’ve made our main application code, let’s expose it over an API. We’ll use Flask for this. Don’t worry about meticulously reading this, it’s just here to serve as an example, and is shown here for context.
""" web/api.py. Expose logic functions as API using Flask. """ from flask import Flask, jsonify import web.logic as logic app = Flask(__name__) @app.route('/api/capitalise/<string:input_str>', methods=['GET']) def capitalise(input_str): """ Return capitalised version of string. """ return jsonify({'result': logic.capitalise(input_str)}) @app.route('/api/reverse/<string:input_str>', methods=['GET']) def reverse(input_str): """ Return reversed string. """ return jsonify({'result': logic.reverse(input_str)}) if __name__ == '__main__': app.run()
Note that we should test the API as well (and Flask has some nice ways to do this), but for conciseness, we won’t do this here.
Now that we have an example application setup, let’s do the part we’re all here for and add a CI/CD pipeline to GitLab. We do this by simply adding a
.gitlab-ci.yml file to the repository.
In this explanation we’re going to walk through the file section by section, but you can view the full file here. Here’s the first few lines:
image: python:3 stages: - analyse - test - deploy
This sets the default Docker image to run jobs in (Python 3 in this case), and defines the three stages of our pipeline. By default, each stage will only run once the previous stage has passed.
pylint: stage: analyse script: - pip install -r requirements.txt - pylint web/ test_logic.py
This is the job for our first stage. We run
pylint as an initial static analyser on the code to ensure correct code formatting and style. This is a useful way to enforce a style guide and statically check for errors.
pytest: stage: test script: - pip install -r requirements.txt - pytest
This is our second stage, where we run the tests we wrote earlier, using
pytest. If they pass, we continue to our final stage: deployment.
staging: stage: deploy script: - apt-get update -qy && apt-get install -y ruby-dev - gem install dpl - dpl --provider=heroku --app=hackaday-ci-staging --api-key=$HEROKU_API_KEY production: stage: deploy only: - master script: - apt-get update -qy && apt-get install -y ruby-dev - gem install dpl - dpl --provider=heroku --app=hackaday-ci-prod --api-key=$HEROKU_API_KEY
Our aim here is to deploy the API onto some kind of server, so I’ve used Heroku as the platform, authorised with an API key.
This last stage is slightly different from the others because it contains two jobs that deploy to two places: staging and production. Note that we deploy to staging on any commit, but we only deploy to production when we push to or merge into master. This means that we can check, test and use our live app in staging after any code change, but the production app isn’t affected until our code is merged into master. (In a larger project, it often makes more sense to deploy to staging on master and only deploy to production when a commit is tagged.)
And that’s it! In less than 40 lines we’ve defined a completely automated system to check and deploy our code. We are rewarded by our pipeline showing up in GitLab as below:
Additionally, the
.gitlab-ci.yml configuration file which specifies what to automate is usually also version-controlled, so that if the CI pipeline evolves, it evolves alongside the relevant version of your code.
Why it’s useful
All manner of tasks can be automated using CI, and can allow you to catch errors early and fix them before they propagate technical debt in the codebase.
Common tasks for larger Python projects might be to test our code for compatibility with different Python versions, build a Python module as a wheel, and/or push it to PyPi. For projects using compiled languages, you could automatically build your binaries for all your target platforms. For web development, it’s easy to see the benefit of automatically deploying new code on a server once certain conditions have been met.
Furthermore, part of the reason that CI is so powerful is its close relation to version control. Every time that code is pushed to any branch in a repository, tests and analysis can run, which means that people who control master or protected branches can easily see if code is safe to merge in.
Indeed, whilst CI is most satisfying when the pipeline is full of ticks, it is most useful when it looks like this:
This means that the tests failed, and as a result, the broken code was not deployed. People can clearly see not to merge this code into important branches.
Conclusions: do you need CI?
CI/CD is definitely more useful in some cases than others. But if you’re writing any code at all, you can save yourself time by writing tests for it. And if you have tests, why not run them automatically on every commit?
I can personally say that every time I’ve set up a CI pipeline, not only has it saved me time, but at some point or another it got me out of a scrape by catching broken code. I’d wager it will do the same for you.
3 thoughts on “Continuous Integration: What It Is And Why You Need It”
Like any software development methodology, CI/CD works well when applied thoughtfully to suitable problems (high-level APIs and mature code bases where changes are likely to be incremental and other teams will legitimately be consuming intermediate versions).
I did, however, work somewhere where management decided one day to enforce a CI/CD workflow on all teams regardless of how well it fit the problem space. I was working on a team building drivers and low-level data plane code which embodied the software half of a task that was split between a gang of FPGAs and a gang of threads on the x86 CPUs where the functionality implemented was effectively an 5-or-so stage pipeline that wove back and forth between the FPGA and the CPU where the last stsge is a handoff to an external process, and then another similar process in reverse on the other side.
Applying CI/CD techniques to ensure continuity and stability of the API between the end stages and the external consumer process makes sense because A) it’s a public API and B) it’s operations are somewhat high level (i.e. here are some buffers: fill them with packets or here are some packets, let me know when they’re all sent so I can reclaim the buffers).
The earlier stages, however, interlocked with the FPGA bitstream and needed to be developed in lock-step with the FPGA bitstream (and can’t be meaningfully tested without the actual target system (chassis, backplane, custom motherboard, and live connectivity via the 100G interfaces on the front panel) and a reasonably convincing and representative environment for it to operate in). Yes, with some extra effort we could make module-level tests with a simulated stand-in for the FPGA (wrap all MMIO access in ab opaque call that replaces it with socket messaging, and replaces all DMA with memcpy(), and in fact we did that for a while but since it needed reworking with every successive build of the bitstream it took as much work to maintain as the code it was meant to test, and worse it turns out that the majority of bugs arose from the “language barrier” between the software folks and the FPGA/ASIC folks so more than once the emulator/test jig ended up emulating what the software folks thought the FPGA folks meant not what the FPGA actually did.
Between that and the frequent incremental updates as the FPGA went from bare-bones functionality with a lot of stubs/placeholders to more and more functional features (and, over time, the design of the later features was revised to address lessons learned and improve integration with software and work around the occasional erratum in supporting chips such that some placeholders got replaced by real functionality as envisioned and some got ripped out and replaced with entirely dissimilar interfaces to the same functionality) it became clear that software emulation was not cutting it and our efforts were better applied to detailed instrumentation while running subsystem tests and full integration tests on the real hardware.
It was _not_ easy to convince upper management (from whence the decree came that all teams must apply their chosen CI/CD workflow and Agile/Scrum development cycle down to the length of “sprints” regardless of whether it lined up with the FPGA development cycle or not) that the methodology and implementation they imposed organization-wide (which was working well for the UI amd control plane as you’d expect) was doubling or tripling our workload while adding little or no value.
The bottom line is this: No methodology is a magic bullet across all layers and applications; there are good ideas to be borrowed from many such approaches but in my 20+ years in the business I’ve found that the best results are to be had from stopping and thinking carefully about the tricky bits in your specific application, where errors creep in, and how best to catch them early, limit the blast radius, of disruption they cause, and go from there to synthesize a workflow and development method suited to the problem at hand from the bottom up.
The appeal of rigid top-down methodology is often the data and metrics it puts at the fingertips of project managers, etc. because they feel that the more data they have the fewer unseen risks can creep in. This is, of course, true but only to the extent that the data they have are meaningful… When the workflow is not well suited to the task at hand it is less likely that the metrics generated by the automation will be meaningful and indicative of stability/completeness/correctness/performance/whatever they’re trying to measure. It is far better to generate the right mstrics from the right workflow from the bottom up and teach management to interpret them than generate uniform metrics across all modules at all levels whether they’re meaning or not
I’m sorry but youbare wrong. CI/CD is just as useful for your project. I just think you guys may have handled it wrong.
So for one, I think your biggest beef is automated testing, as that is usually the most import stage in your CI. But at the very least, you can do static code analysis and style checking. That works for any project.
Now unit testing/application testing as part of the CI is indeed very complex and depending on how much hardware you have to emulate, very complex. But the solution here usually is to do inline testing as part if the pipeline, which a looks more like CD. I do assume you guys do actually perform testing on the real hardware, now if you can automate this, bobs your uncle.
Now this last step of course can be tricky, but a raspberry pi that you have a CI runner on probing and toggling your pins should get youbquite far. For example, 1 gpio to power up the FPGA, spi bus to programming it, gpio to power up your board/Linux, 1 gpio that triggers your fail/success. And then, it is just a matter of writing proper inline tests, which IMHO you need anyway. Some (time) investment there can’t be seen as bad.
Now if you don’t need or have automated testing, I do wonder how the production line would work without some tests.
Of course this post is all based on assumptions and experience from my end, so do pardon me if I am wrong, but in the case that I am, your product would be so exclusive, that the generalisation still holds.
Now that I am a bald, old, and cranky independent, I never take a job where the boss says bad things about CI. Back when Grady Booch was first attempting to push programmers into software engineering, I used to adore all of the cute programming teams that were supposedly implementing CI along with XP. They thought they were cool. And they never released any production code.
But CI is essential – and should be used for *all* sizes of projects, whether hobby or professional. And my one-man shop eats its own dogfood. My CI ‘system’ is based on SCons. You can have my SCons-based check-in scripts when you pry my cold dead hands from my keyboard.