Skip to main content

Concourse CI: A new way to approach CI-CD


DevOps is essential to software delivery. Applying DevOps principles to software development and delivery has become a necessity as more and more Enterprises are adopting this philosophy. A philosophy focused at faster delivery of software with reduced risks and more collaboration among teams. And who doesn't want faster delivery of software?

One of the first place to start when adopting DevOps is implementing CI-CD to automate the process of integration, build, test and delivery. Continuous integration is a process where developers continuously integrate their code to source code repository. When implemented with DevOps, this also includes build, unit testing and integration testing. This provides faster feedback to the team related to any build errors or defects.

In this blog, I will explain what Concourse CI is, how it is different from other CI tools, what are its core concepts and how teams can use it implement an effective CI-CD pipeline for their applications.

Concourse CI is a CI-CD tool developed as part of Cloud Foundry project. It was created out of the need to create a CI tool more suitable for modern application development process and culture. It is an open source tool distributed under Apache 2.0 license. Concourse is built for scalability. For a production grade installation, it is recommended to install it using BOSH release. BOSH is a release management tool similar to Puppet or Chef but it has built-in CPIs (Cloud Provider Interface) for all popular cloud platforms like AWS, GCP, Azure, OpenStack, vSphere etc. By running Concourse using BOSH on any of the private or public cloud, you can create a highly scalable, self-healing CI platform which can be used across the organization. 

Concourse is different in many ways compare to existing CI tools such as Jenkins, Travis CI etc. Jenkins is probably the oldest and most common used CI tool. But automating Jenkins server is not easy. It requires lot of manual intervention using Jenkins UI after installation. It is heavily dependent on plug-ins and these plug-ins are dependent on other plug-ins. This makes it difficult to create reproducible Jenkins environments.

With Concourse, pipeline as code is not optional but that is the way to build pipelines. Pipelines are nothing but a YAML file which can be stored in an SCM. Pipelines have no further dependency on the infrastructure. Concourse runs everything inside a container which takes away all the dependencies and it gives developers the control to run the jobs they want to. Containers also help with scalability as multiple containers can run at the same time across multiple Concourse workers. I will explain this in more detail as I explain Concourse architecture later.

Another aspect is persistence. Concourse uses PostgreSQL to store all the data. This makes recreation of Concourse environment very easy as we just need to connect Concourse web server with PostgreSQL database and we will get the same state. 

Concourse has three core concepts. Tasks, Jobs and Resources.

Task: Task is just a script meant to run in an isolated environment which can use resources. If the scripts exits with 0, it is a success or else it is considered a failure. A task can be executed by a Job or from the command line utility called Fly. In both the cases, its execution is identical. This reassures that it will run same way as part of pipeline too.

Resources: A resource is an entity that can be checked for new versions, pulled down at a specific version and pushed to idempotently create new versions. A good example is git repo. Resources have generic interfaces implemented to be used by Concourse. So Concourse need not to know about specific resource. This create another layer of isolation. Concourse comes with some of the most popular built-in resources like git, mercurial, time, s3 etc. There are lots of community resources also available and you can create your own resource by following the resource implementation guidelines at https://concourse.ci/implementing-resources.html A full list of resources is also available at https://concourse.ci/resource-types.html

Jobs: Jobs basically describes a set of actions to be performed whenever a newer version of a resource is available or it is manually triggered. Jobs can execute multiple tasks. Job can take multiple resources as input and produce output for multiple resources. A basic example will be of running a build job whenever there is a change in source control.

Pipeline is made of multiple jobs. 

Now lets take a look at Concourse architecture. Understanding Concourse architecture is not required if you are just starting and planning to create simple pipelines for your applications. Concourse has three major components: ATC, TSA and Workers.

ATC (Air Traffic Controller) provides Web UI and Build Scheduler. It connects to PostgreSQL and stores pipeline data including build logs. You can run multiple instances of ATC behind a load balancer for high availability. All instances of ATC need to connect to same PostgreSQL database in this case.

TSA is custom-build SSH server responsible for securely registering workers with ATC. TSA is generally colocated with the ATC and sits behind a load balancer.

Workers provide container runtime and cache management. Workers don't have any important state configured and don't require any additional packages other than what is required to run worker itself.

I will not go too deep into the architecture concepts and show a very simple pipeline to put together the three concepts I mentioned previously: Tasks, Resources and Jobs.

Below is a sample pipeline which takes a sample app from GitHub, builds the app and stores the produced jar file in Artifactory server. At first glance, if it looks bit overwhelming, please bear with me as I explain what exactly is happening.

---
resource_types:
- name: artifactory
  type: docker-image
  source:
    repository: pivotalservices/artifactory-resource

resources:
- name: source
  type: git
  source:
    branch: git-hub-branch
    username: github_username
    password: github_password
- name: version
  type: semver
  source:
    driver: git
    branch: version
    file: ci/concourse/version
    username: github_username
    password: github_password
- name: build
  type: artifactory
  source:
    repository: "/app-name/jarfiles"
    regex: "app-name-(?<version>.*).jar"
    username: artifactory_user
    password: artifactory_password
    skip_ssl_verification: true

jobs:
- name: build-and-upload
  plan:
  - aggregate:
    - get: source
      trigger: true
    - get: version
      params: { bump: minor }
  - task: build-app
    config:
      platform: linux
      inputs:
      - name: source
      - name: version
      outputs:
      - name: build-output
      image_resource:
        type: docker-image
        source:
          repository: maven
          tag: "3.5.2-jdk-8"
      run:
        path: sh
        args:
          - -exc
          - |
            version=$(head -n 1 ./version/number)
            cd source
            mvn package
            cd ..
            mv ./source/target/app-name-*.jar "./build-output/app-name-"$version".jar"
            exit 0
  - put: build
    params:
      file: ./build-output/app-name-*.jar
  - put: version
    params: { file: version/number }      

There are three main sections in the above mentioned pipeline. 

resource_types: This is only required if you are using a non-built in resource. In this case, I am declaring Artifactory resource. This basically provides the docker image of resource. 

resources: This section is used to provide configuration for each resource we intend to use in our pipeline. So I am providing here a git resource to fetch our source code, semver resource to get version details and artifactory resource to store jar files.

jobs: This section contains all the jobs part of our pipeline. In this case, we only have one job named "build-and-upload". Jobs further get divided into three sections, GET, PUT and TASK. Get is used for input resources, Put is used for output resources and Task contains the actual actions we want to perform. In this case, I have two inputs, source and version. I also have two outputs, build to store jar file and version to store update version. Task section contains the actual script we want to run the docker image to run the script. I am using a Maven image. Concourse automatically moves the input resources to container at run time into the folder names given. Similarly it creates folder for output resources.

There is still one important question begging for answer. How do you run this pipeline as Concourse does not provide any UI to create pipeline. This is where one more component FLY CLI comes to rescue. Fly CLI is used to manage your Concourse environment, define pipelines, trigger jobs etc. 

$ fly --target <give-target-name> login --concourse-url http://localhost:8080 --username <username>    # This is to set target environment name and login
$ fly -t <given-target-name> set-pipeline --pipeline sample --config pipeline.yml    # This creates the pipeline, same command is used to update pipeline if you change the code

Once pipeline is defined, you can navigate to Concourse URL and see the pipeline. It will look like shown below. I have put basic information about Concourse UI in below image. 



This is all from my side on Concourse. Please provide your feedback on this blog. 

You can write to me at kpsingh.chouhan@gmail.com. My Twitter handle is @ChouhanKP84. This is my first blog. I intend to write more on DevOps and DevOps related tools and Cloud Foundry. Until next time, good bye.



Comments

Post a Comment

Popular posts from this blog

Concourse CI: Credential Management with Hashicorp's Vault

In my previous blog  Concourse CI: A new way to approach CI-CD , I introduced Concourse CI. It is a tool to implement CI-CD pipelines to deliver software. If you have not read that post, please read that first. It will only take 15-20 minutes and make it easy for you to understand what I am trying to explain in this blog. As explained in the introductory blog, all pipelines are in the form of code. Pipeline code contains resources, jobs and tasks. This same pipeline code will also need credentials to connect to SCM (Source Control Management) tools like Git or PaaS such as Cloud Foundry. This pipeline code will also be stored in a source code repository either with the application code or in a separate repository. This will give access to these secure credentials to anyone who has access to the repository. One simple approach to hide these credentials is to use a separate params.yml file. We can define template variables in our pipeline code. Values for these template va...