Midpath | CI at nano and massive scale

Why and how we built Jaypore CI, and pushing it to 50k jobs a month on mid-end hardware.

About

One of the most fundamental things in software development is the ability to automate parts of your work. This area often specializes into Pre Commit Hooks, DevOps, IDE Extensions, and most often CI-CD. In this article, we will focus on CI/CD, exploring available options, offering a thorough comparison, and explaining why we eventually built and adopted Jaypore CI at our company.

Existing CI-CD

Overview

The state of software has come a long way today, starting with early simple systems which were unrecognizable in their role as CI/CD systems, to heavily specialized systems today running workloads across thousands of machines.

Overall there seem to be 3 types of systems which have emerged:

Server-Runner : A central server coordinates tasks among a fleet of workers. Ex: Jenkins.
Container-Step : Treats every step of the pipeline as an isolated container execution. Ex: Github Actions.
Git-Centered : Shifts the focus from "pushing" code to "reconciling" the state of a cluster with a declarative configuration stored in Git. Ex: Argo CD

Comparisons

As of now, we have a large number of CI/CD systems available and each one had its own set of things that it does well.

GitHub Actions SaaS YAML Container Runners GitHub Native

Native CI/CD workflows tightly integrated with GitHub repositories. Provides Linux, Windows, and macOS runners (cloud or self-hosted) and uses YAML-based pipeline definitions.

GitLab CI/CD SaaS / On-Prem YAML Kubernetes Runners DevSecOps

Built-in CI/CD for GitLab (hosted or self-hosted). Supports YAML pipelines, parallel jobs with Docker/Kubernetes runners, and integrates DevSecOps features like built-in security scanning.

Jenkins Self-Hosted Plugin Ecosystem Agent-Based Open Source

Open-source automation server with an extensible plugin ecosystem and “pipeline-as-code” support. Runs on self-managed servers or distributed agents, allowing flexible build/test/deploy pipelines.

Jenkins X Cloud-Native Kubernetes GitOps Tekton-based

Kubernetes-native CI/CD system built on Jenkins and Tekton. Facilitates GitOps-style workflows and multi-cluster pipelines for cloud-native deployments.

CircleCI SaaS / Hybrid YAML Parallel Builds Docker First

Cloud-native CI platform (cloud or self-hosted) with YAML config. Focuses on fast containerized builds with built-in parallelism and caching for high-performance pipelines.

Travis CI SaaS Multi-OS YAML Broad Language

Hosted CI service supporting Linux, Windows, and macOS builds. Uses YAML for pipeline definitions and offers broad programming language support (from languages to environment matrices).

Bitbucket Pipelines SaaS YAML Docker Containers Atlassian Native

CI/CD integrated into Bitbucket Cloud, defined in YAML. Runs your builds in Docker containers on Atlassian’s cloud (or hybrid runners) and is tightly integrated with Jira/Bitbucket workflows.

Azure Pipelines SaaS / On-Prem YAML Multi-Platform Cloud Agents

Microsoft’s CI/CD service (cloud-hosted agents) supporting any language/platform (Windows, Linux, macOS). Integrates deeply with GitHub and Azure DevOps, allowing parallel pipelines and containerized jobs.

AWS CodePipeline SaaS Declarative AWS Ecosystem CD Service

AWS’s cloud-hosted CD service that models release pipelines with stages (via console/CLI or JSON). Automates build, test, and deploy steps on code changes, integrating with AWS services or custom plugins.

AWS CodeBuild SaaS Docker Containers Managed Build Elastic Scaling

AWS’s fully-managed build service. Each build runs in a fresh container (based on curated or custom Docker images), with pay-as-you-go pricing. Eliminates the need to maintain self-hosted build servers.

TeamCity On-Prem / SaaS Kotlin DSL Parallel Builds Commercial

JetBrains’ CI server (on-prem or cloud). Supports Kotlin-based pipeline definitions (DSL), parallel build agents, and tight integration with IntelliJ and other JetBrains tools.

Bamboo Self-Hosted Jira Integration Docker Agents Enterprise

Atlassian’s CI/CD server (Data Center edition for teams). Deeply integrates with Jira and Bitbucket, supports multi-stage build/deployment plans, Docker build agents, and rich deployment projects.

Buddy SaaS Visual UI Container Support Integrations

Cloud-based CI/CD with an intuitive GUI and pipelines-as-a-flowchart. Automates all development stages, integrates with GitHub/Bitbucket/Docker/AWS/Azure/etc., and has first-class Docker/Kubernetes support.

Codeship SaaS Docker Auto-Scaling CloudBees

Cloud CI/CD by CloudBees. Uses Docker (or SSH) for build steps and auto-scales workers on demand. Suitable for teams that want managed Docker-based pipelines (with commercial support).

Semaphore SaaS YAML Parallelism Caching

Hosted CI/CD service focused on speed. Uses YAML pipeline definitions, native Docker layer caching and intelligent parallelism to accelerate builds and tests (exact details from source docs).

Harness CD Platform ML Verified Feature Flags Enterprise

Enterprise-grade Continuous Delivery platform. Provides automated canary analysis and ML-based verification of deployments, along with integrated feature flag management (and has its own CI module).

Spinnaker Open Source Multi-Cloud Pipeline Orchestration GitOps

Open-source CD platform (originally by Netflix) supporting deployments to many clouds. Has a powerful pipeline system with built-in deployment strategies (blue/green, canary, etc.).

Argo Workflows Open Source Kubernetes Container-Native DAG Workflows

Kubernetes-native open-source workflow engine. Allows defining parallel containerized jobs as a DAG or linear steps. No separate servers needed, runs entirely on Kubernetes.

Argo CD GitOps Declarative Kubernetes Open Source

Kubernetes GitOps deployment tool. Watches Git repos and ensures Kubernetes clusters’ live state matches declarative manifests (supports YAML, Helm, Kustomize).

Flux GitOps Kubernetes Operator Open Source

Open-source GitOps operator for Kubernetes. Continuously deploys applications by reconciling cluster state to Git configurations (e.g. YAML/Helm charts) and supports custom resources.

Tekton Pipelines Cloud Native Kubernetes CRDs Declarative Open Source

Open-source Kubernetes-based CI/CD framework (CD Foundation project). Provides pipeline, task, and trigger CRDs to define build/test/deploy workflows natively on K8s (often used by Tekton-based platforms).

OpenShift Pipelines Cloud Native Tekton-based Kubernetes Red Hat

Red Hat’s distribution of Tekton for CI/CD on OpenShift. Provides the same K8s-native pipelines and tasks as Tekton, but integrated into the OpenShift ecosystem (with Red Hat support).

Codefresh Cloud Native Kubernetes Argo-based GitOps

Kubernetes-native CI/CD platform built on Argo. Focuses on microservices delivery with integrated Docker registry, supports GitOps workflows (can be triggered from GitHub Actions, Jenkins, etc.).

GoCD Continuous Delivery Open Source Value Stream Pipeline Graph

Open-source continuous delivery server. Emphasizes visualization of pipelines and value stream mapping (shows whole pipeline topology) to help optimize delivery processes.

Concourse CI CI/CD Platform Container-based YAML Open Source

Open-source CI system with a “concourse pipeline” model (resources, jobs, tasks). Each job runs in its own container, with strictly declarative YAML configs for reproducibility.

Drone CI Docker-Native Pipeline as Code Any SCM Open Source

Self-service CI (now part of Harness). Uses a simple YAML file for pipelines. Every step runs in an isolated Docker container (supporting any language), and it integrates with GitHub, GitLab, Bitbucket, etc.

Woodpecker CI Docker-Native Open Source Fork Self-Hosted YAML

Community-driven open-source fork of Drone CI (v0.8). Maintains the same Docker-based pipeline model (each step in a container) and uses a `.woodpecker.yml` for configuration.

Buildkite Hybrid Self-Hosted Agents Cloud Orchestration Pipeline as Code

Hybrid CI/CD service: orchestration in cloud (SaaS) with builds running on your own agents. Provides YAML pipelines, full control over build environments, and can scale via cloud or Kubernetes runners.

Bitrise Mobile CI/CD iOS/Android Visual Pipelines Fully Managed

CI/CD platform for mobile app development. Automates builds/tests/deploy for iOS, Android, React Native, etc., with a visual editor and pre-configured steps. Provides managed cloud build infrastructure optimized for mobile.

AppVeyor SaaS Windows/Linux/macOS YAML Free Tier

Continuous Integration service targeting Windows (including .NET), but also supports Linux and macOS. Configurable via YAML or UI, it provides fast build VMs with admin access and integrates with GitHub, Bitbucket, etc.

DeployBot Deployment SaaS Docker Support Multi-Server

Automated deployment service. Pulls code from Git (GitHub/Bitbucket/GitLab), builds or runs code in Docker or on its servers, and deploys to remote servers via SSH/SCP. Supports multi-server and multi-branch deployments concurrently.

Kapstan Internal Platform One-Click GitOps Enterprise

Internal developer platform that automates end-to-end CI/CD with a one-click deployment model (often used in enterprises to encapsulate pipeline complexity). Integrates code to deployment with minimal manual steps.

Act (nektos/act) Local Runner CLI Tool Docker Open Source

Local execution tool for GitHub Actions. Allows running GitHub Actions workflows locally in Docker containers (emulates GitHub Actions runners on your machine).

DSCI (Dead Simple CI) CI Framework Code-Based YAML Alternative GitOps Friendly

Developer-centric CI framework (open source). Uses code (Go/Python) to define pipelines instead of YAML. Emphasizes simplicity and programmability in CI configuration.

Terraform Infrastructure as Code Declarative Self-Hosted Provisioning

Open-source IaC tool for provisioning infrastructure. Often used in CI/CD pipelines to define and create cloud resources (AWS, Azure, GCP, etc.) as part of deployment workflows.

Pulumi Infrastructure as Code General Languages Self-Hosted Cloud Native

Cloud-native IaC platform using real programming languages (TypeScript, Python, Go, etc.). Integrates with CI pipelines to provision and manage infrastructure in code-centric workflows.

Ansible Automation/Config Agentless YAML Provisioning

Agentless automation tool (configuration management). Uses YAML “playbooks” to provision servers, configure environments, and deploy applications as part of CI/CD processes.

AWS CloudFormation Infrastructure as Code Declarative JSON/YAML AWS Native

AWS’s service for IaC. Manages AWS resources via declarative JSON or YAML templates. Commonly used in CI pipelines to setup infrastructure (VPCs, databases, etc.) before or during deployments.

Crossplane Infrastructure as Code Kubernetes Declarative Cross-Cloud

Kubernetes add-on to provision and manage cloud infrastructure via K8s manifests. Treats cloud services as custom resources (e.g., databases, S3, etc.) that can be included in Kubernetes-native CI/CD workflows.

Docker Containerization Runtime OCI DevOps

Container platform (runtime and tools). Widely used in CI/CD to create consistent, reproducible build and test environments across machines.

Kubernetes Orchestration Scaling Cloud-Native Auto-Deployment

Container orchestration system. Frequently used as the execution environment for CI/CD (running build agents or deploying releases), enabling automated scaling and management of container workloads.

Helm Package Manager Kubernetes YAML Templating

Kubernetes package manager. Used in CI/CD to template and deploy Kubernetes applications (charts) as part of automated release pipelines.

Kustomize Config Management Native Kubernetes Overlay

Built-in Kubernetes tool for customizing YAML manifests. Often used in CI pipelines to apply environment-specific patches/overlays without templates.

Skaffold Development Tool Automation Kubernetes Hot Reload

CLI tool that automates the build/test/deploy loop for Kubernetes applications (often used in local development). Detects code changes, rebuilds images, and redeploys to local or remote clusters.

Buildah Containerization CLI Tool OCI Daemonless

Tool to build OCI-compliant container images. Unlike Docker, Buildah can run rootless and without a daemon. Used in CI pipelines for lightweight image builds.

Capistrano Deployment Ruby-based SSH Manual

Remote deployment tool (written in Ruby). Automates executing commands on servers via SSH (common in Rails projects). Can be used within CI to coordinate deployments to web servers.

Gradle Build Tool JVM Groovy/Kotlin Config as Code

Build automation tool (JVM-based). Commonly used in CI for compiling and packaging Java/Groovy projects. Supports custom tasks in Groovy or Kotlin DSL, and integrates with many CI systems.

Maven Build Tool Java XML Dependency Mgmt

Java build and dependency management tool. Used in CI pipelines to compile, test, and package Java applications, with pom.xml as the declarative config.

Dagger Container DAG Code-Based Local Runner Multi-Language

Programmable CI/CD engine. Pipelines are defined in code (Go, Python, JS) and executed as container DAGs. Dagger runs each step in isolated containers on your local machine or CI, bridging local and remote environments.

Earthly Container-Native Makefile-Like Parallel Builds Caching

CI/CD framework that runs all build targets inside containers. Uses an “Earthfile” (Dockerfile+Makefile style syntax) and supports any language/toolchain. Features automatic parallel execution and caching for fast, repeatable builds across machines.

GNU Make Build Tool Makefile Local Runner Open Source

Classic build automation tool using Makefiles. While simple, Make can orchestrate CI tasks in scripts and is often used as a lightweight local CI solution for compiling and testing code.

Task (Taskfile) Task Runner YAML Local Runner Go-Based

Go-based task runner using a `Taskfile.yml`. Simplifies build and automation scripts with straightforward YAML syntax, acting as a modern replacement for complex shell scripts.

Just Command Runner Makefile Rust Local Runner

Task runner written in Rust (“Justfile”). Behaves like a Makefile for humans – define recipes to automate commands. Often used by developers for simple CI tasks on local machines.

Mage Build Tool Code-Based Go Local Runner

Make-like build tool for Go projects. Build scripts are written as Go code (plain functions), providing strongly-typed tasks and easier logic than shell-based Makefiles.

Bazel Open Source Multi-Language Monorepo Caching

Open-source build system by Google. Fast incremental builds via advanced caching and parallelism, supporting many languages on large codebases. Scales to huge monorepos for reliable CI builds.

What we need in a CI system

A single system cannot meet every need. Hence we lay down a few requirements that we have which need to be satisfied by any CI system in order to be adopted.

Minimal repo footprint

Our workflow involves managing 100-120 active git repositories at any given time, a roster that shifts regularly. Many of these codebases are outside our direct control and come pre-configured with the original developers' CI systems-such as Travis CI on GitHub, or GitLab’s inbuilt CI. Often, we cannot access these original platforms due to expired customer licenses, platform drift, or lack of maintenance.

Consequently, we require a CI/CD solution that can be inserted, utilized, and removed from customer codebases with negligible friction, allowing future developers to easily incorporate our logic into their own systems if they deem necessary.

Scale to Zero Cost

Efficiency is paramount. We need the ability to completely drop the cost of a project to zero once it's complete. With most CI systems this is not possible. Most of the time, the nearest option is a free tier which forces you to make compromises like making the code public, running on public infra and so on.

With free tiers, the cost is also never actually zero. A few common culprits are time spent handling backward incompatible changes, changing code since new free tiers don't support your volume of usage, rotating secrets due to an incident on the platform, and so on.

The operational overhead for the CI/CD system should ideally not exceed $5 as well. The cost should be driven by the usage of individual projects. Specific jobs may occasionally require high-end hardware like GPUs, the core CI system should remain inexpensive. This mandate extends to long-term storage of logs and artifacts, handling high concurrency, and costs of transfer when we share projects with other people.

Local, Offline Development

To accommodate team members in remote areas and prevent divergence between local testing/integration and CI, the system must be capable of running entirely offline. We want to avoid behemoth server requirements and complex setup steps unless the codebase itself mandates it. It is a hard requirement that the CI/CD logic remains debuggable and REPL capable directly on a developer’s local machine.

Seamless Sharing

Collaboration should be simple. We require the ability to share CI/CD results regardless of the git host; be it GitHub, GitLab, Bitbucket, or SourceHut. Access should be tied simply to the repository itself, and viewing results locally must be a straightforward, painless experience for all users.

It should also be possible for people to organize / present the results of the CI in whatever fashion they want. For example, a web dev project might want to present UI snapshots. A kernel patch might want to present a rigorous matrix of hardware testing, a ML project might want to show tensorlog outputs and so on.

No Vendor Lock-in

We must protect our long-term automation investments from platform volatility in line with our philosophy of building software that will last a few decades. Having previously been forced to migrate away from Travis, Github, GitLab CI, and Drone CI we have felt this pain first hand.

The reason can be anything from licensing and free-tier changes to the project shutting down and so we require a system that ensures our automations remain usable for a decade or more, independent of any single provider's business model.

Powerfully Extensible

Since we want to run JCI workloads for the next 50 years, we definitely want it to be extended easily without forcing changes on the core JCI setup.

For instance running jobs on bare machines, via docker, via podman, lightweight VMs, cloud spot instances, on remote machines, on Windows machines from Linux machines, on Mac Mini devices, on old mobile phones etc is all trivial today with JCI.

Tomorrow if we want to run jobs on ... say, WASM runtimes running in customer browser (for whatever reason), the core of JCI would not need any change. This is in direct contrast with most systems since they would have to support it in order for us to use it.

Secret management

Secret management ties together the code, the environments that the code runs in, and the security surface of that product. Trying to improve security posture for applications is no easy task, especially when we don't have access to secrets management in customer's choice of CI platform.

For example, if a customer uses a token to deploy to netlify / cloudfare, and we want to rotate that token for prod/test/ci/dev/staging environments at the same time, then the customer needs to do it for us (if they know how), or we need to do it on call with them.

Hence, secret management in whatever CI system we choose, must be linked to the repo instead of a parallel system that follows it's own rules. In case of a parallel system, it's quite frequent to see cases where the customer had one fellow who left the company, and now after 6 months, some secret X in Drone CI can no longer be accessed.

Secrets must be versioned, rotateable, have JIT access, must not be exposed on disk anywhere if possible. Especially in a CI system where the runners of the code might be un-trusted, and the logs might be viewable by anyone, the secrets must be protected at all costs.

Why not X ?

We tried a large number of things over the years, and found something missing in each of them. Here are a few that we remember clearly:

CI/CD Platform	Key Issues & Pain Points
GitHub Actions	Frequent downtime; high maintenance requirements due to insecure Docker images in codebases.
Circle CI / Travis CI	Prohibitively high costs relative to the volume of jobs required.
Jenkins	High runtime costs and significant friction during integration.
GitLab CI	High runtime costs; runners frequently stall/choke when scaling to 100-200 containers on a single machine.
Gitea Actions	System was flaky during testing; license changes reduced long-term confidence in the product.
Drone CI	Ballooning storage costs and unexpected licensing changes.
Woodpecker CI	Failed setup phase due to gRPC protocol issues between the server and runners.

Jaypore CI

v0.x.x

The first version of Jaypore CI was built in Python, had a configuration system which allowed us to write pipeline configs in Python, and posted CI results to the PR on gitea in neat little text-status blocks like this one:

Reviewed-on: https://gitea.midpathsoftware.com/midpath/jaypore_ci/pulls/78
╔ 🔴 : JayporeCI       [sha 15fd3e2]
┏━ build-and-test
┃
┃ 🟢 : JciEnv          [7589e605]   0:12
┃ 🟢 : Jci             [742ecb75]   0:18            ❮-- ['JciEnv']
┃ 🟢 : black           [68992f99]   0: 0            ❮-- ['JciEnv']
┃ 🟢 : install-test    [717173b8]   0: 0            ❮-- ['JciEnv']
┃ 🟢 : pylint          [642eb24b]   0:10            ❮-- ['JciEnv']
┃ 🟢 : pytest          [6a8e4f34]   0:28 Cov: 89%   ❮-- ['JciEnv']
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
┏━ Publish
┃
┃ 🟢 : DockerHubJci    [eaae4d95]   0:52
┃ 🟢 : DockerHubJcienv [faa653f9]   0:52
┃ 🟢 : PublishDocs     [0fbaeb6e]   0:39
┃ 🔴 : PublishPypi     [0a9dae01]   0: 4 v0.2.30
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

This kind of pipeline output was created by a config written in python like this one:

from jaypore_ci import jci

with jci.Pipeline() as p:
    jcienv = f"jcienv:{p.repo.sha}"
    with p.stage("build_and_test"):
        p.job("JciEnv", f"docker build  --target jcienv -t jcienv:{p.repo.sha} .")
        p.job(
            "Jci",
            f"docker build  --target jci -t jci:{p.repo.sha} .",
            depends_on=["JciEnv"],
        )
        kwargs = dict(image=jcienv, depends_on=["JciEnv"])
        p.job("black", "python3 -m black --check .", **kwargs)
        p.job("pylint", "python3 -m pylint jaypore_ci/ tests/", **kwargs)
        p.job("pytest", "bash cicd/run_tests.sh", image=jcienv, depends_on=["JciEnv"])
        p.job(
            "install_test",
            "bash cicd/test_installation.sh",
            image=jcienv,
            depends_on=["JciEnv"],
        )

    with p.stage("Publish", image=jcienv):
        p.job("DockerHubJcienv", "bash cicd/build_and_push_docker.sh jcienv")
        p.job("DockerHubJci", "bash cicd/build_and_push_docker.sh jci")
        p.job(
    	"PublishDocs",
    	f"bash cicd/build_and_publish_docs.sh {p.remote.branch}",
        )
        p.job("PublishPypi", "bash cicd/build_and_push_pypi.sh")

This first version only ran things inside Docker containers, which was great for "on-my-machine" problems. Eventually, the stress that this system underwent led to two major painpoints.

Python : Configuring Rust/Go/PhP/.NET projects in Python was not intuitive for most people. The hard dependency of requiring Python on the machine was not easily satisfied in the windows world.
Docker : Not all projects need docker, some need podman and others need full VMs for testing. Some pipelines need entire clusters to be setup before running tests.

v1.x.x

Eventually we came up with a new version of Jaypore CI, leading to quiet a few changes and backward breakages.

Built in Go, not Python. This gave us flexibility of deployment since we only had to ship a single static binary to get things running.

Executables instead of DSL config for configuring the pipeline. This meant that we could program the pipeline in any language we want. Frequently it is done in bash, but we also use powershell, python, Go, and whatever else is being used in the respective project.

Overall the new Jaypore CI system is rugged, stable, extensible, powerful, minimal, and much more in a tiny executable.

We retained the idea of storing CI results directly in git and in fact extended it so that the CI browser that ships with the binary can present the results however the repo wants. The repo can export a tiny static webapp and it will be rendered as the CI results, in the CI browser.

Pipeline status can also be posted to Gitea / Gitlab / Github etc without too much fanfare.

With this kind of a setup we can now write our pipelines in any language we want! For example here's one in simple bash:

#! /bin/bash

set -o errexit
set -o nounset
set -o pipefail


main(){
    docker build -t mp -f pwa/Dockerfile .
    docker run --rm -v ./:/app mp bash scripts/publish_site.sh
}
(main)

Conclusion

Go try it out at JayporeCI.in! If you have questions, let's talk on the Help Desk .

Table of Contents

CI at nano and massive scale