Shipped CI from 19 to 4 minutes.
DevEx for a 40-engineer team. We rebuilt the pipeline on Buildkite + cached layers and ergonomic local-to-prod parity.
What we walked into.
Engineers waited 19 minutes for CI to pass before merging. With 40 engineers and ~80 PRs/day, that was hours of compounded waiting. Worse, slow CI led to large PRs, which led to more conflicts, which led to more reruns.
Existing CI was GitHub Actions with no caching strategy. Every PR built the world from scratch. Docker layer caches were not configured. Test parallelism was minimal.
How we shipped it.
Measure first, then change
Two weeks of CI metric collection. Identified the three slowest jobs (dependency install, type-check, integration tests) eating 70% of pipeline time. Optimised those first.
Migration to Buildkite for parallelism
Moved to Buildkite with self-hosted agents in EKS. Same code, same test commands. But with proper agent pools and parallelism per workload. Cost neutral to GitHub Actions.
Aggressive cache layering
Restructured Dockerfiles for cache-friendly layer ordering. Added S3-backed bazel cache for Go services. Cached node_modules and pip wheels at the agent level. Cache hit rate went from 22% to 78%.
Local-to-prod parity
Engineers can now run the exact same CI pipeline locally via `bk-local-run`. Catches 80% of CI failures before they hit the queue.
AWS services in this engagement
What shipped.
p50 CI run from 19 to 4 minutes. p99 from 41 to 9 minutes. PR review cycle time dropped from 18 hours to 4 hours.
Engineering team now deploys 12+ times per day, up from 1–2 before the engagement. No incidents traced to the increased deploy cadence.
Hardened 47 IAM roles in 11 days.
Series B fintech needed audit-ready IAM before SOC 2. We refactored every role into least-privilege Terraform modules.
PlatformCut cloud spend 38% in 6 weeks.
Multi-region platform was burning compute. We re-sized the fleet, moved to Graviton, and added a savings plan model that paid back in 21 days.
ArchitectureMigrated from Heroku to a cloud platform in 9 weeks.
B2B platform hit Heroku ceilings. We landed them on ECS with managed Postgres, zero downtime, and a runbook the team actually uses.
Tell us what you're trying to ship.
A 30-minute scoping call with the engineers who would do the work.