Mohammad Gufran Jahangir February 22, 2026 0

CI/CD sounds simple until you’re responsible for a production system and realize:

  • A “successful build” doesn’t mean the app is safe to ship
  • “Deploy” isn’t the end — verification and rollback readiness matter more
  • One bad release can burn hours of on-call time if rollback isn’t engineered in

This guide gives you a reference CI/CD pipeline you can adopt for most cloud apps (containers + Kubernetes or similar). It’s beginner-friendly, but it’s also how mature teams structure reliable delivery.

By the end, you’ll have a pipeline that:

  • Builds reproducible artifacts
  • Scans code + dependencies + container + IaC
  • Deploys safely (staging → production)
  • Verifies with smoke tests and health checks
  • Rolls back fast with minimal panic

No fluff. Just steps, patterns, and examples.


Table of Contents

The reference pipeline in one view

Here’s the shape we’re building (you’ll implement it step-by-step):

PR Pipeline (fast feedback)

  1. Lint + unit tests
  2. SAST + dependency scan + secret scan
  3. Build container (optionally) + quick container scan
  4. Report results back to PR

Main/Release Pipeline (shipping)

  1. Build → version → tag
  2. Run full test suite
  3. Scan (code, deps, container, IaC)
  4. Create SBOM + sign artifacts
  5. Push image to registry
  6. Deploy to staging
  7. Smoke tests + optional integration tests
  8. Promote to production (progressive)
  9. Post-deploy verification + monitoring gates
  10. Rollback plan ready at every step

The “golden rules” (these prevent 90% of CI/CD pain)

1) Build once, deploy many

Never rebuild the same version in staging and production.
You build an immutable artifact (example: a container image with a unique tag), then promote it across environments.

2) Everything is versioned

  • App version
  • Container image tag
  • Helm chart / manifest version
  • Database migration version (if applicable)

If you can’t name what’s running, you can’t roll it back confidently.

3) Security gates are part of the pipeline, not a meeting

Scanning must run automatically. If it’s manual, it won’t happen consistently.

4) Rollback is a feature, not a reaction

A mature pipeline always assumes: some release will fail.
So it makes rollback fast, safe, and boring.


What you need before building the pipeline

You can follow this guide with any CI system. The concepts stay the same.

Minimal prerequisites

  • A Git repo for your app
  • A Dockerfile (or build definition)
  • A container registry (private is fine)
  • A deployment target (Kubernetes recommended, but any environment works)
  • Secrets storage (CI secret store or cloud secret manager)
  • A way to run tests

Suggested repo structure (simple and scalable)

repo/
  app/                 # source code
  Dockerfile
  .ci/                 # pipeline scripts (optional)
  deploy/
    k8s/               # manifests or helm chart
      base/
      overlays/
        dev/
        stage/
        prod/
  scripts/
    smoke-test.sh
    migrate.sh

Step 1 — BUILD (fast, reproducible, traceable)

Your build stage should answer 3 questions:

  1. Can we reproduce this build later?
  2. Can we trace this build to a commit?
  3. Can we trust the artifact is the same in every environment?

Build checklist (what “good” looks like)

  • ✅ Deterministic dependencies (lock files pinned)
  • ✅ Version injected from Git commit or tag
  • ✅ Unit tests run (at least)
  • ✅ Container image built with a unique tag
  • ✅ Artifact stored (image pushed OR saved as build output)

A practical versioning scheme

Use something humans and machines can read:

  • 1.6.0 for releases (tags)
  • 1.6.0+sha-abc1234 for traceability
  • Container tag examples:
    • app:1.6.0
    • app:sha-abc1234
    • app:1.6.0-sha-abc1234

Rule: Production should deploy release tags or commit tags, not “latest”.

Example: container build commands (conceptual)

  • Build: docker build -t app:sha-abc1234 .
  • Test: run unit tests in CI
  • Push: docker push app:sha-abc1234

Step 2 — SCAN (catch issues before they become incidents)

Scanning isn’t one tool. It’s coverage.

A strong pipeline scans 5 areas:

  1. Secrets scan (accidental keys in code)
  2. SAST (static code vulnerabilities)
  3. SCA / dependency scan (libraries you import)
  4. Container image scan (OS packages + known CVEs)
  5. IaC scan (Terraform/Kubernetes misconfigurations)

The right way to gate scans (so dev velocity stays high)

PR gating (fast)

Block PRs only for:

  • leaked secrets
  • critical vulnerabilities with known exploitability
  • clearly unsafe IaC patterns (public buckets, open security groups, privileged pods)

Everything else becomes:

  • warnings
  • tickets
  • backlog items

Release gating (strict)

Before production, you enforce:

  • no critical vulnerabilities without exception approval
  • no embedded secrets
  • baseline security policies satisfied

Real example: a sensible vulnerability policy

  • Critical: block release (unless exception approved)
  • High: block if internet-facing + reachable, otherwise ticket
  • Medium/Low: ticket + fix in next sprint
  • Accepted risk: record with expiry date (don’t accept forever)

This keeps you secure and shipping.


Step 3 — PACKAGE (SBOM + signing + provenance)

This is where many pipelines level up.

Why SBOM matters (engineer explanation)

An SBOM is simply: “What exactly is inside this artifact?”
If a library vulnerability drops tomorrow, you can answer:
“Are we affected? Which services? Which versions?”

What to generate and store

  • SBOM file (for the image/build)
  • Build metadata (commit, build ID, dependency lock hash)
  • Optional: signature for the image

Result: your pipeline produces artifacts you can audit and trust.


Step 4 — DEPLOY (staging first, then production)

Deploy should be boring and repeatable.

Environment strategy that works for most teams

  • dev: fast, flexible, may use mocks
  • stage: production-like, used for final verification
  • prod: controlled, progressive releases, strict gates

Promotion rule

Only deploy to production from:

  • a tagged release, or
  • an approved commit from main

No “random branch deploys to prod.”


The reference deployment flow (staging → prod)

Stage deployment

  1. Apply manifests / chart to staging
  2. Wait for rollout complete
  3. Run smoke tests
  4. Optional: integration tests
  5. Capture deployment report (versions, rollout time, test results)

Prod deployment (progressive)

  1. Deploy canary (small percentage)
  2. Verify (metrics + logs + error rates)
  3. Gradually increase traffic
  4. Full rollout
  5. Post-deploy verification window

Step 5 — VERIFY (the step teams skip… and regret)

Deploying is not the same as shipping.

You need automated checks that answer:

  • Is the service responding?
  • Is latency acceptable?
  • Are error rates normal?
  • Are key workflows working?

Smoke tests (simple and powerful)

A smoke test is a short script that:

  • hits /health and core endpoints
  • checks auth works (if relevant)
  • validates one “golden path” transaction

Example smoke-test script behavior

  • Call health endpoint
  • Call one API endpoint with a test token
  • Validate response schema
  • Exit non-zero if any check fails

You run smoke tests:

  • after staging deploy
  • after production canary
  • after full rollout

Step 6 — ROLLBACK (fast, safe, and predictable)

Rollback isn’t one button. It’s a design.

Two rollback types you must plan for

A) Application rollback (easy)

When the app code is bad:

  • Roll back to the previous image tag / chart version
  • Re-route traffic back to stable version

This should be automated and fast.

B) Data rollback (hard)

When migrations or data changes are involved:

  • You often cannot “undo” safely

So you use the expand/contract pattern:

  1. Expand: add new columns/fields in a backward-compatible way
  2. Deploy app that writes both old and new (if needed)
  3. Migrate data safely
  4. Contract: remove old fields later (after stability)

Rule: If a release includes a breaking DB change, rollback becomes risky.
So engineer DB changes to be backward-compatible.


Rollback strategies you should know (choose based on risk)

1) Rolling rollback (basic)

  • Re-deploy previous version
  • Works when traffic can tolerate brief disruption

2) Blue/Green (clean rollback)

  • Two environments: Blue (current), Green (new)
  • Switch traffic to Green
  • If bad: switch back to Blue

Rollback is almost instant (traffic switch).

3) Canary (best balance)

  • Send small traffic to new version
  • If metrics degrade: stop canary and revert

This reduces blast radius dramatically.


What should trigger an automatic rollback?

Pick a short verification window (example: 10–20 minutes after deployment).
If any of these break thresholds, rollback automatically:

  • Error rate over X%
  • Latency p95 over Y ms
  • CrashLoopBackOff / unhealthy pods
  • Failed smoke tests

Important: Auto-rollback should be conservative.
You don’t want flapping. Use sensible thresholds.


The Reference Pipeline (copyable blueprint)

Below is a pipeline blueprint written in a generic CI style so you can adapt it to any CI tool.

PR Pipeline (fast feedback)

Goal: prevent unsafe code from merging.

Stages

  1. lint
  2. unit_test
  3. secret_scan
  4. sast_scan
  5. dependency_scan
  6. iac_scan (if deploy files changed)
  7. build_check (optional container build)

Outputs

  • PR status checks (pass/fail)
  • Security report summary
  • Artifact only if needed (not always)

Main Pipeline (ship)

Goal: produce a trusted artifact and deploy progressively.

Stages

  1. build
    • set version from git
    • run unit tests
    • build image app:sha
  2. scan
    • run secret scan
    • SAST + dependency scan
    • image scan
    • IaC scan
  3. package
    • generate SBOM
    • sign artifact (optional)
  4. push
    • push image to registry
    • publish SBOM + metadata
  5. deploy_stage
    • deploy image tag to staging
  6. test_stage
    • smoke tests + optional integration tests
  7. promote_prod
    • manual approval or policy gate (depending on org)
  8. deploy_prod_canary
    • canary release
  9. verify_canary
    • metrics checks + smoke tests
  10. deploy_prod_full
  11. verify_prod
  12. rollback_if_needed
  • automated rollback logic on failure

Real example walkthrough (from commit to production)

Let’s play out a real release:

Day 1: Developer opens PR

  • Lint fails → fixed quickly
  • Dependency scan flags a vulnerable library (high severity)
  • Policy says: ticket created, but PR allowed because it’s not exploitable in this path
    Result: dev velocity stays high, risk is tracked.

Day 2: Merge to main

Main pipeline runs:

  • Build creates app:sha-abc1234
  • Scans pass
  • SBOM generated
  • Image pushed

Day 2: Deploy to staging

  • Staging deploy succeeds
  • Smoke test fails because a config value is missing
    Pipeline stops. Nothing reaches prod.
    Fix is made, pipeline re-runs.

Day 3: Production canary

  • 5% traffic routed to new version
  • Error rate rises above threshold within 3 minutes
    Auto-rollback triggers:
  • Traffic goes back to stable version
  • Incident avoided
  • Pipeline marks release as failed with logs + metrics snapshot

That is a mature pipeline: fast failure, tiny blast radius, safe rollback.


Common mistakes (and how to avoid them)

Mistake 1: “We’ll add scanning later”

Later never comes. Add scanning early with soft gates, then tighten.

Mistake 2: Rebuilding per environment

This destroys traceability. Build once, promote.

Mistake 3: “Rollback = redeploy previous”

That’s only true if data changes are backward-compatible.

Mistake 4: No post-deploy verification

Deploying without verification is gambling.

Mistake 5: Secrets in CI variables forever

Rotate secrets, use short-lived credentials where possible, and audit access.


CI/CD maturity levels (so you know what to aim for)

Level 1: Basic

  • Build + unit tests + deploy
    (works until your first serious incident)

Level 2: Safe

  • Add scanning + staging + smoke tests
    (now you block common disasters)

Level 3: Reliable

  • Progressive delivery + automated rollback
    (blast radius becomes small)

Level 4: Trusted

  • SBOM + signing + policy gates + audit trails
    (you can prove what you shipped and why)

Final “reference checklist” (use this as your implementation guide)

Build

  • deterministic dependencies
  • versioning from Git
  • immutable artifacts (image tags)
  • tests in CI

Scan

  • secret scan
  • SAST
  • dependency scan
  • container scan
  • IaC scan
  • clear gating policy

Deploy

  • staging first
  • promotion-only to prod
  • progressive delivery for prod

Verify

  • smoke tests automated
  • metrics-based gates
  • post-deploy verification window

Rollback

  • rollback tested regularly
  • backward-compatible DB strategy
  • canary abort or traffic switch available

Category: 
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments