Mohammad Gufran Jahangir January 21, 2026 0

Kubernetes RBAC is one of those things that feels annoying until the day it saves your cluster from “oops, I deleted prod.”

The tricky part isn’t what RBAC is (you already know “Role/ClusterRole + Binding”).
The tricky part is building roles that are:

  • Useful (engineers can actually do their work)
  • Safe (no accidental privilege escalation)
  • Maintainable (works for 5 namespaces or 500)

This post is a cookbook: copy-ready YAML, plus the “why” behind each rule.

By the end, you’ll have three practical personas:

  1. Read-only: can view workloads + logs/events (but not secrets)
  2. Developer (dev): can deploy and debug in a namespace safely
  3. SRE: can operate workloads and handle incidents (without handing them the keys to the kingdom)

Let’s build it the way mature teams do: least privilege + clear boundaries + easy testing.


The RBAC mental model (in 60 seconds)

RBAC answers: “WHO can do WHAT to WHICH resources WHERE?”

  • WHO: a user, group, or ServiceAccount
  • WHAT: verbs (get, list, watch, create, update, patch, delete)
  • WHICH: resources (pods, deployments, configmaps, …)
  • WHERE: namespace-scoped (Role) vs cluster-scoped (ClusterRole)

Role = permissions in one namespace
ClusterRole = permissions across the cluster or reusable permissions you bind inside namespaces

RoleBinding binds (User/Group/SA) → (Role/ClusterRole) in a namespace
ClusterRoleBinding binds (User/Group/SA) → (ClusterRole) cluster-wide


The #1 safety rule (don’t skip this)

Never give RBAC self-management to regular roles

If a person can create or update:

  • roles, rolebindings, clusterroles, clusterrolebindings

…they can almost always grant themselves more power.

So in this cookbook:

  • Dev and SRE roles do not include RBAC administration.
  • RBAC admin is a separate “platform admin” concern.

This single decision prevents the most common privilege escalation path.


What we’re building (personas + boundaries)

✅ Read-only (safe viewer)

Can:

  • See workloads and their status
  • Read logs and events
  • Inspect services/ingress/configmaps

Cannot:

  • Modify anything
  • Read secrets

✅ Developer (namespace power user)

Can:

  • Deploy/update workloads (Deployments, Jobs, etc.)
  • View pods, logs, events
  • exec into pods and port-forward (optional but very useful)
  • Create/update configmaps and services, ingress, HPA (optional)

Cannot:

  • Touch cluster-wide resources (nodes, namespaces)
  • Manage RBAC
  • Read secrets (default)

✅ SRE (operate during incidents)

Can:

  • Everything dev can do plus: scale, restart, delete/recreate, manage disruption budgets, etc.
  • View some cluster-scoped read-only resources (optional)

Cannot (by default):

  • Manage RBAC
  • Become cluster-admin silently

Step 1: Create a reusable “Read-only (with logs)” ClusterRole

Why ClusterRole?
Because you want one definition you can bind into many namespaces.

✅ This role is safe for auditors, managers, support engineers, and “I just need to see what’s happening” access.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: readonly-with-logs
rules:
  # Core read-only resources
  - apiGroups: [""]
    resources:
      - pods
      - services
      - endpoints
      - configmaps
      - persistentvolumeclaims
      - serviceaccounts
      - events
    verbs: ["get", "list", "watch"]

  # Read pod logs (subresource)
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get", "list", "watch"]

  # Apps workloads
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
    verbs: ["get", "list", "watch"]

  # Batch workloads
  - apiGroups: ["batch"]
    resources: ["jobs", "cronjobs"]
    verbs: ["get", "list", "watch"]

  # Networking resources
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses", "networkpolicies"]
    verbs: ["get", "list", "watch"]

  # Autoscaling read-only (optional but handy)
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch"]

Why no Secrets?

Because secrets access is often the fastest path to “access everything.”
If someone can read secrets, they can usually:

  • grab database passwords
  • grab API keys
  • grab service tokens

Keep secrets access separate and rare.


Step 2: Bind read-only to a namespace

Let’s say your app runs in team-a-prod.

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: viewers
  namespace: team-a-prod
subjects:
  - kind: Group
    name: viewers
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: readonly-with-logs
  apiGroup: rbac.authorization.k8s.io

Pattern: ClusterRole (reusable) + RoleBinding per namespace (scope control)

This is how you avoid accidentally giving someone visibility into every namespace.


Step 3: Create the Developer role (namespace deploy + debug)

This is the role most teams get wrong.

They either:

  • make it too weak (“devs can’t debug anything”), or
  • way too strong (“just give edit/admin”), and then secrets leak everywhere.

Here’s a balanced dev role.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: dev-namespace-editor
rules:
  # Read common stuff + watch for kubectl get -w
  - apiGroups: [""]
    resources:
      - pods
      - services
      - endpoints
      - configmaps
      - persistentvolumeclaims
      - events
    verbs: ["get", "list", "watch"]

  # Devs typically need to restart pods (delete) during debugging
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["delete"]

  # Logs are essential
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get", "list", "watch"]

  # Optional but very practical: exec + port-forward
  # NOTE: exec and portforward are subresources and use "create"
  - apiGroups: [""]
    resources: ["pods/exec", "pods/portforward"]
    verbs: ["create"]

  # Workloads: create/update/patch/delete inside namespace
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Allow scaling (subresource)
  - apiGroups: ["apps"]
    resources: ["deployments/scale", "statefulsets/scale", "replicasets/scale"]
    verbs: ["get", "update", "patch"]

  # Batch jobs
  - apiGroups: ["batch"]
    resources: ["jobs", "cronjobs"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Services & Ingress management for app teams (choose based on your org)
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["create", "update", "patch", "delete"]
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Optional: HPA if teams own scaling policies
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

What this dev role intentionally does NOT allow

  • secrets (read or write)
  • RBAC resources
  • cluster-scoped resources (nodes, namespaces, PVs, CRDs)

That’s how you keep it safe.


Bind dev access to a namespace

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: devs
  namespace: team-a-prod
subjects:
  - kind: Group
    name: devs-team-a
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: dev-namespace-editor
  apiGroup: rbac.authorization.k8s.io

Step 4: Create the SRE role (operate safely)

SREs need to fix incidents fast:

  • roll back
  • scale up
  • delete broken things
  • check policies
  • manage disruption budgets

But you still don’t want to hand them RBAC admin by default.

Here’s a practical SRE operator role:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: sre-namespace-operator
rules:
  # Everything dev can read
  - apiGroups: [""]
    resources:
      - pods
      - services
      - endpoints
      - configmaps
      - persistentvolumeclaims
      - serviceaccounts
      - events
    verbs: ["get", "list", "watch"]

  # Logs + debug
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods/exec", "pods/portforward"]
    verbs: ["create"]

  # Full workload lifecycle control in namespace
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: ["apps"]
    resources: ["deployments/scale", "statefulsets/scale", "replicasets/scale"]
    verbs: ["get", "update", "patch"]

  - apiGroups: ["batch"]
    resources: ["jobs", "cronjobs"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Networking operations
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses", "networkpolicies"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # SREs often manage disruption budgets
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Autoscaling
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Allow deleting pods during incident recovery
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["delete"]

Bind it just like dev:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: sres
  namespace: team-a-prod
subjects:
  - kind: Group
    name: sres
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: sre-namespace-operator
  apiGroup: rbac.authorization.k8s.io

Optional (recommended): SRE cluster read-only visibility

Many SREs need to see cluster-wide context (without changing it):

  • nodes list
  • namespaces list
  • storageclasses list

Create a strict read-only cluster role:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: sre-cluster-observer
rules:
  - apiGroups: [""]
    resources: ["nodes", "namespaces", "persistentvolumes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]

Bind it cluster-wide (read-only):

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: sres-cluster-observer
subjects:
  - kind: Group
    name: sres
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: sre-cluster-observer
  apiGroup: rbac.authorization.k8s.io

Notice: still no RBAC admin. Still no write access to cluster-scoped objects.


Step 5: Test RBAC like a pro (before humans complain)

RBAC testing is where teams become confident.

The fastest command you’ll ever love

Use kubectl auth can-i.

Examples (namespace scoped):

kubectl auth can-i get pods -n team-a-prod --as=alice --as-group=devs-team-a
kubectl auth can-i create deployments -n team-a-prod --as=alice --as-group=devs-team-a
kubectl auth can-i get secrets -n team-a-prod --as=alice --as-group=devs-team-a

You want:

  • ✅ pods: yes
  • ✅ deployments create: yes
  • ❌ secrets: no

Test debug powers:

kubectl auth can-i create pods/exec -n team-a-prod --as=alice --as-group=devs-team-a
kubectl auth can-i get pods/log -n team-a-prod --as=alice --as-group=viewers

Real-world “why is my kubectl failing?” fixes

1) “I can see pods but can’t view logs”

You forgot pods/log.

✅ Add:

resources: ["pods/log"]
verbs: ["get", "list", "watch"]

2) “I can’t exec into pods”

Exec is a subresource and uses create.

✅ Add:

resources: ["pods/exec"]
verbs: ["create"]

3) “kubectl port-forward fails”

Port-forward is also a subresource and uses create.

✅ Add:

resources: ["pods/portforward"]
verbs: ["create"]

4) “Scaling doesn’t work”

Scaling uses /scale subresource.

✅ Add:

resources: ["deployments/scale"]
verbs: ["update", "patch"]

The dangerous shortcuts (and what to do instead)

Shortcut A: “Just give them edit”

Problem: many default/broad roles accidentally include powers you didn’t intend (and in some setups can expose secrets).

Instead: use the custom roles above so you know exactly what’s allowed.

Shortcut B: Wildcards everywhere

apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]

This is basically “you’re an admin” with extra steps.

Instead: start minimal, expand only when a real use-case demands it.

Shortcut C: Give devs RBAC permissions

If devs can edit RoleBindings, they can often give themselves more power.

Instead: separate “RBAC admin” to a platform group.


Optional add-ons (when your cluster uses CRDs)

If your teams manage custom resources (examples: VirtualService, Certificate, ServiceMonitor, etc.), you must add them explicitly.

Template:

- apiGroups: ["<your.api.group>"]
  resources: ["<resourcePluralName>"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Keep CRD permissions namespace-scoped whenever possible.


A clean RBAC structure that scales to many teams

If you want this to stay sane as you grow:

Per namespace, you bind:

  • viewersreadonly-with-logs
  • devs-team-xdev-namespace-editor
  • sressre-namespace-operator

Cluster-wide you bind (optional):

  • sressre-cluster-observer (read-only)

That’s it. Simple. Predictable. Auditable.


Quick “copy checklist” for production safety

✅ Do:

  • Prefer RoleBinding to ClusterRole per namespace (scoped access)
  • Grant pods/log, pods/exec, pods/portforward intentionally (debug without admin)
  • Keep secrets access separate and rare
  • Test with kubectl auth can-i before rollout
  • Keep RBAC admin separate (platform)

❌ Don’t:

  • Give cluster-admin “temporarily”
  • Allow devs to manage RoleBindings
  • Use wildcards unless you truly mean “admin”

Category: 
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments