Kubernetes Cheat Sheet

Core Concepts

Object	What it is
Pod	Smallest deployable unit — one or more containers sharing network + storage
Deployment	Manages a ReplicaSet; handles rolling updates + rollbacks
ReplicaSet	Ensures N pod replicas are running at all times
StatefulSet	Like Deployment but with stable pod identity + ordered updates (databases)
DaemonSet	One pod per node — monitoring agents, log collectors
Job / CronJob	Run-to-completion tasks; CronJob adds scheduling
Service	Stable DNS + IP in front of a set of pods (label selector)
Ingress	L7 routing — paths/hostnames → Services
ConfigMap	Non-sensitive config as key-value or files
Secret	Base64-encoded sensitive data (env vars, image pull)
Namespace	Soft isolation within a cluster
PersistentVolume (PV)	Cluster-level storage resource
PersistentVolumeClaim (PVC)	Pod's request for PV storage
ServiceAccount	Identity for pods to talk to the API server

kubectl — Essential Commands

Context & Cluster

kubectl config get-contexts               # list contexts
kubectl config use-context <name>         # switch cluster
kubectl config current-context
kubectl cluster-info
kubectl api-resources                     # all resource types + short names

Get / Inspect

kubectl get pods                          # -n <ns> for namespace, -A for all
kubectl get pods -o wide                  # + node, IP
kubectl get all -n <namespace>
kubectl describe pod <name>               # events, conditions, resource limits
kubectl logs <pod>                        # -f to follow, --previous for crashed
kubectl logs <pod> -c <container>         # multi-container pod
kubectl get events --sort-by=.lastTimestamp

Create / Apply

kubectl apply -f manifest.yaml            # create or update (idempotent)
kubectl apply -f ./dir/                   # whole directory
kubectl create deployment nginx --image=nginx --replicas=3
kubectl expose deployment nginx --port=80 --type=ClusterIP

Edit / Patch

kubectl edit deployment <name>            # opens $EDITOR
kubectl set image deployment/<name> app=nginx:1.25
kubectl scale deployment <name> --replicas=5
kubectl rollout restart deployment <name>

Rollouts

kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>           # revert one version
kubectl rollout undo deployment/<name> --to-revision=2

Delete

kubectl delete pod <name>
kubectl delete -f manifest.yaml
kubectl delete deployment,svc <name>
kubectl delete pod <name> --grace-period=0 --force   # immediate kill

Exec / Copy

kubectl exec -it <pod> -- /bin/sh         # shell into pod
kubectl exec -it <pod> -c <container> -- bash
kubectl cp <pod>:/path/to/file ./local    # copy from pod
kubectl cp ./local <pod>:/path/           # copy to pod

Port Forward

kubectl port-forward pod/<name> 8080:80
kubectl port-forward svc/<name> 8080:80

Dry Run / Diff

kubectl apply -f manifest.yaml --dry-run=client
kubectl diff -f manifest.yaml             # show what would change

Output Formats

kubectl get pods -o yaml
kubectl get pods -o json
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase

YAML Templates

Pod

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  labels:
    app: my-app
spec:
  containers:
  - name: app
    image: nginx:1.25
    ports:
    - containerPort: 80
    resources:
      requests:
        cpu: "100m"
        memory: "128Mi"
      limits:
        cpu: "500m"
        memory: "256Mi"
    env:
    - name: ENV_VAR
      valueFrom:
        configMapKeyRef:
          name: my-config
          key: my-key

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: app
        image: myregistry/my-app:v1.0
        ports:
        - containerPort: 8080

Service

apiVersion: v1
kind: Service
metadata:
  name: my-svc
spec:
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080
  type: ClusterIP        # ClusterIP | NodePort | LoadBalancer

ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-config
data:
  DATABASE_URL: "postgres://db:5432/mydb"
  config.yaml: |
    key: value
    other: value

Secret

apiVersion: v1
kind: Secret
metadata:
  name: my-secret
type: Opaque
stringData:                 # stringData auto-encodes; data expects base64
  DB_PASSWORD: "hunter2"

Ingress (nginx)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-svc
            port:
              number: 80

PersistentVolumeClaim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: standard
  resources:
    requests:
      storage: 10Gi

Namespaces

kubectl get namespaces
kubectl create namespace staging
kubectl get pods -n staging
kubectl config set-context --current --namespace=staging   # set default ns

# Resource quota per namespace
kubectl create quota my-quota --hard=cpu=2,memory=4Gi,pods=10 -n staging

Labels & Selectors

kubectl get pods -l app=my-app
kubectl get pods -l 'env in (prod, staging)'
kubectl label pod <name> tier=backend
kubectl label pod <name> tier-                   # remove label

Health Probes

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

startupProbe:                  # gives slow-starting apps more time
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10

Liveness — is the container alive? Restart if fails.
Readiness — is it ready to receive traffic? Remove from Service endpoints if fails.
Startup — did it start up? Disables liveness/readiness until it passes.

Resource Requests & Limits

resources:
  requests:       # scheduler uses this to find a node
    cpu: "250m"   # 250 millicores = 0.25 CPU
    memory: "64Mi"
  limits:         # enforced at runtime; CPU throttled, memory OOMKilled
    cpu: "1"
    memory: "256Mi"

CPU is compressible — throttled when over limit, not killed.
Memory is incompressible — container is OOMKilled when over limit.
Always set requests; limits are optional but recommended.

Scheduling

# Node selector (simple)
nodeSelector:
  kubernetes.io/arch: amd64

# Node affinity (flexible)
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node-type
          operator: In
          values: [gpu]

# Pod anti-affinity (spread replicas across nodes)
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            app: my-app
        topologyKey: kubernetes.io/hostname

# Taints & tolerations
# Node: kubectl taint nodes <node> key=value:NoSchedule
tolerations:
- key: "key"
  operator: "Equal"
  value: "value"
  effect: "NoSchedule"

RBAC

# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-sa
  namespace: default

# Role (namespace-scoped)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: default
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

# RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: ServiceAccount
  name: my-sa
  namespace: default
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Use ClusterRole + ClusterRoleBinding for cluster-wide permissions.

kubectl auth can-i list pods --as=system:serviceaccount:default:my-sa

Networking

Service Type	Use case
`ClusterIP`	Internal only — default
`NodePort`	Exposes on each node's IP at a static port (30000–32767)
`LoadBalancer`	Cloud LB provisioned automatically
`ExternalName`	DNS alias to external hostname

DNS within cluster: <service>.<namespace>.svc.cluster.local
Shorthand within same namespace: just <service>

Network policies — default: all pods can talk to all pods. Apply a NetworkPolicy to restrict:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-frontend
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend

Troubleshooting Playbook

# Pod stuck in Pending
kubectl describe pod <name>       # look at Events — usually: no nodes fit, PVC not bound

# Pod CrashLoopBackOff
kubectl logs <pod> --previous     # logs from the crashed container

# Pod stuck Terminating
kubectl delete pod <name> --grace-period=0 --force

# OOMKilled
kubectl describe pod <name>       # last state: OOMKilled
# → increase memory limit or find the leak

# ImagePullBackOff
kubectl describe pod <name>       # check image name, pull secret
kubectl get secret regcred -o yaml

# Service not routing
kubectl get endpoints <svc>       # should list pod IPs; empty = label mismatch
kubectl run tmp --image=busybox -it --rm -- wget -qO- http://<svc>:<port>

# Node NotReady
kubectl describe node <name>      # Events, Conditions
kubectl get pods -A --field-selector spec.nodeName=<name>

# Check resource usage
kubectl top pods
kubectl top nodes

Helm (package manager)

helm repo add stable https://charts.helm.sh/stable
helm repo update
helm search repo nginx

helm install my-release stable/nginx-ingress -f values.yaml
helm upgrade my-release stable/nginx-ingress --set replicaCount=3
helm rollback my-release 1
helm uninstall my-release

helm list
helm status my-release
helm get values my-release

Quick Reference — Common Flags

Flag	Effect
`-n <namespace>`	Target namespace
`-A` / `--all-namespaces`	All namespaces
`-o yaml` / `-o json`	Full output
`-o wide`	Extra columns
`--dry-run=client`	Validate without applying
`-l <selector>`	Filter by label
`-w`	Watch for changes
`--field-selector`	Filter by field (e.g. `status.phase=Running`)
`--show-labels`	Include labels column