Kubernetes Cheat Sheet
Core Concepts
| Object | What it is |
|---|---|
| Pod | Smallest deployable unit — one or more containers sharing network + storage |
| Deployment | Manages a ReplicaSet; handles rolling updates + rollbacks |
| ReplicaSet | Ensures N pod replicas are running at all times |
| StatefulSet | Like Deployment but with stable pod identity + ordered updates (databases) |
| DaemonSet | One pod per node — monitoring agents, log collectors |
| Job / CronJob | Run-to-completion tasks; CronJob adds scheduling |
| Service | Stable DNS + IP in front of a set of pods (label selector) |
| Ingress | L7 routing — paths/hostnames → Services |
| ConfigMap | Non-sensitive config as key-value or files |
| Secret | Base64-encoded sensitive data (env vars, image pull) |
| Namespace | Soft isolation within a cluster |
| PersistentVolume (PV) | Cluster-level storage resource |
| PersistentVolumeClaim (PVC) | Pod's request for PV storage |
| ServiceAccount | Identity for pods to talk to the API server |
kubectl — Essential Commands
Context & Cluster
kubectl config get-contexts # list contexts
kubectl config use-context <name> # switch cluster
kubectl config current-context
kubectl cluster-info
kubectl api-resources # all resource types + short names
Get / Inspect
kubectl get pods # -n <ns> for namespace, -A for all
kubectl get pods -o wide # + node, IP
kubectl get all -n <namespace>
kubectl describe pod <name> # events, conditions, resource limits
kubectl logs <pod> # -f to follow, --previous for crashed
kubectl logs <pod> -c <container> # multi-container pod
kubectl get events --sort-by=.lastTimestamp
Create / Apply
kubectl apply -f manifest.yaml # create or update (idempotent)
kubectl apply -f ./dir/ # whole directory
kubectl create deployment nginx --image=nginx --replicas=3
kubectl expose deployment nginx --port=80 --type=ClusterIP
Edit / Patch
kubectl edit deployment <name> # opens $EDITOR
kubectl set image deployment/<name> app=nginx:1.25
kubectl scale deployment <name> --replicas=5
kubectl rollout restart deployment <name>
Rollouts
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name> # revert one version
kubectl rollout undo deployment/<name> --to-revision=2
Delete
kubectl delete pod <name>
kubectl delete -f manifest.yaml
kubectl delete deployment,svc <name>
kubectl delete pod <name> --grace-period=0 --force # immediate kill
Exec / Copy
kubectl exec -it <pod> -- /bin/sh # shell into pod
kubectl exec -it <pod> -c <container> -- bash
kubectl cp <pod>:/path/to/file ./local # copy from pod
kubectl cp ./local <pod>:/path/ # copy to pod
Port Forward
kubectl port-forward pod/<name> 8080:80
kubectl port-forward svc/<name> 8080:80
Dry Run / Diff
kubectl apply -f manifest.yaml --dry-run=client
kubectl diff -f manifest.yaml # show what would change
Output Formats
kubectl get pods -o yaml
kubectl get pods -o json
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase
YAML Templates
Pod
apiVersion: v1
kind: Pod
metadata:
name: my-pod
labels:
app: my-app
spec:
containers:
- name: app
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
env:
- name: ENV_VAR
valueFrom:
configMapKeyRef:
name: my-config
key: my-key
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: my-app
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: app
image: myregistry/my-app:v1.0
ports:
- containerPort: 8080
Service
apiVersion: v1
kind: Service
metadata:
name: my-svc
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: ClusterIP # ClusterIP | NodePort | LoadBalancer
ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: my-config
data:
DATABASE_URL: "postgres://db:5432/mydb"
config.yaml: |
key: value
other: value
Secret
apiVersion: v1
kind: Secret
metadata:
name: my-secret
type: Opaque
stringData: # stringData auto-encodes; data expects base64
DB_PASSWORD: "hunter2"
Ingress (nginx)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: myapp.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-svc
port:
number: 80
PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 10Gi
Namespaces
kubectl get namespaces
kubectl create namespace staging
kubectl get pods -n staging
kubectl config set-context --current --namespace=staging # set default ns
# Resource quota per namespace
kubectl create quota my-quota --hard=cpu=2,memory=4Gi,pods=10 -n staging
Labels & Selectors
kubectl get pods -l app=my-app
kubectl get pods -l 'env in (prod, staging)'
kubectl label pod <name> tier=backend
kubectl label pod <name> tier- # remove label
Health Probes
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
startupProbe: # gives slow-starting apps more time
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
Liveness — is the container alive? Restart if fails.
Readiness — is it ready to receive traffic? Remove from Service endpoints if fails.
Startup — did it start up? Disables liveness/readiness until it passes.
Resource Requests & Limits
resources:
requests: # scheduler uses this to find a node
cpu: "250m" # 250 millicores = 0.25 CPU
memory: "64Mi"
limits: # enforced at runtime; CPU throttled, memory OOMKilled
cpu: "1"
memory: "256Mi"
- CPU is compressible — throttled when over limit, not killed.
- Memory is incompressible — container is OOMKilled when over limit.
- Always set requests; limits are optional but recommended.
Scheduling
# Node selector (simple)
nodeSelector:
kubernetes.io/arch: amd64
# Node affinity (flexible)
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: [gpu]
# Pod anti-affinity (spread replicas across nodes)
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: my-app
topologyKey: kubernetes.io/hostname
# Taints & tolerations
# Node: kubectl taint nodes <node> key=value:NoSchedule
tolerations:
- key: "key"
operator: "Equal"
value: "value"
effect: "NoSchedule"
RBAC
# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-sa
namespace: default
# Role (namespace-scoped)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
namespace: default
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
# RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
- kind: ServiceAccount
name: my-sa
namespace: default
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
Use ClusterRole + ClusterRoleBinding for cluster-wide permissions.
kubectl auth can-i list pods --as=system:serviceaccount:default:my-sa
Networking
| Service Type | Use case |
|---|---|
ClusterIP |
Internal only — default |
NodePort |
Exposes on each node's IP at a static port (30000–32767) |
LoadBalancer |
Cloud LB provisioned automatically |
ExternalName |
DNS alias to external hostname |
DNS within cluster: <service>.<namespace>.svc.cluster.local
Shorthand within same namespace: just <service>
Network policies — default: all pods can talk to all pods. Apply a NetworkPolicy to restrict:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-frontend
namespace: default
spec:
podSelector:
matchLabels:
app: backend
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
Troubleshooting Playbook
# Pod stuck in Pending
kubectl describe pod <name> # look at Events — usually: no nodes fit, PVC not bound
# Pod CrashLoopBackOff
kubectl logs <pod> --previous # logs from the crashed container
# Pod stuck Terminating
kubectl delete pod <name> --grace-period=0 --force
# OOMKilled
kubectl describe pod <name> # last state: OOMKilled
# → increase memory limit or find the leak
# ImagePullBackOff
kubectl describe pod <name> # check image name, pull secret
kubectl get secret regcred -o yaml
# Service not routing
kubectl get endpoints <svc> # should list pod IPs; empty = label mismatch
kubectl run tmp --image=busybox -it --rm -- wget -qO- http://<svc>:<port>
# Node NotReady
kubectl describe node <name> # Events, Conditions
kubectl get pods -A --field-selector spec.nodeName=<name>
# Check resource usage
kubectl top pods
kubectl top nodes
Helm (package manager)
helm repo add stable https://charts.helm.sh/stable
helm repo update
helm search repo nginx
helm install my-release stable/nginx-ingress -f values.yaml
helm upgrade my-release stable/nginx-ingress --set replicaCount=3
helm rollback my-release 1
helm uninstall my-release
helm list
helm status my-release
helm get values my-release
Quick Reference — Common Flags
| Flag | Effect |
|---|---|
-n <namespace> |
Target namespace |
-A / --all-namespaces |
All namespaces |
-o yaml / -o json |
Full output |
-o wide |
Extra columns |
--dry-run=client |
Validate without applying |
-l <selector> |
Filter by label |
-w |
Watch for changes |
--field-selector |
Filter by field (e.g. status.phase=Running) |
--show-labels |
Include labels column |