# Kubernetes Cheat Sheet

## Core Concepts

| Object | What it is |
|---|---|
| **Pod** | Smallest deployable unit — one or more containers sharing network + storage |
| **Deployment** | Manages a ReplicaSet; handles rolling updates + rollbacks |
| **ReplicaSet** | Ensures N pod replicas are running at all times |
| **StatefulSet** | Like Deployment but with stable pod identity + ordered updates (databases) |
| **DaemonSet** | One pod per node — monitoring agents, log collectors |
| **Job / CronJob** | Run-to-completion tasks; CronJob adds scheduling |
| **Service** | Stable DNS + IP in front of a set of pods (label selector) |
| **Ingress** | L7 routing — paths/hostnames → Services |
| **ConfigMap** | Non-sensitive config as key-value or files |
| **Secret** | Base64-encoded sensitive data (env vars, image pull) |
| **Namespace** | Soft isolation within a cluster |
| **PersistentVolume (PV)** | Cluster-level storage resource |
| **PersistentVolumeClaim (PVC)** | Pod's request for PV storage |
| **ServiceAccount** | Identity for pods to talk to the API server |

---

## kubectl — Essential Commands

### Context & Cluster
```bash
kubectl config get-contexts               # list contexts
kubectl config use-context <name>         # switch cluster
kubectl config current-context
kubectl cluster-info
kubectl api-resources                     # all resource types + short names
```

### Get / Inspect
```bash
kubectl get pods                          # -n <ns> for namespace, -A for all
kubectl get pods -o wide                  # + node, IP
kubectl get all -n <namespace>
kubectl describe pod <name>               # events, conditions, resource limits
kubectl logs <pod>                        # -f to follow, --previous for crashed
kubectl logs <pod> -c <container>         # multi-container pod
kubectl get events --sort-by=.lastTimestamp
```

### Create / Apply
```bash
kubectl apply -f manifest.yaml            # create or update (idempotent)
kubectl apply -f ./dir/                   # whole directory
kubectl create deployment nginx --image=nginx --replicas=3
kubectl expose deployment nginx --port=80 --type=ClusterIP
```

### Edit / Patch
```bash
kubectl edit deployment <name>            # opens $EDITOR
kubectl set image deployment/<name> app=nginx:1.25
kubectl scale deployment <name> --replicas=5
kubectl rollout restart deployment <name>
```

### Rollouts
```bash
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name>           # revert one version
kubectl rollout undo deployment/<name> --to-revision=2
```

### Delete
```bash
kubectl delete pod <name>
kubectl delete -f manifest.yaml
kubectl delete deployment,svc <name>
kubectl delete pod <name> --grace-period=0 --force   # immediate kill
```

### Exec / Copy
```bash
kubectl exec -it <pod> -- /bin/sh         # shell into pod
kubectl exec -it <pod> -c <container> -- bash
kubectl cp <pod>:/path/to/file ./local    # copy from pod
kubectl cp ./local <pod>:/path/           # copy to pod
```

### Port Forward
```bash
kubectl port-forward pod/<name> 8080:80
kubectl port-forward svc/<name> 8080:80
```

### Dry Run / Diff
```bash
kubectl apply -f manifest.yaml --dry-run=client
kubectl diff -f manifest.yaml             # show what would change
```

### Output Formats
```bash
kubectl get pods -o yaml
kubectl get pods -o json
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
kubectl get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase
```

---

## YAML Templates

### Pod
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  labels:
    app: my-app
spec:
  containers:
  - name: app
    image: nginx:1.25
    ports:
    - containerPort: 80
    resources:
      requests:
        cpu: "100m"
        memory: "128Mi"
      limits:
        cpu: "500m"
        memory: "256Mi"
    env:
    - name: ENV_VAR
      valueFrom:
        configMapKeyRef:
          name: my-config
          key: my-key
```

### Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: app
        image: myregistry/my-app:v1.0
        ports:
        - containerPort: 8080
```

### Service
```yaml
apiVersion: v1
kind: Service
metadata:
  name: my-svc
spec:
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080
  type: ClusterIP        # ClusterIP | NodePort | LoadBalancer
```

### ConfigMap
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-config
data:
  DATABASE_URL: "postgres://db:5432/mydb"
  config.yaml: |
    key: value
    other: value
```

### Secret
```yaml
apiVersion: v1
kind: Secret
metadata:
  name: my-secret
type: Opaque
stringData:                 # stringData auto-encodes; data expects base64
  DB_PASSWORD: "hunter2"
```

### Ingress (nginx)
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-svc
            port:
              number: 80
```

### PersistentVolumeClaim
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: standard
  resources:
    requests:
      storage: 10Gi
```

---

## Namespaces

```bash
kubectl get namespaces
kubectl create namespace staging
kubectl get pods -n staging
kubectl config set-context --current --namespace=staging   # set default ns

# Resource quota per namespace
kubectl create quota my-quota --hard=cpu=2,memory=4Gi,pods=10 -n staging
```

---

## Labels & Selectors

```bash
kubectl get pods -l app=my-app
kubectl get pods -l 'env in (prod, staging)'
kubectl label pod <name> tier=backend
kubectl label pod <name> tier-                   # remove label
```

---

## Health Probes

```yaml
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

startupProbe:                  # gives slow-starting apps more time
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10
```

**Liveness** — is the container alive? Restart if fails.  
**Readiness** — is it ready to receive traffic? Remove from Service endpoints if fails.  
**Startup** — did it start up? Disables liveness/readiness until it passes.

---

## Resource Requests & Limits

```yaml
resources:
  requests:       # scheduler uses this to find a node
    cpu: "250m"   # 250 millicores = 0.25 CPU
    memory: "64Mi"
  limits:         # enforced at runtime; CPU throttled, memory OOMKilled
    cpu: "1"
    memory: "256Mi"
```

- CPU is **compressible** — throttled when over limit, not killed.  
- Memory is **incompressible** — container is OOMKilled when over limit.  
- Always set requests; limits are optional but recommended.

---

## Scheduling

```yaml
# Node selector (simple)
nodeSelector:
  kubernetes.io/arch: amd64

# Node affinity (flexible)
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node-type
          operator: In
          values: [gpu]

# Pod anti-affinity (spread replicas across nodes)
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            app: my-app
        topologyKey: kubernetes.io/hostname

# Taints & tolerations
# Node: kubectl taint nodes <node> key=value:NoSchedule
tolerations:
- key: "key"
  operator: "Equal"
  value: "value"
  effect: "NoSchedule"
```

---

## RBAC

```yaml
# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-sa
  namespace: default

# Role (namespace-scoped)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: default
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

# RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: ServiceAccount
  name: my-sa
  namespace: default
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io
```

Use `ClusterRole` + `ClusterRoleBinding` for cluster-wide permissions.

```bash
kubectl auth can-i list pods --as=system:serviceaccount:default:my-sa
```

---

## Networking

| Service Type | Use case |
|---|---|
| `ClusterIP` | Internal only — default |
| `NodePort` | Exposes on each node's IP at a static port (30000–32767) |
| `LoadBalancer` | Cloud LB provisioned automatically |
| `ExternalName` | DNS alias to external hostname |

**DNS within cluster:** `<service>.<namespace>.svc.cluster.local`  
Shorthand within same namespace: just `<service>`

**Network policies** — default: all pods can talk to all pods. Apply a NetworkPolicy to restrict:
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-frontend
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
```

---

## Troubleshooting Playbook

```bash
# Pod stuck in Pending
kubectl describe pod <name>       # look at Events — usually: no nodes fit, PVC not bound

# Pod CrashLoopBackOff
kubectl logs <pod> --previous     # logs from the crashed container

# Pod stuck Terminating
kubectl delete pod <name> --grace-period=0 --force

# OOMKilled
kubectl describe pod <name>       # last state: OOMKilled
# → increase memory limit or find the leak

# ImagePullBackOff
kubectl describe pod <name>       # check image name, pull secret
kubectl get secret regcred -o yaml

# Service not routing
kubectl get endpoints <svc>       # should list pod IPs; empty = label mismatch
kubectl run tmp --image=busybox -it --rm -- wget -qO- http://<svc>:<port>

# Node NotReady
kubectl describe node <name>      # Events, Conditions
kubectl get pods -A --field-selector spec.nodeName=<name>

# Check resource usage
kubectl top pods
kubectl top nodes
```

---

## Helm (package manager)

```bash
helm repo add stable https://charts.helm.sh/stable
helm repo update
helm search repo nginx

helm install my-release stable/nginx-ingress -f values.yaml
helm upgrade my-release stable/nginx-ingress --set replicaCount=3
helm rollback my-release 1
helm uninstall my-release

helm list
helm status my-release
helm get values my-release
```

---

## Quick Reference — Common Flags

| Flag | Effect |
|---|---|
| `-n <namespace>` | Target namespace |
| `-A` / `--all-namespaces` | All namespaces |
| `-o yaml` / `-o json` | Full output |
| `-o wide` | Extra columns |
| `--dry-run=client` | Validate without applying |
| `-l <selector>` | Filter by label |
| `-w` | Watch for changes |
| `--field-selector` | Filter by field (e.g. `status.phase=Running`) |
| `--show-labels` | Include labels column |