Back to all posts

Comprehensive Kubernetes Guide

13 min read
Comprehensive Kubernetes Guide

Table of Contents

  1. Introduction to Kubernetes
  2. Kubernetes Architecture
  3. Core Components
  4. Kubernetes Object Hierarchy
  5. Kubernetes Objects and Resources
  6. Identity and Access Management
  7. Essential kubectl Commands
  8. Networking in Kubernetes
  9. Best Practices

Introduction to Kubernetes

Kubernetes (K8s) is an open-source container orchestration platform for automating deployment, scaling, and management of containerized applications. It was originally developed by Google and is now maintained by the Cloud Native Computing Foundation (CNCF).

Key Features of Kubernetes

  • Automated deployment and rollbacks
  • Service discovery and load balancing
  • Self-healing capabilities
  • Horizontal scaling
  • Configuration management
  • Storage orchestration
  • Secret and configuration management

Kubernetes Architecture

Kubernetes follows a client-server architecture composed of master components (control plane) and node components.

Control Plane Components

The control plane manages the worker nodes and the Pods in the cluster. Components include:

  1. kube-apiserver: Exposes the Kubernetes API, the front-end to the Kubernetes control plane
  2. etcd: Consistent and highly-available key-value store used as Kubernetes' backing store
  3. kube-scheduler: Watches for newly created Pods with no assigned node, and selects a node for them
  4. kube-controller-manager: Runs controller processes like node controller, endpoints controller, etc.
  5. cloud-controller-manager: Interfaces with the underlying cloud providers (if applicable)

Node Components

Components that run on every node, maintaining running Pods and providing runtime environment:

  1. kubelet: Ensures containers are running in a Pod
  2. kube-proxy: Maintains network rules on nodes for forwarding network traffic to Pods
  3. Container Runtime: Software responsible for running containers (Docker, containerd, CRI-O)

Core Components

kube-apiserver

The kube-apiserver is the front end for the Kubernetes control plane that exposes the Kubernetes API. It's designed to scale horizontally by deploying more instances.

Loading diagram...
graph TD A[Clients] -->|RESTful API requests| B(kube-apiserver) B -->|Validation & Authentication| C(etcd) B -->|Resource Creation| D(kubelet on nodes) B -->|Scheduling| E(kube-scheduler) B -->|Controller Operations| F(kube-controller-manager) B -->|Cloud Provider Operations| G(cloud-controller-manager)

Responsibilities:

  • REST operations and validating API requests
  • API registration and discovery
  • Implements authentication and authorization
  • Central hub for all cluster operations

etcd

etcd is a distributed key-value store that provides reliable storage for all cluster data. It serves as the single source of truth for the entire Kubernetes cluster.

Loading diagram...
graph TD A[kube-apiserver] -->|Read/Write| B[(etcd)] B -->|Distributed consensus| C[(etcd replica 1)] B -->|Distributed consensus| D[(etcd replica 2)] B -->|Distributed consensus| E[(etcd replica n)]

Key features:

  • Strong consistency guarantees
  • Watch mechanism for changes
  • TTL-based key expiration
  • Distributed consensus using Raft algorithm
  • Secure communication via TLS

kube-scheduler

The kube-scheduler is responsible for assigning newly created Pods to nodes based on resource requirements, constraints, and other factors.

Loading diagram...
graph TD A[kube-apiserver] -->|Watch for unscheduled pods| B(kube-scheduler) B -->|Filtering| C[Filter Nodes] C -->|Scoring| D[Score Nodes] D -->|Binding| E[Select Best Node] E -->|Update Pod| A

Scheduling process:

  1. Filter nodes that meet Pod requirements
  2. Rank and score remaining nodes
  3. Select the best node and bind the Pod
  4. Considers resources, node selector, affinity/anti-affinity, taints/tolerations

kube-controller-manager

The kube-controller-manager runs controller processes that regulate the state of the system. Each controller is a separate process, but they are compiled into a single binary and run in a single process for simplicity.

Loading diagram...
graph TD A[kube-apiserver] <-->|Watch/Update| B(kube-controller-manager) B -->|Manage nodes| C[Node Controller] B -->|Manage replicasets| D[ReplicaSet Controller] B -->|Manage deployments| E[Deployment Controller] B -->|Manage services| F[Service Controller] B -->|Other controllers| G[...]

Some important controllers:

  • Node Controller: Monitors node health
  • ReplicaSet Controller: Ensures correct number of Pods
  • Deployment Controller: Manages deployment lifecycle
  • Service Controller: Creates load balancers for Services
  • Endpoint Controller: Populates Endpoints objects

cloud-controller-manager

The cloud-controller-manager interfaces with the underlying cloud provider's API to manage cloud-specific resources.

Loading diagram...
graph TD A[kube-apiserver] <-->|Watch/Update| B(cloud-controller-manager) B <-->|API calls| C[Cloud Provider API] B -->|Manage nodes| D[Node Controller] B -->|Manage load balancers| E[Service Controller] B -->|Manage routes| F[Route Controller]

Responsibilities:

  • Node Controller: Updates node information from cloud provider
  • Route Controller: Sets up routes in cloud infrastructure
  • Service Controller: Creates, updates, and deletes cloud provider load balancers

Kubernetes Object Hierarchy

Kubernetes has a clear hierarchy of resources and objects.

Loading diagram...
graph TD A[Cluster] -->|Contains| B[Namespaces] B -->|Group| C[Resources] C -->|Include| D[Pods] C -->|Include| E[Deployments] C -->|Include| F[Services] C -->|Include| G[ConfigMaps/Secrets] C -->|Include| H[PVs/PVCs] D -->|Contains| I[Containers] I -->|Uses| J[Images]

A more detailed view of container-pod-node hierarchy:

Loading diagram...
graph TD A[Cluster] -->|Contains| B[Nodes] B -->|Run| C[Pods] C -->|Contains| D[Containers] D -->|Based on| E[Images]

Kubernetes Objects and Resources

Pods

Pods are the smallest deployable units in Kubernetes that can be created and managed. A Pod represents a single instance of a running process in your cluster and contains one or more containers.

Loading diagram...
graph TD A[Pod] -->|Contains| B[Container 1] A -->|Contains| C[Container 2] A -->|Has| D[Shared Network Namespace] A -->|Has| E[Shared Storage Volumes] A -->|Has| F[Pod Specification]

Key features:

  • Shared network namespace (containers can communicate via localhost)
  • Shared storage volumes
  • Always scheduled on the same node
  • Ephemeral - not designed to survive failures

Example Pod YAML:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.19
    ports:
    - containerPort: 80

ReplicaSets

ReplicaSets ensure that a specified number of Pod replicas are running at any given time. It's the mechanism behind self-healing in Kubernetes.

Loading diagram...
graph TD A[ReplicaSet] -->|Controls| B[Pod 1] A -->|Controls| C[Pod 2] A -->|Controls| D[Pod 3] A -->|Defined by| E[Pod Template] A -->|Managed by| F[Deployment] A -->|Uses| G[Selector]

Key features:

  • Ensures a specified number of replicas are running
  • Uses a selector to identify which Pods to manage
  • Creates new Pods if existing ones are deleted or terminated
  • Usually managed by Deployments

Example ReplicaSet YAML:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.19

Deployments

Deployments provide declarative updates for Pods and ReplicaSets. They allow you to describe an application's lifecycle, including upgrades, rollbacks, scaling, and more.

Loading diagram...
graph TD A[Deployment] -->|Manages| B[ReplicaSet 1] A -->|Manages| C[ReplicaSet 2 Old Version] B -->|Controls| D[Pod 1] B -->|Controls| E[Pod 2] B -->|Controls| F[Pod 3] A -->|Provides| G[Rolling Updates] A -->|Provides| H[Rollbacks]

Key features:

  • Declarative updates
  • Rolling updates and rollbacks
  • Scaling
  • Pause and resume
  • Revision history

Example Deployment YAML:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.19
        ports:
        - containerPort: 80

Services

Services define a logical set of Pods and a policy for accessing them. They enable network access to a set of Pods, providing stable endpoints.

Loading diagram...
graph TD A[Service] -->|Selects| B[Pod 1] A -->|Selects| C[Pod 2] A -->|Selects| D[Pod 3] A -->|Types| E[ClusterIP] A -->|Types| F[NodePort] A -->|Types| G[LoadBalancer] A -->|Types| H[ExternalName]

Service types:

  • ClusterIP: Internal only, accessible within the cluster
  • NodePort: Exposes the service on each Node's IP at a static port
  • LoadBalancer: Exposes service externally using cloud provider's load balancer
  • ExternalName: Maps service to an external name via DNS

Example Service YAML:

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP

ConfigMaps and Secrets

ConfigMaps and Secrets store configuration data separately from application code.

ConfigMaps

ConfigMaps store non-sensitive configuration data in key-value pairs.

Loading diagram...
graph TD A[ConfigMap] -->|Referenced by| B[Pod 1] A -->|Referenced by| C[Pod 2] A -->|Used as| D[Environment Variables] A -->|Used as| E[Configuration Files] A -->|Used as| F[Command-line Arguments]

Example ConfigMap YAML:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  database_url: "mysql://db:3306/mydb"
  app_mode: "production"
  config.json: |
    {
      "key": "value",
      "logging": true
    }

Secrets

Secrets store sensitive information like passwords, tokens, and keys.

Loading diagram...
graph TD A[Secret] -->|Referenced by| B[Pod 1] A -->|Referenced by| C[Pod 2] A -->|Types| D[Opaque] A -->|Types| E[TLS] A -->|Types| F[Docker Registry] A -->|Used as| G[Environment Variables] A -->|Used as| H[Files in a Volume]

Example Secret YAML:

apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded "admin"
  password: cGFzc3dvcmQxMjM=  # base64 encoded "password123"

Volumes, Persistent Volumes and Persistent Volume Claims

Volumes

Volumes in Kubernetes provide a way for containers in a Pod to share files and persist data beyond the container lifecycle.

Loading diagram...
graph TD A[Pod] -->|Has| B[Container 1] A -->|Has| C[Container 2] A -->|Mounts| D[Volume] B -->|Accesses| D C -->|Accesses| D

Volume types:

  • emptyDir: Empty directory created when a Pod is assigned to a node
  • hostPath: Mounts a file or directory from the host node's filesystem
  • configMap: Provides a way to inject configuration data
  • secret: Used to pass sensitive information to Pods
  • persistentVolumeClaim: Claims a PersistentVolume for Pod use

Persistent Volumes and Persistent Volume Claims

PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs) provide storage resources in a cluster that have a lifecycle independent of any Pod.

Loading diagram...
graph TD A[PersistentVolume] -->|Claimed by| B[PersistentVolumeClaim] B -->|Used by| C[Pod 1] B -->|Used by| D[Pod 2] A -->|Types| E[NFS] A -->|Types| F[AWS EBS] A -->|Types| G[GCE PD] A -->|Types| H[Azure Disk]

Example PV YAML:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-storage
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: standard
  hostPath:
    path: "/mnt/data"

Example PVC YAML:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: standard

Horizontal Pod Autoscaler (HPA)

The Horizontal Pod Autoscaler automatically scales the number of Pods in a deployment, replicaset, or statefulset based on observed CPU utilization or other metrics.

Loading diagram...
graph TD A[HPA] -->|Monitors| B[CPU Utilization] A -->|Monitors| C[Memory Usage] A -->|Monitors| D[Custom Metrics] A -->|Scales| E[Deployment/ReplicaSet] E -->|Creates/Removes| F[Pods]

Key features:

  • Automatically scales Pods based on metrics
  • Supports CPU, memory, and custom metrics
  • Configurable scaling behavior (min, max replicas)
  • Customizable scaling algorithms

Example HPA YAML:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80

Identity and Access Management

Kubernetes uses a combination of authentication, authorization, and admission control to secure the API server.

Loading diagram...
graph TD A[API Request] -->|Step 1| B[Authentication] B -->|Step 2| C[Authorization] C -->|Step 3| D[Admission Control] D -->|Approved| E[API Server Processing]

Role-Based Access Control (RBAC)

RBAC is a method of regulating access based on the roles of users within the organization.

Loading diagram...
graph TD A[User/ServiceAccount] -->|Bound to| B[Role/ClusterRole] A -->|via| C[RoleBinding/ClusterRoleBinding] B -->|Has| D[Rules] D -->|Define| E[Resources] D -->|Define| F[Verbs]

Key components:

  • Role: Permissions within a namespace
  • ClusterRole: Cluster-wide permissions
  • RoleBinding: Assigns Role to users in a namespace
  • ClusterRoleBinding: Assigns ClusterRole to users cluster-wide

Example Role and RoleBinding YAML:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Essential kubectl Commands

kubectl is the command-line tool for interacting with the Kubernetes API.

Cluster Information

# Get cluster info
kubectl cluster-info

# Check component status
kubectl get componentstatuses

# View nodes
kubectl get nodes
kubectl describe node <node-name>

Working with Pods

# List pods
kubectl get pods
kubectl get pods -o wide
kubectl get pods --all-namespaces

# Create a pod from YAML
kubectl apply -f pod.yaml

# Describe pod details
kubectl describe pod <pod-name>

# Execute command in pod
kubectl exec -it <pod-name> -- /bin/bash

# Get pod logs
kubectl logs <pod-name>
kubectl logs <pod-name> -c <container-name>  # For multi-container pods

# Delete pod
kubectl delete pod <pod-name>

Deployments

# Create a deployment
kubectl create deployment nginx --image=nginx

# List deployments
kubectl get deployments

# Scale a deployment
kubectl scale deployment nginx --replicas=3

# Update a deployment
kubectl set image deployment/nginx nginx=nginx:1.19

# Rollback a deployment
kubectl rollout undo deployment/nginx

# Check rollout status
kubectl rollout status deployment/nginx

# View rollout history
kubectl rollout history deployment/nginx

Services

# Create a service
kubectl expose deployment nginx --port=80 --target-port=80

# List services
kubectl get services

# Describe a service
kubectl describe service nginx

# Delete a service
kubectl delete service nginx

ConfigMaps and Secrets

# Create configmap
kubectl create configmap app-config --from-file=config.properties

# Create secret
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=password123

# List configmaps and secrets
kubectl get configmaps
kubectl get secrets

Namespaces

# Create namespace
kubectl create namespace development

# Set default namespace
kubectl config set-context --current --namespace=development

# List resources in all namespaces
kubectl get pods --all-namespaces

Context and Configuration

# View kubeconfig
kubectl config view

# Get current context
kubectl config current-context

# Switch context
kubectl config use-context <context-name>

Resource Management

# Apply resources from directory
kubectl apply -f ./dir

# Delete resources
kubectl delete -f ./dir

# Explain resource
kubectl explain pods
kubectl explain pods.spec.containers

Networking in Kubernetes

Kubernetes networking addresses four primary concerns:

  1. Pod-to-Pod Communication: Pods on the same node communicate via local interfaces; pods on different nodes communicate through overlay networks
  2. Pod-to-Service Communication: Services provide stable endpoints for pods
  3. External-to-Service Communication: External traffic reaches services through NodePort, LoadBalancer, or Ingress
  4. Node-to-Node Communication: Nodes communicate with each other for pod networking and cluster operations
Loading diagram...
graph TD A[Internet] -->|Ingress| B[Ingress Controller] A -->|LoadBalancer| C[Service LoadBalancer] A -->|NodePort| D[Service NodePort] B -->|Routes to| E[Service ClusterIP] C -->|Routes to| E D -->|Routes to| E E -->|Selects| F[Pod 1] E -->|Selects| G[Pod 2] F -->|Network| G

Network Policies

Network Policies specify how groups of pods are allowed to communicate with each other and other network endpoints.

Loading diagram...
graph TD A[NetworkPolicy] -->|Selects| B[Pod 1] A -->|Selects| C[Pod 2] A -->|Rules| D[Ingress Rules] A -->|Rules| E[Egress Rules] D -->|Allow/Deny| F[Traffic In] E -->|Allow/Deny| G[Traffic Out]

Example NetworkPolicy YAML:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 80

Best Practices

Resource Management

  • Set resource requests and limits for containers
  • Implement Horizontal Pod Autoscaling
  • Use namespaces to organize resources

Security

  • Follow the principle of least privilege using RBAC
  • Use network policies to restrict pod communication
  • Secure etcd with TLS
  • Use Pod Security Policies or Pod Security Standards
  • Regularly update and scan your images for vulnerabilities

High Availability

  • Deploy multiple replicas of applications
  • Use anti-affinity rules to distribute pods across nodes
  • Implement readiness and liveness probes
  • Design for failure and recovery

Monitoring and Logging

  • Implement centralized logging with tools like EFK or ELK
  • Set up monitoring with Prometheus and Grafana
  • Use distributed tracing for microservices

CI/CD Integration

  • Implement GitOps workflow with tools like Flux or ArgoCD
  • Automate deployments and rollbacks
  • Implement canary deployments or blue-green deployments