Kubernetes is orchestrator for apps deployed in containers.

Kubernetes Architecture

O11y | Kubernetes Architecture

Master Nodes

Manage Kubernetes platform and administer worker nodes running the containers (workloads)

Kube API

Exposes the API so other components can communicate.

ETCD

Distributed key-value database to track state of the whole system.

Controller Manager

Take care to maintain desired state of the system.

Scheduler

Distributes PODs to nodes based on different criterias.

Worker Nodes

Run PODs

Kubelet

Communicates with Mater Node and make sure PODs are running.

Container Engine

Software that runs containers in the Node - containerd, docker, …

Kubernetes Manifests

Get skeleton yaml files for Kubernetes objects by using:

kubectl run my-pod --image=nginx --dry-run=client -o yaml

kubectl create deployment my-deployment --image=nginx --dry-run=client -o yaml

Services

Services are Kubernetes objects used to create network connections between different Kubernetes objects. They are exposing ports on the PODs to the outside world.

NodePort

Exposing the ports on the PODs as ports on the Node where the PODs are running. Using ports from 30000 and up

ClusterIP

New virtual IP object cretaed on the cluster level and ports are exposed on this IP. Used for internal communication between microservices

LoadBalancer

Used in the Clouds to create LoadBalancer (from specific Cloud)

Taints and Tolerance

Taint is mechanism used by the node to prevent PODs run on it - PODs can not be scheduled on tainted node unless they have tolerance set for this taint. On the other hand, tolerance is the mechanism to give PODs the ability to run on the node with specific taint - to be scheduled on tainted node.

Node Selectors and Afinity

Node selectors are instructing the POD how to choose the node where it will run based on specific labels. Matching labels must be set on POD and Node. If they match, specific POD will be able to run there.

Node afinity is more flexible and gives preference to the POD where to run based on matching labels. It can have 2 different options like: requiredDuringSchedulingIgnoredDuringExecution or preferredDuringSchedulingIgnoredDuringExecution so you can fine tune the PODs preferences.

DaemonSet

DaemonSet is Kubernetes object similar to Deployment, but it makes sure to deploy one POD instance in each Node in the cluster. Classic use case is for monitoring agents or log collectors.

Drain, Cordon and Uncordon

Sometime you need to bring the Node down for maintenance or upgrade. There are few commands you can use to help.

Cordon the Node

This will prevent new PODs to be scheduled in cordoned Node.

kubectl cordon <node>

Drain the Node

This will stop all the PODs in this Node and schedule the on other Nodes (if available).


kubectl drain node <name>

Uncordon the Node

This will enable the Node to run newly scheduled PODs


kubectl uncordon <node>

KubeConfig

This is the file you need for cluster access using kubectl or tools like Lens or k9s. You can update this file to keep certificates, users and cluster info for multiple clusters, so you can simply switch the context and work on cluster you want.

Clusters

Names of the clusters and their SSL certificates

Contexts

Contexts with namespaces and user name in specific cluster which sets the complete working environment for the user in the cluster.

Users

Users and their keys

In microk8s this file can be created by:

microk8s kubectl config view

In k3s this file is created in /etc/rancher/k3s/k3s.yaml and provides access to cluster on localhost.

RBAC

Create User with RBAC

Role is created to allow limited access to namespace objects like PODs, Services, ConfigMaps, etc… for a user. We are using Role Binding to pair user with Role.

# Create Role Template
kubectl create role test --verb=list --resource=pod --dry-run=client -o yaml

You can, now, adjust:


# Allow to list PODs in namespace

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  creationTimestamp: null
  name: test
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - list

Now, create RoleBinding tobind user to this role:

kubectl create rolebinding reader --role=test --user=reader-user --dry-run=client -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  creationTimestamp: null
  name: reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: test
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: reader-user

Cluster Roles

Cluster Roles are used to allow user access to cluster wide objects in Kubernetes like namespaces, nodes, APIs, etc… We are using Cluster Role Binding to bind user with specific Cluster Role.

kubectl create clusterrole test --verb=list,get --resource=namespaces --dry-run=client -o yaml

and get the template:


# Allow user to list all namespaces in the cluster
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: null
  name: test
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  verbs:
  - list
  - get

Now, create ClusterRoleBinding to bind user to specific ClusterRole. It is same as for the Role.

Service Accounts

Just like Roles and Cluster Roles are used to provide limited access for users, Service Accounts are used to provide the same, but for applications.

Use Role or ClusterRole Binding to bind ServiceAccount to Role or ClusterRole

Image Security

Kubernetes is running your apps in Docker images inside PODs. Docke Images, likeany other software, should be as secure as possible. Protected from malwares, scanned for vulnerabilities and downloaded from trusted sources.

you can use various 3rd party tools (Trivy, etc…) to scan and patch your images. PODs should be instructed in their manifests to use Kubernetes secrets to store Registry credentials and pull images from secured Registry URLs.

Network Policy

Network policies are POD related network firewalls to provide ingress (incomig traffic to the POD) and egress (outgoing traffic from the POD) traffic access on POD level.

If no Network Policy is defined for a POD, it allows unrestricted access for ingress and egress,

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-networkpolicy
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  - Egress

  # Ingress: allow traffic from frontend Pods only
  ingress:
  - from:
    - podSelector:
        matchLabels:
          role: frontend
    ports:
    - protocol: TCP
      port: 80

  # Egress: allow traffic only to database Pods in the "database" namespace
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: database
      podSelector:
        matchLabels:

Persistent Volumes

Persistent volume is a peace of hard disk in the node assigned as storage to the Kubernetes cluster. PODs are able to request portion of the storage and persist data to it. They can be provisioned by the admin or by the StorageClass. PV exists independently from the PODs, so data is not lost after PODs start or stop.

Persisten Volume Claims

PODs will make the Persistent Volume Claim when they need persistant storage. They make the clain and Kubernetes cluster will assign existing or create new PV for this POD.

Storage Class

StorageClass instructs Kubernetes which storage driver to use in order to provision storage for PODs following the PVC. In Cloud providers, some of the common classes are: AWS EBS, CSI, Ceph, etc…

CNI

CNI plugin handles network setup within the cluster. It enables communication with internal and external components. It supports various network plugins like Calico or Flannel. These plugins may setup a number of IPtables rules on the node to handle all the networkig.

Service Networking

Kubernetes services are objects used to create network connectivity for your PODs. They will run in the namespace and have DNS name in the format:

..svc.cluster.local

which is visible and will be resolved to the correct IP from the entire cluster, so PODs from different namespaces can communicate through the network. Services are providing port mapping to the ports exposed on the POD.

There are 3 types of Services in Kubernetes.

NodePort

This will assign port above 30000 on the node to your application, so you app will be visible on the

node-IP:<assigned port gt. 30000>

NodePort will expose only port on the specific Node, so there will be no load balancing over Nodes if PODs are running in different Nodes.

ClusterIP

ClusterIP will assign IP from the subnet assigned to the cluster and expose port on this IP. This is default service.

Load Balancer

Is used if you are running in the commercial Cloud, so it will bring up the Load Balancer supported by your Cloud Provider

Ingress

Ingress is used to expose your apps to the Internet using HTTP(s) ports 80 or 443. It is based on NGINX so you can configure in the same way, ie. SSL termination, rate limit, paths/routes, etc… Ingress is implemented using Ingress Controller - cluster wide resource and Ingress Resources - namespace resource handling traffic in specific namepsace and pointing to the relevant Kubernetes service behind it.

Limitation of Ingress is support for different ports or network services.

Gateway API

Gateway API is the new concept for exposing Kubernetes apps to the Internet. Traditional Ingress is exposing only HTTP/HTTPS port 80/443, so if your apps need WebSocket or standard TCP port, Ingress will not work. Gateway API solves this issue by introducing routes and you can have http(s) or TCP routes exposing whatever port your app needs. Gateway API is in devlopment and may not be fully available or implemented. However, it is possible to find working implementation and use it instead of Ingress. Kourier looks like promissing solution for small k8s implementations.