Container orchestration with Kubernetes

Orchestration has a history that started from the need to automate and coordinate and manage systems. Today, we’ll take a look into how we used to handle automation and deliverables in the past and the main orchestration tool is used in the dev scene today.

Why of orchestration?#

Through times, we evolved from managing delivery of services manually through FTP, created processes by writting complex shell scripts in a standard way, delivered assets through rsync + ssh, or even by using Git hooks and checking in to the latest branch on remote servers.

More then a decade has past since LAMP developer was a job title and traditional tools faded into better tooling, to do configuration management or provisioning. And eventually, container orchestration.

Configuration Management - Chef, Puppet, Ansible, and SaltStack were designed to help install and manage applications or software on servers.

Provisioning - AWS Cloudformation or Terraform, were designed to provide the server infrastructure (load balancer, database, network topology, etc). Cloudformation is proprietary but in regards of Terraform, it contacts the Cloud provider API to accomplish it.

These terms are not mutually exclusive, as some of the configuration management tools can offer some degree of provisioning and vice-versa.

What’s important to understand this far is that these formed the base for what is now understood as “Infrastructure as code” (IAC). The purpose is to move away from non standardized shell script processes or manual labour, simplifying management and configuration.

Container orchestration - is the process of automating the management of container-based service apps across clusters. To understand it we need to look at what containers are and why they exist.

A Container guarantees that an application runs the same everywhere, its a distributable that includes libraries, binaries, dependencies and configuration in a linux operating system that can be stripped to its bare minimals. In the past, to run an application we’d make sure that the environment was setup correctly causing the classic “it works on my computer” (this was actually solved by tools like Vagrant, that use VM instead to provide a portable development environment).

We’d often had to spend a lot of time troubleshooting, figuring out dependencies, so that our staging and production servers were setup correctly for our application to run.

Container’s solved this by packaging every single requirement on its own contained and distributable package.

A single container in a development environment can be easily run through Docker and Dockerfiles. For multiple container applications you can even work with Docker compose - I’m using Docker here as an example and the tools provided that are easy to use through the CLI in the comfort of your machine.

So, while you can run multiple container applications this way, you wouldn’t want to run all the container applications in a machine with limited resources for production. This is one of the main reasons we’d want to distribute our container applications to multiple machines and with all the complexity that comes with this sort of setup, we’d need a container orchestration system!

A container-orchestration system is a tool to help manage how the container instances are created, managed at runtime, scaled, placed on underlying infrastructure (one or more servers, that we call the cluster), communicate with each other, etc beyond the “development” environment.

We’ll be looking at Kubernetes as a container orchestrator that delivers these capabilities.

Kubernetes, the container orchestrator#

Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications.

There might be different distributions of Kubernetes, such as the “vanilla upstream” (its not a distro, but the github repositories, all the pure Kubernetes project source codes), EKS, GKS, DOK, “vanilla installers” or “vanilla upstream” deployment (kubeadm, kops, kubicorn), kind (kubernetes in docker), Rancher k3s etc.

Some basic use cases can be listed as:

  • Run 2 container applications using the Docker image songs/api:v1.0
  • Run 2 container applications using the Docker image songs/client:v1.0
  • Add load-balancer for internal and public services
  • Basic autoscaling
  • Update the containers with latest images songs/xxx:v2.0
  • Keep services running while upgrading
  • Long running services, batch (one time) or CRON like jobs
  • Access control (whom and in which resource)
  • And much more..

A basic K8s architecture, where we have a logical part and the physical parts, the infrastructe and the applications that are required for k8s to work:

the output in the console

Master, also referred as control plane, functions as the brain of a cluster and is composed by the following components :

  • The API server user to interact with the cluster
  • Scheduler
  • Controller manager
  • The etcd a key/value store, the database of the cluster. Most clusters store their state in an etcd service. Etcd needs at least one node to function, and three to provide high availability.

The control plane usually runs on a dedicated node, except on single-node development clusters,such as when running minikube. In AKS, GKE, EKS, the control panel is invisible, we only have a Kubernetes API endpoints.

Nodes (formely called minions) where our applications actually run:

  • Container runtime (typically Docker)
  • “Node agent” (kubelet) is the agent that connects to the API server, reports the node status, and obtain the list of containers to run
  • Network proxy (kube-proxy)

Kubernetes API is mostly a RESTful API that allows us to CRUD resources (create, read, update, delete). Some of these are:

  • Node, a machine that runs in the cluster
  • Pod, group of containers running in a node
  • Service, a network endpoint to connect to one or multiple containers

Pod is an abstraction of one or more containers, the lowest minimal deployable unit in a Kubernetes network.

  • Is a concept that only exists in K8s, not related with the container runtime (Docker)
  • A pod must have at least one container, and can have multiple containers if necessary (we generally just have a single container in a Pod)
  • Kubernetes can NOT manage containers directly
  • IP addresses are assigned to Pods not containers
  • Containers inside a Pod share the same hostname and volumes

Installing Kubernetes for learning#

I’m a MacOS (visit the Kubernetes docs for other instructions) user and these are the requirements to have Kubernetes running locally:

  • Docker desktop

Assuming that you have Docker desktop already installed, open the preferences > Kubernetes > Enable Kubernetes, apply and restart!

You may want to disable the Kubernetes engine when not using it, since this takes about 5-10% of your CPU power.

Note: When enabling on my machine (macOS Catalina), “Kubernetes is starting…” was indefinitely, but after uninstalling and re-installing Docker fixed it. More details here .

Once Docker displays that Kubernetes is running... in your Macos system bar, top right. Run the command help for the manual:

kubectl help

You can also find more information at: https://kubernetes.io/docs/reference/kubectl/overview/

Since we are going to use different tools besides the kubectl tool, to make sure we have a consistent experience, we can use shpod that provides a container with a shell inside the cluster with standard linux tools (jq, helm, stern, curl, shell auto-complete, etc) that we’ll use throughout the article series.

To setup the container (waits for attachment):

kubectl apply -f https://k8smastery.com/shpod.yaml

Attach to that container, giving a shell inside the cluster:

kubetcl attach --namespace=shpod -ti shpod

To delete:

kubetcl delete -f https://k8smastery.com/shpod.yaml

The shpod.sh script will:

  • apply the shpod.yaml manifest to your cluster
  • wait for the pod shpod to be ready
  • attach to that pod
  • delete resources created by the manifest when you exit the pod

Kubectl#

The kubectl (cube control) is a rich CLI tool around the Kubernetes API, which means that whatever we do with the CLI, we can do directly with the API. The kubectl uses a configuration file that can be found in the location ~/.kube/config. The configuration file and parameters can be either passed as a file –kubeconfig or with individual flags –server, –user, etc.

Let’s get started and check the composition of our cluster:

kubectl get node

Similarly, we could do no, node or nodes.

This will return us the hostname of the machine, for a single node setup that is our case so far, local machine.

NAME             STATUS   ROLES    AGE   VERSION
docker-desktop   Ready    master   12h   v1.19.3

The get command is important, as its used to get resource types, as node.

We can pass flags such as -o wide, -o yaml, or -o json. Or, pipe the output to jq:

kubectl get node -o json | jq ".items[] | {name: .metadata.name} + .status.capacity"

To learn more about jq CLI tool check the documentation here . If you don’t have jq in your local machine, feel free to use the shpod mentioned above.

We can get extended humanly readable info by running kubectl describe <resource-type-name>/<resource-name>:

kubectl get no
NAME             STATUS   ROLES    AGE   VERSION
docker-desktop   Ready    master   12h   v1.19.3
kubectl describe node/docker-desktop
Name:               docker-desktop
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=docker-desktop
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
...

We can list all available resource types, as documented when kubectl help “Print the supported API resources on the server”:

kubectl api-resources

Similarly, we can get a description about each type:

kubectl explain <resource-type-name>

kubectl explain pods
kubectl explain pods.spec
kubectl explain pods.spec.volumes

To list all sub fields:

kubectl explain <resource-type-name> --recursive

The Kubernetes documentation can be found here . Bare in mind that vendor k8s distributions extend the list of options that are accessible through the CLI and not available in the standard API doc.

We can get other resource types as stated, so lets look at Services.

kubectl get services

There are shorter versions for services, such as svc.

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   12h

A service is explained as:

kubectl explain services

Service is a named abstraction of software service (for example, mysql)
consisting of local port (for example 3306) that the proxy listens on, and
the selector that determines which pods will answer requests sent through
the proxy.

If try the pods, we get a message “No resources found in default namespace”.

kubectl get pods

No resources found in default namespace.

At this point you can probably guess by running “kubectl get namespace” or for short “namespace”, or “ns”:

kubectl get namespace
kubectl get ns

NAME              STATUS   AGE
default           Active   12h
kube-node-lease   Active   12h
kube-public       Active   12h
kube-system       Active   12h
shpod             Active   60m

Pick the shpod, we started earlier on (we can also use the short version -n namespace):

kubectl get pods --namespace=shpod
kubectl get pods -n shpod

NAME    READY   STATUS    RESTARTS   AGE
shpod   1/1     Running   0          62m

The explainer for namespaces says:

Namespace provides a scope for Names.
Use of multiple namespaces is optional.

You can get a list of all regardless, or shorter flag “-A”:

kubectl get pods --all-namespaces
kubectl get pods -A

NAMESPACE     NAME                                     READY   STATUS    RESTARTS   AGE
kube-system   coredns-f9fd979d6-4v6z6                  1/1     Running   0          13h
kube-system   coredns-f9fd979d6-gcbx6                  1/1     Running   0          13h
kube-system   etcd-docker-desktop                      1/1     Running   0          12h
kube-system   kube-apiserver-docker-desktop            1/1     Running   0          12h
kube-system   kube-controller-manager-docker-desktop   1/1     Running   0          12h
kube-system   kube-proxy-f5q94                         1/1     Running   0          13h
kube-system   kube-scheduler-docker-desktop            1/1     Running   0          12h
kube-system   storage-provisioner                      1/1     Running   0          12h
kube-system   vpnkit-controller                        1/1     Running   0          12h
shpod         shpod                                    1/1     Running   0          63m

Although, our concern should be within the namespace, as we only want to see what’s specific for our service applications, this might come handy. In the kube-system you’ll find the essentials as mentioned above, like etcd, kube-scheduler, kube-apiserver, etc.

There are other namespaces apart from default, docker and kube-system. We have kube-node-lease and kube-public.

The kube-public as explained in the answer here , contains a single ConfigMap object, cluster-info, that aids discovery and security bootstrap and is readable without authentication. Used for installation mainly.

We’ll look into ConfigMaps a bit later.

kubectl -n kube-public get configmaps

NAME           DATA   AGE
cluster-info   2      13h

kubectl -n kube-public get configmap cluster-info -o yaml

apiVersion: v1
data:
  jws-kubeconfig-abcdef: eyJhbGciOiJIUzI1NiIsImtpZCI6ImFiY2RlZiJ9..3--vDb1wbI7Lh-Lc_Q4kHTRUZTsiXDsBdJPLqtRW5C4
  kubeconfig: |
    apiVersion: v1
    clusters:
...

The kube-node-least works as a keepalive, healthcheck ping system that calls the control plane. You can read more about it here .

Running our first containers#

First, we can’t create a container directly, we need a pod.

Similar to a Docker run command, we can create a pod pingpong, use a Docker image alpine and execute a command, such as ping.

kubectl run pingpong --image alpine ping 1.1.1.1
kubectl get all

NAME                            READY   STATUS             RESTARTS   AGE
pod/pingpong                    1/1     Running            0          56m

Alternatively, we should also create a replicateSet and a deployment. But to run a command as we did above we need to pass a yaml file.

kubectl create deployment pingpong --image=alpine --replicas=1

kubectl get all

NAME                            READY   STATUS             RESTARTS   AGE
pod/pingpong                    1/1     Running            0          56m
pod/pingpong-85f7749846-hgf69   0/1     CrashLoopBackOff   12         38m

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   14h

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/pingpong   0/1     1            0           38m

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/pingpong-85f7749846   1         1         0       38m

In the list provided by get all, we get all the resources in our cluster where the service/kubernetes already existed as we discussed earlier, its the API connection point for anything in our cluster if need to communicate with the API.

Deployment resource provides a declarative interface for a Pod resource and a ReplicaSet resource. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments. Each Deployment resource requires a unique Deployment Name. Kubernetes resources are identified by their names, so the name must be unique in the target namespace. More detailed information here .

  • allows scalling, rolling updates, rollbacks
  • canary deployments, details here
  • delegates Pods management to replica sets

ReplicaSet has the purpose to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods. A ReplicaSet resource monitors the Pod resources to ensure that the required number of instances are running. More details here .

  • ensures that a given number of identical Pods are running
  • allows scalling
  • rarely used directly

Pods are the smallest deployable units of computing that you can create and manage in Kubernetes. A Pod resource configures one or more Containers resources. Container resources reference a Docker container image and provide all the additional configuration required for Kubernetes to deploy, run, expose, monitor, and secure the Docker container. More details here .

Deployment > ReplicaSet > Pod, these are abstractions, layers of different functionality, split into different purposes, that allows us flexibility on how we use Kubernetes.

Deployment, replicaSet and pod

It’s more important to understand the concepts at this point as you can always use the Kubectl reference to run the commands, here .

You might not find all commands, such as:

kubectl delete deployment pingpong

As the kubectl run, we can override a the Dockerfile CMD in a deployment by passing a YAML file containing the spec, command and args fields:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: alpine-deployment
  labels:
    app: alpine
spec:
  replicas: 3
  selector:
    matchLabels:
      app: alpine
  template:
    metadata:
      labels:
        app: alpine
    spec:
      containers:
      - name: alpine
        image: alpine
        command: ["ping"]
        args: ["1.1.1.1"]
        ports:
        - containerPort: 80

And then run the CMD to generate the deployment, replicaSet and pods:

kubectl apply -f alpine-deployment.yaml

We can then get all the resources in the cluster and confirm:

kubectl get all

NAME                                     READY   STATUS    RESTARTS   AGE
pod/alpine-deployment-556cbc76fb-hx54q   1/1     Running   0          2m23s
pod/alpine-deployment-556cbc76fb-mnjds   1/1     Running   0          2m23s
pod/alpine-deployment-556cbc76fb-zcngw   1/1     Running   0          2m23s

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   15h

NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/alpine-deployment   3/3     3            3           2m23s

NAME                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/alpine-deployment-556cbc76fb   3         3         3       2m23s

Similary, we can check for logs. Where deploy is a short for deployment and the alpine-deployment the name.

kubectl logs deploy/alpine-deployment

PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=37 time=25.266 ms
64 bytes from 1.1.1.1: seq=1 ttl=37 time=14.619 ms
64 bytes from 1.1.1.1: seq=2 ttl=37 time=11.260 ms

This will only return the log output for one Pod, but we can check the logs for a specific Pod.

kubectl logs alpine-deployment-556cbc76fb-hx54q

To summarize, for kubectl logs we can either pass a pod name or a type/name. In the example above for the deploy/alpine-deployment case it gets the very first pod by default.

Other options are --tail <number> or --tail <number> --follow.

Scaling our application#

I’ve set a number of replicas to be 3, in the YAML file I’ve shared previously, but we can scale this up.

kubectl scale deploy/alpine-deployment --replicas 10

We do this to change the declarative spec, which means to scale deployment alpine-deployment to a given number regardless of what’s there before.

ReplicaSet in action#

Let’s see the replicaSet in action by watching the stream of logs from our Pod container and delete to see the effects it causes.

kubectl logs deploy/alpine-deployment --tail 1 --follow

Found 3 pods, using pod/alpine-deployment-556cbc76fb-zcngw
64 bytes from 1.1.1.1: seq=3301 ttl=37 time=13.381 ms
64 bytes from 1.1.1.1: seq=3302 ttl=37 time=12.077 ms
64 bytes from 1.1.1.1: seq=3303 ttl=37 time=11.508 ms
...

In a separate terminal window, while the log stream outputs, we execute:

kubectl delete pod/alpine-deployment-556cbc76fb-zcngw

We should get a response message stating the pod is deleted. Kubernetes does it gracefully, which means that it takes some time to terminate the process. For a short period of time (Kubernetes by default gives 30 seconds grace period for Docker to stop the container), you should see a 4 Pod that is created, while the target is Terminating, as follows:

kubectl get all

NAME                                     READY   STATUS        RESTARTS   AGE
pod/alpine-deployment-556cbc76fb-598kj   1/1     Running       0          5m21s
pod/alpine-deployment-556cbc76fb-hx54q   1/1     Running       0          57m
pod/alpine-deployment-556cbc76fb-rfwsf   1/1     Running       0          7s
pod/alpine-deployment-556cbc76fb-zcngw   1/1     Terminating   0          57m

This because ReplicaSet resource monitors the Pod resources to ensure that the required number of instances are running.

Single run containers#

In this section will look at Pods that run a single time and do not restart, these create Jobs or Pods instead of deployments. We’ll also look into Cronjobs .

Previous to version 1.18, the flags to achieve these were:

kubectl run --restart=OnFailure

kubectl run --restart=Never

Similarly, we can create Cronjobs by using the flag:

kubectl run --schedule=...

Under the hood, the commands invoked “generators” to create resources descriptions that we could write ourselves in YAML (there are other ways but YAML is more typical), we have an example of this in the previous topic.

For the current 1.18+, we use the resource description to achieve it, you can also check the original documentation here :

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

Applied by:

kubectl apply -f https://kubernetes.io/examples/controllers/job.yaml

kubectl describe jobs/pi

Name:           pi
Namespace:      default
Selector:       controller-uid=c9948307-e56d-4b5d-8302-ae2d7b7da67c
Labels:         controller-uid=c9948307-e56d-4b5d-8302-ae2d7b7da67c
                job-name=pi
...

We can delete it by using one of the delete CMD formats:

kubectl delete jobs/pi

kubectl delete -f ./job.yaml

Better logs#

Stern allows you to tail multiple pods on Kubernetes and multiple containers within the pod. Each result is color coded for quicker debugging.

Install it on macOS & homebrew, run:

brew install stern

We then can run:

stern --tail 1 <deployment-name>

It’ll output log messages with callers and better formating. Check the documentation for details.

Be careful when using stern, becasue if not used properly, you might stream the logs of all the pods in the current namespace, opening one connection for each container. If thousands of containers are running, this can put some stress on the API server.

WIP#

References:

https://k8smastery.com

https://www.digitalocean.com/community/curriculums/kubernetes-for-full-stack-developers

https://www.ibm.com/cloud/architecture/content/course/kubernetes-101/deployments-replica-sets-and-pods/

comments powered by Disqus