Demystifying Kubernetes: the tool to manage Google-scale workloads in the cloud

We are witnessing a new technology wave in the form of immutable infrastructure and micro-services

Once every five years, the IT industry witnesses a major technology shift. In the past two decades, we have seen server paradigm evolve into web-based architecture that matured to service orientation before finally moving to the cloud. Today it is containers.

When launched in 2008, Amazon EC2 was nothing short of a revolution – a self-service portal that launched virtual servers at the click of a button fundamentally changed the lives of developers and IT administrators. 

Elasticity and automation introduced the powerful paradigm of programmable infrastructure. Applications became self-aware and could scale-out and scale-in dynamically. This concept was exciting five years ago, but not any more.

Now, we are witnessing a new technology wave in the form of immutable infrastructure and micro-services. Leading this change is a Linux-based technology called containers, where a single kernel runs multiple instances on a single operating system.

Docker resurrecting container technology

The concept of containers is not new – FreeBSD, Solaris, Linux and even Microsoft Windows had some sort of isolation to run self-contained applications. When an application runs within a container, it gets an illusion that it has exclusive access to the operating system. This reminds us of virtualisation, where the guest operating system (OS) lives in an illusion that it has exclusive access to the underlying hardware.

More on Docker and Kubernetes

  • VMware users ponder Docker containers
  • Cloud security considerations for Docker hosting
  • Docker containers, virtualization can work in harmony
  • New era of container technology with Docker, Kubernetes
  • Google's container management project creates Kumbaya moment
  • Microsoft backs cloud rival Google’s open-source Kubernetes project
  • Five signs that VMware is becoming a developer platform company again

Containers and virtual machines (VMs) share many similarities but are fundamentally different because of the architecture. Containers run as lightweight processes within a host OS, whereas VMs depend on a hypervisor to emulate the x86 architecture. Since there is no hypervisor involved, containers are faster, more efficient and easier to manage.

One company that democratised the use of Linux containers is Docker. Though it did not create the container technology, it deserves the credit for building a set of tools and the application programming interface (API) that made containers more manageable. 

Like most of the successful product launches, Docker was well-timed and came at a point when the industry was looking for better ways to exploit cloud to run web-scale workloads.

Docker is much more than just the tools and API. It created a vibrant ecosystem that started to contribute to a variety of tools to manage the lifecycle of containers. 

DockerCon, the first ever container conference hosted in June 2014, witnessed tech giants such as Google, IBM, Microsoft, Amazon, Facebook, Twitter and Rackspace flaunt their use of container technology.

Containers vs VMs

How Google runs critical workloads in containers

Though Docker hogs the limelight in the cloud world, there is another company that mastered the art of running scalable, production workloads in containers. And that is Google, which deals with more than two billion containers per week. That’s a lot of containers to manage. Popular Google services such as Gmail, Search, Apps and Maps run inside containers. 

Ask any administrator on their experience of managing a few hundred VMs and they will admit that it’s a nightmare. But managing containers is different. Over the past decade, Google has built many tools to deal with the massive number of containers. 

With Google entering the cloud business through App Engine, Compute Engine and other services, it is opening up the container management technology to the developers. This will not only benefit the community, but also acts a differentiating factor for Google Cloud Platform.

New era of containers with Kubernetes

One of the first tools that Google decided to make open source is called Kubernetes, which means “pilot” or “helmsman” in Greek.

Kubernetes works in conjunction with Docker. While Docker provides the lifecycle management of containers, Kubernetes takes it to the next level by providing orchestration and managing clusters of containers.

To understand this better, let’s revisit the way infrastructure as a service (IaaS) runs today. Customers provisioning VMs on Amazon Web Services (AWS), Microsoft Azure and other public clouds don’t care about the brand of underlying hardware. As long as they get the desired performance and uptime, customers will keep spinning VMs and port their applications to the cloud. They are hardly interested to know if the servers in Amazon’s datacentre run on HP, Dell or IBM kit. 

Users choose the instance type, OS and the geographic location to run the VM, and a piece of software called the orchestrator or fabric controller co-ordinates provisioning of the VMs in one of the available physical servers within the datacentre. The IaaS layer fully abstracts the physical hardware.

Kubernetes

When we provision containers, the same process repeats. But, instead of provisioning a new VM, the container orchestration engine might decide to boot the container in one of the running VMs. Based on the current availability of VMs, the orchestrator may also decide to launch a new VM and run the container within that

So, container orchestration does to the VMs what a fabric controller does to the physical hardware.

Kubernetes is capable of launching containers in existing VMs or even provisioning new VMs and placing the containers in that. It goes beyond booting containers to monitoring and managing them. With Kubernetes, administrators can create Pods, which are logical collections of containers that belong to an application. These Pods are provisioned within the VMs or bare metal servers.  

Kubernetes and Docker deliver the promise of PaaS through a simplified mechanism

Traditionally, platform as a service (PaaS) offerings such as Azure, App Engine, Cloud Foundry, OpenShift, Heroku and Engine Yard exposed the capability of running the code by abstracting the infrastructure. Developers dealing with PaaS will push the code along with the metadata that has the requirements and dependencies of the application. 

The PaaS engine looks up the metadata configuration to provision the code. Once the application moves into production, PaaS will scale, monitor and manage the application lifecycle.

Kubernetes and Docker deliver the promise of PaaS through a simplified mechanism. Once the system administrators configure and deploy Kubernetes on a specific infrastructure, developers can start pushing the code into the clusters. This hides the complexity of dealing with the command line tools, APIs and dashboards of specific IaaS providers. 

Developers can define the application stack in a declarative form and Kubernetes will use that information to provision and manage the pods. If the code, the container or the VM experience disruption, Kubernetes will replace that entity with a healthy one.

Currently, Kubernetes is supported on Google Compute Engine, Rackspace, Microsoft Azure and vSphere environments. Red Hat and Pivotal are working towards integrating Docker and Kubernetes with Cloud Foundry and OpenShift PaaS. 

The evolution of lightweight operating systems like CoreOS and Mesos will change the way applications are deployed and managed on the cloud.

About the author
Janakiram MSV is a Gigaom Research analyst and the principal analyst at Janakiram & Associates. He is a regular contributor to Computer Weekly and can be followed on Twitter at @janakiramm.

Read more on Datacentre systems management