Understanding Docker, Kubernetes and Container-Based Systems
by Jeremy H • October 21, 2020
The rise of the microservices and container-based systems has allowed global enterprises – like Amazon.com, Netflix, Uber, and Airbnb – to achieve unprecedented market dominance. Central to making these microservices-based applications possible is the concept of containerization, and at the core of containerization are Docker and Kubernetes – the two most widespread solutions for building and managing container-based systems.
However, as much as people talk about Docker and Kubernetes, these platforms are often confused. That’s why we wrote this guide: to give you a clear understanding of what Docker and Kubernetes are all about. We’ll start by defining the concepts of containerization and container-based systems. Then we’ll teach you about Docker and Kubernetes, how they work, and how they fit into the containerization puzzle.
Did you know you can generate a full-featured, documented, and secure REST API in minutes using DreamFactory? Sign up for our free 14 day hosted trial to learn how! Our guided tour will show you how to create an API using an example MySQL database provided to you as part of the trial!
Please use these links to navigate the guide:
- What Is Containerization?
- What Is Docker?
- Docker Suite of Tools
- Docker Functional Components
- Docker Container Orchestration for Container-Based Systems
- What Is Kubernetes?
- Summary and Review of Docker and Kubernetes
- DreamFactory: Simplifying API Connections for Container-Based Systems
What Is Containerization?
Containerization is a server virtualization strategy that allows you to launch and run multiple applications or microservices within “containers” on the same operating system instance. As an isolated runtime environment, a container lets you “contain” an application as if it had its very own operating system. From the containerized application’s perspective, it is alone – unaware of any other applications running on the server – even when it is sharing server resources with multiple apps or microservices.
Compared to the traditional server virtualization method (virtual machines), containers are lighter weight and you can more economically host more of them on the same server. This is because a virtual machine needs to contain and run a replica of the entire OS instance, but a container only holds the bare minimum of code, libraries, tools, and dependencies that a microservice requires for its operation.
In recent years, service-based application architectures consisting of many containerized microservices have become increasingly popular because they offer distinct advantages over traditional monolithic applications. For example, monolithic applications intertwine all of the programming for the application within the same codebase. This makes monoliths complicated to change or upgrade without coding conflicts that can negatively impact other parts of the system. In contrast, container-based systems (consisting of multiple microservices running inside containers) have a modular, pluggable architecture that is faster to upgrade, more efficient to operate, and easier to scale than monolithic applications.
In a microservices-based app, developers break the monolith into its individual features and services – then they run each service or feature as an independent application, i.e., a microservice. By hosting each microservice in its own container, developers can loosely connect them via APIs to form a more flexible, component-based application architecture. The “pluggability” of this kind of system makes scaling and updating easier, faster, and more affordable.
Lastly, compared to using virtual machines as runtime environments for microservices, containers use fewer system resources and offer faster performance (more on this in the next section).
Containerization Benefits
Here are the benefits that containers bring to the table:
- They don’t need their own OS instance. Since you can host multiple containers and multiple containerized apps on the same OS instance, you are only paying for one OS license to host multiple microservices. This is in contrast to using virtual machines (VMs) to host microservices. Each VM requires its own OS instance, so every time you spin up a VM you’re paying extra for a unique OS kernel.
- They use fewer system resources. VMs hog up as many system resources as they can. Their need for a separate OS instance demands extra resources as well. Comparatively, a container is lighter weight, often taking up as little as 10 megabytes of space, and you can restrict their access to CPU and memory – so they require dramatically fewer resources to run.
- They offer better performance. Since a VM needs a full copy of the operating system to run, it takes more time to boot because you have to wait for the entire OS to boot up. With containers (which often take only milliseconds to boot up), creating, replicating, and destroying application instances is markedly faster than doing the same with VMs. This translates into better, faster system performance.
- Support a pluggable architecture: The containerized services that form a microservices-based architecture have fewer dependencies between them. This makes it easy to add, remove, or update your individual application components with fewer chances of impacting the system.
- Make scaling easier: A container-based architecture facilitates the scaling of individual components because you can deploy and destroy containers so quickly. You can also replicate or divert system resources to only the containers that need them – which makes scaling an app more economical.
- Deployment flexibility: With Docker container images, you don’t have to have a special operating system for testing purposes – nor for distributing your container files. Docker allows you to download different container files and deploy containers (and their apps/microservices) on nearly any operating system.
What Is Docker?
Docker is a free, open-source platform for building containerized apps, deploying containerized apps, and allocating resources across a container-based architecture. By allowing you to create, deploy, and orchestrate a multi-container system, Docker helps you realize all of the container benefits we referenced above.
Before we describe Docker further, it’s important to note that the word “Docker” refers to two separate concepts: (1) The docker container file format (also known as a docker image), which holds all of the components, code, tools, libraries, and dependencies that a containerized application needs to run. (2) The free, open-source Docker platform, which includes the tools you need for creating, deploying, and managing containers and container-based systems.
Now let’s get familiar with the Docker platform, its tools, and components.
Docker Suite of Tools
Getting familiar with the Docker platform starts with Docker Desktop. Docker Desktop runs on Windows or Mac and features a dashboard where you can access different Docker tools for creating, managing, and automating the deployment of containers. From Docker Desktop, you can access the rest of the tools in this section.
The main tools included in the Docker platform are:
- Docker Engine: This is the runtime environment that creates, deploys, and automates containerized application deployment on different types of operating systems.
- Docker Hub: This is a hosted library of docker images where you can publish, share, and access a wide range of docker container files. The un-hosted version of Docker Hub is called Docker Registry.
- Docker CLI (Command Line Interface): This text-based tool is the “control interface” that allows you to send requests to the Docker daemon REST API. These requests might include commands that create new container files, run docker files, save docker files, or access/share saved docker files via Docker Registry.
- Docker Compose: Docker Compose is a simple container orchestration tool that lets you manage resources between containers running on the same Docker node (i.e., the same server instance). Managing containers running under more than one Docker node requires more advanced container orchestration tools like Docker Swarm or Kubernetes.
- Docker Swarm: This is a container orchestration tool that allows you to manage containers running on different Docker nodes (i.e., different OS kernels). Docker Swarm is different enough from Kubernetes that the two are not exactly in competition with each other.
- Kubernetes: Docker also includes native access/integration for the container orchestration solution Kubernetes.
Docker Functional Components
Now that you understand the Docker toolset, let’s review Docker’s fundamental components that you’ll be interacting with:
- Docker Host and Docker Daemon: Docker Host is what runs the Docker Engine server. Docker Engine is also called Docker daemon. Docker daemon offers a REST API that allows you to interact with the daemon (by using Docker CLI to send requests). The primary role of Docker daemon is to monitor request traffic, control the docker node, and control docker objects. Docker daemon also interacts with external docker daemons when needed.
- Docker CLI: This text-based command-line tool is your Docker “control interface.” It allows you to send requests to create, run, save, or share docker files. These requests are sent to the Docker daemon REST API.
- Docker Objects: Docker objects include a number of objects that are central to using the platform. These include (1) container files that you can share and deploy, (2) containers, (3) plugins, (4) networks, (5) volumes, and (6) networks. You can read detailed descriptions of these objects here.
- Docker Registry: Docker Registry is an un-hosted tool that lets you save, share, access, and distribute container files. Docker Hub is the hosted version of this (see description above). Docker Hub offers additional perks such as webhooks, teams, organizations, and more.
If you want to start creating and running docker images, watch this video for an awesome introduction to the basics!
Docker Container Orchestration for Container-Based Systems
When your microservices-based application includes multiple containerized microservices, the application will require a container orchestration tool that automates the distribution of resources, container deployment, and the orchestration of requests across the architecture. Docker Compose allows you to automate container deployment and resource distribution for a container-based system when all containers are operating under the same Docker node and operating system.
When container-based systems include multiple Docker nodes (and containers running on different server instances), you will need a more advanced container orchestration solution such as Docker Swarm or Kubernetes.
Docker User Reviews
Here are some excellent user reviews for Docker from Capterra:
- "Absolutely amazing, and super reliable. I am able to pull down basically any image/machine at any version with almost anything preinstalled (and if I can't find it, I could make it)."
- "This makes it extremely easy to package a piece of software almost any way you want. On top of that, there is a free version of the product which is more than enough for most people’s needs."
- “If you are a beginner with Docker, you need to follow some learning materials… if not you will get a bit confused without knowing the platform."
- "Maybe more documentation should be provided for insecure docker registries. We prefer code to stay with our organization and hence struggled a lot to put docker registry internally."
- "Performance on macOS is really bad compared to other Operating Systems."
What Is Kubernetes?
Like the conductor of an orchestra, Kubernetes can direct actions of thousands (even hundreds of thousands) of containers and groups of containers, all running on different servers under different Docker nodes. With Kubernetes, you can manage a group of containerized apps to work in concert – in a way that forms a larger container-based system.
As the most widely-used solution for orchestrating container-based systems, Kubernetes is not only free and open-source, it’s also being used by the U.S. Department of Defense in F-16s and Battleships.
Perhaps the easiest way to understand Kubernetes is to view it as a series of deployment instructions coded in YAML (a human-readable computer language). Developers use a single interface (be it command-line or dashboard) to interact with Kubernetes API and define these deployment instructions. After coding the instructions, Kubernetes has all of the rules and limits it needs to manage the lifecycle of the containerized microservices that comprise your system. Kubernetes will follow these rules/limits while doing the following:
- Controlling and supervising requests across a container-based system
- Load balancing requests across the system
- Monitoring the performance of a container-based system
- Allocating resources
- Booting, rebooting, replicating, scaling, and destroying container-based processes (and groups of containers)
- Maintaining optimum performance and high availability in response to changing demands
Ultimately, Kubernetes carries out all of these actions on autopilot as dictated by its YAML-coded deployment instructions.
Kubernetes Benefits
Container orchestration through Kubernetes brings the following advantages:
- Highly-available: When you distribute the microservices that make up an application across different servers and different Docker nodes, you can use Kubernetes to manage the replication of Docker nodes, microservices, and groups of microservices. This achieves redundancy and higher availability because different containers and essential services will be duplicated on different operating systems and machines.
- Scalable: Kubernetes allows you to set rules that automatically deploy and replicate containerized services according to the needs of the system at any given time. If the workload spikes on a particular server and Docker node, Kubernetes can replicate the node on another server or direct traffic to an already existing replica. This allows you to scale your system at any time to manage any amount of traffic.
- Declarative: The rules/limits of a Kubernetes-based system are coded in YAML, a human-readable language that lets you use version control and track updates, which also facilitates collaboration.
- Compatible/portable: Most public cloud systems can host Kubernetes. You can also run Kubernetes on-premises or on a bare-metal system.
- Complex, large-scale microservices-based architectures: Kubernetes allows you to develop systems consisting of thousands (or hundreds of thousands) of microservices.
- Resource optimization: A well-designed Kubernetes cluster optimizes the use of server resources to achieve a system that’s faster, more reliable, and less costly to run.
- Self-healing: Kubernetes can boot, reboot, and destroy containers, pods, and nodes – or divert requests to online containers – when one part of the system fails. This allows Kubernetes to contain failures and self-heal to keep the entire system available.
Kubernetes Functional Components
“Kubernetes clusters” are the container-based systems that you can create with Kubernetes. In other words, a Kubernetes cluster is the group of containerized microservices that a Kubernetes instance controls. These containerized processes will be running on multiple servers under multiple Docker nodes. Essentially, it represents your entire container-based system or application. A Kubernetes cluster consists of three primary units of deployment:
Kubernetes Pods: Pods are the smallest element of a Kubernetes cluster. A pod can be one or more containerized microservices that depend on one another for their operation. In other words, you can’t spin up one microservice within a pod without spinning up the others. Let’s say a containerized web service needs a containerized caching server to function. Both containers will belong to the same Kubernetes pod, and they will deploy/replicate in unison as required. The capacity of Kubernetes to orchestrate multi-container pods like this is a distinct advantage.
Kubernetes Worker Nodes: A single Docker node – which is running on its own server and managing the different containerized services on that server – is referred to as a Worker Node in the Kubernetes cluster. A Worker Node could be managing one or multiple containerized services. Three fundamental parts comprise a Worker Node: (1) the kubelet, (2) the kube-proxy, and (3) the Docker engine:
- The kubelet is responsible for sending current node status data to the Kubernetes Master Node (see Master Node definition below).
- The kube-proxy is a network proxy. It allows the containerized services that are managed by the Worker Node to communicate with each other, communicate with the Kubernetes cluster, communicate with different pods, and communicate with external systems.
- The Docker Engine refers to the instance of Docker Engine that the Worker Node is running. This engine manages the containers that are running on the same server instance as the Worker Node.
Kubernetes Master Node: The Kubernetes Master Node is the brain that runs the entire Kubernetes cluster. It runs on an OS instance of its own, but for redundancy, replicas of the Master Node will usually exist on multiple OS instances. As the brain of the Kubernetes cluster, the Master Node automates the process of scheduling pod deployments and allocating resources to Worker Nodes across the network as required. Four fundamental parts comprise each Master Node: (1) the kube-apiserver, (2) the kube-control-manager, (3) the kube-scheduler, and (4) the etcd.:
- The Kube-apiserver is the control panel for the cluster. It offers an API that developers can use to interact with Kubernetes and set the rules/limits that Kubernetes uses to manage the cluster. Developers will either use a command-line client (like kubectl) or a web-based dashboard (like WebUI Dashboard) to send requests to the API.
- The Kube-control-manager is the monitoring element. It tracks the number of pods in operation, request traffic, available resources, and other metrics. It monitors kube-apiserver activity.
- The kube-scheduler executes the rules that govern the system. It uses the rules and limits that developers set – and the data from kube-control-manager – to trigger different actions across the cluster.
- The etcd is the record where the Kubernetes Master Node saves the rules that kube-scheduler follows when controlling the Kubernetes cluster. Etcd holds data pertaining to the policies, system state, etc. that apply to the cluster.
Here’s an image of the Kubernetes WebUI Dashboard interface:
How Docker Swarm Compare to Kubernetes
Docker Swarm is another container-orchestration option that Docker Desktop offers alongside Kubernetes in its toolset. Since it was developed by Docker, you might think that Swarm competes with Kubernetes for the Container Orchestration use-case. However, these solutions are not directly competing with each other.
Like Kubernetes, Docker Swarm can orchestrate a large-scale system consisting of thousands of microservices. However, Docker has less of a learning curve, and setup takes a lot less time. Therefore, Swarm is great when you need to develop a container-based system as quickly as possible, and when your team doesn’t have advanced Kubernetes engineering experience. That being said, the Swarm API does not have as many features and capabilities as Kubernetes, and there’s no native monitoring.
Kubernetes includes a lot of features that Swarm doesn’t have – such as more automatic scaling capabilities when developing a large, nuanced, highly-available system. It also has a range of native monitoring features (Docker Swarm does not have native monitoring tools). That being said, due to the steep learning curve of Kubernetes and the time it takes to set up the system, it’s more appropriate for teams that already have Kubernetes engineering experience – and for container orchestration jobs that require more sophisticated configurations.
Kubernetes User Reviews
Here are some excellent user reviews for Kubernetes from TrustRadius:
- “Kubernetes is a great tool for managing Docker images. It has great features for managing your containers.”
- “Like any other cloud migration projects, using Kubernetes can be a hard thing to bring to the team. I would not see that as a con of Kubernetes', just a fact.”
- “Kubernetes is very easy to deploy in the cloud but not easy for platforms other than AWS, GCP, Azure.”
Summary of Docker, Kubernetes, and Container-Based Systems
In summary, here are the key takeaways to from this guide (a.k.a, the TL;DR version):
Containerization is a strategy that creates a virtual runtime environment without needing to replicate an OS instance. This provides a lighter-weight, more efficient, and higher-performance method for running multiple independent microservices on the same server. Containerized microservices allow you to build modular, container-based systems and/or microservices-based application architectures. Compared to traditional monolithic architectures, container-based systems consisting of independently running microservices are faster, easier, and more cost-efficient to operate, manage, upgrade, change, and scale.
Docker is an open-source suite of tools that allow you to create, deploy, and manage containers, containerized apps/microservices, and container-based systems. In terms of container orchestration, Docker is limited to only managing containers running on a single operating system kernel.
Kubernetes is an open-source container orchestration platform that allows you to manage complex and large-scale container-based systems made up of thousands (even hundreds of thousands) of containers hosted on different OS kernels. Kubernetes allows developers to automate deployment, scaling, replication, load-balancing, and available resources across a massive network of containers (and groups of containers) running on many different servers.
Did you know you can generate a full-featured, documented, and secure REST API in minutes using DreamFactory? Sign up for our free 14 day hosted trial to learn how! Our guided tour will show you how to create an API using an example MySQL database provided to you as part of the trial!
DreamFactory: Simplifying API Integrations for Container-Based Systems
Docker and Kubernetes are powerful tools that help you build container-based systems, but they are many more pieces to the microservices puzzle. In addition to using Docker/Kubernetes to create, deploy, and orchestrate the containerized microservices that make up your architecture – you also need a tool that ensures the individual microservices can interact with each other through efficiently-developed API connections.
This is where DreamFactory can help. DreamFactory is an API gateway that allows you to quickly integrate new microservices, applications, and other services into a larger system. By offering the unique ability to automatically generate REST APIs for nearly any database or service in minutes, DreamFactory bypasses weeks of coding time, decreases your labor costs, and achieves a dramatically faster development cycle.
Want to learn more about DreamFactory or try the platform for yourself? Click this link and schedule a free hosted trial of DreamFactory now!
Fascinated by emerging technologies, Jeremy Hillpot uses his backgrounds in legal writing and technology to provide a unique perspective on a vast array of topics including enterprise technology, SQL, data science, SaaS applications, investment fraud, and the law.