Containers vs. Pods - Taking a Deeper Look

Containers could have become a lightweight VM replacement. However, the most widely used form of containers, standardized by Docker/OCI, encourages you to have just one process service per container. Such an approach has a bunch of pros - increased isolation, simplified horizontal scaling, higher reusability, etc. However, there is a big con - in the wild, virtual (or physical) machines rarely run just one service.

While Docker tries to offer some workarounds to create multi-service containers, Kubernetes makes a bolder step and chooses a group of cohesive containers, called a Pod, as the smallest deployable unit.

When I stumbled upon Kubernetes a few years ago, my prior VM and bare-metal experience allowed me to get the idea of Pods pretty quickly. Or so thought I... 🙈

Starting working with Kubernetes, one of the first things you learn is that every pod gets a unique IP and hostname and that within a pod, containers can talk to each other via localhost. So, it's kinda obvious - a pod is like a tiny little server.

After a while, though, you realize that every container in a pod gets an isolated filesystem and that from inside one container, you don't see processes running in other containers of the same pod. Ok, fine! Maybe a pod is not a tiny little server but just a group of containers with a shared network stack.

But then you learn that containers in one pod can communicate via shared memory! So, probably the network namespace is not the only shared thing...

This last finding was the final straw for me. So, I decided to have a deep dive and see with my own eyes:

  • How Pods are implemented under the hood
  • What is the actual difference between a Pod and a Container
  • How one can create Pods using Docker.

And on the way, I hope it'll help me to solidify my Linux, Docker, and Kubernetes skills.

Read more

Service Discovery in Kubernetes - Combining the Best of Two Worlds

Before jumping to any Kubernetes specifics, let's talk about the service discovery problem in general.

What is Service Discovery

In the world of web service development, it's a common practice to run multiple copies of a service at the same time. Every such copy is a separate instance of the service represented by a network endpoint (i.e. some IP and port) exposing the service API. Traditionally, virtual or physical machines have been used to host such endpoints, with the shift towards containers in more recent times. Having multiple instances of the service running simultaneously increases its availability and helps to adjust the service capacity to meet the traffic demand. On the other hand, it also complicates the overall setup - before accessing the service, a client (the term client is intentionally used loosely here; oftentimes a client of some service is another service) needs to figure out the actual IP address and the port it should use. The situation becomes even more tricky if we add the ephemeral nature of instances to the equation. New instances come and existing instances go because of the non-zero failure rate, up- and downscaling, or maintenance. That's how a so-called service discovery problem arises.

Service discovery problem

Service discovery problem.

Read more

Service Proxy, Pod, Sidecar, oh my!

How services talk to each other?

Imagine you're developing a service... For certainty, let's call it A. It's going to provide some public HTTP API to its clients. However, to serve requests it needs to call another service. Let's call this upstream service - B.

Service A talks to Service B directly.

Obviously, neither network nor service B is ideal. If service A wants to decrease the impact of the failing upstream requests on its public API success rate, it has to do something about errors. For instance, it could start retrying failed requests.

Read more