GoogleContainerTools' distroless base images are often mentioned as one of the ways to produce small(er), fast(er), and secure(r) containers. But what are these distroless images, really? Why are they needed? What's the difference between a container built from a distroless base and a container built from scratch? Let's take a deeper look.
Many of us these days seem to be in pursuit of better container images. And this is for good reasons! Bloated images with many (potentially unneeded) moving parts slow down development and give more space for a CVE to sneak in. Luckily, there is a number of ways to produce slim and secure images, and everyone just needs to pick
their poison a suitable one. But before doing so, it's good to become aware of a potential dissonance between what we say is important for us (securing our software supply chains) and what may actually drive our decisions (keeping out dev loops fast).
Containers could have become a lightweight VM replacement. However, the most widely used form of containers, standardized by Docker/OCI, encourages you to have just one
process service per container. Such an approach has a bunch of pros - increased isolation, simplified horizontal scaling, higher reusability, etc. However, there is a big con - in the wild, virtual (or physical) machines rarely run just one service.
While Docker tries to offer some workarounds to create multi-service containers, Kubernetes makes a bolder step and chooses a group of cohesive containers, called a Pod, as the smallest deployable unit.
When I stumbled upon Kubernetes a few years ago, my prior VM and bare-metal experience allowed me to get the idea of Pods pretty quickly. Or so thought I... 🙈
Starting working with Kubernetes, one of the first things you learn is that every pod gets a unique IP and hostname and that within a pod, containers can talk to each other via localhost. So, it's kinda obvious - a pod is like a tiny little server.
After a while, though, you realize that every container in a pod gets an isolated filesystem and that from inside one container, you don't see processes running in other containers of the same pod. Ok, fine! Maybe a pod is not a tiny little server but just a group of containers with a shared network stack.
But then you learn that containers in one pod can communicate via shared memory! So, probably the network namespace is not the only shared thing...
This last finding was the final straw for me. So, I decided to have a deep dive and see with my own eyes:
- How Pods are implemented under the hood
- What is the actual difference between a Pod and a Container
- How one can create Pods using Docker.
And on the way, I hope it'll help me to solidify my Linux, Docker, and Kubernetes skills.
There are many resources for people who want to learn Linux, Containers, or Kubernetes. However, most of these resources don't come with an interactive, hands-on learning experience. You can read tens of fine blog articles and watch hundreds of engaging YouTube videos, maybe even take some courses with theoretical quizzes at the end, but it's doubtful you'll master any of the above technologies without actively practicing them.
Theoretical-only knowledge of, say, Kubernetes doesn't really count. Hands-on exercises should be a must-have learning element. Some resources, including this blog, strive to provide reproducible instructions so that students can try out the new skills. However, for that, a running system is needed. Setting up such a system can make the learning curve substantially steeper or even make the task fully unbearable for inexperienced students.
So, where can a student practice the new skills?
One option is to experiment on a real staging (or production 🙈) environment. But it can be quite harmful. Luckily, there is an alternative. Some learning platforms offer interactive playgrounds mimicking real-world setups. On these platforms, students can SSH into disposable Linux servers, or even access multi-server stages right from their browsers!
Experimenting with the new skills in such sandboxes makes the learning hands-on. At the same time, these platforms free students from the need for provisioning playgrounds. It brings students closer to real-world environments while keeping the learning process safe - playgrounds can always be destroyed and recreated without damaging any real production systems.
Cloud-Native Learn-by-Doing platforms— Ivan Velichko (@iximiuz) September 25, 2021
A list of sites with interactive playgrounds on
learning .oreilly .com
I got so fascinated by the idea of interactive playgrounds recently that I spent a week researching platforms that provide in-browser learn-by-doing experience. Below are my findings, alphabetically ordered:
Disclaimer: In 2021, there is still a place for simple setups with just one machine serving all traffic. So, no Kubernetes and no cloud load balancers in this post. Just good old Docker and Podman.
Even when you have just one physical or virtual server, it's often a good idea to run multiple instances of your application on it. Luckily, when the application is containerized, it's actually relatively simple. With multiple application containers, you get horizontal scaling and a much-needed redundancy for a very little price. Thus, if there is a sudden need for handling more requests, you can adjust the number of containers accordingly. And if one of the containers dies, there are others to handle its traffic share, so your app isn't a SPOF anymore.
The tricky part here is how to expose such a multi-container application to the clients. Multiple containers mean multiple listening sockets. But most of the time, clients just want to have a single point of entry.