Learn-by-Doing Platforms for Dev, DevOps, and SRE Folks

There are many resources for people who want to learn Linux, Containers, or Kubernetes. However, most of these resources don't come with an interactive, hands-on learning experience. You can read tens of fine blog articles and watch hundreds of engaging YouTube videos, maybe even take some courses with theoretical quizzes at the end, but it's doubtful you'll master any of the above technologies without actively practicing them.

Theoretical-only knowledge of, say, Kubernetes doesn't really count. Hands-on exercises should be a must-have learning element. Some resources, including this blog, strive to provide reproducible instructions so that students can try out the new skills. However, for that, a running system is needed. Setting up such a system can make the learning curve substantially steeper or even make the task fully unbearable for inexperienced students.

So, where can a student practice the new skills?

One option is to experiment on a real staging (or production 🙈) environment. But it can be quite harmful. Luckily, there is an alternative. Some learning platforms offer interactive playgrounds mimicking real-world setups. On these platforms, students can SSH into disposable Linux servers, or even access multi-server stages right from their browsers!

Experimenting with the new skills in such sandboxes makes the learning hands-on. At the same time, these platforms free students from the need for provisioning playgrounds. It brings students closer to real-world environments while keeping the learning process safe - playgrounds can always be destroyed and recreated without damaging any real production systems.

I got so fascinated by the idea of interactive playgrounds recently that I spent a week researching platforms that provide in-browser learn-by-doing experience. Below are my findings, alphabetically ordered:

Read more

How to learn PromQL with Prometheus Playground

Working with real metrics is hard. Metrics are needed to give you an understanding of how your service behaves. That is, by definition, you have some uncertainty about the said behavior. Therefore, you have to be hell certain about your observability part. Otherwise, all sorts of metric misinterpretations and false conclusions will follow.

Here are the things I'm always trying to get confident about as soon as possible:

  • How metric collection works - push vs. pull model, aggregation on the client- or server-side?
  • How metrics are stored - raw samples or aggregated data, rollup and retention strategies?
  • How to query metrics - is my mental model aligned with the actual query execution model?
  • How to plot query results - what approximation errors may be induced by the graphing tools?

And even if I have a solid understanding of all of the above stuff, there will be one thing I'm never entirely sure about - the correctness of my query logic. But this one becomes testable once other parts are known.

Recently, I've been through another round of this journey - I was making an acquaintance with Prometheus. Since it was already a third of fourth monitoring system I had to work with, at first, I thought I could skip all the said steps and jump into writing queries to production metrics and reading graphs... The hope was on the knowledge extrapolation. But nope, it didn't work out well. So, I gave up on the idea of cutting corners quickly. That's how I found myself setting up a Prometheus playground, feeding it with some known inputs, observing the outputs, and trying to draw some meaningful conclusion.

Read more