Prometheus Is Not a TSDB

Misconception - the right word to explain my early Prometheus experience. I came to Prometheus with vast Graphite and moderate InfluxDB experience. In my eyes, Graphite was a highly performant but fairly limited system. Metrics in Graphite are just strings (well, dotted), and the values are always stored aggregated with the lowest possible resolution of 1 second. But due to these limitations, Graphite is fast. In contrast, InfluxDB adopts Metrics 2.0 format with multiple tags and fields per metric. It also allows the storage of non-aggregated data points with impressive nanosecond precision. But this power needs to be used carefully. Otherwise, you'll get all sorts of performance issues.

For some reason, I expected Prometheus to reside somewhere in between these two systems. A kinda-sorta system that takes the best of both worlds: rich labeled metrics, non-aggregated values, and high query performance.

And at first, it indeed felt as such! But then I started noticing that I cannot really explain some of the query results. Like at all. Or sometimes, I couldn't find evidence in metrics that just had to be there. Like the metrics were showing me a different picture than I was observing with my eyes while analyzing raw data such as web server access logs.

So, I started looking for more details. I wanted to understand precisely how metrics are collected, how they are stored, what a query execution model is, et cetera, et cetera. And at first, I was shocked by my findings! Oftentimes, the Prometheus behavior didn't make any sense, especially comparing to Graphite or InfluxDB! But then it occurred to me that I was missing one important detail...

Both Graphite and InfluxDB are pure time-series databases (TSDB). Yes, they are often used as metric storage for monitoring purposes. But every particular setup of these systems comes with certain trade-offs and bolt-on additions addressing performance or reliability concerns. For instance, there is often a statsd-like daemon in front of your Graphite doing preaggregation; you use different rollup strategies for older data points, etc. But normally, you are aware of that. So when you query the last couple of days of metrics, you expect them to have a secondly precision. But when you query something a week or a month old, you already know that each data point represents a minute of aggregated data, not a second.

However, Prometheus is not a TSDB.

Prometheus is not a TSDB

Read more

How to learn PromQL with Prometheus Playground

Working with real metrics is hard. Metrics are needed to give you an understanding of how your service behaves. That is, by definition, you have some uncertainty about the said behavior. Therefore, you have to be hell certain about your observability part. Otherwise, all sorts of metric misinterpretations and false conclusions will follow.

Here are the things I'm always trying to get confident about as soon as possible:

  • How metric collection works - push vs. pull model, aggregation on the client- or server-side?
  • How metrics are stored - raw samples or aggregated data, rollup and retention strategies?
  • How to query metrics - is my mental model aligned with the actual query execution model?
  • How to plot query results - what approximation errors may be induced by the graphing tools?

And even if I have a solid understanding of all of the above stuff, there will be one thing I'm never entirely sure about - the correctness of my query logic. But this one becomes testable once other parts are known.

Recently, I've been through another round of this journey - I was making an acquaintance with Prometheus. Since it was already a third of fourth monitoring system I had to work with, at first, I thought I could skip all the said steps and jump into writing queries to production metrics and reading graphs... The hope was on the knowledge extrapolation. But nope, it didn't work out well. So, I gave up on the idea of cutting corners quickly. That's how I found myself setting up a Prometheus playground, feeding it with some known inputs, observing the outputs, and trying to draw some meaningful conclusion.

Read more

Prometheus Cheat Sheet - Basics (Metrics, Labels, Time Series, Scraping)

Here we focus on the most basic Prometheus concepts - metrics, labels, scrapes, and time series.

What is a metric?

In Prometheus, everything revolves around metrics. A metric is a feature (i.e., a characteristic) of a system that is being measured. Typical examples of metrics are:

  • http_requests_total
  • http_request_size_bytes
  • system_memory_used_bytes
  • node_network_receive_bytes_total
Prometheus metrics

Read more

Prometheus Cheat Sheet - Moving Average, Max, Min, etc (Aggregation Over Time)

When you have a long series of numbers, such as server memory consumption scraped 10-secondly, it's a natural desire to derive another, probably more meaningful series from it, by applying a moving window function. For instance, moving average or moving quantile can give you much more readable results by smoothing some spikes.

Prometheus has a bunch of functions called <smth>_over_time(). They can be applied only to range vectors. It essentially makes them window aggregation functions. Every such function takes in a range vector and produces an instant vector with elements being per-series aggregations.

For people like me who normally grasp code faster than text, here is some pseudocode of the aggregation logic:

# Input vector example.
range_vector = [
    ({"lab1": "val1", "lab2": "val2"}, [(12, 1624722138), (11, 1624722148), (17, 1624722158)]),
    ({"lab1": "val1", "lab2": "val2"}, [(14, 1624722138), (10, 1624722148), (13, 1624722158)]),
    ({"lab1": "val1", "lab2": "val2"}, [(16, 1624722138), (12, 1624722148), (15, 1624722158)]),
    ({"lab1": "val1", "lab2": "val2"}, [(12, 1624722138), (17, 1624722148), (18, 1624722158)]),
]

# agg_func examples: `sum`, `min`, `max`, `avg`, `last`, etc.

def agg_over_time(range_vector, agg_func, timestamp):
    # The future instant vector.
    instant_vector = {"timestamp": timestamp, "elements": []}

    for (labels, samples) in range_vector:
        # Every instant vector element is 
        # an aggregation of multiple samples.
        sample = agg_func(samples)
        instant_vector["elements"].append((labels, sample))

    # Notice, that the timestamp of the resulting instant vector 
    # is the timestamp of the query execution. I.e., it may not 
    # match any of the timestamps in the input range vector.
    return instant_vector

Read more

Prometheus Cheat Sheet - How to Join Multiple Metrics (Vector Matching)

PromQL looks neat and powerful. And at first sight, simple. But when you start using it for real, you'll quickly notice that it's far from being trivial. Searching the Internet for query explanation rarely helps - most articles focus on pretty high-level overviews of the language's most basic capabilities. For example, when I needed to match multiple metrics using the common labels, I quickly found myself reading the code implementing binary operations on vectors. Without a solid understanding of the matching rules, I constantly stumbled upon various query execution errors, such as complaints about missing group_left or group_right modifier. Reading the code, feeding my local Prometheus playground with artificial metrics, running test queries, and validating assumptions, finally helped me understand how multiple metrics can be joined together. Below are my findings.

Read more