Prometheus is a great monitoring tool. It can easily scrape all the services in your cluster dynamically, without any static configuration. For me, the move from manual metrics shipping to Prometheus was magical. But, like any other technology we’re using, Prometheus need special care an love. If not handled properly, it can easily get out of shape. Why does it happen? And how can we keep it in shape? Let’s first do a quick recap of how Prometheus works.
Prometheus Monitoring Model
Prometheus works differently from other monitoring systems – it uses pull over push model. The push model is simple: Just push metrics from your code directly to the monitoring system, for example – Graphite.
Pull model is fundamentally different – the service exposes metrics on a specific endpoint, and Prometheus scrapes them once in a while (the scrape interval – see here how to configure it). While there are reasons to prefer push over the pull model, it has its own challenges: Each metric scrape operation can take time; what happens if it the scrape take longer then the scrape interval?
For example, let’s say Prometheus is configured to scrape its targets (that’s how services are called in Prometheus language) once in 20 seconds; what will happen if one scrape takes more then 20 seconds? The result is out of order metrics: instead of having a data point every 20 seconds, it will be every time the scrape completed. What can we do?
Today, I looked at our production Kubernetes cluster dashboard and I noticed something weird:
Well, this looks pretty bad. This is the average disk usage of the nodes running in the cluster. On average, only 20% percent of the disk in each node is available. This is probably not a good sign.