Prometheus
Power your metrics and alerting with a leading open-source monitoring solution.
Most Prometheus components are written in Go
, making them easy to build and deploy as static binaries.
Use it in:
Prometheus works well for recording any purely numeric time series. IT fits both machine-centric as well as monitoring of highly dynamic service-oriented architectures.
It is designed for reliability, each server is standalone, not depending on network storage or other remote services.
No need to setup extensive infrastructure to use it.
Doesn't fit when:
If you need 100% accuracy , such as per-request billing, as the collected data will likely not be detailed and complete enough.
Features
Multi-dimensional data model with time series data identified by metric name and key/value pairs, making it a powerful tool for data collection.
PromQL
a flexible query language to leverage this dimensionality and create powerful prompts.Efficient storage, no reliance on distributed storage; single server nodes are autonomous.
Easy integration with Grafana and other clients. (Despite being a powerful tool, Prometheus don't have visual display of its data)
Intelligent alert system.
Components
The Prometheus ecosystem consist of multiple components:
The main, Prometheus server which scrapes and stores time series data.
Client libraries for instrumenting application code.
A push gateway for supporting short-lived jobs.
Special-purpose exporters for services.
An alert-manager to handle alerts.
Other support tools.
Architecture
Prometheus scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs.
It stores all scraped samples locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts.
Grafana or other API consumers can be used to visualize the collected data.
Retrieval
The component that receives all the information from outside.
TSDB (Time series Database)
The component that stores all information with a time series format.
Data storage changes with time.
Recent data is more precise than old data.
HTTP server
An internal HTTP server, by which it can monitor itself, and to make all the data available to the outside.
Also, access the Web Server to see target jobs and other useful Prometheus configurations.
Service Discovery
Prometheus can have access and communicate with Service Discoveries
, to find new targets to monitor. (for instance when you have dynamic architectures where nodes are added and deleted due to auto-scaling)
For services or applications that are not working all the time. (Ex.: An application available only once a day at a specific time)
To avoid Prometheus to keep trying to pull unecessarily the metrics from this job directly, the job push the metrics and data to the Pushgateway, and the Pushgateway "stores" this data.
Useful for generating metrics over thid-party software (that you don't have access to).
The Alertmanager hits the Prometheus Server's Http server to get the required data for the alerts to work.
Metrics
The Prometheus client libraries offer four core metric types.
Counter
A metric with an incremental (cumulative) value.
It can only increase or be reset to zero on restarts.
Do not use Counter to expose a value that can decrease.
Ex.: For the number of currently running processes;
Gauge
It is a metric that represents a single numerical value that can arbitrarily go up and down.
Ex.: Like mesuraments of memory usage, temperatures, etc.
A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
Useful for frequency distribution.
Similar to a histogram, a summary samples observations (usually things like request durations and response sizes).
While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
Last updated