Prometheus metrics and configuration

Itential Platform exposes system metrics for scraping via the /prometheus_metrics route. A Prometheus server can be configured to scrape this route and visualize the data in charts and graphs.

Prometheus collects time series numeric data, which is useful for testing and diagnostic purposes such as during an Itential Platform outage. Metrics are returned in a text exposition format understood by Prometheus servers of any version:

# HELP iap_unique_sessions Number of uniqueItential Platform sessions
# TYPE iap_unique_sessions gauge
iap_unique_sessions 1
# HELP iap_active_sessions Number of activeItential Platform sessions
# TYPE iap_active_sessions gauge
iap_active_sessions 1
# HELP iap_total_api_calls Total count of api calls sinceItential Platform start
# TYPE iap_total_api_calls gauge
iap_total_api_calls 1
# HELP iap_active_jobs Number of active jobs
# TYPE iap_active_jobs gauge
iap_active_jobs 17882
# HELP iap_memory_heap_usage_percentage percent of heap used
# TYPE iap_memory_heap_usage_percentage gauge
iap_memory_heap_usage_percentage 92.39992269241
# HELP iap_total_memory_heap_size total number of bytes allocated for this node process's heap
# TYPE iap_total_memory_heap_size gauge
iap_total_memory_heap_size 46194688
# HELP iap_memory_heap_used_size number bytes currently in use in this node process's heap
# TYPE iap_memory_heap_used_size gauge
iap_memory_heap_used_size 42683856
# HELP iap_cpu_user_usage user CPU time usage ofItential Platform process
# TYPE iap_cpu_user_usage gauge
iap_cpu_user_usage 9479497
# HELP iap_cpu_system_usage system CPU time usage ofItential Platform process
# TYPE iap_cpu_system_usage gauge
iap_cpu_system_usage 1884960

Configuration file

Prometheus servers are configured using a .yml file (default: prometheus.yml). The metrics_path option must be changed to /prometheus_metrics since Prometheus scrapes /metrics by default. See the Prometheus configuration docs for more information.

Itential recommends securing your endpoint by:

  • Adding basic authentication to requests for the /prometheus_metrics endpoint.
  • Enabling TLS client authentication and requiring all Prometheus clients to present valid certificates.

See Prometheus Security for further reading.

Example prometheus.yml:

1# global config
2global:
3 scrape_interval: 2s
4 evaluation_interval: 5s
5
6scrape_configs:
7 - job_name: 'pronghorn'
8
9 static_configs:
10 - targets: ['host:port']
11
12 metrics_path: /prometheus_metrics
13
14 basic_auth:
15 username: 'admin@pronghorn'
16 password: 'admin'
17
18 scheme: https
19
20 tls_config:
21 cert_file: path/to/crt
22 key_file: path/to/key
23 insecure_skip_verify: false

Metrics collected

MetricDescription
iap_unique_sessionsNumber of unique unexpired sessions.
iap_active_sessionsTotal number of unexpired sessions.
iap_total_api_callsTotal number of Itential Platform calls made by unexpired sessions.
iap_active_jobsNumber of active jobs (status running and error).
iap_memory_heap_used_sizeItential Platform V8 heap memory used.
iap_total_memory_heap_sizeItential Platform V8 heap memory allocated.
iap_memory_heap_usage_percentagePercent of V8 heap memory allocated (iap_memory_heap_used_size / iap_total_memory_heap_size).
iap_cpu_user_usageUser CPU time usage of the Itential Platform process.
iap_cpu_system_usageSystem CPU time usage of the Itential Platform process.