Skip to content

Prometheus Metrics

Metrics retrieved from Sense servers can be exposed on a Prometheus compatible endpoint. You don't have to be a Prometheus expert to use Butler SOS, but understanding some basic concepts is helpful.

Overview

Storing metrics in Prometheus is not mandatory, but some kind of metrics storage—either in Prometheus, InfluxDB, or New Relic—is needed to take full benefit of Butler SOS' features.

Prometheus gathers metrics by "scraping" data from web pages ("endpoints") on which metrics are displayed in a well-specified format. Most metrics from Sense servers are exposed on a Prometheus-compatible endpoint, but not all.

Key Difference from InfluxDB

InfluxDB is more flexible for some types of data (especially strings), while Prometheus provides more easily used features for data aggregation when data should be displayed in Grafana.

Prometheus Endpoint

Prometheus is enabled/disabled in the Butler-SOS.prometheus section in the config file. Prometheus metrics are available on the /metrics URL on the IP and port specified in the config file.

For example, if the host is 0.0.0.0 and the port is 9842, Butler SOS will listen on port 9842 on all available network interfaces. If the Butler SOS server's IP address is 192.168.1.168, you can view metrics in a web browser at:

text
http://192.168.1.168:9842/metrics

This is the web page Prometheus will scrape and ingest into its time-series database.

Prometheus Concepts

In contrast to InfluxDB, to which Butler SOS pushes data, Prometheus works the other way around. The Prometheus server is responsible for gathering data exposed by the systems being monitored (in this case, Butler SOS).

The basic concepts are:

  • Metrics represent the measurements of interest (similar to "fields" in InfluxDB)
  • Labels are used to categorize metrics (similar to "tags" in InfluxDB)

Labels

The labels available for all Prometheus metrics are:

Label NameSourceDescription
hostButler-SOS.serversToMonitor.servers[].hostHost IP or FQDN of the server from which the metric comes
server_nameButler-SOS.serversToMonitor.servers[].serverNameHuman friendly server name
server_descriptionButler-SOS.serversToMonitor.servers[].serverDescriptionHuman friendly server description
custom labelsButler-SOS.serversToMonitor.servers[].serverTags.*All tags defined in the config file will be added as Prometheus labels

Qlik Sense Metrics

String Data Limitation

Prometheus is designed for storing numeric measurements and doesn't offer a good way to store strings. For that reason, Butler SOS metrics involving strings (for example, list of apps loaded in memory) are not available on the Prometheus endpoint.

If you need string data, use InfluxDB instead.

Most of the metrics come from Qlik Sense's health check API.

Available Metrics

MetricTypeDescription
butlersos_apps_callsGaugeTotal number of requests made to the Qlik Sense engine
butlersos_apps_selectionsGaugeTotal number of selections made to the Qlik Sense engine
butlersos_apps_activedocs_totalGaugeNumber of active apps. An app is active when a user is currently performing some action on it
butlersos_apps_inmemorydocs_totalGaugeNumber of apps currently loaded into memory, even if they do not have any open sessions or connections
butlersos_apps_loadeddocs_totalGaugeNumber of apps currently loaded into memory that also have open sessions or connections
butlersos_cache_addedGaugeNumber of cache objects added
butlersos_cache_hitsGaugeNumber of cache hits
butlersos_cache_lookupsGaugeNumber of cache lookups
butlersos_cache_replacedGaugeNumber of replaced cache objects
butlersos_cache_saturatedGaugeWhen the value is 1, the engine is running with high resource usage; otherwise 0
butlersos_cpu_totalGaugePercentage of the CPU used by the engine, averaged over 30 seconds
butlersos_mem_committedGaugeTotal amount of committed memory for the engine process in MB
butlersos_mem_allocatedGaugeTotal amount of allocated memory (committed + reserved) from the OS in MB
butlersos_mem_freeGaugeTotal amount of free memory (minimum of free virtual and physical memory) in MB
butlersos_session_activeGaugeNumber of active engine sessions. A session is active when a user is currently performing some action on an app
butlersos_session_totalGaugeTotal number of engine sessions
butlersos_users_activeGaugeNumber of distinct active users. An active user is currently performing an action on an app
butlersos_users_totalGaugeTotal number of distinct users within the current engine sessions
butlersos_engine_metadataGaugeMetadata about the Qlik Sense engine
butlersos_user_session_totalGaugeNumber of sessions (as reported by the proxy service)

Node.js Metrics

A set of Node.js-specific metrics are also available on Butler SOS' Prometheus endpoint. These provide insight into the Butler SOS process itself.

These are described in the "Default metrics" section of the prom-client documentation.

Common Node.js metrics include:

MetricDescription
nodejs_eventloop_lag_secondsLag of event loop in seconds
nodejs_gc_duration_secondsGarbage collection duration
nodejs_active_handles_totalNumber of active handles
nodejs_active_requests_totalNumber of active requests
nodejs_heap_size_total_bytesProcess heap size in bytes
nodejs_heap_size_used_bytesProcess heap size used in bytes
nodejs_external_memory_bytesExternal memory size in bytes
process_cpu_user_seconds_totalTotal user CPU time
process_cpu_system_seconds_totalTotal system CPU time
process_start_time_secondsProcess start time
process_resident_memory_bytesResident memory size in bytes

Example Prometheus Configuration

Add this to your prometheus.yml to scrape metrics from Butler SOS:

yaml
scrape_configs:
  - job_name: "butler-sos"
    static_configs:
      - targets: ["butler-sos-server:9842"]
    scrape_interval: 30s

Example Queries

Here are some useful PromQL queries for Grafana dashboards:

CPU Usage

promql
butlersos_cpu_total{server_name="Production Server"}

Memory Usage

promql
butlersos_mem_committed{server_name=~".*"}

Active Users Over Time

promql
rate(butlersos_users_active[5m])

Session Count by Server

promql
sum by (server_name) (butlersos_session_total)

Engine Saturation Alerts

promql
butlersos_cache_saturated == 1

Comparison with InfluxDB

FeaturePrometheusInfluxDB
Data collectionPull (scraping)Push
String dataNot supportedSupported
Built-in alertingYes (via Alertmanager)Via Kapacitor
Query languagePromQLInfluxQL / Flux
Best forKubernetes, cloud-nativeDetailed time-series data

Released under the MIT License.