This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Getting started

Taking your first steps.

Butler SOS is written in Node.js, which is a cross-platform programming environment. This means most kinds of computers and servers can be used to run Butler SOS, including Windows, Linux and Mac OS.

Setting up Butler SOS is pretty straightforward, but you do need a working understanding of Qlik Sense admin tasks.
For example, you need to export certificates from the QMC, as well as installing Butler SOS itself.

1 - Overview

SenseOps monitoring - what’s that?

This page provides the general steps to get started with Butler SOS.
It also explains how Butler SOS relates to other tools and services that collectively make up the SenseOps concept.

Butler SOS

Qlik Sense + DevOps = SenseOps

Butler SenseOps Stats (“Butler SOS”) is a monitoring tool for Qlik Sense, built with DevOps workflows in mind.

It publishes operational, close to real-time Qlik Sense Enterprise metrics to InfluxDB, Prometheus, New Relic and MQTT. From there it can be visualised using tools like Grafana, New Relic or acted on by downstream systems that listen to the MQTT topics used by Butler SOS.

Butler SOS gathers operational metrics from several sources, including the Sense healthcheck API and Session API and the Sense logs.

Do I really need a tool like this?

Let’s say you are somehow involved in (or maybe even responsible for) your company’s client-managed Qlik Sense Enterprise on Windows (QSEoW) environemnt.

Let’s also assume you have more than 5-10 users in your Sense environment. Maybe you even have business critical data in your Sense apps.

Given the above, the answer is almost certainly “yes” : You can simplify your workday and provide a better analytics experience to your end users by using a tool like Butler SOS.

Looking at companies using Butler SOS, they range from small companies with a single Sense server to large enterprises with dozens of Sense servers and (many!) thousands of users.

Why a separate tool for this?

Good question.

While Qlik Sense ships with a great Operations Monitor application, it is not useful or intended for real-time operational monitoring.
The Ops Monitor app is great for retrospective analysis of what happened in a Qlik Sense environment, but for a real-time understanding of what’s going on in a Sense environment something else is needed - enter Butler SOS.

The most common way of using Butler SOS is for creating real-time dashboards based on the data in the InfluxDB or Prometheus database, showing operational metrics for a Qlik Sense Enterprise environment.

Sample screen shots of some basic Grafana dashboards created using data extracted by Butler SOS:

Grafana dashboard

Grafana dashboard

Grafana dashboard

As mentioned above, Butler SOS can also send data to MQTT for use in any MQTT enabled tool or system.

Known limitations & improvement ideas

Things can always be improved, of course. Here are some ideas on things for future versions:

  • The MQTT messages are kind of basic, at least when it comes to data from the Sense logs and for detailed user sessions. In both those cases a single text string is sent to MQTT. That’s fine, but assumes the downstream consumer of the MQTT message can parse the string and extract the information of interest.
    A better approach would be to send more detailed MQTT messages. Those would be easier to consume and act upon for downstream systems, but it would on the other mean lots more MQTT messages being sent.
  • Send data as Kafka messages. Same basic idea as for MQTT messages, but having the Sense operational data in Kafka would make it easier to process/use it in (big) data pipelines.

If you have ideas or suggestions on new features, please feel free to add them in the Butler SOS Github project.

Where should I go next?

Ready to move on?

Great! Here are some good starting points

  • Examples: Check out some Grafana dashboards to get inspiration what can be done!
  • Installation & setup: Learn how to install Butler SOS, then set it up according to your needs.

I have a question or want to report an issue

Feel free to reach out via GitHub discussions for general questions, GitHub issues for bugs, or by email to info@ptarmiganlabs.com.

Security / Disclosure

If you discover any important bug with Butler SOS that may pose a security problem, please disclose it confidentially to security@ptarmiganlabs.com first, so that it can be assessed and hopefully fixed prior to being exploited. Please do not raise GitHub issues for security-related doubts or problems.

Who’s behind Butler SOS?

Butler SOS is an open source project sponsored by Ptarmigan Labs, an IT consulting company in Stockholm, Sweden.
Project lead is Göran Sander from same company.

Please refer to the Contribution guidelines page for details on how to contribute, suggest features etc to the tool.

2 - Install

The steps needed for installing and configuring vary slightly depending on what platform you use. The details are found here.

Warning

Butler SOS can store data in InfluxDB 1.x or 2.x databases.

InfluxDB 3.x is currently not supported.

Tip

There is a Tips & Tricks to get started with Butler SOS document on the Butler SOS forums.

It contains description of issues people have faced when installing Butler SOS, as well as solutions to them.

If in doubt on how to install Butler SOS, please consider Docker (or Kubernetes if available) as the first alternative.
Why? Several reasons:

  • Very quick to get started. Usually it takes just a few minutes to set up a Butler SOS instance in Docker.
  • Using Docker is a great way to test new tools without having to install the tool on one of your actual servers. If you decide the tool in question is not for you - just delete the Docker container. Your servers remain 100% the same as before the test.
  • The previous point is true not only for Butler SOS, but also its companion tools InfluxDB, Prometheus, Grafana and MQTT (via for example the Mosquitto MQTT broker). You can run all of these tools in their own Docker containers, and not install a single piece of new, native applications during your evaluation of Butler SOS.
  • Make use of your existing Docker runtime environments, or use those offered by Amazon, Google, Microsoft etc.
  • Benefit from the comprehensive tools ecosystem (monitoring, deployment etc) that is available for Docker.
  • Updating Butler SOS to the latest version (assuming no config file changes are needed for that particular upgrade) is as easy as stopping the container, doing a “docker pull ptarmiganlabs/butler-sos:latest”, and finally starting the container again.

If Docker is not an option, the pre-built, stand-alone binaries for Windows, Linux and macOS are good options.
They offer a download-configure-execute approach to running Butler SOS.
This will be the easiest way to use Butler SOS if you are not familiar with Docker.

But even with the above recommendations, Butler SOS can be deployed in lots of different configurations.
It is therefore difficult to give precise instructions that will work everwhere, for everyone. Especially the fact that Butler SOS uses certificates to authenticate with Sense is a complicating factor. Certificates are (when correctly used) great for securing systems, but they can alse cause headaches.

First we must recognize that Sense uses self signed certificates. This is fine, and as long as you work on a server where Sense Enterprise is installed, that server will have the Sense-provided certificates and Certificate Authority (CA) installed.

This means that the easiest option for getting Butler SOS up and running is usually to install it on one of your Sense servers.

That said, it is probably better system design to run Butler SOS (and maybe other members of the Butler family) on their own server, maybe using some flavour of Linux (lower cost compared to Windows). Windows servers work equally well though.

In this case you might want to consider exporting the Sense CA certificate from one of your Sense servers, and then install it on the Linux server. This should technically not be needed for Butler SOS to work correctly - as long as you specify the correct root.pem file in the Butler SOS config file, you should be ok.

If you specify an incorrect root CA certificate file in the clientCertCA config option, you will get an error like this:

2018-05-23T20:36:44.393Z - error: Error: Error: unable to verify the first certificate
    at TLSSocket.<anonymous> (_tls_wrap.js:1105:38)
    at emitNone (events.js:106:13)
    at TLSSocket.emit (events.js:208:7)
    at TLSSocket._finishInit (_tls_wrap.js:639:8)
    at TLSWrap.ssl.onhandshakedone (_tls_wrap.js:469:38)
2018-05-23T20:36:49.164Z - verbose: Event started: Query log db
2018-05-23T20:36:49.180Z - verbose: Event started: Statistics collection

A general note on host names is also relevant.
If you specify a server name of “myserver.company.com” while exporting certificates from the QMC, you should use that same server name in the Butler SOS config file. Failing to do so will (most likely) result in an error:

2018-05-23T19:51:03.087Z - error: Error: Error: Hostname/IP doesn't match certificate's altnames: "Host: serveralias.company.net. is not in the cert's altnames: DNS:myserver.company.com"
    at Object.checkServerIdentity (tls.js:223:17)
    at TLSSocket.<anonymous> (_tls_wrap.js:1111:29)
    at emitNone (events.js:106:13)
    at TLSSocket.emit (events.js:208:7)
    at TLSSocket._finishInit (_tls_wrap.js:639:8)
    at TLSWrap.ssl.onhandshakedone (_tls_wrap.js:469:38)
2018-05-23T19:51:07.701Z - verbose: Event started: Statistics collection

2.1 - Choosing a platform - what are the options?

You can run Butler SOS on several platforms, each with their own pros and cons. This section should help you decide which platform is right for you.

As Butler SOS is written in Node.js, the tool in theory runs on all platforms where Node.js is available. It is also available as a Docker image.

Docker is by far the preferred way of running Butler SOS, mainly because it gives you a very nice, production grade (stable, scalable, monitorable etc) execution environment. If you are really serious about scalability and stability you could even run Butler SOS in Kubernetes.

Other platforms can be used too, of course - let’s look at the pros and cons of some of the more commonly used platforms:

Platform Pros Cons
Docker - Easy to set up Butler SOS in Docker
- Easy to test new versions of Butler SOS
- Use existing Docker infrastructure
- Monitoring, restarts etc built into Docker
- Runs on low cost hardware and OSs
- Docker environment needed (if not already available). Setting up and running Docker is not hard, but does require somewhat other skills than those needed to run a Sense environment
Kubernetes - Enterprise grade
- Fault tolerant
- Deployed alongside other enterprise applications
- More difficult to set up than Docker
Windows server - Butler SOS can run on same server as Qlik Sense, saving hardware/server costs
- Pre-built, standalone binaries available
- Running Butler SOS natively on the Sense server is a potential risk (usually a good idea to isolate systems/services to their own servers/environments whenever possible)
- More difficult (compared to Docker) to achieve a production grade setup (auto restarts etc)
Linux - No cost for operating system (at least not for most Linux versions)
- Runs on low cost hardware
- Pre-built, standalone binaries available
- More difficult (compared to Docker) to achieve a production grade setup (auto restarts etc)
Mac OS - For development, if you want to extend or modify Butler SOS
- Signed, pre-built, standalone binaries available
- Not a server grade operating system, i.e. not for production use
Windows (desktop) - For development, if you want to extend or modify Butler SOS - Not a server grade operating system, i.e. not for production use

2.2 - Native app

How to install Butler SOS as a Node.js application.

Selecting an OS

While Qlik Sense Enterprise is a Windows only system, Butler SOS should be able to run on any OS where Node.js is available.
Butler SOS has been succesfully used as a native Node.js app - during development and production - on Windows, Linux (Debian and Ubuntu tested) and mac OS.

Prerequisites

What Comment
Qlik Sense Enterprise on Windows Mandatory. Butler SOS is developed with Qlik Sense Enterprise on Windows (QSEoW) in mind.
Butler SOS is simply not intended to work with Sense Desktop or Sense cloud.
Node.js Mandatory. Butler SOS is written in Node - which is thus a firm requirement.
MQTT broker Optional. MQTT is used for outbound pub-sub messaging. Butler SOS assumes a working MQTT broker is available, the IP of which is defined in the Butler SOS config file. Mosquitto is a great open source broker. It requires very little hardware to run, even the smallest (usually free) Amazon/Google/Microsoft/… instance is enough, if you want a dedicated MQTT server. If you don’t care about the pubsub features of Butler SOS, you don’t need a MQTT broker. In this case you can disable the MQTT features in the config YAML file.
InfluxDB Use at least one of InfluxDB and Prometheus. An open source database for realtime information, used to store metrics around Butler’s own memory usage over time (if this feature is enabled).
At this point more metrics and events are sent to InfluxDB, compared to Prometheus.
Prometheus Use at least one of InfluxDB and Prometheus. The de-facto standard open source tool for metrics gathering in large-scale systems, including Kubernetes. A bit more complex to set up and configure compared to InfluxDB, but also more focused on providing observability features.
Grafana Optional. The de-facto open source standard for showing real-time metrics. In order to visualise Sense realtime metrics in Grafana you must enable at least one of InfluxDB or Prometheus.

2.2.1 - Windows

Running Butler SOS in Windows.

Installation

There are two options: Run Butler SOS as a standalone binary or as a Node.js app. The first is by far easier to set up and maintain and thus recommended.

Using the pre-built, standalone app

The pre-build binaries are available from the releases page.

  1. Download the Windows binary
  2. “Unblock” the downloaded zip file
    1. Right-click the zip file
    2. Select “properties”
    3. Mark the “Unblock” check box in the lower right part of the properties window
    4. Click the “Apply” button, then “Ok” to close the properties window
  3. Unzip the zip file
  4. Move the extracted butler-sos.exe file to desired location, for example d:\tools\butler-sos
  5. Use nssm or similar tool to install Butler SOS as a Windows service

Unblocking the Butler SOS zip file on Windows Server

Using Node.js

In this scenario you will use the Butler SOS source code together with the standard Node.js runtime libraries.

The result is the same as with the stand-alone binaries, you just have to do more of the work yourself.
This is usually not preferred, but if you want to add new features to (or modify existone ones) Butler SOS, this option is for you

1. Install Node.js

The latest LTS version is usually a good choice.

2. Select a directory from which Butler SOS will be run

This can be pretty much anywhere, in this example d:\tools\butler-sos will be used.

3. Get Butler SOS

Get the desired Butler SOS version and extract it into the directory above.

Get the latest available version unless you have a really good reason to use an older version.
New features are added, bugs fixed and security updates are applied in each version - it’s simply a good idea to use the latest version.

Do not just clone the Butler SOS repository as that will give you the latest development version, which may not yet be fully tested and packaged.
The exception is of course if you want to contribute to Butler SOS development - then forking and cloning the repository is the right thing to do.

4. Install Node.js dependencies

From d:\tools\butler-sos\src, run npm i to install the various Node.js modules used by Butler SOS. Depending on your server configuration you may get some warnings about (for example) Python not being installed, these can usually be ignored.

Configuration

The configuration file is used the same way as when Butler SOS runs on Docker, with one exception:

The path to the certificates used to authenticate with Sense must be specified in the config file. With Docker the certificate path is always the same, but with Windows you need to specify where the certificate files are located.

For example, if the certificate files exported from Sense are stored in d:\secrets\sensecert, the config file would look like this when used on Windows:

  ...
  # Certificates to use when querying Sense for healthcheck data. Get these from the Certificate Export in QMC.
  cert:
    clientCert: d:\secrets\sensecert\client.pem
    clientCertKey: d:\secrets\sensecert\client_key.pem
    clientCertCA: d:\secrets\sensecert\root.pem

Stayin’ alive

A tool like Butler SOS should of course start automatically when the server it runs on is restarted. This can be achieved in at least a couple of ways:

  1. By far the best option is to turn Butler SOS into a Windows service. That way it will be started on server boot, restarted if it fails etc. There are various tools for doing this, with NSSM being a very good one. Butler SOS has been installed in lots of Sense environments this way.

  2. You can also use a Node process monitor such as PM2 to monitor the Butler SOS process, and restart it if it for some reason crashes. PM2 is not entirely easy to use on Windows though.

2.2.2 - Linux and Mac OS

Running Butler SOS in Linux and Mac OS. Installation and configuration.

Installation

There are two options: Run Butler SOS as a standalone binary or as a Node.js app. The first is by far easier to set up and maintain and thus recommended.

Using the pre-built, standalone app

The pre-build binaries are available from the releases page.

  1. Download the Linux/macOS binary
  2. Move the extracted butler-sos file to desired location.
  3. Use the process monitor of choice (see below) to make sure Butler SOS is always running

Using Node.js

This scenario is identical to the Windows scenario, please refer to that page for details. Keep in mind that the format of file systems paths differ between Windows and Linxu/Mac OS.

Configuration

Once again, same thing as on Windows.

Stayin’ alive

A Node process monitor can be used on Linux or Mac OS too.
Tools like PM2 in fact usually work better on Linux/Mac OS than on Windows.

You can probably also use Linux’ standard service layer to start Butler SOS, that has not been tested though.

2.3 - Docker

Running Butler SOS in Docker. Installation and configuration.

Tip

Butler SOS Docker images are automatically built for several architectures:

  • amd64: This is by far the most common platform - your typical Intel based server use amd64.
  • arm64: Arm servers are now available from most cloud providers and offer very competetive price/performance. Apple’s new M1 CPUs also use arm64, as does the newer Raspberry Pi models.
  • arm/v7: An older Arm architecture found in previous-gen Raspberry Pis, for example.

All images are available on Docker Hub.

Docker is great in that it runs on many different platforms.
This means that as long as the Docker runtime environment is installed, you can run Butler SOS on your Mac laptop, on a Linux server or on a Windows server.
Or in a Kubernetes cluster to get enterprise grade process monitoring of Butler SOS.

Installation

Docker runtime

Installing Docker is beyond the scope of this document, but there are plenty of online tutorials covering this.

Butler SOS installation and configuration

When using Docker there is no installation in the traditional sense.
Instead we (in this case) use a docker-compose file to define how Butler SOS should be executed within a Docker container. There are also other ways to start Docker containers, but docker-compose is usually a good and robust starting point.

Configuration of Butler specific settings is then done using Butler’s own JSON/YAML config file.

Install & configure - walkthrough

Create a directory for Butler SOS. Config files and logs will be stored here.
This example uses macOS but the commands will be very similar on Linux.
Docker on Windows is another story - it’s there and works great, but not always identical to Linux/macOS.

➜  butler-sos-demo mkdir -p butler-sos-docker/config/certificate
➜  butler-sos-demo mkdir -p butler-sos-docker/log
➜  butler-sos-demo cd butler-sos-docker
➜  butler-sos-docker
  1. Copy docker-compose.yml from the GitHub repository to the Butler SOS directory that was created above. The directory where the docker-compose file is stored is the ‘root’ directory of Butler SOS - everything else is relative to this directory.

  2. Adapt the docker-compose file as needed (usually no changes are needed if you want to run the latest version of Butler SOS).

  3. Copy the YAML config file from the GitHub repository into the ./config directory, rename it to production.yaml (or something else, as long as it matches the NODE_ENV environment variable set in the docker-compose.yml file) and edit it as needed. Note that for the Docker setup the path to certificates (as specified in the YAML config file) should be /nodeapp/config/certificate/ (this is the Docker container’s internal path to the certificate directory).

  4. Edit the config file above as needed, with respect to your local Sense environment, folder paths etc. The provided template file has reasonable defualt settings where possible, but there are also a number of paths, passwords etc that must be configured.

  5. Export certifiates from the QMC in Qlik Sense Enterprise, place them in the ./config/certificate directory (i.e. in a subdirectory to the directory where the docker-comspose file is stored). The certificates can in theory be placed anywhere, as long as they are made available to the Docker container via a volume mount in the docker-compose.yaml file (e.g. - "./config:/nodeapp/config").

Let’s do this one step at a time.
Here we will bring up a single container with Butler SOS in it.
The Butler SOS config file is called production.yaml.

First, what files are there?

➜  butler-sos-docker ls -la
total 8
drwxr-xr-x  5 goran  staff   160 Aug 21 19:08 .
drwxr-xr-x  3 goran  staff    96 Aug 21 18:49 ..
drwxr-xr-x  4 goran  staff   128 Aug 21 19:08 config
-rw-r--r--  1 goran  staff  1505 Aug 21 19:01 docker-compose.yml
drwxr-xr-x  2 goran  staff    64 Aug 21 18:49 log
➜  butler-sos-docker
➜  butler-sos-docker ls -la config
total 48
drwxr-xr-x  4 goran  staff    128 Aug 21 19:08 .
drwxr-xr-x  5 goran  staff    160 Aug 21 19:08 ..
drwxr-xr-x  5 goran  staff    160 Aug 21 19:08 certificate
-rw-r--r--  1 goran  staff  21903 Aug 21 19:08 production.yaml
➜  butler-sos-docker
➜  butler-sos-docker ls -la config/certificate
total 24
drwxr-xr-x  5 goran  staff   160 Aug 21 19:08 .
drwxr-xr-x  4 goran  staff   128 Aug 21 19:08 ..
-rw-r--r--@ 1 goran  staff  1170 Aug 21 19:06 client.pem
-rw-r--r--@ 1 goran  staff  1704 Aug 21 19:06 client_key.pem
-rw-r--r--@ 1 goran  staff  1224 Aug 21 19:06 root.pem
➜  butler-sos-docker

What does the docker-compose.yml file look like?

➜  butler-sos-docker cat docker-compose.yml
# docker-compose.yml
version: "3.3"
services:
    butler-sos:
        image: ptarmiganlabs/butler-sos:latest
        container_name: butler-sos
        restart: always
        volumes:
            # Make config file and log files accessible outside of container
            - "./config:/nodeapp/config"
            - "./log:/nodeapp/log"
        environment:
            - "NODE_ENV=production" # Means that Butler SOS will read config data from production.yaml
        logging:
            driver: "json-file"
            options:
                max-file: "5"
                max-size: "5m"
        networks:
            - senseops

networks:
    senseops:
        driver: bridge

➜  butler-sos-docker

Ok, all good. Let’s start the container using docker-compose (the exact output will depend on what version of Butler SOS you are using and what features you have enabled in its YAML config file).

➜  butler-sos-docker docker-compose up
Creating network "butler-sos-docker_senseops" with driver "bridge"
Creating butler-sos ... done
Attaching to butler-sos
butler-sos    | 2022-08-23T03:45:28.754Z info: CONFIG: Influxdb enabled: true
butler-sos    | 2022-08-23T03:45:28.757Z info: CONFIG: Influxdb host IP: 192.168.100.20
butler-sos    | 2022-08-23T03:45:28.757Z info: CONFIG: Influxdb host port: 8086
butler-sos    | 2022-08-23T03:45:28.758Z info: CONFIG: Influxdb db name: senseops
butler-sos    | 2022-08-23T03:45:29.003Z info: CONFIG: Found InfluxDB database: senseops
butler-sos    | 2022-08-23T03:45:29.219Z info: --------------------------------------
butler-sos    | 2022-08-23T03:45:29.220Z info: Starting Butler SOS
butler-sos    | 2022-08-23T03:45:29.220Z info: Log level: verbose
butler-sos    | 2022-08-23T03:45:29.221Z info: App version: 9.2.0
butler-sos    | 2022-08-23T03:45:29.221Z info: Instance ID    : 87b978019ae........
butler-sos    | 2022-08-23T03:45:29.222Z info:
butler-sos    | 2022-08-23T03:45:29.223Z info: Node version   : v18.7.0
butler-sos    | 2022-08-23T03:45:29.223Z info: Architecture   : x64
butler-sos    | 2022-08-23T03:45:29.224Z info: Platform       : linux
butler-sos    | 2022-08-23T03:45:29.224Z info: Release        : 11
butler-sos    | 2022-08-23T03:45:29.224Z info: Distro         : Debian GNU/Linux
butler-sos    | 2022-08-23T03:45:29.224Z info: Codename       : bullseye
butler-sos    | 2022-08-23T03:45:29.224Z info: Virtual        : false
butler-sos    | 2022-08-23T03:45:29.225Z info: Processors     : 4
butler-sos    | 2022-08-23T03:45:29.225Z info: Physical cores : 4
butler-sos    | 2022-08-23T03:45:29.225Z info: Cores          : 4
butler-sos    | 2022-08-23T03:45:29.226Z info: Docker arch.   : undefined
butler-sos    | 2022-08-23T03:45:29.226Z info: Total memory   : 6233055232
butler-sos    | 2022-08-23T03:45:29.226Z info: Standalone app : false
butler-sos    | 2022-08-23T03:45:29.226Z info: --------------------------------------
butler-sos    | 2022-08-23T03:45:29.226Z info: Client cert: /nodeapp/config/certificate/client.pem
butler-sos    | 2022-08-23T03:45:29.227Z info: Client cert key: /nodeapp/config/certificate/client_key.pem
butler-sos    | 2022-08-23T03:45:29.227Z info: CA cert: /nodeapp/config/certificate/root.pem
butler-sos    | 2022-08-23T03:45:29.250Z verbose: MAIN: Anonymous telemetry reporting has been set up.
butler-sos    | 2022-08-23T03:45:29.252Z verbose: MAIN: Starting Docker healthcheck server...
butler-sos    | 2022-08-23T03:45:29.257Z info: USER EVENT: UDP server listening on 0.0.0.0:9997
butler-sos    | 2022-08-23T03:45:29.257Z info: LOG EVENT: UDP server listening on 0.0.0.0:9996
butler-sos    | 2022-08-23T03:45:29.290Z info: MAIN: Started Docker healthcheck server on port 12398.
butler-sos    | 2022-08-23T03:45:29.290Z info: MAIN: Starting Prometheus Butler SOS endpoint on 0.0.0.0:9842.
butler-sos    | 2022-08-23T03:45:29.291Z verbose: PROM: Setting up Prometheus client for server: sense1
butler-sos    | 2022-08-23T03:45:29.292Z verbose: PROM: Setting up Prometheus client for server: sense2
butler-sos    | 2022-08-23T03:45:29.310Z info: PROM: Prometheus Butler SOS metrics server now listening on port 9842
butler-sos    | 2022-08-23T03:45:29.311Z info: PROM: Prometheus Node.js metrics server now listening on port 0.0.0.0:9001
butler-sos    | 2022-08-23T03:45:30.911Z verbose: --------------------------------
butler-sos    | 2022-08-23T03:45:30.911Z verbose: Iteration # 1, Uptime: 0 months, 0 days, 0 hours, 0 minutes, 2.005 seconds, Heap used 33.26 MB of total heap 58.39 MB. External (off-heap): 3.57 MB. Memory allocated to process: 92.45 MB.
butler-sos    | 2022-08-23T03:45:31.051Z verbose: UPTIME NEW RELIC: Sent Butler SOS memory usage data to New Relic account 123456789 ("Ptarmigan Labs NR account")
butler-sos    | 2022-08-23T03:45:31.269Z verbose: MEMORY USAGE INFLUXDB: Sent Butler SOS memory usage data to InfluxDB
...

Once everything everything looks good you can stop the containers (ctrl-C), then start them again in daemon mode (i.e. running unattended in the background):

➜  butler-sos-docker docker-compose up -d
Starting butler-sos ... done
➜  butler-sos-docker

Setting the log level to info in the config file will reduce log output.

The Docker container implements Docker healthchecks, which means you can run docker ps to see whether the container is healthy or not (assuming Docker healthchecks are enabled in the config file, of course):

➜  butler-sos-docker docker ps
CONTAINER ID   IMAGE                             COMMAND                  CREATED              STATUS                    PORTS     NAMES
9d2253511a24   ptarmiganlabs/butler-sos:latest   "docker-entrypoint.s…"   About a minute ago   Up 17 seconds (healthy)             butler-sos
➜  butler-sos-docker

2.4 - InfluxDB & Grafana

How to use Butler SOS with InfluxDB and Grafana using Docker.

Warning

Butler SOS supports InfluxDB version 1.x and 2x.

InfluxDB v3.x is not yet supported.

If you already have InfluxDB and/or Grafana running you can skip this section.

Running in Docker using docker-compose

The easiest way to get started is to run these tools in Docker containers, controlled by docker-compose files.
Running them under Kubernetes will give you a whole other level of fault tolerance, scalability etc - but this also requries much more when it comes to Kubernetes skills. Use the setup that’s relevant to your use case.

You can use a single docker-compose file for Butler SOS, InfluxDB and Grafana - or separate docker-compose files for each tool.

The advantage of using a single docker-compose file is that the entire stack of tools will be launched in unison. You can create dependencies between the tools if needed etc - very convenient. On the other hand, having separate docker-compose files makes it easier to restart (or upgrade or in other ways change) a single service without affecting other services.

Full stack docker-compose file

Let’s start Butler SOS, InfluxDB and Grafana from a single docker-compose_fullstack_influxdb.yml file:

➜  butler-sos-docker cat docker-compose_fullstack_influxdb.yml
# docker-compose_fullstack_influxdb.yml
version: "3.3"
services:
    butler-sos:
        image: ptarmiganlabs/butler-sos:latest
        container_name: butler-sos
        restart: always
        volumes:
            # Make config file and log files accessible outside of container
            - "./config:/nodeapp/config"
            - "./log:/nodeapp/log"
        environment:
            - "NODE_ENV=production_influxdb" # Means that Butler SOS will read config data from production_influxdb.yaml
        logging:
            driver: "json-file"
            options:
                max-file: "5"
                max-size: "5m"
        networks:
            - senseops

    influxdb:
        image: influxdb:1.8.10
        container_name: influxdb
        restart: always
        volumes:
            - ./influxdb/data:/var/lib/influxdb # Mount for influxdb data directory
            - ./influxdb/config/:/etc/influxdb/ # Mount for influxdb configuration
        ports:
            # The API for InfluxDB is served on port 8086
            - "8086:8086"
            - "8082:8082"
        environment:
            # Disable usage reporting
            - "INFLUXDB_REPORTING_DISABLED=true"
        networks:
            - senseops

    grafana:
        image: grafana/grafana:latest
        container_name: grafana
        restart: always
        ports:
            - "3000:3000"
        volumes:
            - ./grafana/data:/var/lib/grafana
        networks:
            - senseops

networks:
    senseops:
        driver: bridge

➜  butler-sos-docker

Assuming you’ve already completed the setup of Butler SOS, the result of running the docker-compose_fullstack_influxdb.yml file above is something like this:

➜  butler-sos-docker docker-compose -f docker-compose_fullstack_influxdb.yml up
Creating network "butler-sos-docker_senseops" with driver "bridge"
Creating influxdb   ... done
Creating butler-sos ... done
Creating grafana    ... done
Attaching to butler-sos, grafana, influxdb
...
...
grafana       | logger=grafanaStorageLogger t=2022-08-21T18:13:42.76538465Z level=info msg="storage starting"
grafana       | logger=ngalert t=2022-08-21T18:13:42.780004463Z level=info msg="warming cache for startup"
grafana       | logger=http.server t=2022-08-21T18:13:42.796364325Z level=info msg="HTTP Server Listen" address=[::]:3000 protocol=http subUrl= socket=
grafana       | logger=ngalert.multiorg.alertmanager t=2022-08-21T18:13:42.807894344Z level=info msg="starting MultiOrg Alertmanager"
butler-sos    | 2022-08-21T18:13:42.908Z info: CONFIG: Influxdb enabled: true
butler-sos    | 2022-08-21T18:13:42.911Z info: CONFIG: Influxdb host IP: influxdb
butler-sos    | 2022-08-21T18:13:42.912Z info: CONFIG: Influxdb host port: 8086
butler-sos    | 2022-08-21T18:13:42.912Z info: CONFIG: Influxdb db name: senseops
influxdb      | ts=2022-08-21T18:13:43.139047Z lvl=info msg="Executing query" log_id=0cSPbmJG000 service=query query="SHOW DATABASES"
influxdb      | [httpd] 172.24.0.2 - - [21/Aug/2022:18:13:43 +0000] "GET /query?p=&q=show+databases&u= HTTP/1.1" 200 84 "-" "-" fd854ac5-217c-11ed-8001-0242ac180003 1084
influxdb      | ts=2022-08-21T18:13:43.169398Z lvl=info msg="Executing query" log_id=0cSPbmJG000 service=query query="CREATE DATABASE senseops"
influxdb      | [httpd] 172.24.0.2 - - [21/Aug/2022:18:13:43 +0000] "POST /query?p=&q=create+database+%22senseops%22&u= HTTP/1.1 " 200 33 "-" "-" fd89e529-217c-11ed-8002-0242ac180003 2940
butler-sos    | 2022-08-21T18:13:43.177Z info: CONFIG: Created new InfluxDB database: senseops
influxdb      | ts=2022-08-21T18:13:43.219945Z lvl=info msg="Executing query" log_id=0cSPbmJG000 service=query query="CREATE RETENTION POLICY \"10d\" ON senseops DURATION 10d REPLICATION 1 DEFAULT"
influxdb      | [httpd] 172.24.0.2 - - [21/Aug/2022:18:13:43 +0000] "POST /query?p=&q=create+retention+policy+%2210d%22+on+%22senseops%22+duration+10d+replication+1+default&u= HTTP/1.1 " 200 33 "-" "-" fd91ac84-217c-11ed-8003-0242ac180003 2299
butler-sos    | 2022-08-21T18:13:43.242Z info: CONFIG: Created new InfluxDB retention policy: 10d
butler-sos    | 2022-08-21T18:13:43.391Z info: --------------------------------------
butler-sos    | 2022-08-21T18:13:43.391Z info: Starting Butler SOS
butler-sos    | 2022-08-21T18:13:43.392Z info: Log level: verbose
butler-sos    | 2022-08-21T18:13:43.393Z info: App version: 9.2.0
butler-sos    | 2022-08-21T18:13:43.394Z info: Instance ID    : 964cbd0a36bc....
butler-sos    | 2022-08-21T18:13:43.394Z info:
butler-sos    | 2022-08-21T18:13:43.395Z info: Node version   : v18.7.0
butler-sos    | 2022-08-21T18:13:43.396Z info: Architecture   : x64
butler-sos    | 2022-08-21T18:13:43.396Z info: Platform       : linux
butler-sos    | 2022-08-21T18:13:43.396Z info: Release        : 11
butler-sos    | 2022-08-21T18:13:43.397Z info: Distro         : Debian GNU/Linux
butler-sos    | 2022-08-21T18:13:43.397Z info: Codename       : bullseye
butler-sos    | 2022-08-21T18:13:43.398Z info: Virtual        : false
butler-sos    | 2022-08-21T18:13:43.398Z info: Processors     : 4
butler-sos    | 2022-08-21T18:13:43.399Z info: Physical cores : 4
butler-sos    | 2022-08-21T18:13:43.399Z info: Cores          : 4
butler-sos    | 2022-08-21T18:13:43.400Z info: Docker arch.   : undefined
butler-sos    | 2022-08-21T18:13:43.400Z info: Total memory   : 6233055232
butler-sos    | 2022-08-21T18:13:43.401Z info: Standalone app : false
butler-sos    | 2022-08-21T18:13:43.401Z info: --------------------------------------
butler-sos    | 2022-08-21T18:13:43.402Z info: Client cert: /nodeapp/config/certificate/client.pem
butler-sos    | 2022-08-21T18:13:43.402Z info: Client cert key: /nodeapp/config/certificate/client_key.pem
butler-sos    | 2022-08-21T18:13:43.402Z info: CA cert: /nodeapp/config/certificate/root.pem
butler-sos    | 2022-08-21T18:13:43.421Z verbose: MAIN: Anonymous telemetry reporting has been set up.
butler-sos    | 2022-08-21T18:13:43.423Z verbose: MAIN: Starting Docker healthcheck server...
butler-sos    | 2022-08-21T18:13:43.428Z info: USER EVENT: UDP server listening on 0.0.0.0:9997
butler-sos    | 2022-08-21T18:13:43.429Z info: LOG EVENT: UDP server listening on 0.0.0.0:9996
butler-sos    | 2022-08-21T18:13:43.461Z info: MAIN: Started Docker healthcheck server on port 12398.
butler-sos    | 2022-08-21T18:13:43.462Z info: MAIN: Starting Prometheus Butler SOS endpoint on 0.0.0.0:9842.
butler-sos    | 2022-08-21T18:13:43.464Z verbose: PROM: Setting up Prometheus client for server: sense1
butler-sos    | 2022-08-21T18:13:43.465Z verbose: PROM: Setting up Prometheus client for server: sense2
butler-sos    | 2022-08-21T18:13:43.482Z info: PROM: Prometheus Butler SOS metrics server now listening on port 9842
butler-sos    | 2022-08-21T18:13:43.483Z info: PROM: Prometheus Node.js metrics server now listening on port 0.0.0.0:9001
butler-sos    | 2022-08-21T18:13:45.080Z verbose: --------------------------------
butler-sos    | 2022-08-21T18:13:45.081Z verbose: Iteration # 1, Uptime: 0 months, 0 days, 0 hours, 0 minutes, 2.007 seconds, Heap used 31.56 MB of total heap 60.81 MB. External (off-heap): 2.98 MB. Memory allocated to process: 102.28 MB.
influxdb      | [httpd] 172.24.0.2 - - [21/Aug/2022:18:13:45 +0000] "POST /write?db=senseops&p=&precision=n&rp=&u= HTTP/1.1 " 204 0 "-" "-" feaf181f-217c-11ed-8004-0242ac180003 44267
butler-sos    | 2022-08-21T18:13:45.137Z verbose: MEMORY USAGE INFLUXDB: Sent Butler SOS memory usage data to InfluxDB
butler-sos    | 2022-08-21T18:13:45.198Z verbose: UPTIME NEW RELIC: Sent Butler SOS memory usage data to New Relic account 123456789 ("Ptarmigan Labs NR account")
...
...

From a separate shell we can then ensure that the expected Docker containers are running:

➜  ~ docker ps
CONTAINER ID   IMAGE                             COMMAND                  CREATED              STATUS                        PORTS                                            NAMES
2311d17d1285   ptarmiganlabs/butler-sos:latest   "docker-entrypoint.s…"   About a minute ago   Up About a minute (healthy)                                                    butler-sos
a22307d12263   influxdb:1.8.10                   "/entrypoint.sh infl…"   About a minute ago   Up About a minute             0.0.0.0:8082->8082/tcp, 0.0.0.0:8086->8086/tcp   influxdb
81df665545d0   grafana/grafana:latest            "/run.sh"                About a minute ago   Up About a minute             0.0.0.0:3000->3000/tcp                           grafana
➜  ~

That’s great, we now have a single command (docker-compose -f docker-compose_fullstack_influxdb.yml up -d for background/daemon mode) to bring up all the tools needed to monitor a Qlik Sense cluster!

Now, let’s see if any data has arrived in InfluxDB.
Let’s check this by going into Grafana, which is available on port 3000.

First time logging into a new Grafana instance you can use the default admin acount (username=admin, password=admin).
You will be asked to change that password during first login.

First add a data source in Grafana, pointing it to the local InfluxDB server.

Adding an InfluxDB datasource in Grafana

Now we can create a basic chart in Grafana, showing for example Butler SOS’ own memory usage.
After a while we should see some data in the chart:

Butler SOS' own memory usage, stored in InfluxDB and visualised in Grafana

Need to stop the entire stack of tools?
Easy - just run docker-compose -f docker-compose_fullstack_influxdb.yml down:

➜  butler-sos-docker docker-compose -f docker-compose_fullstack_influxdb.yml down
Stopping butler-sos ... done
Stopping influxdb   ... done
Stopping grafana    ... done
Removing butler-sos ... done
Removing influxdb   ... done
Removing grafana    ... done
Removing network butler-sos-docker_senseops
➜  butler-sos-docker

2.5 - Prometheus & Grafana

How to use Butler SOS with Prometheus and Grafana using Docker.

Warning

Work in progress

While Butler SOS’ Prometheus support is functional and works well, this documentation page is not yet complete.

Info

Prometheus is extremely powerful and flexible.
In fact, it’s probably the closest thing there is to a de facto standard for monitoring large scale software systems today.
No matter if you run Kubernetes cluster spanning multiple data centers and continents, or just a single Butler SOS instance - Prometheus is an excellent choice for monitoring of operational metrics.

That power and flexibility also means it can be challenging to set up Prometheus.
Usually it’s not that difficult, but if you’re new to Docker and has no previous experience with monitoring tools, using InfluxDB is usually a bit easier.

Or view it as a chance to learn more about one of the absolute stars of open source software - Prometheus is awesome!

This page assumes you don’t already have Prometheus and Grafana running.
If you already have access to those tools you can of course instead configure them to work with Butler SOS.

Running in Docker using docker-compose

The easiest way to get started is to run these tools in Docker containers, controlled by docker-compose files.
Running them under Kubernetes will give you a whole other level of fault tolerance, scalability etc - but this also requries much more when it comes to Kubernetes skills. Use the setup that’s relevant to your use case.

You can use a single docker-compose file for Butler SOS, Prometheus and Grafana - or separate docker-compose files for each tool.

The advantage of using a single docker-compose file is that the entire stack of tools will be launched in unison. You can create dependencies between the tools if needed etc - very convenient. On the other hand, having separate docker-compose files makes it easier to restart (or upgrade or in other ways change) a single service without affecting other services.

Full stack docker-compose file

Let’s start Butler SOS, Prometheus and Grafana from a single docker-compose_fullstack_prometheus.yml file:

# docker-compose_fullstack_prometheus.yml
version: "3.3"
services:
    butler-sos:
        image: ptarmiganlabs/butler-sos:latest
        container_name: butler-sos
        restart: always
        volumes:
            # Make config file and log files accessible outside of container
            - "./config:/nodeapp/config"
            - "./log:/nodeapp/log"
        environment:
            - "NODE_ENV=production_prometheus" # Means that Butler SOS will read config data from production_prometheus.yaml
        logging:
            driver: "json-file"
            options:
                max-file: "5"
                max-size: "5m"
        networks:
            - senseops

    prometheus:
        image: prom/prometheus:latest
        container_name: prometheus
        volumes:
            - ./prometheus/:/etc/prometheus/
            - prometheus_data:/prometheus
        command:
            - "--config.file=/etc/prometheus/prometheus.yml"
            - "--storage.tsdb.path=/prometheus"
            - "--web.console.libraries=/usr/share/prometheus/console_libraries"
            - "--web.console.templates=/usr/share/prometheus/consoles"
            # - "--log.level=debug"
        ports:
            - 9090:9090
        links:
            - alertmanager:alertmanager
        networks:
            - senseops
        restart: always

    alertmanager:
        image: prom/alertmanager
        container_name: alertmanager
        ports:
            - 9093:9093
        volumes:
            - ./alertmanager/:/etc/alertmanager/
        networks:
            - senseops
        restart: always
        command:
            - "--config.file=/etc/alertmanager/config.yml"
            - "--storage.path=/alertmanager"

    grafana:
        image: grafana/grafana:latest
        container_name: grafana
        restart: always
        ports:
            - "3000:3000"
        volumes:
            - ./grafana/data:/var/lib/grafana
        networks:
            - senseops

networks:
    senseops:
        driver: bridge

Assuming you’ve already completed the setup of Butler SOS, the result of running the docker-compose_fullstack_prometheus.yml file above is something like this:

...
...

From a separate shell we can then ensure that the expected Docker containers are running:

➜ docker ps
CONTAINER ID   IMAGE                             COMMAND                  CREATED         STATUS                            PORTS                                                                                  NAMES

That’s great, you now have a single command (docker-compose -f docker-compose_fullstack_influxdb.yml up -d for background/daemon mode) to bring up all the tools needed to monitor a Qlik Sense cluster!

Need to stop the entire stack of tools?
Easy - just run docker-compose -f docker-compose_fullstack_influxdb.yml down:

➜ docker-compose -f docker-compose_fullstack_influxdb.yml down

3 - Setup

Everything you wanted to know about Butler SOS configuration but never dared to ask.

3.1 - Which config file to use

Butler SOS can use multiple config files. Here you learn to control which one is used by Butler SOS.

A description of the config file format is available here.

Select which config file to use

Butler SOS uses configuration files in YAML format.

Butler SOS comes with a default config file called production_template.yaml.
Make a copy of it, then rename the copy to butler-sos-config-prod.yaml, production.yaml, staging.yaml or something else suitable to your specific use case.

Update the config file as needed (see the config file reference page for details).

Trying to run Butler SOS with the default config file (the one included in the files download from GitHub) will not work - you must adapt it to your server environment. For example, you need to enter the IP or host name of you Sense server(s), the IP or host name where Butler SOS is running, where the Sense certificates are stored etc.

The name of the config file matters. Unless you specifically name which config to use when starting Butler SOS, it will look for an environment variable called “NODE_ENV” and then try to load a config file named with the value found in NODE_ENV.

Example 1:

  • Environment variable NODE_ENV = production
  • Butler SOS is started without specifying a config file: butler.exe --loglevel info
  • Butler SOS will look for a config file config/production.yaml.

Example 2:

  • Butler SOS is started with a command line option specifying a config file: butler.exe --configfile d:\some\path\butler-sos-config-prod.yaml --loglevel info
  • Butler SOS will not look at the NODE_ENV environment variable. Settings will be loaded from the butler-sos-config-prod.yaml instead.

Running several Butler SOS instances in parallel

If you have several Sense clusters (for example DEV, TEST and PROD environments) you can either monitor them all from a single Butler SOS instance, or set up separate instances for each Sense cluster.
The second case is implemented by creating several config files: butler-sos-dev.yaml, butler-sos-test.yaml and butler-sos-prod.yaml.

In this scenario three instances of Butler SOS should be started, each given a different config file by setting the NODE_ENV variable as needed when starting Butler SOS.
Or (this option is usually much easier!) use the --configfile command line option when starting Butler SOS.

Note: If running several Butler SOS instances in parallel, you must also ensure that each one uses unique port numbers for their respective UDP servers etc.

Setting environment variables

The method for setting environment variables varies between operating systems:

On Windows:

set NODE_ENV=production

Mac OS or Linux

export NODE_ENV=production

If using Docker, the NODE_ENV environment varible is set in the docker-compose.yml file (as already done in the template docker-compose files.)

3.2 - Config file verification

How to verify that the Butler SOS config file is valid.

A description of the config file format is available here.

Getting the config file correctly set up is usually the most challenging part of setting up Butler SOS.
The config file is written in an easy to read YAML format, but given the number of settings that can be configured, it can be a bit daunting to get it right.

Verify the config file

Config file verification is enabled by default.

Verification is done when Butler SOS is started, and if the config file is not valid, Butler SOS will not start.

Info

All settings in the config file are mandatory.

If you don’t want to use a specific Butler SOS feature, you must still include its settings in the config file, but you are free to disable the feature and set its setting to empty strings/values/arrays, or just leave the default values in place.

Skipping config file verification

If you want to skip config file verification, you can do so by starting Butler SOS with the ‘–skip-config-verificationsetting totrue` in the config file.

This will bypass all checks of the config file’s validity, and Butler SOS will try to start with the provided config file.
This can be useful if you are in the process of setting up Butler SOS and want to start it before the config file is complete, but for production scenarios it is recommended to leave config file verification enabled.

3.3 - Visualise the config file

Butler SOS can visualise its confile on a web page, using an internal web server.
This can be useful for troubleshooting and understanding how Butler SOS is configured.

The configuration can optionally be obfuscated to hide sensitive information.

What’s this?

Butler SOS can visualise its config file on a web page, using an internal web server. This can be useful for troubleshooting and understanding how Butler SOS is configured.

If enabled, the web server will serve a web page on the IP address and port specified in the config file.
The default IP address is localhost and the default port is 3100.

By clicking the “Download

JSON and YAML

The web page will show the config file in both JSON and YAML format.

The JSON format is useful if you want to copy the config file and paste it into a JSON validator, for example.
The YAML format is easier to read and understand for humans, and is also the format used in the config file.

Examples:

Butler SOS config file visualisation - JSON view

Butler SOS config file visualisation - YAML view

Obfuscation

The configuration can optionally be obfuscated to hide sensitive information.
This is useful if you need to share the config file with someone else, but don’t want to share sensitive information like IP addresses, user names or passwords.
Obfuscation is enabled/disabled in the config file.

For example, if asking for support on the Butler SOS forum, you can share the obfuscated config file without revealing sensitive information.

Disclaimer: Obfuscation is not foolproof, but it should be good enough for most use cases.
Always check the obfuscated config file before sharing it.

Settings in config file

Butler-SOS:
  ...
  ...
  # Should Butler SOS start a web server that serves an obfuscated view of the Butler SOS config file?
  configVisualisation:
    enable: true  
    host: localhost       # Hostname or IP address where the web server will listen. Should be localhost in most cases.
    port: 3100            # Port where the web server will listen. Change if port 3100 is already in use.
    obfuscate: true        # Should the config file shown in the web UI be obfuscated?
  ...
  ...

3.4 - Configuring Butler SOS logging

Butler SOS can log its activities to console and disk files.
Log files can be useful for retrospective troubleshooting of Butler SOS.

What’s this?

Butler SOS continuously logs what its doing.
Logging is always done to console and optionally also to disk files.

The top level section Butler-SOS in the config file has a set of settings that control logging.

Log level (verbosity) can be set, logging to disk can be enabled/disabled and the directory where log files are stored can be set.
Log level can also be set on the command line when starting Butler SOS, using the --loglevel option.

Log files are kept for 30 days, after which they are automatically deleted.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Logging configuration
  logLevel: info          # Log level. Possible log levels are silly, debug, verbose, info, warn, error
  fileLogging: true       # true/false to enable/disable logging to disk file
  logDirectory: log       # Subdirectory where log files are stored
  ...
  ...

3.5 - Configuring Butler SOS heartbeats

Heartbeats provide a way to monitor that Butler SOS is running and working as intended.
Butler SOS can send periodic heartbeat messages to a monitoring tool, which can then alert if Butler SOS hasn’t checked in as expected.

What’s this?

A tool like Butler SOS should be viewed as mission critical, at least if it is used to monitor mission critical Sense apps.

But how can you know whether Butler SOS itself is working?
Somehow Butler SOS should be monitored.

Butler SOS (and most other tools in the Butler family) has a heartbeat feature.
It sends periodic messages to a monitoring tool, which can then alert if Butler SOS hasn’t checked in as expected.

Healthchecks.io is an example of such as tool. It’s open source and can be self-hosted, but also has a SaaS option if so preferred.

Uptime Kuma is another great tool that can be used, it has a somewhat slicker UI than Healthchecks.io - but it’s relally a matter of personal preference which one to use.

More info on using Healthchecks.io with Butler (Butler SOS works the same way) can be found in this blog post.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Heartbeats can be used to send "I'm alive" messages to some other tool, e.g. an infrastructure monitoring tool
  # The concept is simple: The remoteURL will be called at the specified frequency. The receiving tool will then know 
  # that Butler SOS is alive.
  heartbeat:
    enable: true
    remoteURL: http://my.monitoring.server/some/path/
    frequency: every 1 hour         # https://bunkat.github.io/later/parsers.html#text
  ...
  ...

3.6 - Docker healthcheck

Docker has a concept of “health checks”, which is a way for Docker containers to tell the Docker runtime engine that the container is alive and well. Butler SOS can be configured to send such health check messages to Docker.

Note: Sending health check messages is only meaningful when running Butler SOS as a Docker container.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Docker health checks are used when running Butler SOS as a Docker container. 
  # The Docker engine will call the container's health check REST endpoint with a set interval to determine
  # whether the container is alive/well or not.
  # If you are not running Butler SOS in Docker you can safely disable this feature. 
  dockerHealthCheck:
    enable: true                    # Control whether a REST endpoint will be set up to serve Docker health check messages
    port: 12398                     # Port the Docker health check service runs on (if enabled)
  ...
  ...

3.7 - Configuring Butler SOS uptime monitor

Butler SOS can optionally log how long it’s been running and how much memory it uses.
Optionally the memory usage can also be stored to an InfluxDB database or sent to New Relic, for later viewing/alerting in for example a Grafana dashboard or within New Relic.

What’s this?

In some cases - especially when investigating issues or bugs - it can be useful to get log messages telling how long Butler SOS has been running and how much memory it uses.

This feature is called “uptime monitoring” and can be enabled in the main config file. The feature is being added to more and more tools in the Butler family of tools for Qlik Sense.

The logging interval is configurable, as is the log level required for uptime messages to be shown in the console/file log.

Select a reasonable retention policy and logging frequency!
You will rarely if ever need to know how much memory Butler SOS used a month ago… A retention policy of 1-2 weeks is usually a good start, logging uptime metrics every few minutes.

InfluxDB

The memory usage data can optionally be written to InfluxDB, from where it can later be viewed in Grafana.
The metrics will be stored in the database specified in the Butler-SOS.influxdbConfig section of the config file.

Note that Butler-SOS.influxdbConfig.enable must also be set to true for any data to be sent to InfluxDB.

New Relic

Uptime metrics can be sent to zero or more New Relic accounts.

New Relic attributes (a concept where each data point sent to New Relic is tagged with a set of attributes) can be added to the metrics.
Attributes come in two forms: Static and dynamic.

  • Static attributes are hard-coded strings that don’t change over time. Could be used to distinguish metrics from DEV, TEST and PROD Sense environments.
  • Dynamic attributes may change each time Butler SOS is started, or even more often in the future if/when more dynamic attributes are added.
    An example is the Butler SOS version, which will change when Butler SOS is upgraded to a new version.

Log level

The log level does not affect storing uptime metrics in InfluxDB or New Relic.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Uptime monitor
  uptimeMonitor:
    enable: true                    # Should uptime messages be written to the console and log files?
    frequency: every 15 minutes     # https://bunkat.github.io/later/parsers.html#text
    logLevel: verbose               # Starting at what log level should uptime messages be shown in console log and log files?
    storeInInfluxdb: 
      butlerSOSMemoryUsage: true    # Should data on Butler SOS' own memory use be stored in Infludb?
      instanceTag: PROD             # Tag that can be used to differentiate data from multiple Butler SOS instances
    storeNewRelic:
      enable: true
      destinationAccount:
        - First NR account
        - Second NR account
      metric:
        dynamic:
          butlerMemoryUsage:
            enable: true            # Should Butler SOS' memory/RAM usage be sent to New Relic?
          butlerUptime:
            enable: true            # Should Butler SOS' uptime (how long since it was started) be sent to New Relic?
      attribute: 
        static:                     # Static attributes/dimensions to attach to the data sent to New Relic.
          - name: metricType
            value: butler-sos-uptime
          - name: qs_service
            value: butler-sos
          - name: qs_environment
            value: prod
        dynamic:
          butlerVersion: 
            enable: true            # Should the Butler SOS version be included in the data sent to New Relic?
  ...
  ...

3.8 - Credentials to third party services

What’s this?

Butler SOS can interact with certain third party services, such as New Relic.
These services typically require some kind of authentication with associated credentials (username, password etc).
Those credentials are stored in the Butler-SOS.thirdPartyToolsCredentials section of the config file.

New Relic

Zero, one or more New Relic accounts with their respective credentials can be specified.
These accounts can then be used by Butler SOS’ various features.

Note that different Butler SOS features can send their data to different New Relic accounts.
This is specified in each feature’s section in the YAML config file.

Example:

  • Sense user events are sent to First NR account
  • Sense log events are sent to Second NR account
  • Sense RAM usage is sent to both First NR account and Second NR account

Note that the accountName setting is only used within Butler SOS to reference the diffferent New Relic accounts.
Specificallyt, it is not used by or within New Relic itself.

Settings in main config file

---
Butler-SOS:
  ...
  ...
  # Credentials for third party systems that Butler SOS integrate with.
  # These can also be specified via command line parameters when starting Butler SOS. 
  # Command line options takes precedence over settings in this config file.
  thirdPartyToolsCredentials:
    newRelic:         # Array of New Relic accounts/insert keys.
      - accountName: First NR account
        insertApiKey: <API key 1 (with insert permissions) from New Relic> 
        accountId: <New Relic account ID 1>
      - accountName: Second NR account
        insertApiKey: <API key 2 (with insert permissions) from New Relic> 
        accountId: <New Relic account ID 2>
  ...
  ...

3.9 - General Sense event settings

Butler SOS can act as a reciever of Qlik Sense events, sent as UDP messages from Qlik Sense Enterprise.
This section of the config file contains general settings for how Butler SOS should handle these events.

More specific settings for each event type (user, log, …) can be found in the respective sections of the config file.

What’s this?

Butler SOS can receive events from Qlik Sense Enterprise, sent as UDP messages.
Two kinds of events are supported: User events and log events.

  • User events are events that are generated when a user interacts with Qlik Sense, for example logging into Sense or opening an app.
  • Log events originate from the Sense logging framework itself, which is also responsible for logging things to Qlik Sense’s own log files.

Some aspects of these events are general in nature, i.e. shared between the different event types, and are configured in this section of the config file.

Counters for user and log events

If log and/or user events are enabled there is a risk that the number of events generated by a QSEoW cluster can be overwhelming.
To make it easier to understand the volume of events generated, Butler SOS can be configured to count the number of events generated by the Sense cluster.

The counters are stored in InfluxDB and can be used to create dashboards in Grafana.

Each InfluxDB datapoint has tags and fields as described in the reference section.

Rejected events

Butler SOS can be configured to reject certain events even though the event is correctly formatted and contains valid data.
The need to do this can arise if the Sense cluster generates a large number of events, and not all of them are relevant for the current monitoring use case.

For example, performance log events (event name qseow-qix-perf) will be emitted by Sense every 2 seconds during scheduled reloads. There is rarely a need to store all these events in InfluxDB, so they can be filtered out (=rejected) by Butler SOS.
But they can also be seen as valid and stored in InfluxDB, depending on the use case.

Rejected events are counted and the counters stored in InfluxDB.
They can be used to understand how many events are rejected by Butler SOS, versus how many are received in total (see the section above).

The InfluxDB measurement name is defined in the config file, Butler-SOS.qlikSenseEvents.rejectedEventCount.influxdb.measurementName.

Each event type and name may have its own rejection settings, defined in the respective sections of the config file.
They may also store different tags and fields in InfluxDB.

The currently defined rejection settings are:

Performance log events

These events come from the Qlik associative engine (the “QIX engine”) an contain very detailed performance data about apps, app objects, charts, user selections in apps etc.

Counters for rejected performance log events are enabled via the Butler-SOS.logEvents.enginePerformanceMonitor.trackRejectedEvents.enable setting in the config file.

Once this data is in InfluxDB it can be used in Grafana dashboards, for example showing how long each app takes to open:

Average time to open Sense apps

The data stored in InfluxDB for performance log events is described here.

Settings in main config file

---
Butler-SOS:
  ...
  ...
  # Shared settings for user and log events (see below)
  qlikSenseEvents:                  # Shared settings for user and log events (see below)
    influxdb:
      enable: false                 # Should summary (counter) of user/log events, and rejected events be stored in InfluxDB?
      writeFrequency: 20000         # How often (milliseconds) should rejected event count be written to InfluxDB?  
    eventCount:                     # Track how many events are received from Sense.
                                    # Some events are valid, some are not. Of the valid events, some are rejected by Butler SOS
                                    # based on the configuration in this file. 
      enable: false                 # Should event count be stored in InfluxDB?
      influxdb:
        measurementName: event_count # Name of the InfluxDB measurement where event count is stored
        tags:                       # Tags are added to the data before it's stored in InfluxDB
          - name: env
            value: DEV
          - name: foo
            value: bar
    rejectedEventCount:             # Rejected events are events that are received from Sense, that are correctly formatted, 
                                    # but that are rejected by Butler SOS based on the configuration in this file. 
                                    # An example of a rejected event is a performance log event that is filtered out by Butler SOS.
      enable: false                 # Should rejected events be counted and stored in InfluxDB?
      influxdb:
        measurementName: rejected_event_count # Name of the InfluxDB measurement where rejected event count is stored
  ...
  ...

Log appender XML files

Sample log appender files are available in the ZIP file available from the download page, in subfolders engine/proxy/repository/scheduler of config/log_appender_xml/ folder.

Note that the log appender files contain slightly different information for each Sense service (engine/proxy/repository/scheduler)!
Also keep in mind that the log appender files must be called LocalLogConfig.xml and placed in these directories on the all Sense servers (assuming the detfault installation path of Qlik Sense):

  • C:\ProgramData\Qlik\Sense\Engine
  • C:\ProgramData\Qlik\Sense\Proxy
  • C:\ProgramData\Qlik\Sense\Repository
  • C:\ProgramData\Qlik\Sense\Scheduler

Tip

If you have more than one Sense server you strictly speaking don’t have to deploy log appenders to all servers.

If you are only interested in receiving log events from some servers and/or services (engine, proxy, repository, scheduler) - deploy the log appender files there.

3.9.1 - Configuring user events

What’s this?

User events are among the most detailed bits of information retrieved from Sense by Butler SOS.
They capture session start/stop events (=users logging in/out) and connection open/close events (apps opened/closed in browser tabs).

These events rely on two things to be correctly configured:

  1. Settings in Butler SOS’ config file.
  2. Log appender XML file(s) being deployed on the Sense server(s) where user activity events should be captured.

Both are described below.

Tech deep-dive

The user events are created by hooking into Sense’s logging framework, which is called Log4Net.

By placing a carefully crafted XML file in the Qlik Sense proxy service’s configuration directory, we can instruct Log4Net to forward certain Sense log events that we are interested in to Butler SOS.
In this case we are interested in session start/stop and connection open/close events.

The XML file is also known as a “log appender file”.
It contains instructions that tell Log4Net to do various things when the specified filter matches the actual log data created by Sense. Examples include sending emails, writing log entries to disk (i.e. regular file logging!), sending the log row as a UDP message and more.
Here we’re interested in the UDP message feature.

So, by means of a log appender file we tell Log4Net to send certain log rows to Butler SOS as UDP messages.

We also have to specify in the log appender file what host/IP address and port Butler SOS listens to, i.e. where the UDP messages should be sent.
Finally we have to make sure firewalls are open and allow UDP traffic from the Sense server(s) to Butler SOS.

If everything is set up correctly UDP messages will arrive at Butler SOS within seconds after the actual event taking place in Qlik Sense, i.e. close to real-time.

Tagging of data

InfluxDB

The tags added to InfluxDB are described in the reference documentation for log events.

New Relic

The following attributes (which is New Relic lingo for tags) are added:

  1. A core set of attributes are added to all user events
    1. qs_host: Host name of the Sense server the event originated at.
    2. qs_event_action: What kind of user event that took place. Examples are “Start session”, “Stop session, “Open connection”, “Close connection”.
    3. qs_userFull: Full directory/user ID of the user the event is about. Will be scrambled if scrambling enabled in config file.
    4. qs_userDirectory: User directory of the user the event is about. Will be scrambled if scrambling enabled in config file.
    5. qs_userId: User ID of the user the event is about. Will be scrambled if scrambling enabled in config file.
    6. qs_origin: What kind of activity caused the event, for example “AppAccess”. May be empty for some user events.
    7. qs_appId: App ID of the app the event is about. May be empty for some user events.
    8. qs_appName: App name of the app the event is about. May be empty for some user events.
    9. qs_uaBrowserName: Browser name of the user agent that caused the event.
    10. qs_uaBrowserMajorVersion: Browser major version of the user agent that caused the event.
    11. qs_uaOsName: OS name of the user agent that caused the event.
    12. qs_uaOsVersion: OS version of the user agent that caused the event.
  2. Custom attributes defined in the Butler SOS config file’s Butler-SOS.userEvents.tags section of the config file.

Note: Attributes defined further down in the list above will overwrite already defined attributes if their names match.
To avoid problems you should make sure not to use already defined attributes.

Settings in config file

---
Butler-SOS:
  ...
  ...
  # Track individual users opening/closing apps and starting/stopping sessions. 
  # Requires log appender XML file(s) to be added to Sense server(s).
  userEvents:                       
    enable: false
    excludeUser:                    # Optional blacklist of users that should be disregarded when it comes to user events
      - directory: LAB
        userId: testuser1
      - directory: LAB
        userId: testuser2
    udpServerConfig:
      serverHost: <IP or FQDN>      # Host/IP where user event server will listen for events from Sense
      portUserActivityEvents: 9997  # Port on which user event server will listen for events from Sense
    tags:                           # Tags are added to the data before it's stored in InfluxDB
      - tag: env
        value: DEV
      - tag: foo
        value: bar
    sendToMQTT: 
      enable: false                 # Set to true if user events should be forwarded as MQTT messages
      postTo:                       # Control when and to which MQTT topics messages are sent 
        everythingTopic:            # Topic to which all user events are sent
          enable: true
          topic: qliksense/userevent
        sessionStartTopic:          # Topic to which "session start" events are sent
          enable: true
          topic: qliksense/userevent/session/start
        sessionStopTopic:           # Topic to which "session stop" events are sent
          enable: true
          topic: qliksense/userevent/session/stop
        connectionOpenTopic:        # Topic to which "connection open" events are sent
          enable: true
          topic: qliksense/userevent/connection/open
        connectionCloseTopic:       # Topic to which "connection close" events are sent
          enable: true
          topic: qliksense/userevent/connection/close
    sendToInfluxdb:
      enable: true                  # Set to true if user events should be stored in InfluxDB
    sendToNewRelic:
      enable: false                  # Should log events be sent to New Relic?
      destinationAccount:
        - First NR account
        - Second NR account
      scramble: true                # Should user info (user directory and user ID) be scrambled before sent to NR?
  ...
  ...

Log appender XML files

A sample log appender file LocalLogConfig.xml is available in the ZIP file available from the download page, in the config/log_appender_xml/proxy/LocalLogConfig.xml folder.

That file includes log appenders for both user and log events.
Looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <!-- Log appender finding user session events -->
    <appender name="EventSession" type="log4net.Appender.UdpAppender">
        <filter type="log4net.Filter.StringMatchFilter">
            <param name="stringToMatch" value="Start session for user" />
        </filter>
        <filter type="log4net.Filter.StringMatchFilter">
            <param name="stringToMatch" value="Stop session for user" />
        </filter>
        <filter type="log4net.Filter.DenyAllFilter" />
        <param name="remoteAddress" value="FQDN or IP of server where Butler SOS is running" />
        <param name="remotePort" value="9997" />
        <param name="encoding" value="utf-8" />
        <layout type="log4net.Layout.PatternLayout">
            <converter>
                <param name="name" value="hostname" />
                <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.HostNamePatternConverter" />
             </converter>
            <param name="conversionpattern" value="/qseow-proxy-session/;%hostname;%property{Command};%property{UserDirectory};%property{UserId};%property{Origin};%property{Context};%message" />
        </layout>
    </appender>

    <!-- Log appender finding user connection events -->
    <appender name="EventConnection" type="log4net.Appender.UdpAppender">
        <filter type="log4net.Filter.StringMatchFilter">
            <param name="stringToMatch" value="connection Opened for session" />
        </filter>
        <filter type="log4net.Filter.StringMatchFilter">
            <param name="stringToMatch" value="connection Closed for session" />
        </filter>
        <filter type="log4net.Filter.DenyAllFilter" />
        <param name="remoteAddress" value="FQDN or IP of server where Butler SOS is running" />
        <param name="remotePort" value="9997" />
        <param name="encoding" value="utf-8" />
        <layout type="log4net.Layout.PatternLayout">
            <converter>
                <param name="name" value="hostname" />
                <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.HostNamePatternConverter" />
             </converter>
            <param name="conversionpattern" value="/qseow-proxy-connection/;%hostname;%property{Command};%property{UserDirectory};%property{UserId};%property{Origin};%property{Context};%message" />
        </layout>
    </appender>

    <!-- Generic appender for detecting warnings and errors -->
    <appender name="LogEvent" type="log4net.Appender.UdpAppender">
        <param name="threshold" value="warn" />
        <param name="remoteAddress" value="FQDN or IP of server where Butler SOS is running" />
        <param name="remotePort" value="9996" />
        <param name="encoding" value="utf-8" />
        <layout type="log4net.Layout.PatternLayout">
            <converter>
                <param name="name" value="rownum" />
                <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.CounterPatternConverter" /> 
            </converter> 
            <converter>
                <param name="name" value="hostname" />
                <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.HostNamePatternConverter" />
            </converter>
            <converter>
                  <param name="name" value="longIso8601date" /> 
                  <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.Iso8601TimeOffsetPatternConverter" /> 
            </converter>
            <converter> 
                  <param name="name" value="user" /> 
                  <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.ServiceUserNameCachedPatternConverter" /> 
            </converter> 
            <converter> 
                  <param name="name" value="encodedmessage" /> 
                  <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.EncodedMessagePatternConverter" /> 
            </converter> 
            <converter> 
                  <param name="name" value="encodedexception" /> 
                  <param name="type" value="Qlik.Sense.Logging.log4net.Layout.Pattern.EncodedExceptionPatternConverter" /> 
            </converter>
            <param name="conversionpattern" value="/qseow-proxy/;%rownum{9999};%longIso8601date;%date;%level;%hostname;%logger;%user;%encodedmessage;%encodedexception;%property{UserDirectory};%property{UserId};%property{Command};%property{Result};%property{Origin};%property{Context}" />
        </layout>
    </appender>

    <!-- Send UDP message to Butler SOS on user activity -->
    <logger name="AuditActivity.Proxy">
        <appender-ref ref="EventSession" />
        <appender-ref ref="EventConnection" />
    </logger>

    <!-- Send UDP message to Butler SOS on warnings and errors -->
    <logger name="Audit.Proxy">
        <appender-ref ref="LogEvent" />
    </logger>
    <logger name="AuditSecurity">
        <appender-ref ref="LogEvent" />
    </logger>
    <logger name="Security.Proxy">
        <appender-ref ref="LogEvent" />
    </logger>
    <logger name="System.Proxy">
        <appender-ref ref="LogEvent" />
    </logger>
</configuration>

Tip

If you have several servers in your Sense cluster you probably need several log appender files too.

More specifically, you should put a log appender file on each server where the Qlik Sense proxy service is running, i.e. on all servers via which end users access the Sense cluster.

Note the places where you need to fill in the IP/host where Butler SOS is running, as well as the port number to use (set to 9997 but can be changed if needed).

Make necessary changes so the file matches your environment, then deploy to C:\ProgramData\Qlik\Sense\Proxy\LocalLogConfig.xml (adapt path if you have a different installation path).
Note that the file must be called LocalLogConfig.xml!

Sense will usually detect and use the file without any restarts needed, but it can take a while. You can always restart the Sense proxy service to make sure the XML file is applied and used.

Once in place you should see events in the Butler SOS console/file logs if you set logging level to verbose, debug or silly.

3.9.2 - Configuring log events

What’s this?

Butler SOS log events are designed to be a replacement for the most important/useful aspects of Qlik Sense’ log database, which was removed from Qlik Sense Enterprise on Windows in mid 2021.

The log events capture warnings, errors and fatals from the various QSEoW subsystems.
These events used to be sent to the PostgreSQL logging database, most (but not all) are also sent to QSEoW’s log files.

Using Butler SOS’ log events is arguably even better than getting the same information from log db:
Log db had to be polled to detect new log events and this polling could realistically only be done every few minutes. It also put additional load on an often already struggling part of many QSEoW clusters.

With Butler SOS’ log events concept the notifications are almost instantaneous.
Errors and warnings show up in the Grafana or New Relic dashboards within seconds after taking place in QSEoW.

Log events rely on two things to work:

  1. Settings in the Butler SOS config file.
  2. Log appender XML files being deployed on the Sense servers where log events should be captured.

Both are described below.

Info

As of Butler SOS version 9.2, log events are captured in these QSEoW services:

  • Engine
  • Proxy
  • Repository
  • Scheduler

Support for additional modules is reasonably easy to add, please create a ticket if you believe some service should be added to the list above.

Tech deep-dive

The underlying mechanism is the same as described on the user events page.

Tagging of data

Categorising log events

Log events can optionally be categorised by Butler SOS.
The reason for categorising log events is to make it easier to make Sense of the potentially large number of log events that can be generated by a QSEoW cluster.

For example, if a QSEoW cluster creates 1000 warnings and errors per hour, it’s difficult to know which warnings are important and which are not.
By categorising the messages, for example by the subsystem that generated them and/or what they refer to, it’s easier to understand what’s going on.

Possible ways of categorising log events could be “access denied” issues, “user directory” issues, “app reload failed” issues, “general engine issues” etc.

Log events can be categorised in any number of ways, but the template config file contains a few examples.
Specifically:

  1. Access denied issues
    1. Captures log events of severity WARN and ERROR.
    2. Match log events aginst two filters.
      1. Log message starts with “Access was denied for User:”.
      2. Log message contains “was denied for User
    3. Categorises them by setting category tag qs_log_category to access-denied.
  2. AD issues
    1. Captures log events of severity WARN and ERROR.
    2. Match log events against one filter.
      1. Log message starts with “Duplicate entity with userId”.
    3. Categorises them by setting category tag qs_log_category to user-directory.
  3. Qlik Sense service down
    1. Captures log events of severity WARN.
    2. Match log events against two filters.
      1. Log message starts with “Failed to request service alive response from”.
      2. Log message contains “Unable to connect to the remote server”.
    3. Categorises them by setting category tag qs_log_category to qs-service.
  4. Reload task failed
    1. Captures log events of severity WARN and ERROR.
    2. Match log events against four filters.
      1. Log message starts with “Task finished with state FinishedFail”.
      2. Log message starts with “Task finished with state Error”.
      3. Log message ends with “Reload failed in Engine. Check engine or script logs.”.
      4. Log message starts with “Reload sequence was not successful (Result=False, Finished=True, Aborted=False) for engine connection with handle”.
    3. Categorises them by setting category tag qs_log_category to reload-failed.
  5. If no rules match the log event, it will be categorised as unknown.
    1. The default rule can be enabled/disabled in the config file via the ruleDefault.enable parameter.

Note 1: It is possible to assign one or more categories to a log event. This provides flexibility in how you later create dashboards in for example Grafana or New Relic.

Note 2: The sample/template config file may be updated with more examples in the future.

Tip

It is also possible to drop log events that match a certain pattern, for example if you are never interested in seeing them in Grafana or New Relic.
This is done by setting the action parameter to drop in the config file, for that particular rule.

InfluxDB

The tags added to InfluxDB are described in the reference documentation for log events.

Log categories are added as tags to InfluxDB datapoints.

New Relic

The following attributes (which is New Relic lingo for tags) are added:

  1. A core set of attributes are added to all user events. Note that some attributes will be empty for some/many log events.
    1. qs_ts_iso: Event timestamp in ISO format.
    2. qs_ts_local: Event timestamp in local (server) time zone.
    3. qs_log_source: Which Sense service the event originated in, for example “qseow-proxy”, “qseow-repository”.
    4. qs_log_level: Log level of the event. “WARN”, “ERROR”, or “FATAL”.
    5. qs_host: Host name of the Sense server the event originated at.
    6. qs_subsystem: Which part of each Sense service the event originated in, for example “System.Proxy.Proxy.Core.RequestListener”, “System.Engine.Engine”.
    7. qs_windows_user: Name of the Windows user that’s used to run the Window service where the event originated.
    8. qs_message: Event message.
    9. qs_exception_message: Additional information about the event.
    10. qs_user_full: Full directory/user ID of the user the event is about. Will be scrambled if scrambling enabled in config file.
    11. qs_user_directory: User directory of the user the event is about. Will be scrambled if scrambling enabled in config file.
    12. qs_user_id: User ID of the user the event is about. Will be scrambled if scrambling enabled in config file.
    13. qs_command: What command (if any) caused the event. Example: “Doc::DoSave”, “Doc::CreateObject”.
    14. qs_result_code: Result code as reported by Sense. Usually empty.
    15. qs_origin: What kind of activity caused the event, for example “AppAccess”.
    16. qs_context: Additional information about the event.
    17. qs_task_name: Task name (if any) causing the event.
    18. qs_app_name: App name (if any) causing the event.
    19. qs_task_id: Task ID (if any) causing the event.
    20. qs_app_id: App ID (if any) causing the event.
    21. qs_execution_id: Execution ID as reported by Sense.
    22. qs_proxy_session_id: Proxy session ID as reported by Sense.
    23. qs_engine_ts: Engine timestamp (if any) associated with the event.
    24. qs_process_id: Process ID of engine service.
    25. qs_engine_exe_version: Version of engine service’s EXE file.
    26. qs_server_started: Timestamp when Sense engine service was started.
    27. qs_entry_type: Entry type as reported by Sense. Usually empty.
    28. qs_session_id: Session ID as reported by Sense.
  2. Custom attributes defined in the Butler SOS config file’s Butler-SOS.logEvents.tags section.
  3. Custom attributes defined in the Butler SOS config file’s Butler-SOS.newRelic.event.attribute.static section.
  4. Dynamic attributes
    1. butlerSosVersion: Butler SOS version. Enabled by setting Butler-SOS.newRelic.event.attribute.dynamic.butlerSosVersion.enable to true in config file.

Note: Attributes defined further down in the list above will overwrite already defined attributes if their names match.
To avoid problems you should make sure not to use already defined attributes.

Log categories are currently NOT included in the data sent to New Relic.

Settings in main config file

Tip

The config snippet below comes from the production_template.yaml file.

Being a template, it contains examples on how configuration may be done - not necessarily how it should be done.
For example, the env/DEV and foo/bar tags are optional and can be changed to something else, or removed all together if not used.

Butler-SOS:
  ...
  ...
  # Log events are used to capture Sense warnings, errors and fatals in real time
  logEvents:
    udpServerConfig:
      serverHost: <IP or FQDN>         # Host/IP where log event server will listen for events from Sense
      portLogEvents: 9996              # Port on which log event server will listen for events from Sense
    tags:
      # - name: env
      #   value: DEV
      # - name: foo
      #   value: bar
    source:
      engine:
        enable: false                  # Should log events from the engine service be handled?
      proxy:
        enable: false                  # Should log events from the proxy service be handled?
      repository:
        enable: false                  # Should log events from the repository service be handled?
      scheduler:
        enable: false                  # Should log events from the scheduler service be handled?
    categorise:                        # Take actions on log events based on their content
      enable: false
      rules:                           # Rules are used to match log events to filters
        # - description: Find access denied errors
        #   logLevel:                    # Log events of this Log level will be matched. WARN, ERROR, FATAL. Case insensitive.
        #     - WARN
        #     - ERROR
        #   action: categorise           # Action to take on matched log events. Possible values are categorise, drop
        #   category:                    # Category to assign to matched log events. Name/value pairs. 
        #                                # Will be added to InfluxDB datapoints as tags.
        #     - name: qs_log_category
        #       value: access-denied
        #   filter:                      # Filter used to match log events. Case sensitive.
        #     - type: sw                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: "Access was denied for User:"
        #     - type: so
        #       value: was denied for User
        # - description: Find AD issues
        #   logLevel:                    # Log events of this Log level will be matched. WARN, ERROR, FATAL. Case insensitive.
        #     - ERROR
        #     - WARN
        #   action: categorise           # Action to take on matched log events. Possible values are categorise, drop
        #   category:                    # Category to assign to matched log events. Name/value pairs. 
        #                                # Will be added to InfluxDB datapoints as tags.
        #     - name: qs_log_category
        #       value: user-directory
        #   filter:                      # Filter used to match log events. Case sensitive.
        #     - type: sw                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Duplicate entity with userId
        # - description: Qlik Sense service down
        #   logLevel:                    # Log events of this Log level will be matched. WARN, ERROR, FATAL. Case insensitive.
        #     - WARN
        #   action: categorise           # Action to take on matched log events. Possible values are categorise, drop
        #   category:                    # Category to assign to matched log events. Name/value pairs. 
        #                                # Will be added to InfluxDB datapoints as tags.
        #     - name: qs_log_category
        #       value: qs-service
        #   filter:                      # Filter used to match log events. Case sensitive.
        #     - type: sw                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Failed to request service alive response from
        #     - type: so                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Unable to connect to the remote server
        # - description: Reload task failed
        #   logLevel:                    # Log events of this Log level will be matched. WARN, ERROR, FATAL. Case insensitive.
        #     - WARN
        #     - ERROR
        #   action: categorise           # Action to take on matched log events. Possible values are categorise, drop
        #   category:                    # Category to assign to matched log events. Name/value pairs. 
        #                                # Will be added to InfluxDB datapoints as tags.
        #     - name: qs_log_category
        #       value: reload-failed
        #   filter:                      # Filter used to match log events. Case sensitive.
        #     - type: sw                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Task finished with state FinishedFail
        #     - type: sw                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Task finished with state Error
        #     - type: ew                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Reload failed in Engine. Check engine or script logs.
        #     - type: sw                 # Type of filter. sw = starts with, ew = ends with, so = substring of
        #       value: Reload sequence was not successful (Result=False, Finished=True, Aborted=False) for engine connection with handle
      ruleDefault:                     # Default rule to use if no other rules match the log event
        enable: true
        category:
          - name: qs_log_category
            value: unknown
    enginePerformanceMonitor:           # Detailed app performance data extraction from log events
      enable: false                     # Should app performance data be extracted from log events?
      appNameLookup:                    # Should app names be looked up based on app IDs?
        enable: false
      trackRejectedEvents: 
        enable: false                   # Should events that are rejected by the app performance monitor be tracked?
        tags:                           # Tags are added to the data before it's stored in InfluxDB
          # - name: env
          #   value: DEV
          # - name: foo
          #   value: bar
      monitorFilter:                    # What objects should be monitored? Entire apps or just specific object(s) within some specific app(s)?
                                        # Two kinds of monitoring can be done:
                                        # 1) Monitor all apps, except those listed for exclusion. This is defined in the allApps section.
                                        # 2) Monitor only specific apps. This is defined in the appSpecific section.
                                        # An event will be accepted if it matches any of the rules in the allApps section OR any of the rules in the appSpecific section.
        allApps:
          enable: false                 # Should all apps be monitored?
          appExclude:                   # What apps should be excluded from monitoring?
                                        # If both appId and appName are specified, both must match the event's data for it to be considered a match.
            # - appId: 5b817efe-472d-43ce-8a31-6cce34af7de9
            # - appName: Sales forecast
            # - appId: f42d6b16-8faf-45ca-a783-59f9da47db6e
            #   appName: Inventory analysis
          objectType:
            allObjectTypes: true        # Should all object types be monitored?
            allObjectTypesExclude:      # If allObjectTypes is set to true, the object types in this array are excluded from monitoring. 
                                        # someObjectTypesInclude (below) is ignored in that case.
              # - LoadModelList
              # - <Unknown>
              # - linechart
              # - map
            someObjectTypesInclude:     # What object types should be included in monitoring?
                                        # Only applicable if allObjectTypes is set to false.
              # - LoadModelList
              # - sheet
              # - barchart
          method:
            allMethods: true            # Should all methods be monitored?
            allMethodsExclude:          # If allMethods is set to true, the methods in this array are excluded from monitoring.
                                        # someMethodsInclude (below) is ignored in that case.
              # - Global::OpenApp
              # - Doc::GetAppLayout
              # - Doc::CreateSessionObject
            someMethodsInclude:         # What methods should be included in monitoring?
                                        # Only applicable if allMethods is set to false.
              # - GenericObject::GetLayout
              # - GenericObject::GetHyperCubeContinuousData
        appSpecific:
          enable: false                  # Should app specific monitoring be done?
          app:
            - include:                  # What apps should be monitored?
                                        # If both appId and appName are specified, both must match the event's data for it to be considered a match.
                # - appId: d7cf16f9-6a95-462a-9ff1-a6d413326de4
                # - appName: Budget 2025
                # - appId: 6931136d-c234-4358-a40c-e37153aba7c9
                #   appName: Sales basket analysis
              objectType:
                allObjectTypes: true   # Should all object types be monitored?
                allObjectTypesExclude:  # If allObjectTypes is set to true, the object types in this array are excluded from monitoring. 
                                        # someObjectTypesInclude (below) is ignored in that case.
                  # - table
                  # - map
                someObjectTypesInclude: # What object types should be included in monitoring?
                                        # Only applicable if allObjectTypes is set to false.
                  # - sheet
                  # - barchart
                  # - linechart
                  # - map
              appObject:
                allAppObjects: true     # Should all app objects be monitored?
                allAppObjectsExclude:   # If allAppObjects is set to true, the app objects in this array are excluded from monitoring.
                                        # someAppObjectsInclude (below) is ignored in that case.
                  # - objectId: AaBbCc
                  # - objectId: DdEeFf
                someAppObjectsInclude:  # What app objects should be included in monitoring?
                                        # Only applicable if allAppObjects is set to false.
                  # - objectId: YJEpPT    
              method: 
                allMethods: true        # Should all methods be monitored?
                allMethodsExclude:      # If allMethods is set to true, the methods in this array are excluded from monitoring.
                                        # someMethodsInclude (below) is ignored in that case.
                  # - Global::OpenApp
                  # - Doc::GetAppLayout
                  # - Doc::CreateSessionObject
                someMethodsInclude:     # What methods should be included in monitoring?
                                        # Only applicable if allMethods is set to false.
                  # - GenericObject::GetLayout
                  # - GenericObject::GetHyperCubeContinuousData
    sendToMQTT:
      enable: false                    # Should log events be sent as MQTT messages?
      baseTopic: qliksense/logevent    # What topic should log events be forwarded to? 
      postTo:
        baseTopic: true
        subsystemTopics: true          # Should log events be sent to subtopics corresponding to the QSEoW subsystems where the events originated?
    sendToInfluxdb:
      enable: false                    # Should log events be stored in InfluxDB?
    sendToNewRelic:
      enable: false                    # Should log events be sent to New Relic?
      destinationAccount:
        # - First NR account
        # - Second NR account
      source:
        engine:
          enable: true                 # Should log events from the engine service be handled?
          logLevel: 
            error: true                # Should error level log events be handled by Butler SOS?
            warn: true                 # Should warning level log events be handled by Butler SOS?
        proxy:
          enable: true                 # Should log events from the proxy service be handled?
          logLevel: 
            error: true                # Should error level log events be handled by Butler SOS?
            warn: true                 # Should warning level log events be handled by Butler SOS?
        repository:
          enable: true                 # Should log events from the repository service be handled?
          logLevel: 
            error: true                # Should error level log events be handled by Butler SOS?
            warn: true                 # Should warning level log events be handled by Butler SOS?
        scheduler:
          enable: true                 # Should log events from the scheduler service be handled?
          logLevel: 
            error: true                # Should error level log events be handled by Butler SOS?
            warn: true                 # Should warning level log events be handled by Butler SOS?
  ...
  ...

3.9.2.1 - Engine performance log events

Butler SOS can capture detailed performance data from the Qlik Sense engine service, based on log events generated by that service.

Due to the risk of generating a large number of log events, this feature comes with a comprensive set of settings to control what data is captured and stored in InfluxDB, and what is rejected.

Warning

Due to the risk of overwhelming the InfluxDB database with a large number of performance log events, this feature is disabled by default in the sample config file.
If you want to use this feature, you must enable it first.

It’s also recommended to start small and gradually increase the number of performance log events captured, to make sure the InfluxDB database can handle the load.

For example, start by capturing only a few specific types of objects types, methods etc for a limited set of apps, and then gradually increase/customise from there if needed.

What’s this?

“Performance log events” are a regular log events generated by the Qlik Sense engine service.
They get a special name in Butler SOS because they contain detailed performance data about the engine’s operations, something that can be very useful when monitoring the performance of the Qlik Sense engine.

These events make it possible to monitor how the engine is performing in close to real time.

Butler SOS can capture all or some of these events and store them in InfluxDB, where they can be used to create dashboards in Grafana - just like any other events or metrics captured by Butler SOS.

How it works

Once a log event has been identified as a performance log event (by looking at the event name, which is qseow-qix-perf in this case), Butler SOS will extract the data of interest from the event.

The data is then checked against a set of rules/filters to determine if the event should be stored in full detail in InfluxDB (=accepted) or not (=rejected).

For rejected events, a counter is incremented to keep track of how many events have been rejected.
The counters are written to InfluxDB at regular intervals, controlled by the Butler-SOS.qlikSenseEvents.influxdb.writeFrequency setting in the config file.

Once the data has been accepted, it is immediately stored in InfluxDB with a set of tags and fields, as described below.

Filtering of log events

There are two kinds of filters that can be defined in the config file:

  • Filters that apply to all apps
  • App specific filters that apply to specific apps

An event will be accepted if it matches either or both of the above filter types.
If an event does not match either of the two filter types, it will be rejected.

Filters applying to all apps

Global filters are defined in the Butler-SOS.logEvents.enginePerformanceMonitor.monitorFilter.allApps section of the config file.
All references to config file settings below are relative to this section.

The all-app filter consists of several parts/subfilters:

  • Apps for which events should be excluded, based on app ID and/or app name
  • Object types to include/exclude
  • Methods to include/exclude

An event must match all of the subfilters to be accepted by the all-app filter.

Enabling/disabling

It is possible to enabled/disable monitoring of all apps via the enable setting. If disabled no events will be accepted via the global all-apps filter type.

Excluding apps

The appExclude[] array can be used to exclude specific apps from monitoring.
The effect is that events from all apps, except the apps listed in appExclude[], will be accepted (unless the event is rejected by some of the other all-app filters).

appExclude[] can contain zero or more objects, each with either an appId or an appName properties, or both.
If both properties are present, both must match the event’s data for it to be considered a match.

Filtering by object type

The objectType section can be used to filter events based on the object type of the event.
“Object types” are things like barchart, sheet, map, but also internal Sense objects like AppPropsList and appprops.

  • If objectType.allObjectTypes is set to true, all object types are monitored, except those listed in allObjectTypesExclude[].
    • someObjectTypesInclude[] is not used in this case.
  • If objectType.allObjectTypes is set to false, only the object types listed in someObjectTypesInclude[] will result in events being accepted.
    • allObjectTypesExclude[] is not used in this case.

Put differently: You can either start by including all object types and then specify which ones to exclude, or start with an empty list and add the object types you want to include.

Filtering by method

“Methods” refer to the various operations that the Sense engine performs.
Examples include Global::OpenApp, Global::GetProgress, Doc::GetAppLayout, Doc::CreateSessionObject.

The concepts for filtering by method are the same as for filtering by object type (see above).

Filter applying to specific apps

If the all-app filters are not useful (they may generate too much data, for example), it is possible to define app-specific filters in the Butler-SOS.logEvents.enginePerformanceMonitor.monitorFilter.appSpecific section of the config file.
All references to config file settings below are relative to this section.

App-specific filters are defined in the array app[].
It includes zero or more objects, each of which contains the following subfilters:

  • Apps to include, based on app ID and/or app name.
  • Object types to include/exclude
  • App objects to include/exclude
  • Methods to include/exclude

The concepts for including all (and then excluding some) or starting with an empty list and adding what you want to include are the same as for the all-app filters.

Enabling/disabling

It is possible to enabled/disable app specific monitoring via the enable setting. If disabled no events will be accepted via the app specific filter type.

Including apps

The app[].include[] contains the app IDs and/or app names of the apps to include in the filter.

Example:

appSpecific:
  enable: true                  # Should app specific monitoring be done?
  app:
    - include:                  # What apps should be monitored?
                                # If both appId and appName are specified, both must match the event's data for it to be considered a match.
        - appId: d7cf16f9-6a95-462a-9ff1-a6d413326de4
        - appName: Budget 2025
        - appId: 6931136d-c234-4358-a40c-e37153aba7c9
          appName: Sales basket analysis

Filtering by object type

The objectType section can be used to filter events based on the object type of the event.

The concepts for filtering by object type are the same as for the all-app filters.

Filtering by app object

The appObject section can be used to filter events based on the object ID of the app object that the event is related to.
This is useful if you want to monitor only specific objects within an app, for example certain chart or table.

Getting the ID of an app object is a bit tricky, but for charts, tables and other UI elements it can be done in a few reasonably easy ways.
The below works for Qlik Sense 2024-May - your mileage may vary for other versions.

Share object from within app
  1. Open the app, then move to the sheet the chart is on.
  2. Right click the chart and select “Share”. Click “Embed”.
  3. The object ID is shown under the preview image of the chart. It is also available in the Iframe URL as the obj parameter.

Get the object ID from the "Share" dialog within the Sense app itself.

Use a Chrome extension

This works if you are using Chrome and have the Add Sense extension installed.

  1. Open the app, then move to the sheet the chart is on.
  2. Click the Add Sense icon in the Chrome toolbar, then “Show” in the menu that appears.
  3. Popup windows will appear with information - including the object ID - about all UI objects on the sheet.

Get the object ID from the Add Sense extension in Chrome.

Use the “Single configurator” in the Qlik Sense Dev Hub

While it is possible to get the object ID from the Dev Hub, it is not recommended as the Dev Hub will be removed in a future version of Qlik Sense.

Still, if you are using a version of Sense that has the Dev Hub, you can get the object ID like this:

  1. Open the Dev Hub, then the “Single configurator” tool (usually found at https://mysense.some.domain/dev-hub/single-configurator).
  2. Select the app and the object you want to get the ID for in the dropdown to the left.
  3. Select the chart or table you are interested in. The object ID is found in the URL.

Get the object ID from the URL in the Single configurator tool in the Dev Hub.

The concepts for filtering by app object are the same as for other filter types.

Filtering by method

The method section can be used to filter events based on the method that generated the event.

The concepts for filtering by method are the same as for other filter types.

Metrics for accepted performance log events

The accepted performance log events stored in InfluxDB are described here.

Metrics for rejected performance log events

The rejected performance log events stored in InfluxDB are described here.

Settings in config file

Tip

The config snippet below comes from the production_template.yaml file.

Being a template, it contains examples on how configuration may be done - not necessarily how it should be done.
For example, the env/DEV and foo/bar tags are optional and can be changed to something else, or removed all together if not used.

Butler-SOS:
  ...
  ...
  # Log events are used to capture Sense warnings, errors and fatals in real time
  logEvents:
    ...
    ...
    enginePerformanceMonitor:           # Detailed app performance data extraction from log events
      enable: false                     # Should app performance data be extracted from log events?
      appNameLookup:                    # Should app names be looked up based on app IDs?
        enable: false
      trackRejectedEvents: 
        enable: false                   # Should events that are rejected by the app performance monitor be tracked?
        tags:                           # Tags are added to the data before it's stored in InfluxDB
          # - name: env
          #   value: DEV
          # - name: foo
          #   value: bar
      monitorFilter:                    # What objects should be monitored? Entire apps or just specific object(s) within some specific app(s)?
                                        # Two kinds of monitoring can be done:
                                        # 1) Monitor all apps, except those listed for exclusion. This is defined in the allApps section.
                                        # 2) Monitor only specific apps. This is defined in the appSpecific section.
                                        # An event will be accepted if it matches any of the rules in the allApps section OR any of the rules in the appSpecific section.
        allApps:
          enable: false                 # Should all apps be monitored?
          appExclude:                   # What apps should be excluded from monitoring?
                                        # If both appId and appName are specified, both must match the event's data for it to be considered a match.
            # - appId: 5b817efe-472d-43ce-8a31-6cce34af7de9
            # - appName: Sales forecast
            # - appId: f42d6b16-8faf-45ca-a783-59f9da47db6e
            #   appName: Inventory analysis
          objectType:
            allObjectTypes: true        # Should all object types be monitored?
            allObjectTypesExclude:      # If allObjectTypes is set to true, the object types in this array are excluded from monitoring. 
                                        # someObjectTypesInclude (below) is ignored in that case.
              # - LoadModelList
              # - <Unknown>
              # - linechart
              # - map
            someObjectTypesInclude:     # What object types should be included in monitoring?
                                        # Only applicable if allObjectTypes is set to false.
              # - LoadModelList
              # - sheet
              # - barchart
          method:
            allMethods: true            # Should all methods be monitored?
            allMethodsExclude:          # If allMethods is set to true, the methods in this array are excluded from monitoring.
                                        # someMethodsInclude (below) is ignored in that case.
              # - Global::OpenApp
              # - Doc::GetAppLayout
              # - Doc::CreateSessionObject
            someMethodsInclude:         # What methods should be included in monitoring?
                                        # Only applicable if allMethods is set to false.
              # - GenericObject::GetLayout
              # - GenericObject::GetHyperCubeContinuousData
        appSpecific:
          enable: false                  # Should app specific monitoring be done?
          app:
            - include:                  # What apps should be monitored?
                                        # If both appId and appName are specified, both must match the event's data for it to be considered a match.
                # - appId: d7cf16f9-6a95-462a-9ff1-a6d413326de4
                # - appName: Budget 2025
                # - appId: 6931136d-c234-4358-a40c-e37153aba7c9
                #   appName: Sales basket analysis
              objectType:
                allObjectTypes: true   # Should all object types be monitored?
                allObjectTypesExclude:  # If allObjectTypes is set to true, the object types in this array are excluded from monitoring. 
                                        # someObjectTypesInclude (below) is ignored in that case.
                  # - table
                  # - map
                someObjectTypesInclude: # What object types should be included in monitoring?
                                        # Only applicable if allObjectTypes is set to false.
                  # - sheet
                  # - barchart
                  # - linechart
                  # - map
              appObject:
                allAppObjects: true     # Should all app objects be monitored?
                allAppObjectsExclude:   # If allAppObjects is set to true, the app objects in this array are excluded from monitoring.
                                        # someAppObjectsInclude (below) is ignored in that case.
                  # - objectId: AaBbCc
                  # - objectId: DdEeFf
                someAppObjectsInclude:  # What app objects should be included in monitoring?
                                        # Only applicable if allAppObjects is set to false.
                  # - objectId: YJEpPT    
              method: 
                allMethods: true        # Should all methods be monitored?
                allMethodsExclude:      # If allMethods is set to true, the methods in this array are excluded from monitoring.
                                        # someMethodsInclude (below) is ignored in that case.
                  # - Global::OpenApp
                  # - Doc::GetAppLayout
                  # - Doc::CreateSessionObject
                someMethodsInclude:     # What methods should be included in monitoring?
                                        # Only applicable if allMethods is set to false.
                  # - GenericObject::GetLayout
                  # - GenericObject::GetHyperCubeContinuousData
  ...
  ...

Log appender XML files

Sample log appender files are available in the ZIP file available from the download page, in subfolders engine/proxy/repository/scheduler of config/log_appender_xml/ folder.

Note that the log appender files contain slightly different information for each Sense service (engine/proxy/repository/scheduler)!
Also keep in mind that the log appender files must be called LocalLogConfig.xml and placed in these directories on the all Sense servers:

  • C:\ProgramData\Qlik\Sense\Engine
  • C:\ProgramData\Qlik\Sense\Proxy
  • C:\ProgramData\Qlik\Sense\Repository
  • C:\ProgramData\Qlik\Sense\Scheduler

Tip

If you have more than one Sense server you strictly speaking don’t have to deploy log appenders to all servers.

If you are only interested in receiving log events from some servers and/or services (engine, proxy, repository, scheduler) - deploy the log appender files there.

3.10 - Configuring the log database

Warning

Support for Qlik Sense log db was removed in Butler SOS version 11.0.0.

It’s recommended to use log events instead, they provide a more flexible and scalable way to capture log events from Qlik Sense.

What’s this?

Up until mid 2021 Qlik Sense Enterprise on Windows included a logging database to which log events were sent. It was removed from the product due to mainly performance reasons - it could be difficult to scale properly for large Sense clusters.

Butler SOS offers a replacement for log db, in the form of log events.

3.11 - Connecting to a Qlik Sense server

Details on how to configure the connection from Butler SOS to Qlik Sense Enterprise on Windows.

What’s this?

In order to interact with a Qlik Sense Enterprise on Windows (QSEoW) environment, Butler SOS needs to know a few things about that environment. This is true no matter if the Sense cluster consists of a single Sense server or many.

Settings in main config file

---
Butler-SOS:
  ...
  ...
  # Certificates to use when connecting to Sense. Get these from the Certificate Export in QMC.
  cert:
    clientCert: <path/to/cert/client.pem>
    clientCertKey: <path/to/cert/client_key.pem>
    clientCertCA: <path/to/cert/root.pem>
    clientCertPassphrase: <certificate key password, if one was specified when exporting certificates from Sense QMC >
    # If running Butler in a Docker container, the cert paths MUST be the following
    # clientCert: /nodeapp/config/certificate/client.pem
    # clientCertKey: /nodeapp/config/certificate/client_key.pem
    # clientCertCA: /nodeapp/config/certificate/root.pem
    # clientCertPassphrase: 
  ...
  ...

Qlik Sense certificates

Butler SOS uses certificates to authenticate with Qlik Sense.
These certificates must be exported from the Qlik Management Console (QMC).

Qlik Sense certificate export

To export certificates you need to provide a few pieces of information:

  1. The IP or full host name that Butler SOS’ will use when calling Butler SOS APIs. For example, if Butler SOS get data from server1.my.domain (i.e. the config setting Butler.serverToMonitor.server.host is set to server1.my.domain), the value server1.my.domain should be entered as “machine name” when exporting the certificates from the QMC.
  2. You only need to export certificate from one server in multi-server Sense clusters. The exported certificate can be used to access and get data from any server in the cluster.
  3. Butler SOS can handle certificates with or without password protection. If you choose to use a password, you must enter that password in the Butler SOS config file.
  4. Check the “Include secret key” check box.
  5. Export certificates in PEM format.

Qlik Sense certificate export

Then click the “Export certificates” button. If all goes well the certificates are now exported to a folder on the Sense server to which you are connected (i.e. the server hosting the virtual proxy you are connected to):

Qlik Sense certificate export

The exported certificate files will be used when configuring Butler SOS.

3.12 - Setting up MQTT messaging

Butler SOS can use MQTT as a channel for pub-sub style M2M (machine to machine) messages. This page describes how to configure MQTT in Butler SOS.

What’s this?

MQTT is a light weight messaging protocol based on a publish-subscribe metaphore. It is widely used in Internet of Things and telecom sectors.

MQTT has features such as guaranteed delivery of messages, which makes it very useful for communicating between Sense and both up- and downstream source/destination systems.

Butler SOS can be configured to forward various metrics and events from Sense as MQTT messages. In order to do so, some shared configuration needs to be in place first. This section covers that configuration.

Specifically, a MQTT broker/gateway has to be configured. All MQTT messages from Butler SOS will be sent to this broker.

Settings in main config file

Butler-SOS:
  ...
  ...
  # MQTT config parameters
  mqttConfig:
    enable: false
    # Items below are mandatory if mqttConfig.enable=true
    brokerHost: <IP of MQTT broker/server>
    brokerPort: 1883
    baseTopic: butler-sos/          # Default topic used if not not oherwise specified elsewhere. Should end with /
  ...
  ...

3.13 - Setting up the New Relic integration

Butler SOS can send metrics and events to New Relic.
This way it’s possible use their SaaS solution for storing and visualising Butler SOS data.

What’s this?

New Relic offers a suite of online/SaaS products that collectively form a very complete obeservability stack.

From a Butler SOS perspective the interesting parts are metrics, event and log handling.
By forwarding such data to New Relic it’s not necessary to run local InfluxDB and Grafana instances.

That said, New Relic is a commercial service and while their free tier is very generous, there will be a trade-off between a local/lower cost InfluxDB/Grafana setup and using New Relic.
With InfluxDB/Grafana you also get more fine grained control over both data storage and visualisations, while New Relic offer ease of setup and no need to host InfluxDB/Grafana yourself.

Below the settings for sending Qlik Sense health metrics to New Relic are described.

Tagging of data

The following attributes (which is New Relic lingo for tags) are added.

Note: Attributes defined further down in the list will overwrite already defined attributes if their names match.
To avoid problems you should make sure not to use already defined attributes.

Tags for Qlik Sense health metrics

  1. Static attributes defined in the config file’s Butler-SOS.newRelic.metric.attribute.static section.
  2. Dynamic attributes
    1. butlerSosVersion: Butler SOS version
  3. Dynamic attributes based on whether each item in Butler-SOS.newRelic.metric.dynamic section is enabled/disabled.
    1. If Butler-SOS.newRelic.metric.dynamic.engine.memory section is enabled
      1. qs_memCommited
      2. qs_memAllocated
      3. qs_memFree
    2. If Butler-SOS.newRelic.metric.dynamic.engine.cpu section is enabled
      1. qs_cpuTotal
    3. If Butler-SOS.newRelic.metric.dynamic.engine.calls section is enabled
      1. qs_engineCalls
    4. If Butler-SOS.newRelic.metric.dynamic.engine.selections section is enabled
      1. qs_engineSelections
    5. If Butler-SOS.newRelic.metric.dynamic.engine.sessions section is enabled
      1. qs_engineSessionsActive
      2. qs_engineSessionsTotal
    6. If Butler-SOS.newRelic.metric.dynamic.engine.users section is enabled
      1. qs_engineUsersActive
      2. qs_engineUsersTotal
    7. If Butler-SOS.newRelic.metric.dynamic.engine.saturated section is enabled
      1. qs_engineSaturated
    8. If Butler-SOS.newRelic.metric.dynamic.apps.docCount section is enabled
      1. qs_docsActiveCount
      2. qs_docsLoadedCount
      3. qs_docsInMemoryCount
    9. If Butler-SOS.newRelic.metric.dynamic.cache.cache section is enabled
      1. qs_cacheHits
      2. qs_cacheLookups
      3. qs_cacheaAdded
      4. qs_cacheReplaced
      5. qs_cacheBytesAdded

Tags for Qlik Sense proxy session metrics

If and how Butler SOS should extract proxy session metrics is controlled in the Butler-SOS.userSessions section of the config file.

  1. Static attributes defined in the config file’s Butler-SOS.newRelic.metric.attribute.static section.
  2. Dynamic attributes
    1. butlerSosVersion: Butler SOS version

Settings in main config file

Butler-SOS:
  ...
  ...
  # New Relic config
  # If enabled, select Butler SOS metrics will be sent to New Relic.
  newRelic:
    enable: false
    event:
      # There are different URLs depending on whther you have an EU or US region New Relic account.
      # The available URLs are listed here: https://docs.newrelic.com/docs/accounts/accounts-billing/account-setup/choose-your-data-center/
      #
      # Note that the URL path should *not* be included in the url setting below!
      # As of this writing the valid options are
      # https://insights-collector.eu01.nr-data.net
      # https://insights-collector.newrelic.com 
      url: https://insights-collector.eu01.nr-data.net
      header:                   # Custom http headers
        - name: X-My-Header
          value: Header value
      attribute: 
        static:                 # Static attributes/dimensions to attach to the events sent to New Relic.
          - name: service
            value: butler-sos
          - name: environment
            value: prod
        dynamic:
          butlerSosVersion: 
            enable: true       # Should the Butler SOS version be included in the events sent to New Relic?
    metric:
      destinationAccount:
        - First NR account
        - Second NR account
      # There are different URLs depending on whther you have an EU or US region New Relic account.
      # The available URLs are listed here: https://docs.newrelic.com/docs/accounts/accounts-billing/account-setup/choose-your-data-center/
      # As of this writing the options for the New Relic metrics API are
      # https://insights-collector.eu01.nr-data.net/metric/v1
      # https://metric-api.newrelic.com/metric/v1 
      url: https://insights-collector.eu01.nr-data.net/metric/v1   # Where should uptime data be sent?
      header:                   # Custom http headers
        - name: X-My-Header
          value: Header value
      dynamic:
        engine:
          memory:               # Engine RAM (free/committed/allocated).
            enable: true
          cpu:                  # Engine CPU.
            enable: true        
          calls:                # Total number of requests made to the engine.
            enable: true        
          selections:           # Total number of selections made to the engine.
            enable: true        
          sessions:             # Engine session metrics (active and total number of engine sessions).
            enable: true        
          users:                # Engine user metrics (active and total number of users in engine.
            enable: true        
          saturated:            # Engine saturation status (tracks whether engine has high or low load).
            enable: true
        apps:
          docCount:
            enable: true
        cache:
          cache:                # Cache metrics.
            enable: true
        proxy:
          sessions:             # Session metrics as reported by the Sense proxy service
            enable: true
      attribute: 
        static:                 # Static attributes/dimensions to attach to the data sent to New Relic.
          - name: service
            value: butler-sos
          - name: environment
            value: prod
        dynamic:
          butlerSosVersion: 
            enable: true       # Should the Butler SOS version be included in the data sent to New Relic?
  ...
  ...

3.14 - Setting up Prometheus

Butler SOS can store metrics in Prometheus.

What’s this?

Prometheus is the de-facto standard, open source tool for achieving observability of both small, large and huge IT systems.

At its heart Prometheus contains a time-series databas optimized for storing various kinds of measurements. It has strong support for doing dimensional queries, great integrations with incident managament tools and more.

Looking at the visualisation side of things, Prometheus is Grafana’s preferred source for time-series data. Put differently, Prometheus has some query features that InfluxDB lack, thus making some Grafana diagrams easier to create using Prometheus vs InfluxDB. The difference is minor though.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Prometheus config
  # If enabled, select Butler SOS metrics will be exposed on a Prometheus compatible URL from where they can be scraped.
  prometheus:
    enable: false                                    # Default false
    host: <IP or FQDN where Butler SOS is running>  # On what IP/FQDN should the Prometheus metrics be exposed? Default 0.0.0.0, i.e. all available IPs
    port: 9842      
  ...
  ...

3.15 - Setting up InfluxDB time series database

Butler SOS can store metrics in InfluxDB.

Warning

Butler SOS supports InflixDB 1.x and 2.x.
There are reports that InfluxDB’s cloud product also works with Butler SOS, that has however not been tested by the Butler SOS team.

Version 3 (in beta at the time of this writing) is not supported.

What’s this?

InfluxDB is a time series database. This means it is optimised for storing data that’s somehow linked to a timestamp.
Measurements and metrics are some of the most obvious kinds of data for which InfluxDB was created.

Butler SOS stores data in InfluxDB in full detail, i.e. Butler SOS doesn’t do any aggregation of older data points.
This has a few consequences:

  • If you are monitoring many Sense servers and/or query Sense for health metrics very frequently and/or have a long InfluxDB retention policy (many months or even years) you will eventually end up with lots of data.
  • You should ask yourself how far back you need to look at operational data such as the one collected by Butler SOS. In most cases 30 or 45 days history will be more than enough. 10-14 days are usually a good starting point. Use the Butler-SOS.influxdbConfig.retentionPolicy section of the config file to create a retention policy for Butler SOS.
  • If you need longer history you should consider using InfluxDB’s excellent aggregation features. These can assist in aggregating older data points, with the effect that you can then keep virtually unlimited history. The older data will not be as detailed (fewer samples per time period) - but you will still have an averaged view of what the history looked like.

Note 1: Instructions for how to aggregate old data is beyond the scope of this documentation.

Note 2: The retention policy specified in the config file will only be created if the InfluxDB database specified in the config file does NOT exist when Butler SOS is started.
I.e. if you store data to an existing InfluxDB database, the retention policy will not be created.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Influx db config parameters
  influxdbConfig:
    enable: true
    # Items below are mandatory if influxdbConfig.enable=true
    host: influxdb.mycompany.com    # InfluxDB host, hostname, FQDN or IP address
    port: 8086                      # Port where InfluxDBdb is listening, usually 8086
    version: 1                      # Is the InfluxDB instance version 1.x or 2.x? Valid values are 1 or 2
    v2Config:                       # Settings for InfluxDB v2.x only, i.e. Butler-SOS.influxdbConfig.version=2
      org: myorg
      bucket: mybucket
      description: Butler SOS metrics
      token: mytoken
      retentionDuration: 10d
    v1Config:                       # Settings below are for InfluxDB v1.x only, i.e. Butler-SOS.influxdbConfig.version=1
      auth:
        enable: false               # Does influxdb instance require authentication (true/false)?
        username: <username>        # Username for Influxdb authentication. Mandatory if auth.enable=true
        password: <password>        # Password for Influxdb authentication. Mandatory if auth.enable=true
      dbName: SenseOps
      # Default retention policy that should be created in InfluxDB when Butler SOS creates a new database there. 
      # Any data older than retention policy threshold will be purged from InfluxDB.
      retentionPolicy:
        name: 10d
        duration: 10d                 # Possible duration units here: https://docs.influxdata.com/influxdb/v1.8/query_language/spec/#durations
    # Control whether certain metrics are stored in InfluxDB or not
    # Use with caution! Enabling activeDocs, loadedDocs or inMemoryDocs may result in lots of data sent to InfluxDB.
    includeFields:
      activeDocs: false              # Should data on what docs are active be stored in Influxdb (true/false)? 
      loadedDocs: false              # Should data on what docs are loaded be stored in Influxdb (true/false)?
      inMemoryDocs: false            # Should data on what docs are in memory be stored in Influxdb (true/false)?

  ...
  ...

3.16 - Configuring extraction of app names from Qlik Sense

What’s this?

Qlik Sense’s APIs return all its app metrics relative to an app ID, not the app name.
This is fine as the ID is guaranteed to be unique, but the downside is that the ID doesn’t tell us humans much.

Butler SOS therefore provides an app ID-to-app name mapping with a configurable update interval.
The Butler-SOS.appNames section of the config file controls if this mapping should be done at all, how often and which Sense server should be used to get it.

If an app ID for some reason can’t be mapped to an app name, Butler SOS will use the app ID as the app name.

A more comprehensive description of Butler SOS’ strategy for getting correct names of Sense apps is available in the Concepts section.

Settings in main config file

Butler-SOS:
  ...
  ...
  # Extract app names
  appNames: 
    enableAppNameExtract: true    # Extract app names in addition to app IDs (tue/false)?
    extractInterval: 60000        # How often (milliseconds) should app names be extracted?
    hostIP: <IP or FQDN>          # What Sense server should be queried for app names?
  ...
  ...

Setting anonTelemetry to true enables telemetry, setting it to false disables telemetry.

3.17 - Configuring user sessions

What’s this?

A description of what user sessions are is available in the Concepts section.

Detailed user session metrics are retrieved for all virtual proxies specified in the Butler-SOS.serversToMonitor.servers[].userSessions.VirtualProxies[] array.

In other words: For each monitored Sense server it is possible to specify which of the server’s proxy service’s virtual proxies should be monitored with respect to per-user session metrics.
Right, that’s a long sentence…

Let’s try again: For each monitored Sense server, decide which virtual proxies should be monitored.
Enter those virtual proxies in the Butler-SOS.serversToMonitor.servers[].userSessions.VirtualProxies[] array for the server in question.

Note

In order to get detailed, per-user and virtual proxy session info you need to

  1. Configure the Butler-SOS.userSessions section of the config file with general parameters about how often sessions should be polled, user blacklist etc.
    Don’t forget to set Butler-SOS.userSessions.enableSessionExtract to true.
  2. For each server then set Butler-SOS.serversToMonitor.servers[].userSessions.enable to true and specify which virtual proxies should be monitored.

You will only get user session info if you configure both the points above.

Settings in main config file

Tip

The config snippet below comes from the production_template.yaml file.

Being a template, it contains examples on how configuration may be done - not necessarily how it should be done.
For example, the LAB/testuser1 and LAB/testuser2 user are optional and can be changed to something else, or removed all together if not used.

Butler-SOS:
  ...
  ...
  # Sessions per virtual proxy
  userSessions:
    enableSessionExtract: true      # Query unique user IDs of what users have sessions open (true/false)?
    # Items below are mandatory if enableSessionExtract=true    
    pollingInterval: 30000        # How often (milliseconds) should session data be polled?
    excludeUser:                  # Optional blacklist of users that should be disregarded when it comes to session monitoring.
                                  # Blacklist is only applied to data in InfluxDB. All session data will be sent to MQTT.
      - directory: LAB
        userId: testuser1
      - directory: LAB
        userId: testuser2
  ...
  ...

3.18 - Configure which Sense servers to monitor

What’s this?

This part of the config file contains information on what Sense servers should be monitored and details about those servers.

Note that there is a dependency to the Butler-SOS.userSessions section. Please see the Configuring user sessions page for more info.

Tags

It’s possible to define server specific tags that will be stored together with the Sense metrics in InfluxDB.
The tags can then be used when creating Grafana dashboards, for example to distinguish between DEV, TEST and PROD servers, where servers are located physically etc.

You can define zero or more tags in the Butler-SOS.serversToMonitor.serverTagsDefinition section.
Those tags are then given values for each server.

If a tag is defined it must in fact be given a value for each server!
If nothing else just set it to an empty string or a hyphen, ‘-’.

Settings in main config file

Tip

The config snippet below comes from the production_template.yaml file.

Being a template, it contains examples on how configuration may be done - not necessarily how it should be done.
For example, the list serverTagsDefinition is optional and can be changed to something else, or removed all together if not used.

Same thing for the list of servers and virtual proxies. Update them so they match your own Sense environment.

Butler-SOS:
  ...
  ...
  serversToMonitor:
    pollingInterval: 30000          # How often (milliseconds) should the healthcheck API be polled?

    # If false, Butler SOS will accept TLS certificates on the server without verifying them with the CA. 
    # If true, data will only be retrieved from the Sense server if that server's TLS cert verifies 
    # successfuully against the list of CAs available on the computer where Butler SOS is running.
    rejectUnauthorized: true 

    # List of extra tags for each server. Useful for creating more advanced Grafana dashboards.
    # Each server below MUST include these tags in its serverTags property.
    # The tags below are just examples - define your own as needed
    # These tags will also be used to label data exposed on the Prometheus endpoint (if it is enabled)
    # NOTE: Prometheus only allows label names consisting of ASCII letters, numbers, as well as underscores. They must match the regex [a-zA-Z_][a-zA-Z0-9_]*. 
    # I.e. if the Prometheus endpoint is enabled, the tag names below must follow the label naming standard of Prometheus. 
    serverTagsDefinition: 
      - server_group
      - serverLocation
      - server_type
      - serverBrand

    # Sense Servers that should be queried for healthcheck data 
    servers:
      - host: <server1.my.domain>:4747        # Example: 10.34.3.45:4747
        serverName: <server1>
        serverDescription: <description>
        logDbHost: <host name as used in QLogs db>
        userSessions:
          enable: true
          # Items below are mandatory if userSessions.enable=true
          host: <server1.my.domain>:4243      # Example: 10.34.3.45:4243
          virtualProxies:
            - virtualProxy: /                 # Default virtual proxy
            - virtualProxy: /hdr              # "hdr" virtual proxy
            - virtualProxy: /sales            # "sales" virtual proxy
        serverTags:
          server_group: DEV
          serverLocation: Asia
          server_type: virtual
          serverBrand: Dell
        headers: 
          X-My-Header-1: Header value 1
          X-My-Header-2: Header value 2
      - host: <server2.my.domain>:4747        # Example: 10.34.3.46:4747
        serverName: <server2>
        serverDescription: <description>
        logDbHost: <host name as used in QLogs db>
        userSessions:
          enable: true
          # Items below are mandatory if userSessions.enable=true
          host: <server2.my.domain>:4243      # Example: 10.34.3.46:4243
          virtualProxies:
            - virtualProxy: /finance          # "finance" virtual proxy
        serverTags:
          server_group: PROD
          serverLocation: Europe
          server_type: physical
          serverBrand: HP
        headers: 
          X-My-Header-1: Header value 3
          X-My-Header-2: Header value 4
  ...
  ...

3.19 - Configuring telemetry

What’s this?

A description of Butler’s telemetry feature is available here.

Settings in main config file

---
Butler:
  # Logging configuration
  ...
  ...
  anonTelemetry: true     # Can Butler SOS send anonymous data about what computer it is running on? 
                          # More info on whata data is collected: https://butler-sos.ptarmiganlabs.com/docs/about/telemetry/
                          # Please consider leaving this at true - it really helps future development of Butler SOS!
  ...
  ...

Setting anonTelemetry to true enables telemetry, setting it to false disables telemetry.

4 - Day 2 operations

Options for running Butler SOS.

Running Butler SOS

How to start and keep Butler SOS running varies depending on whether you are using Docker or a native Node.js approach.

Docker

Starting Butler SOS using Docker is easy.

First configure the docker-compose.yml file as needed, then start the Docker container in interactive mode (=with output sent to the screen).
This is useful to ensure everything works as intended when first setting up Butler SOS.

docker-compose up

Once Butler SOS has been verified to work as intended, hit ctrl-c to stop it.
Then start it again in deameon (background) mode:

docker-compose up -d

From here on the Docker enviromment will make sure Butler SOS is always running, including restarting it if it for some reason stops, when server reboots etc.

Pre-built, standalone binaries

Starting Butler SOS using the pre-built binaries could look like this on Windows:

d:
cd \node\butler-sos
butler-sos.exe --configfile butler-sos-prod.yaml --loglevel info

It is of course also possible to put those commands in a command file (.bat on Windows, .sh etc on other platforms) file and execute that file instead.

As Butler SOS is the kind of service that (probably) should always be running on a server, it makes sense running it as a Windows service (or similar mechanism in Linix).

On Windows you can use the excellent Nssm tool (https://nssm.cc) to achieve this, with all the benefits that follow (the service can be monitored using operations tools, automatic restarts etc).

A step-by-step tutorial for running Butler SOS as a Windows service using NSSM is available over at ptarmiganlabs.com.

On Linux both PM2 (https://github.com/Unitech/pm2) and Forever (https://github.com/foreverjs/forever) have been successfully tested with Butler SOS.

Native Node.js

Starting Butler SOS as a Node.js on Windows could look like this:

d:
cd \node\butler-sos\src
node butler-sos.js

It is of course also possible to put those commands in a command file (.bat on Windows, .sh etc on other platforms) file and execute that file instead.

Windows services & process monitors

As Butler SOS is the kind of service that (probably) should always be running on a server, it makes sense using a Node.js process monitor to keep it alive (if running Butler SOS as a Docker container you get this for free).

On Windows you can use the excellent Nssm tool (https://nssm.cc) to make Butler SOS run as a Windows Service, with all the benefits that follow (can be monitored using operations tools, automatic restarts etc).

If running Butler SOS as a Node.js app on Linux, PM2 (https://github.com/Unitech/pm2) and Forever (https://github.com/foreverjs/forever) are two process monitors that both have been successfully tested with Butler SOS.

One caveat with these is that it can be hard to start them (and thus Butler SOS) when a Windows server is rebooted. PM2 can be used to solve this challenge in a nice way, more info in this blog post: https://ptarmiganlabs.com/blog/2017/07/12/monitoring-auto-starting-node-js-services-windows-server.
On the other hand - just using Nssm is probably the easiest and best option for Windows.

4.1 - Monitoring Butler SOS

Options for monitoring Butler SOS itself.

Monitoring Butler SOS

Once Butler SOS is running it’s a good idea to also monitor it. Otherwise you stand the risk of not getting notified if Butler SOS for some reason misbehaves.

Butler SOS will log data on its memory usage to InfluxDB if

  1. The config file’s Butler-SOS.uptimeMonitor.enable and Butler.uptimeMonitor.storeInInfluxdb.butlerSOSMemoryUsage properties are both set to true.
  2. The remaining InfluxDB properties of the config file are correctly configured.

Assuming everything is correctly set up, you can then create a Grafana dashboard showing Butler SOS’ memory use over time.
You can also set up alerts in Grafana with notifications going to most popular IM tools and email.

A Grafana dashboard can look like this. This particular chart is for the Butler tool, but the concept for Butler SOS is the same.

alt text

There is a sample Grafana dashboard in Butler SOS’ GitHub repo.

5 - Upgrade

Upgrading Butler SOS to a new version?
Here’s how to do it.

First: Don’t panic

Upgrading Butler SOS is usually a smooth process:

  • Get the new version from the assets section on the download page. Extract the ZIP file.
  • Back up your existing Butler SOS configuration file.
  • Edit the configuration file to match the new version’s requirements.
  • Stop the Butler SOS process/service.
  • Replace the Butler SOS binary with the new version.
  • Start the process/service.
  • 🥳 Celebrate!

Then: The details

Version number hints

Different kind of upgrades (usually) result in different levels on modifications needed in the main config file.

  • “Small” upgrades move from one patch verison to another, without changing the feature version.
    Example: Upgrading from 7.3.0 > 7.3.4.
  • “Medium” upgrades involves moving from one minor version to another, without changing the major version. Example: Upgrading from 7.2.3 > 7.3.0
  • “Major” upgrades is when you move to a new major version. Example: 7.4.2 > 8.0.0

Warning

You should always upgrade to the latest available version.
That version has the latest features, bug fixes and security patches.

Minor upgrades

The new release includes bug fixes, security patches, minor updates to documentation etc - but no new features.

In theory there should never be any changes to the config files when doing a minor upgrade.

Medium upgrades

This scenario means that new features are added to Butler SOS.
Usually there are also various bug fixes included.

Most new features need to be configured somehow, meaning that medium upgrades usually require modification to the config files.
The most common change by far is that it’s the main config file that needs to be modified, but a new scheduler related feature could for example mean that the scheduler config file must be modified too.

The changes needed to the config files are usually additive in nature, i.e. some settings must be added to the config file, but the existing settings and general structure of the file remain the same.

Major upgrades

This scneario involves breaking changes of some kind.

These almost certainly require changes to the config files, sometimes even significant ones in the sense that the structure of the config file may have changed.

If very major rework has been done to Butler SOS, this may also result in a major version bump.

Know your config file

Butler SOS is entirely driven by its YAML-formatted configuration file, with an example file serving as a good starting point.

InfluxDB considerations

Some versions include changes to the InfluxDB schema, meaning that you need to do some manual work in order to upgrade to the new schema.

The easiest way to do this is to delete the InfluxDB database used by Butler SOS, then let Butler SOS re-create it using the new schema.
If the InfluxDB database specified in the Butler SOS config file does not exist, Butler SOS will automatically create it for you.

Deleting the InfluxDB database “senseops” can be done with a command similar to this:

influx --host <ip-where-influxdb-is-running> --port <influxdb-port-usually-8086>
drop database senseops
exit

Upgrade checklist

Info

Butler SOS always checks that the config file has the correct format when starting.

This means that if you forget to add or change some setting in the main YAML config file, Butler SOS will tell you what’s missing and refuse to start.
A consequence of this is that all settings are now mandatory, even if you don’t use them.

  1. Make a backup of your YAML configuration file before upgrading. Just… do it.
  2. Look at the release notes to get a general feeling for what is new and what has changed.
    Those are the areas tha may require changes in the config file.
  3. Compare your existing main config file with the template config file available on GitHub.
    This comparison is a manual process and can be a bit tedious, but knowing your config file is really needed in order to make full and correct use of Butler SOS.
    1. That file is also included in the Butler SOS ZIP file available on the download page.
    2. A more in-depth description of the config file is available in the Reference docs > Config file format section of the documentation.
  4. The result of the comparison will show you what parts of the config file are new (for medium-sized upgrades) and which parts have changed in a significant way (for major upgrades).
  5. Get the binaries for the new Butler SOS version from the download page.
  6. Start the new Butler SOS version and let it run for a few minutes.
    1. Review the console logs (or the log files) to make sure there are no warnings or errors.
    2. If there are warnings or errors it can be helpful to start Butler SOS with more verbose logging.
      Adding --log-level verbose or even --log-level debug will give you more details on what Butler SOS is doing and what might be causing the problems you are experiencing.

Finally: When things aren’t working - check the logs

By far the most common problem when upgrading to a new Butler SOS version (or doing a fresh install) is an incorrect config file.

All config entries are mandatory, even if you don’t use them.
This may seem a bit harsh, but this way Butler SOS can tell you exactly what is missing in the config file.

Missing entries are shown in the startup log, like this:

2024-09-02T12:17:33.919Z error: VERIFY CONFIG FILE: Errors found in config file. Exiting.
2024-09-02T12:17:33.920Z error: Tip: Start Butler SOS with --no-config-file-verify option to skip this check and start with provided config file. 
2024-09-02T12:17:33.920Z error: /home/goran/code/butler-sos/src/config/production.yaml is not following the correct structure, missing:,Butler-SOS.configVisualisation.enable

In the example above the Butler-SOS.configVisualisation.enable entry is missing.
Adding that entry to the config file should make Butler SOS start.

Butler SOS is pretty good at figuring out what is wrong with the config file, but there may be cases where it’s not obvious what is wrong.

Thus, double check your config file, then triple check it.

Then start Butler SOS and read the logs carefully.
If you need more details, start Butler SOS with the --log-level verbose or even --log-level debug options to get more details on what’s going on.

If things still don’t work you can post a question in the Butler SOS forums.

By sharing your installation and upgrade challenges/issues you enable future improvements, which will benefit both yourself and others.