Oh data where art thou?

Oh data where art thou?
Photo by NASA / Unsplash

When implementing Qlik Sense solutions for enterprise clients they usually have various requirements regarding alerts for failed reloads, support SLAs etc.
Those are all interesting challenges to implement, but the most common request is probably:

“The system should alert me if data doesn’t arrive on time”

It’s a reasonable request. If some source system is delayed and doesn’t delivery data to Sense on time, they as a system or application owner should be notified.

Don’t bother me unless it’s broken

The second thing almost all clients say is:

“Oh, I only want alerts when data is delayed. No messages when data is on time.”

Again, this is very reasonable.

Let’s say the extract app in Sense sends a notification email once data has been loaded from the source system. Great – we now know that data has arrived, and when it did.
Except that we will drown in such notifications from all our dozens (or hundreds..) of extract apps.

It’s kind of hard to implement this in a good way though, at least I have never found a really good, generic solution for this request.
Sure, you can have a Sense app who’s only job is to execute every 15 minutes and check whether data has arrived, and alert if data is delayed. That app will however put load on your Sense environment and use one of the reload slots. Even if the app reloads quickly it’s still pretty bad system design, IMHO.

I have a suggestion for a better option:

Set up a monitoring tool that views the Sense app (or rather its load script) as a black box with unknown function. The only thing the monitoring tool cares about is whether that black box has checked in within some configured interval.

I stumbled upon this while looking at ways to monitor servers in general, but quickly realised it could be used also with Sense.
By the way: while I have only used the described concept on Qlik Sense, it should work equally well with QlikView.

Don’t care for reading? This video outlines the concept, otherwise keep reading below.

The solution: Don’t over-engineer things

Health check dashboard at Ptarmigan Labs

Tl;dr
Configure Sense apps to ping healthchecks.io when data has arrived. If there are delays, healthchecks.io will handle alerting across email, instant messaging, dedicated alerting tools and more.
Do as little as possible in Sense, offload as much as possible to the monitoring tool.

The heart of the solution: healthchecks.io

healthchecks.io is a tool that exists both as a very reasonably priced SaaS offering, and as a fully featured open source variant. There is even a free tier which is perfect for trying out the ideas in this text.

The tool is written in Python, so in theory it runs everywhere where Python runs. That said, the quickest way to get started is probably to run it as a Docker container (image available here) or by simply using the trial/getting-started online version.
The following demos use the Docker option, running on my laptop.

Solution overview

A few things must be present fot the concept to work:

  1. Access to a healthchecks.io instance, configured with checks for one or more Sense apps.
  2. A Sense data connection created for the purpose of pinging healthchecks.io.
  3. A call to the Sense REST connector will ping healthchecks.io, either directly or via a wrapper function.

Configuring healthchecks.io

Creating and configuring a new check in healthcheck.io is pretty straight forward. Give it a name, a good description, maybe a well chosen tag or two, and last but not least a schedule.

The schedule is important: If a check is not triggered by that schedule (e.g. weekdays at 9.00 am) an alert will be sent. It is also possible to set a grace time for the schedule, this helps avoiding alerts due to things being just a few minutes delayed.

Scroll down a bit, then set the check’s schedule:

Take note of the check’s unique ID, which is part of it’s URL:

Instrumenting Sense apps

Some changes are needed to the Sense apps too:

  1. First create a data connection that can be used for pinging healthchecks.io. This connection can be shared across all apps as the actual ping URL (which is unique for each health check) used will be injected when the script executes.

Let’s simply call the connection “Healthcheck”:

  1. We then need a way to tell healthchecks.io that a script reload has started or completed. A couple of Subs will do the trick:
Sub Healthcheck(healthcheckId)
    LIB CONNECT TO 'Healthcheck';

    let remoteURL = 'https://healthcheck.ptarmiganlabs.net/ping/$(healthcheckId)';

    RestConnectorMasterTable:
    SQL SELECT 
        "col_1"
    FROM CSV (header off, delimiter ",", quote """") "CSV_source"
    WITH CONNECTION(Url "$(remoteURL)");
    ;

    // Uncomment if you are interested in the result that came back from the healthcheck service
    //   [CSV_source]:
    //   LOAD    
    //       [col_1]
    //   RESIDENT RestConnectorMasterTable;

    DROP TABLE RestConnectorMasterTable;
End Sub


Sub HealthcheckStart(healthcheckId)
    LIB CONNECT TO 'Healthcheck';

    let remoteURL = 'https://healthcheck.ptarmiganlabs.net/ping/$(healthcheckId)/start';

    RestConnectorMasterTable:
    SQL SELECT 
        "col_1"
    FROM CSV (header off, delimiter ",", quote """") "CSV_source"
    WITH CONNECTION(Url "$(remoteURL)");
    ;

    // Uncomment if you are interested in the result that came back from the healthcheck service
    //   [CSV_source]:
    //   LOAD    
    //       [col_1]
    //   RESIDENT RestConnectorMasterTable;

    DROP TABLE RestConnectorMasterTable;
End Sub

2. As one of the first things in your app’s load script you call HealthcheckStart(‘<check-ID>’).

This tells healthcheck.io that the check with check-ID has started.
Replace check-ID with the proper ID, of course.

3. At the very end of your app’s load script you then call Healthcheck(‘<check-ID>’) to signal that your app has finished reloading.

Demo time

Let’s look at a basic demo:

  • We have a Sense app called “Inventory database extract (hourly)”.
  • It reloads every hour on weekdays. If any such reload is more than 15 minutes delayed, we want to get notified over email and via Slack.
  • Thera are also some other apps that are monitored.

Below follow screen shots of what the setup and results could look like. The Qlik Sense app used is available on Github.

Definition of check “Inventory data (hourly)”. Looks like the Sense app hasn’t reloaded since June 23, 12.15
Manually triggering the app to reload clears the alerts. We see when the reload started and how long it took.
Our app now lists as green, but the Meetup.com app is still alerting. Let’s reload it and see what happens.
Better. The Meetup.com check now lists as running. The badges have their own URLs, which means they can be embedded on web pages, in blog posts etc.
Slack messages tell us what happened, and when it happened.
Email notifications are sent when a check changes state.

The Sense script needed to achieve the above is minimal. In addition to the two subs shown above, only two lines were added to the Sense app:

// Tell Healthchecks service that we are about to start loading data into the app. 
call HealthcheckStart('0bfd20d6-92ed-4235-8ba3-7043fe09d157'); 

// Load data from source system... 

// Tell Healthchecks service that we are done extracting data from the source system. 
call Healthcheck('0bfd20d6-92ed-4235-8ba3-7043fe09d157');

Instrumenting a Sense app to work with Healthchecks.io is easy.

Why stop here? Keep monitoring!

We’ve only scratched the surface of how Qlik Sense (and QlikView) can be combined with healthchecks.io. Some ideas for enhancements:

  • Integrate with your existing incident management tool (PagerDuty etc). healthchecks.io comes with built-in support for several such tools.
  • Monitor the Windows servers which Sense runs on. This is pretty simple, just have Windows Task Scheduler trigger a Powershell script that pings healthchecks.io.
  • Monitor databases or systems that feed data to Sense. In general any system that can hit a URL can be monitored.
  • Create more complex schedules using cron syntax (e.g. an app should reload every 3 hours on Tuesdays and Thursday, alert if this does not happen).

Stay in touch

Don’t be a stranger – feel free to get in touch through the Qlik community (user “mountaindude” or via LinkedIn (linkedin.com/in/gorsan).