At the beginning of my career as a data analyst, I had to rely on other team members when something went wrong in our data pipeline, often only finding out about it after the event. That experience was one of the driving factors for me to join Kensu. When I spoke with the team for the first time, I had that “lightbulb moment”: data observability is a way of providing help to various data team members, including data analysts, in making their lives more productive and less painful.
What do I mean by this?
So, as a data analyst, I always had many pulls on my time and was constantly swapping from one task to another. One thing that was always present was my primary communication tool, “Slack.” It allowed me to keep up in real-time with what was happening and communicate with the broader teams in the business I was responsible for.
One of the most convenient things about Kensu is the tight integration with “Slack” where data analysts and other team members can receive alert messages. Of course, these alerts are not limited to Slack, tools such as Jira and Pager Duty are also supported.
But what really sold me on Kensu was that these alert messages were contextual and at run time. In my previous role, I discovered data health issues after the event, essentially when I or someone else spotted that metrics in a report did not add up.
When Kensu is running and observing the metrics of a data pipeline, it applies specific rules that have been manually created or recommended by the platform and validated by a user, such as an analyst. These rules check for things like schema changes, data timeliness, threshold variations, and values beyond what is known to be good.
If a rule is triggered, a contextual notification is created in Slack to alert the analyst and others that an incident has occurred. Within this alert, they can see:
- The rule that triggered the alert
- The name of the Kensu project
- The name of the “application” from within the pipeline
- The data source, if that makes sense
- The expected value
- The observed value
As you can see, this is a great starting point in terms of finding out what and where the issue is. My favorite feature is that many items in the alert are clickable links, taking the analyst to the relevant place in Kensu to help them better understand the problem.
If they click on the project, it will take them to a page where they can see the project's details with the applications associated with the project and if there are any interdependencies.
Next, if they click on Application, it takes them to the specific application that was running when the rule was triggered and shows the associated data sources.
If they click on the data source, it will take them to the data source page. On this page, they can see all of the rules and observations.
Finally, if they click on the explorer tab, they can see the full lineage both up and downstream.
Here, they can analyze this information and see that the error has occurred somewhere between cust_database and customers_extract. Thankfully, the error has not been proliferated to the next step in the pipeline, which includes the email_list, Orders_Forecasts, and Business_KPI data sets. As you can see, they are white and show no tickets. If they wanted to be 100% sure, they could also check the data source to show that the rule is present.
This truly is a game-changer for analytics. To summarize what happened here: we added a step to the application “Forecasts” called a circuit breaker. That is, Kensu stopped the data pipeline as soon as the issue was spotted, hence preventing the propagation of bad data.
The biggest benefit is the time saved as an analyst – what they can achieve with Kensu in a matter of a few clicks could often take several hours of work with many team members including DBAs, data governance, and data engineers. This is why I think of Kensu as a data analyst’s best friend.