Packt | Data Observability
for Data Engineering
Tired of broken data pipelines?
Make sure to keep them healthy and promote Data Observability in your teams with this essential hands-on guide. Be the first to get your hands on this book and learn how to:- Ensure and monitor data accuracy in a scalable way
- Prevent and resolve broken data pipelines
- Build trust between data producers and consumers
What you will learn in this book
- Monitor data pipelines proactively in a scalable way
- Implement a data observability approach in the pipelines
- Collect and analyze key metrics through coding examples
- Apply monkey patching in a Python module
- Manage the costs and risks of your data pipeline
- Understand the main techniques to collect observability metrics
- Implement analytics pipeline monitoring techniques in production
- Build a statistic engine continuously
Who this book is for
This book is for data engineers, data architects, data analysts, and data scientists who have experienced broken data pipelines or dashboards. It would also be useful for organizations that want to adopt the practice of data observability and managers, such as Head of Data or Head of Data Platforms, who are responsible for data quality and processes and are looking for a way to increase the confidence of the consumers and the awareness of producers in their data pipelines.
Hear from the Authors
Michele Pinto
Former Head of Engineering @Kensu
Sammy El Khammal
Product Manager @Kensu
"In the information age, data is critically important. Every organization needs to manage its data effectively to ensure accuracy and to prevent its data pipelines from breaking. In these fast moving times of data engineering, how can you keep on top of this?
Data Observability for Data Engineering has the answer. Data observability is a union of techniques and methods that allow you to monitor and validate the health of your data, and this practical guide will show you how to implement it successfully in your organization.
We begin by explaining what data observability is, how it builds on data quality monitoring, and why it is essential from data engineering perspective. Once you're familiar with the techniques and elements of data observability, you'll get hands-on with a practical Python project to reinforce what you've learned.
At the end of the book, we provide some use cases and projects for you to experiment with, by which time you will be perfectly placed to implement Data Observability in your organization and never worry again about the quality of your data pipelines to ease the mind of data engineers."