Streaming Data Mesh: A Model for Optimizing Real-Time Data Services
by Hubert Dulay(Author), Stephen Mooney(Author)
Publisher finelybook 出版社: O’Reilly Media; (June 20, 2023)
Language 语言: English
Print Length 页数: 223 pages
ISBN-10: 1098130723
ISBN-13: 9781098130725
Book Description
Data lakes and warehouses have become increasingly fragile, costly, and difficult to maintain as data gets bigger and moves faster. Data meshes can help your organization decentralize data, giving ownership back to the engineers who produced it. This book provides a concise yet comprehensive overview of data mesh patterns for streaming and real-time data services.
Authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you’ll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly.
With this book, you will:
Design a streaming data mesh using Kafka
Learn how to identify a domain
Build your first data product using self-service tools
Apply data governance to the data products you create
Learn the differences between synchronous and asynchronous data services
Implement self-services that support decentralized data
From the Preface
Welcome to this first edition of Streaming Data Mesh! This is your guide to understanding and building a streaming data mesh that meets all of the pillars of a data mesh.
Data mesh is one of the most popular architectures for data platforms that many are exploring today. This book will help you get a full understanding of this self-servicing data platform in a streaming context. Today, batch processing dominates all extract, transform, and load (ETL) processes in most businesses. This book will help show a different perspective of data pipelines and apply the same concepts you already understand in batch ETL, but in a streaming ETL in the context of a data mesh.
This book is designed to help you understand the essential concepts around streaming data mesh—the concepts, architectures, and technologies at its core. The book covers all the essential topics related to streaming mesh, from the basics of data architecture, to the use of big data tools for data warehousing, to business-oriented approaches for streaming data mesh architectures. Additionally, we will look at a stack of services involved in a successful streaming data mesh project.
This book does not require you to have preknowledge of the pillars that make up a data mesh. We will briefly introduce the pillars at a very high level and define them with streaming specifically in mind. If you feel you need to understand data mesh in more detail, please refer to Zhamak Dehghani’s book, Data Mesh (O’Reilly).
Who Should Read This Book
This book is written for anyone who is interested in learning more about streaming data mesh, combining the exciting work done in data mesh with real-time streaming for data transformation, data product definition, and data governance. This book is also useful for data engineers, data analysts, data scientists, software architects, and product owners who want to implement a streaming data architecture for their projects. This book is useful for those who wish to become familiar with streaming data technologies and best practices for integrating them, at scale, into their projects.
Why We Wrote This Book
We wrote a book on streaming data mesh because we believe it has the potential to revolutionize the way companies manage and process their data. Streaming data mesh provides a platform that unites messaging, storage, and processing capabilities into one comprehensive solution. By increasing data reliability and coverage while reducing costs, this platform enables companies to significantly accelerate their digital transformation and become data-driven organizations. With this book, we want to make sure our readers understand the key principles, the latest approaches, and the dos and don’ts of streaming data mesh. We also want to provide step-by-step guidance for setting up and operating a streaming data mesh, taking into account best practices.
Review
“Streaming Data Mesh daringly puts event streaming at the heart of an original, vividly described Data Mesh architecture – and wins! A must-read.”
— Ralph Matthias Debusmann,
CTO@Forecasty.AI
“Streaming Data Mesh” is a comprehensive guide that masterfully explores the transformative potential of streaming data mesh architectures. With its practical, actionable insights and step-by-step guidance, this book is an essential read for data professionals seeking to revolutionize data management and processing in real-time.
— Yingjun Wu
Founder and CEO @RisingWave Labs
Streaming Data Mesh is for anyone facing modern data challenges. Hubert Dulay extends the work Zhamak Dehghani started with straightforward, real-world strategies to build a data mesh with streaming data systems. This book covers architectures, software, and organizational patterns to help enterprises tackle the next generation of data challenges and does so in a way that’s both approachable and deep.
— Chris Matta
Field CTO @ Confluent
About the Author
Hubert Dulay is a systems & data engineer at Confluent. A veteran engineer with over 20 years of experience in big & fast data and MLOps, Hubert has consulted for many financial institutions, healthcare organizations, and telecommunications companies, providing simple solutions that solved many data problems.
Stephen Mooney is an independent data scientist and data engineer serving multiple clients. With over 20 years of experience in big data, MLOps and data science, he has worked in many major companies across healthcare, retail, and the public sector. Through this experience Stephen has delivered many technical and functional projects throughout the entire product lifecycle.