Apache Airflow Best Practices: A practical guide to orchestrating data workflow with Apache Airflow-finelybook

Apache Airflow Best Practices: A practical guide to orchestrating data workflow with Apache Airflow

Author:by Dylan Intorf (Author), Dylan Storey (Author), Kendrick van Doorn (Author)

Publisher finelybook 出版社:‏Packt Publishing ‎

Edition 版本:‏N/A

Publication Date 出版日期:‏2024-10-31

Language 语言:English

Print length 页数:188pages

ISBN-10:1805123750

ISBN-13:9781805123750

Book Description

Confidently orchestrate your data pipelines with Apache Airflow by applying industry best practices and scalable strategies

Key Features

Understand the steps for migrating from Airflow 1.x to 2.x and explore the new features and improvements in version 2.x
Learn Apache Airflow workflow authoring through real-world use cases
Uncover strategies to operationalize your Airflow instance and pipelines for resilient operations and high throughput
Purchase of the print or Kindle book includes a free PDF eBook

Book Description

Data professionals face the monumental task of managing complex data pipelines, orchestrating workflows across diverse systems, and ensuring scalable, reliable data processing. This definitive guide to mastering Apache Airflow, written by experts in engineering, data strategy, and problem-solving across tech, financial, and life sciences industries, is your key to overcoming these challenges. It covers everything from the basics of Airflow and its core components to advanced topics such as custom plugin development, multi-tenancy, and cloud deployment.

Starting with an introduction to data orchestration and the significant updates in Apache Airflow 2.0, this book takes you through the essentials of DAG authoring, managing Airflow components, and connecting to external data sources. Through real-world use cases, you’ll gain practical insights into implementing ETL pipelines and machine learning workflows in your environment. You’ll also learn how to deploy Airflow in cloud environments, tackle operational considerations for scaling, and apply best practices for CI/CD and monitoring.

By the end of this book, you’ll be proficient in operating and using Apache Airflow, authoring high-quality workflows in Python for your specific use cases, and making informed decisions crucial for production-ready implementation.

What you will learn

Explore the new features and improvements in Apache Airflow 2.0
Design and build data pipelines using DAGs
Implement ETL pipelines, ML workflows, and other advanced use cases
Develop and deploy custom plugins and UI extensions
Deploy and manage Apache Airflow in cloud environments such as AWS, GCP, and Azure
Describe a path for the scaling of your environment over time
Apply best practices for monitoring and maintaining Airflow

Who this book is for

This book is for data engineers, developers, IT professionals, and data scientists who want to optimize workflow orchestration with Apache Airflow. It’s perfect for those who recognize Airflow’s potential and want to avoid common implementation pitfalls. Whether you’re new to data, an experienced professional, or a manager seeking insights, this guide will support you. A functional understanding of Python, some business experience, and basic DevOps skills are helpful. While prior experience with Airflow is not required, it is beneficial.

Getting Started with Airflow 2.0
Core Airflow Concepts
Components of Airflow
Basics of Airflow and DAG Authoring
Connecting to External Sources
Extending Functionality with UI Plugins
Writing and Distributing Custom Providers
Orchestrating a Machine Learning Workflow
Using Airflow as a Driving Service
Airflow Ops: Development and Deployment
Airflow Ops Best Practices: Observation and Monitoring
Multi-Tenancy in Airflow
Migrating Airflow

About the Author

Dylan Intorf is a seasoned technology leader with a B.Sc. in computer science from Arizona State University. With over a decade of experience in software and data engineering, he has delivered custom, tailored solutions to the technology, financial, and insurance sectors. Dylan’s expertise in data and infrastructure management has been instrumental in optimizing Airflow deployments and operations for several Fortune 25 companies.

Dylan Storey holds a B.Sc. and M.Sc. in biology from California State University, Fresno, and a Ph.D. in life sciences from the University of Tennessee, Knoxville where he specialized in leveraging computational methods to study complex biological systems. With over 15 years of experience, Dylan has successfully built, grown, and led teams to drive the development and operation of data products across various scales and industries, including many of the top Fortune-recognized organizations. He is also an expert in leveraging AI and machine learning to automate processes and decisions, enabling businesses to achieve their strategic goals.

Kendrick van Doorn is an accomplished engineering and business leader with a strong foundation in soft ware development, honed through impactful work with federal agencies and consulting technology firms. With over a decade of experience in crafting technology and data strategies for leading brands, he has consistently driven innovation and efficiency. Kendrick holds a B.Sc. in computer engineering from Villanova University, an M.Sc. in systems engineering from George Mason University, and an MBA from Columbia University.

Amazon Page

下载地址

PDF, EPUB | 6 MB | 2024-12-10

Apache Airflow Best Practices: A practical guide to orchestrating data workflow with Apache Airflow

Apache Airflow Best Practices: A practical guide to orchestrating data workflow with Apache Airflow

Book Description

Key Features

Book Description

What you will learn

Who this book is for

Table of Contents

About the Author

下载地址

相关推荐

评论抢沙发

分类

觉得文章有用就打赏一下文章作者

您的打赏，我们将继续给力更多优质内容

支付宝扫一扫

微信扫一扫

Apache Airflow Best Practices: A practical guide to orchestrating data workflow with Apache Airflow

Book Description

Key Features

Book Description

What you will learn

Who this book is for

Table of Contents

About the Author

下载地址

相关推荐

评论 抢沙发

分类

觉得文章有用就打赏一下文章作者

您的打赏，我们将继续给力更多优质内容

支付宝扫一扫

微信扫一扫

评论抢沙发