Modern Data Pipelines Testing Techniques


Modern Data Pipelines Testing Techniques
A Visual Guide
60 DAYS
GUARANTEE
ENGLISH
PDF

Book Description


A visual guide for understanding Modern Data Pipelines Testing Techniques
Any software product deteriorates rapidly without disciplined testing. However, testing data pipelines is a hellish experience for new data developers. Unfortunately, existing training about data pipeline testing give a scattered view of techniques for testing data pipelines. This book will help with a full view of modern data pipelines testing techniques in a highly-visual and coherent body of work. I hope it helps you in your career.
Why bother testing data pipelines? The main reason is that Data Engineering, Data Science, and Machine Learning Data Pipelines are the cornerstones of any productive data product. Billions of budget dollars regularly rely on the excellence of the data scientists, data engineers, and machine learning engineers behind the countless software data pipelines that inform critical business decisions.
This is a Work in Progress a Continuous Delivery style. After purchasing this work, you will get notified each time a new version is ready.
Table of Contents
Chapter 1: Testing Your Patience
Data Pipeline Transitive Failure Modes: The Reality Check
Bad Data Devs Lifestyle
TDD + CICD to the rescue?
Objections to TDD for Data Work
Sources of Data Validation Complexity
The Data Product Promise No One Can Keep
Fighting Against The Manual Auto-Pilot
Observability vs. Testing vs. Monitoring
Test-Driven Theater vs Continuous Delivery Theater
Chapter 2: Core Types of Data Pipeline Tests
Discovering Holistic Testing
Types of Tests: Test Boundaries
Types of Tests: Test Sizes
Types of Tests: Data Product Testing Quadrant
Types of Tests: Write-Audit-Publish
Types of Tests: Testing Grid
Types of Tests: Code Scale vs Data Scale Testing Grid
Types of Tests: Structuring Data Quality Tests
Types of Tests: Pointwise vs Pairwise vs Composite
Types of Tests: Testing SQL Queries
Types of Tests: Assembling The Testing Parts + Bug Tests
Feedback Levels vs. Testing Scales
Test Pyramids and Test Summits
Chapter 3: Supporting Components for Data Pipelines Tests
Supporting Pattern: Static vs Dynamic Test Data Generation
Supporting Pattern: Data Copies, Clones, and Snapshots
Supporting Pattern: Reverse Data Plane to Support Testing
Supporting Pattern: Parallel Dev-Test Data Streams
Chapter 4: Testing Legacy Data Pipelines
Legacy Testing Pattern I: Before Touching Anything — End to End Characterization Tests
Legacy Testing Pattern III: Semantic Monitoring
Legacy Testing Pattern IV: Data Processing Platform Alerts
Legacy Testing Pattern V: Co-Control Data Contracts
Legacy Testing Pattern VI: Legacy Pipelines Golden Rule
Chapter 5: Design for Testability
Designing Hidden Data Pipelines
Designing Temporally Decoupled Data Pipelines
Designing Debuggable Data Pipelines
Designing Encapsulated Data Pipelines
Designing Right-Tool-For-The-Job Data Pipelines
Designing Feature Engineering Data Pipelines
Designing Iceberg Data Pipelines
Chapter 6: Data-oriented Development Environments
What Can You Do From Your Laptop?
Optimal Data Development Environment
Fundamental Data Dev Repo Components
Coding Timeline vs Data Job Timeline
Chapter 7: Deploying Data Pipelines
Useful CICD workflows for Data pipelines
Data Pipeline Release lifecycle
Testable Scheduled Jobs CICD Workflow
Database Schema Versioning Rational
Database Schema Versioning Golden Rule
Database Schema Migrations – Fields Strategies
Database Schema Migrations – Hidden Things To Test
Chapter 8: Tips for Data Organizations
Data Organization Testability Score Cards vs Your Average Data Dev
When To Give Up On Testing Data Pipelines
Actors In A Data Product
Organizational Friction To Disable Data Pipelines Testability
Organizational Changes To Enable Data Pipelines Testability
Chapter 9: Is This It?
With Great Responsibility Comes Great Capped Autonomy
Data Dev Autonomy Destruction Cookbook
The Fear Of Obsolescence
Outro
References

下载地址 Download
解决验证以访问链接!
打赏
未经允许不得转载:finelybook » Modern Data Pipelines Testing Techniques

评论 抢沙发

觉得文章有用就打赏一下

您的打赏,我们将继续给力更多优质内容

支付宝扫一扫

微信扫一扫