Learning Apache Apex: Real-time streaming applications with Apex
By 作者: Thomas Weise – Munagala V. Ramanath – David Yan – Kenneth Knowles
ISBN-10 书号: 1788296400
ISBN-13 书号: 9781788296403
Release Finelybook 出版日期: 2017-11-30
pages 页数: 290
Book Description to Finelybook sorting
Apache Apex is a next-generation stream processing framework designed to operate on data at large scale, with minimum latency, maximum reliability, and strict correctness guarantees.
Half of the book consists of Apex applications, showing you key aspects of data processing pipelines such as connectors for sources and sinks, and common data transformations. The other half of the book is evenly split into explaining the Apex framework, and tuning, testing, and scaling Apex applications.
Much of our economic world depends on growing streams of data, such as social media feeds, financial records, data from mobile devices, sensors and machines (the Internet of Things – IoT). The projects in the book show how to process such streams to gain valuable, timely, and actionable insights. Traditional use cases, such as ETL, that currently consume a significant chunk of data engineering resources are also covered.
The final chapter shows you future possibilities emerging in the streaming space, and how Apache Apex can contribute to it.
1: INTRODUCTION TO APEX
2: GETTING STARTED WITH APPLICATION DEVELOPMENT
3: THE APEX LIBRARY
4: SCALABILITY, LOW LATENCY, AND PERFORMANCE
5: FAULT TOLERANCE AND RELIABILITY
6: EXAMPLE PROJECT – REAL-TIME AGGREGATION AND VISUALIZATION
7: EXAMPLE PROJECT – REAL-TIME RIDE SERVICE DATA PROCESSING
8: EXAMPLE PROJECT – ETL USING SQL
9: INTRODUCTION TO APACHE BEAM
10: THE FUTURE OF STREAM PROCESSING
What You Will Learn
Put together a functioning Apex application from scratch
Scale an Apex application and configure it for optimal performance
Understand how to deal with failures via the fault tolerance features of the platform
Use Apex via other frameworks such as Beam
Understand the DevOps implications of deploying Apex
Thomas Weise is the Apache Apex PMC Chair and cofounder at Atrato. Earlier, he worked at a number of other technology companies in the San Francisco Bay Area, including DataTorrent, where he was a cofounder of the Apex project. Thomas is also a committer to Apache Beam and has contributed to several more of the ecosystem projects. He has been working on distributed systems for 20 years and has been a speaker at international big data conferences. Thomas received the degree of Diplom-Informatiker (MSc in computer science) from TU Dresden, Germany. He can be reached on Twitter at: @thweise.
Munagala V. Ramanath
Dr. Munagala V. Ramanath got his PhD in Computer Science from the University of Wisconsin, USA and an MSc in Mathematics from Carleton University, Ottawa, Canada. After that, he taught Computer Science courses as Assistant/Associate Professor at the University of Western Ontario in Canada for a few years, before transitioning to the corporate sphere. Since then, he has worked as a senior software engineer at a number of technology companies in California including SeeBeyond, EMC, Sun Microsystems, DataTorrent, and Cloudera. He has published papers in peer reviewed journals in several areas including code optimization, graph theory, and image processing.