Spark in Action: Covers Apache Spark 3 with Examples in Java,Python,and Scala,2nd Edition

Spark in Action, Second Edition 版本:‏ Covers Apache Spark 3 with Examples in Java, Python, and Scala
Author: Jean-Georges Perrin (Author)
Publisher finelybook 出版社:‏ Manning
Edition 版本:‏ 2nd
Publication Date 出版日期:‏ 2020-06-02
Language 语言: English
Print Length 页数: 576 pages
ISBN-10: 1617295523
ISBN-13: 9781617295522

Book Description


Summary
The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In
Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop.

Foreword by Rob Thomas.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the technology
Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem.

About the book
Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms.

What’s inside

Writing Spark applications in Java
Spark application architecture
Ingestion through files, databases, streaming, and Elasticsearch
Querying distributed datasets with Spark SQL

About the reader
This book does not assume previous experience with Spark, Scala, or Hadoop.

About the author
Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years.

Table of Contents

PART 1 – THE THEORY CRIPPLED BY AWESOME EXAMPLES

1 So, what is Spark, anyway?

2 Architecture and flow

3 The majestic role of the dataframe

4 Fundamentally lazy

5 Building a simple app for deployment

6 Deploying your simple app

PART 2 – INGESTION

7 Ingestion from files

8 Ingestion from databases

9 Advanced ingestion: finding data sources and building

your own

10 Ingestion through structured streaming

PART 3 – TRANSFORMING YOUR DATA

11 Working with SQL

12 Transforming your data

13 Transforming entire documents

14 Extending transformations with user-defined functions

15 Aggregating your data

PART 4 – GOING FURTHER

16 Cache and checkpoint: Enhancing Spark’s performances

17 Exporting data and building full data pipelines

18 Exploring deployment

Review

“This book reveals the tools and secrets you need to drive innovation in your company or community.”
–Rob Thomas, IBM

“An indispensable, well-paced, and in-depth guide. A must-have for anyone into big data and real-time stream processing.”
–Anupam Sengupta, GuardHat Inc.

“This book will help spark a love affair with distributed processing.”
–Conor Redmond, InComm Product Control

“Currently the best book on the subject!”
–Markus Breuer, Materna IPS

“I am a big fan of your approach to Data Engineering, your book on Spark, and loved your talks… I am training Data Engineers at my company and your Spark in Action, 2e book is a mandatory material for it!”
— Thiago de Faria, LINKIT

From the Author

This project has been an effort spread over more than three years, with lots of energy spent in research, finding the right examples and use-cases, and, finally, enhancing the book based on the feedback from editors and reviewers. It has been intense. Spark in Action, Second Edition, is overflowing with illustrations with over 180 diagrams, labs, and real-life data sets including NASA; I deliberately focused on using examples that are very close to real-life datasets, to make sure the reader is exposed to edge cases.The choice of Java as the programming language allows this to reach more developers interested in the fields of big data and analytics.

Amazon page

相关文件下载地址

PDF, ZIP | 379 MB
下载地址 Download解决验证以访问链接!
打赏
未经允许不得转载:finelybook » Spark in Action: Covers Apache Spark 3 with Examples in Java,Python,and Scala,2nd Edition

评论 抢沙发

觉得文章有用就打赏一下

您的打赏,我们将继续给力更多优质内容

支付宝扫一扫

微信扫一扫