Spark in Action:Covers by:Jean-Georges Perrin
pages 页数：576 pages
Publisher Finelybook 出版社：Manning Publications; 2nd edition (June 2,2020)
Apache Spark 3 with Examples in Java,Python,and Scala,2nd Edition
The Spark distributed data processing platform provides an easy-to-implement tool for ingesting,streaming,and processing data from any source. In Spark in Action,Second Edition,you’ll learn to take advantage of Spark’s core features and incredible processing speed,with applications including real-time computation,delayed evaluation,and machine learning. Spark skills are a hot commodity in enterprises worldwide,and with Spark’s powerful and flexible Java APIs,you can reap all the benefits without first learning Scala or Hadoop.
Analyzing enterprise data starts by:reading,filtering,and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ,delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support,an intuitive interface,and a straightforward multilanguage API,you can use Spark without learning a complex new ecosystem.
Spark in Action,Second Edition,teaches you to create end-to-end analytics applications. In this entirely new book,you’ll learn from interesting Java-based examples,including a complete data pipeline for processing NASA satellite data. And you’ll discover Java,Python,and Scala code samples hosted on GitHub that you can explore and adapt,plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms.
Writing Spark applications in Java
Spark application architecture
Ingestion through files,databases,streaming,and Elasticsearch
Querying distributed datasets with Spark SQL