Big Data Analytics with Hadoop and Spark: A hands-on guide to big data engineering and scalable analytics

Big Data Analytics with Hadoop and Spark: A hands-on guide to big data engineering and scalable analytics (English Edition) book cover

Big Data Analytics with Hadoop and Spark: A hands-on guide to big data engineering and scalable analytics (English Edition)

Author(s): Shikha Mehta (Author)

  • Publisher Finelybook 出版社: BPB Publications
  • Publication Date 出版日期: May 13, 2026
  • Language 语言: English
  • Print length 页数: 384 pages
  • ISBN-10: 9365894743
  • ISBN-13: 9789365894745

Book Description

Technologies like Hadoop and Spark, powered by the Cloudera platform, have become essential for storing, processing, and analyzing big data across various industries, including finance, healthcare, e-commerce, and research in today’s data-driven world.

This book systematically navigates the entire ecosystem, starting with big data fundamentals, security, and HDFS architecture before mastering MapReduce through weather and stock data case studies. Readers will gain hands-on experience with the Cloudera framework, learning high-level scripting with Pig Latin and structured data warehousing using HiveQL’s Metastore and partitions. Additionally, it explores NoSQL versatility with HBase and MongoDB’s CAP theorem, followed by Scala programming and Spark’s high-speed in-memory engine. You will learn to optimize queries with the Catalyst optimizer and process complex Parquet or JSON files using Spark SQL DataFrames. The book also covers machine learning pipelines with spark.ml for professional-grade classification and clustering applications.

By the end of this book, readers will be able to develop strong conceptual clarity and practical expertise in big data analytics. This will enable them to confidently design, implement, and manage scalable data processing solutions, preparing them to solve real-world data challenges and take on professional roles in big data engineering and analytics.

What you will learn

● Understand big data concepts, architecture, ethics, and applications.

● Build scalable storage using HDFS and MapReduce.

● Perform data analysis using Pig and Hive.

● Develop NoSQL solutions using HBase and MongoDB.

● Process large datasets using Apache Spark.

● Analyze data using Spark SQL and DataFrames.

● Implement machine learning using PySpark.

Who this book is for

This book is ideal for students, researchers, and academicians. It empowers aspiring big data engineers, data scientists, and software engineers. Readers should possess basic programming knowledge and database fundamentals to master Hadoop and Spark for professional-grade data science and faculty-level instruction.

Table of Contents

1. Exploring Big Data

2. Introduction to Hadoop

3. Hadoop Distributed File System and MapReduce

4. Big Data Analysis with Cloudera

5. Stock Data Analysis with Cloudera

6. Understanding Pig for Big Data Processing

7. Operators in Pig Latin

8. Functions in Apache Pig

9. Hive-data Warehousing and SQL-like Queries

10. Data Analysis Using Hive

11. Data Storage and Processing Using HBase

12. MongoDB

13. Introduction to Spark for Big Data Processing

14. Getting Started with Scala Programming

15. Data Analysis with Spark SQL

16. Machine Learning Application Using PySpark

View on Amazon

下载地址

EPUB, PDF(conv) | 57 MB | 2026-06-10
下载地址 Download请完成验证以访问链接!
打赏
未经允许不得转载:finelybook » Big Data Analytics with Hadoop and Spark: A hands-on guide to big data engineering and scalable analytics

评论 抢沙发

觉得文章有用就打赏一下文章作者

您的打赏,我们将继续给力更多优质内容

支付宝扫一扫

微信扫一扫