Applied Data Science Using PySpark: Learn the End-to-End Predictive Model-Building Cycle
Author: Ramcharan Kakarla (Author), Sundar Krishnan (Author), Balaji Dhamodharan (Author), Venkata Gunnu (Author)
ASIN: B0DBBXKL4X
Publisher finelybook 出版社: Apress
Edition 版本: Second edition
Publication Date 出版日期: 2024-12-2
Language 语言: English
Print Length 页数: 467 pages
ISBN-13: 9798868808197
Book Description
Book Description
From the Back Cover
This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade.
In Chapters 1, 2 & 3, we will get started with setting up the environment, and the basics of PySpark focusing on data manipulations. In Chapter 4, we will dive into the art of Variable Selection where we demonstrate various selection techniques available in PySpark. In Chapters 5, 6 & 7, we take you on the journey of machine learning algorithms, implementations and fine-tuning techniques. Chapters 8 and 9 will walk you through machine learning pipelines, and various methods available to operationalize the model and serve it through docker/API. Chapter 10 will demonstrate how you can unlock the power of predictive models when used in coherence to create a meaningful impact on your business. Chapter 11 will introduce you to some of the most used and powerful modelling frameworks to unlock real value from data.
In this new edition, you will learn predictive modelling frameworks that can quantify customer lifetime values and estimate the return of your predictive modelling investments. This edition also contains methods to measure engagement and identify actionable populations for churn treatments effectively. In addition, a dedicated chapter for experimentation design including steps to efficiently design, conduct, test and measure the results of your models is added. All the codes will be refreshed as needed to reflect the latest stable version of Spark.
You will:
- Learn the overview of end to end predictive model building
- Understand Multiple variable selection techniques & implementations
- Work with Operationalizing models
- Perform Data science experimentations & tips
About the Author
Ramcharan Kakarla is currently Principal ML at Altice USA. He is a passionate data science and artificial intelligence advocate with 10 years of experience. He holds a master’s degree from Oklahoma State University with specialization in data mining. He is currently pursuing masters in management from University of California, LA. Prior to UCLA and OSU, he received his bachelor’s in electrical and electronics engineering from Sastra University in India. He was born and raised in the coastal town of Kakinada, India. He started his career working as a performance engineer with several Fortune 500 clients including State Farm, British Airways, Comcast and JP Morgan Chase. In his current role he is focused on building data science solutions and frameworks leveraging big data. He has published several papers and posters in the field of predictive analytics. He served as SAS Global Ambassador for the year 2015.
Sundar Krishnan is a Senior Data Science Manager at CVS Health. He has 12+ years of extensive experience leading cross-functional Data Science teams and is an AI, ML, and cloud platform expert. He has a proven track record of building high-performing teams and implementing innovative AI strategies to optimize operational costs and generate substantial revenue. Expert in 0 to 1 product development, successfully led teams from conception to market-ready products in Gen AI & data science. Sundar was born and raised in Tamil Nadu, India, and has a bachelor’s degree from the Government College of Technology, Coimbatore. He completed his master’s at Oklahoma State University, Stillwater. He blogs about his data science works on Medium in his spare time.
Balaji Dhamodharan isanaward winning global Data Science leader, guiding teams to develop and implement innovative, scalable ML solutions. He currently leads the AI/ML and MLOps strategy initiatives with NXP Semiconductors. He has over a decade of experience delivering large-scale technology solutions across diverse industries. His expertise spans Software Engineering, Enterprise AI platforms, AutoML, MLOps, and Generative AI technologies. Balaji holds Masters degrees in Management Information Systems and Data Science from Oklahoma State University and Indiana University. Originally from Chennai, India, Balaji currently resides in Austin, TX, USA.
Venkata Gunnu is a Senior Executive Director of Knowledge Management and Innovation at
JPM Chase. He is an executive with a successful background crafting enterprise-wide data and
data science solutions, GenAI, process improvements, and data and data science-centric
products. Concept to implementation strategist with demonstrated success controlling multiple
projects that elevate organizational efficiency while optimizing resources. Data-focused and
analytical with a track record of automating functions, standardizing data management protocol,and introducing new business intelligence solutions.
相关文件下载地址
相关推荐
- AWS Cloud Projects: Strengthen your AWS skills through practical projects, from websites to advanced AI applications
- Practical Digital Forensics: A Guide for Windows and Linux Users
- Machine Learning in Farm Animal Behavior using Python
- Machine Intelligence Applications in Cyber-Risk Management
- Kibana 8.x - A Quick Start Guide to Data Analysis: Learn about data exploration, visualization, and dashboard building with Kibana
- Innovations in Blockchain-Powered Intelligence and Cognitive Internet of Things (CIoT)