Data pre-processing for machine learning in Python
28 Jun. 2022
by Gianluca Malato(Author)
ASIN: B0B4HX2M44
Publisher finelybook 出版社: Independently published (28 Jun. 2022)
Language 语言: English
Print Length 页数: 84 pages
ISBN-13: 9798837800696
Book Description
By finelybook
In this book, the author shows the practical use of Python programming language to perform pre-processing tasks in machine learning projects. Pre-processing is the set of transformations to be applied to a dataset before it can be used to train a machine learning model. It’s a very important phase of a data science pipeline because a wrong pre-processing will give a very poor performance of the model, while a good pre-processing is able to make the model learn properly.
The pre-processing transformations shown in this book are:
Data cleaning
Encoding of the categorical variables (one-hot encoding and ordinal encoding)
Principal Component Analysis
Scaling (normalization, standardization, robust scaling)
Binarizing
Binning
Power transformations
Filter-based feature selection
Oversampling using SMOTE
All the transformations are described both in theory and in practice using Python programming language and its powerful scikit-learn library.
About the author
Gianluca Malato was born in 1986 and he is an Italian data scientist, teacher and author. In 2010, he received his Master’s Degree cum laude in Theoretical Physics of disordered systems at “La Sapienza” University of Rome (thesis advisors: Giorgio Parisi and Tommaso Rizzo). He has been working for years as a data architect, project manager, data analyst and data scientist for a large Italian company.
He is the founder of yourdatateacher.com, an online school where he teaches Data Science, Machine Learning, R, Python and SQL language using online courses and individual online training programs.
He has published several articles about Data Science on his blog yourdatateacher.com and on Towards Data Science online publication (towardsdatascience.com). He received the “Top Writer” mention on Medium.com in the “Artificial Intelligence” category for his articles.
He has written several fiction books in Italian, focusing on horror, thriller and fantasy genres.