Learning Data Mining with Python: Second Edition
by Robert Layton
Print Length 页数: 358 pages
Publisher finelybook 出版社: Packt Publishing; 2nd Revised edition edition (28 April 2017)
Language 语言: English
ISBN-10: 1787126781
ISBN-13: 9781787126787
Key Features
Use a wide variety of Python libraries for practical data mining purposes.
Learn how to find,manipulate,analyze,and visualize data using Python.
Step-by-step instructions on data mining techniques with Python that have real-world applications.
Book Description
By finelybook
This book teaches you to design and develop data mining applications using a variety of datasets,starting with basic classification and affinity analysis. This book covers a large number of libraries available in Python,including the Jupyter Notebook,pandas,scikit-learn,and NLTK.
You will gain hands on experience with complex data types including text,images,and graphs. You will also discover object detection using Deep Neural Networks,which is one of the big,difficult areas of machine learning right now.
With restructured examples and code samples updated for the latest edition of Python,each chapter of this book introduces you to new algorithms and techniques. By the end of the book,you will have great insights into using Python for data mining and understanding of the algorithms as well as implementations.
What you will learn
Apply data mining concepts to real-world problems
Predict the outcome of sports matches based on past results
Determine the author of a document based on their writing style
Use APIs to download datasets from social media and other online services
Find and extract good features from difficult datasets
Create models that solve real-world problems
Design and develop data mining applications using a variety of datasets
Perform object detection in images using Deep Neural Networks
Find meaningful insights from your data through intuitive visualizations
Compute on big data,including real-time data from the internet
About the Author
Robert Layton is a data scientist working mainly on text mining problems for industries including the finance,information security,and transport sectors. He runs dataPipeline to build algorithms for practical use,and Eurekative,helping bringing start-ups to life in regional Australia. He has presented at the last four PyCon AU conferences,at multiple international research conferences,and has been training in some capacity for five years. He has a PhD in cybercrime analytics from the Internet Commerce Security Laboratory at Federation University Australia,where he was the Inaugural Young Alumni of the Year in 2014 and is currently and Honorary Research Fellow.
You can find him on LinkedIn at https://www.linkedin.com/in/drrobertlayton and on Twitter at @robertlayton.
Robert writes regularly on data mining and cybercrime,in a private,consultancy,and a research capacity. Robert is an Official Member of the Ballarat Hackerspace,where he helps grow the future-tech sector in regional Victoria.
Contents
Chapter 1. Getting Started with Data Mining
Chapter 2. Using Python and the Jupyter Notebook
Chapter 3. A simple affinity analysis example
Chapter 4. Product recommendations
Chapter 5. A simple classification example
Chapter 6. What is classification?
Chapter 7. Summary
Chapter 8. Summary
Chapter 9. Authorship Attribution
Chapter 10. Clustering News Articles
Chapter 11. Object Detection in Images using Deep Neural Networks
Chapter 12. Working with Big Data
Chapter 13. Next Steps…
主要特征
使用各种各样的Python库进行实际的数据挖掘。
了解如何使用Python查找,操纵,分析和可视化数据。
有关具有真实应用程序的Python的数据挖掘技术的分步说明。
图书说明
本书教导您使用各种数据集设计和开发数据挖掘应用程序,从基本分类和亲和力分析开始。本书涵盖了大量可用于Python的图书馆,包括Jupyter Notebook,熊猫,scikit-learning和NLTK。
您将获得复杂数据类型(包括文本,图像和图形)的经验。您还将使用深层神经网络来发现物体检测,这是目前机器学习面临的重大挑战之一。
随着Python的最新版本更新了重组的示例和代码示例,本书的每一章都向您介绍了新的算法和技术。在本书的最后,您将对使用Python进行数据挖掘和对算法以及实现的理解有深入的了解。
你会学到什么
将数据挖掘概念应用于现实世界的问题
根据过去的结果预测体育赛事的结果
根据他们的写作风格确定文档的作者
使用API从社交媒体和其他在线服务下载数据集
从困难的数据集中查找和提取良好的功能
创建解决现实问题的模型
使用各种数据集设计和开发数据挖掘应用程序
使用深层神经网络对图像进行物体检测
通过直观的可视化,从数据中找到有意义的见解
计算大数据,包括来自互联网的实时数据
关于作者
罗伯特·莱顿(Robert Layton)是一位数据科学家,主要从事金融,信息安全和运输等行业的文本挖掘问题。他运行dataPipeline构建实际应用的算法,以及Eurekative,帮助澳大利亚地区启动初创公司。他在多个国际研究会议上介绍了最后四次PyCon AU会议,并已经进行了五年的培训。他拥有澳大利亚联邦大学互联网商务安全实验室的网络犯罪分析博士学位,在那里他是2014年度年度最佳年轻毕业生,现任荣誉研究员。
你可以在LinkedIn上找到他,网址为: http://www.linkedin.com/in/drrobertlayton,在@robertlayton的Twitter上。
罗伯特(Robert)定期撰写数据挖掘和网络犯罪,私人咨询和研究能力。罗伯特是巴拉瑞特黑客空间的官方成员,他帮助扩大维多利亚州地区的未来技术部门。
目录
第1章数据挖掘入门
第2章使用Python和Jupyter笔记本
第3章简单的亲和力分析示例
第4章产品建议
第五章一个简单的分类示例
第六章什么是分类?
第七章总结
第八章总结
第九章作者归因
第10章聚类新闻文章
第11章使用深层神经网络的图像中的对象检测
第12章使用大数据
第13章后续步骤