1119282012
Text Mining in Practice with R
by: Ted Kwartler
ISBN-10: 1119282012
ISBN-13: 9781119282013
Edition 版本: 1
Released: 2017-07-31
Pages: 312
Book Description
A reliable,cost-effective approach to extracting priceless business information from all sources of text
Excavating actionable business insights from data is a complex undertaking,and that complexity is magnified by an order of magnitude when the focus is on documents and other text information. This book takes a practical,hands-on approach to teaching you a reliable,cost-effective approach to mining the vast,untold riches buried within all forms of text using R.
Author Ted Kwartler clearly describes all of the tools needed to perform text mining and shows you how to use them to identify practical business applications to get your creative text mining efforts started right away. With the help of numerous real-world examples and case studies from industries ranging from healthcare to entertainment to telecommunications,he demonstrates how to execute an array of text mining processes and functions,including sentiment scoring,topic modelling,predictive modelling,extracting clickbait from headlines,and more. You’ll learn how to:
Identify actionable social media posts to improve customer service
Use text mining in HR to identify candidate perceptions of an organisation,match job descriptions with resumes,and more
Extract priceless information from virtually all digital and print sources,including the news media,social media sites,PDFs,and even JPEG and GIF image files
Make text mining an integral component of marketing in order to identify brand evangelists,impact customer propensity modelling,and much more
Most companies’ data mining efforts focus almost exclusively on numerical and categorical data,while text remains a largely untapped resource. Especially in a global marketplace where being first to identify and respond to customer needs and expectations imparts an unbeatable competitive advantage,text represents a source of immense potential value. Unfortunately,there is no reliable,cost-effective technology for extracting analytical insights from the huge and ever-growing volume of text available online and other digital sources,as well as from paper documents—until now.
Contents
Chapter 1 What Is Text Mining?
Chapter 2 Basics Of Text Mining
Chapter 3 Common Text Mining Visualizations
Chapter 4 Sentiment Scoring
Chapter 5 Hidden Structures: Clustering,String Distance,Text Vectors And Topic Modeling
Chapter 6 Document Classification: Finding Clickbait From Headlines
Chapter 7 Predictive Modeling: Using Text For Classifying And Predicting Outcomes
Chapter 8 The Opennlp Project
Chapter 9 Text Sources
图书说明
从所有文本来源提取无价商业信息的可靠,经济有效的方法
从数据挖掘可操作的业务洞察是一项复杂的工作,当重点放在文件和其他文本信息上时,这种复杂性被放大了一个数量级。本书采用实用的实践方法,教导您采用可靠的,具有成本效益的方法来挖掘埋藏在所有形式的文本中的浩大,无限的财富。
作者Ted Kwartler清楚地描述了执行文本挖掘所需的所有工具,并向您展示了如何使用它们来识别实际的业务应用程序,以便您立即开始创建文本挖掘工作。在许多真实案例和从医疗保健,娱乐到电信等行业的案例研究的帮助下,他演示了如何执行一系列文本挖掘过程和功能,包括情绪评分,主题建模,预测建模,从头条提取clickbait , 和更多。您将学习如何:
确定可行动的社交媒体职位,以改善客户服务
在人力资源中使用文本挖掘来识别组织的候选人感知,与简历匹配工作描述等
从几乎所有的数字和打印资源中提取无价信息,包括新闻媒体,社交媒体网站,PDF,甚至JPEG和GIF图像文件
使文本挖掘成为营销的一个组成部分,以便识别品牌传播者,影响客户倾向建模等等
大多数公司的数据挖掘工作几乎完全集中在数字和分类数据上,而文本仍然是未开发的资源。特别是在全球市场中,首先要确定和响应客户需求和期望,赋予无与伦比的竞争优势,文本代表着巨大潜在价值的来源。不幸的是,没有可靠的,具有成本效益的技术,可以从在线和其他数字来源以及从纸质文档到现在的庞大和不断增长的文本量提取分析见解。
目录
第1章什么是文字挖掘?
第2章文本挖掘的基础
第3章通用文本挖掘可视化
第四章情感评分
第5章隐藏结构: 聚类,字符串距离,文本向量和主题建模
第6章文档分类: 从头条寻找Clickbait
第7章预测建模: 使用文本分类和预测成果
第8章Opennlp项目
第9章文字来源
请登录以查看全部内容 登录