9781785281945
Big Data Visualization
by: James D. Miller
ISBN-10: 1785281941
ISBN-13: 9781785281945
Released: 2017-02-28
Pages: 304
Learn effective tools and techniques to separate big data into manageable and logical components for efficient data visualization
About This Book
This unique guide teaches you how to visualize your cluttered,huge amounts of big data with ease
It is rich with ample options and solid use cases for big data visualization,and is a must-have book for your shelf
Improve your decision-making by visualizing your big data the right way
Who This Book Is For
This book is for data analysts or those with a basic knowledge of big data analysis who want to learn big data visualization in order to make their analysis more useful. You need sufficient knowledge of big data platform tools such as Hadoop and also some experience with programming languages such as R. This book will be great for those who are familiar with conventional data visualizations and now want to widen their horizon by exploring big data visualizations.
What You Will Learn
Understand how basic analytics is affected by big data
Deep dive into effective and efficient ways of visualizing big data
Get to know various approaches (using various technologies) to address the challenges of visualizing big data
Comprehend the concepts and models used to visualize big data
Know how to visualize big data in real time and for different use cases
Understand how to integrate popular dashboard visualization tools such as Splunk and Tableau
Get to know the value and process of integrating visual big data with BI tools such as Tableau
Make sense of the visualization options for big data,based upon the best suited visualization techniques for big data
In Detail
When it comes to big data,regular data visualization tools with basic features become insufficient. This book covers the concepts and models used to visualize big data,with a focus on efficient visualizations.
This book works around big data visualizations and the challenges around visualizing big data and address characteristic challenges of visualizing like speed in accessing,understanding/adding context to,improving the quality of the data,displaying results,outliers,and so on. We focus on the most popular libraries to execute the tasks of big data visualization and explore “big data oriented” tools such as Hadoop and Tableau. We will show you how data changes with different variables and for different use cases with step-through topics such as: importing data to something like Hadoop,basic analytics.
The choice of visualizations depends on the most suited techniques for big data,and we will show you the various options for big data visualizations based upon industry-proven techniques. You will then learn how to integrate popular visualization tools with graphing databases to see how huge amounts of certain data. Finally,you will find out how to display the integration of visual big data with BI using Cognos BI.
Style and approach
With the help of insightful real-world use cases,well tackle data in the world of big data. The scalability and hugeness of the data makes big data visualizations different from normal data visualizations,and this book addresses all the difficulties encountered by professionals while visualizing their big data.
Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere,you can visit http://www.PacktPub.com/support and register to have the code file.
Contents
Big Data Visualization
Big Data Visualization
Credits
About the Author
About the Reviewer
http://www.PacktPub.com
Why subscribe?
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Introduction to Big Data Visualization
An explanation of data visualization
Conventional data visualization concepts
Training options
Challenges of big data visualization
Big data
Using Excel to gauge your data
Pushing big data higher
The 3Vs
Volume
Velocity
Variety
Categorization
Such are the 3Vs
Data quality
Dealing with outliers
Meaningful displays
Adding a fourth V
Visualization philosophies
More on variety
Velocity
Volume
All is not lost
Approaches to big data visualization
Access,speed,and storage
Entering Hadoop
Context
Quality
Displaying results
Not a new concept
Instant gratifications
Data-driven documents
Dashboards
Outliers
Investigation and adjudication
Operational intelligence
Summary
2. Access,Speed,and Storage with Hadoop
About Hadoop
What else but Hadoop?
IBM too!
Log files and Excel
An R scripting example
Points to consider
Hadoop and big data
Entering Hadoop
AWS for Hadoop projects
Example 1
Defining the environment
Getting started
Uploading the data
Manipulating the data
A specific example
Conclusion
Example 2
Sorting
Parsing the IP
Summary
3. Understanding Your Data Using R
Definitions and explanations
Comparisons
Contrasts
Tendencies
Dispersion
Adding context
About R
R and big data
Example 1
Digging in with R
Example 2
Definitions and explanations
No looping
Comparisons
Contrasts
Tendencies
Dispersion
Summary
4. Addressing Big Data Quality
Data quality categorized
DataManager
DataManager and big data
Some examples
Some reformatting
A little setup
Selecting nodes
Connecting the nodes
The work node
Adding the script code
Executing the scene
Other data quality exercises
What else is missing?
Status and relevance
Naming your nodes
More examples
Consistency
Reliability
Appropriateness
Accessibility
Other Output nodes
Summary
5. Displaying Results Using D3
About D3
D3 and big data
Some basic examples
Getting started with D3
A little down time
Visual transitions
Multiple donuts
More examples
Another twist on bar chart visualizations
One more example
Adopting the sample
Summary
6. Dashboards for Big Data – Tableau
About Tableau
Tableau and big data
Example 1 – Sales transactions
Adding more context
Wrangling the data
Moving on
A Tableau dashboard
Saving the workbook
Presenting our work
More tools
Example 2
What’s the goal? – purpose and audience
Sales and spend
Sales v Spend and Spend as % of Sales Trend
Tables and indicators
All together now
Summary
7. Dealing with Outliers Using Python
About Python
Python and big data
Outliers
Options for outliers
Delete
Transform
Outliers identified
Some basic examples
Testing slot machines for profitability
Into the outliers
Handling excessive values
Establishing the value
Big data note
Setting outliers
Removing Specific Records
Redundancy and risk
Another point
If Type
Reused
Changing specific values
Setting the Age
Another note
Dropping fields entirely
More to drop
More examples
A themed population
A focused philosophy
Summary
8. Big Data Operational Intelligence with Splunk
About Splunk
Splunk and big data
Splunk visualization – real-time log analysis
IBM Cognos
Pointing Splunk
Setting rows and columns
Finishing with errors
Splunk and processing errors
Splunk visualization – deeper into the logs
New fields
Editing the dashboard
More about dashboards
Summary