Data professionals are confronting the most disruptive change since relational databases appeared in the 1980s. SQL is still a major tool for data analytics, but conventional relational database management systems can’t handle the increasing size and complexity of today’s datasets. This updated edition teaches you best practices for Greenplum Database, the open source massively parallel processing (MPP) database that accommodates large sets of nonrelational and relational data.
Marshall Presser, field CTO at Pivotal, introduces Greenplum’s approach to data analytics and data-driven decisions, beginning with its shared-nothing architecture. IT managers, developers, data analysts, system architects, and data scientists will all gain from exploring data organization and storage, data loading, running queries, and learning to perform analytics in the database. Discover how MPP and Greenplum will help you go beyond the traditional data warehouse.
This ebook covers:
Greenplum features, use case examples, and techniques for optimizing use
Four Greenplum deployment options to help you balance security, cost, and time to usability
Why each networked node in Greenplum’s architecture includes an independent operating system, memory, and storage
Additional tools for monitoring, managing, securing, and optimizing query responses in the Pivotal Greenplum commercial database