h8762: emc greenplum management enabled by aginity workbench · workbench for emc ® greenplum ®...

28
White Paper EMC SOLUTIONS GROUP Abstract This white paper discusses the features, benefits, and use of Aginity Workbench for EMC ® Greenplum ® – a comprehensive management and development tool, specially tailored for the features and architecture of the EMC Greenplum Database. August 2011 EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH A Detailed Review

Upload: doankhue

Post on 19-May-2018

236 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

White Paper

EMC SOLUTIONS GROUP

Abstract

This white paper discusses the features, benefits, and use of Aginity Workbench for EMC® Greenplum® – a comprehensive management and development tool, specially tailored for the features and architecture of the EMC Greenplum Database.

August 2011

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH A Detailed Review

Page 2: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 2

Copyright © 2011 EMC Corporation. All Rights Reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.

All trademarks used herein are the property of their respective owners.

Part Number: H8762

Page 3: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

3 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Table of contents

Executive summary ............................................................................................................... 5

Business case .................................................................................................................................. 5

Solution overview ............................................................................................................................ 5

Key benefits ..................................................................................................................................... 5

Introduction .......................................................................................................................... 7

Purpose ........................................................................................................................................... 7

Scope .............................................................................................................................................. 7

Audience.......................................................................................................................................... 7

Terminology ..................................................................................................................................... 7

Technology overview ............................................................................................................. 8

Overview .......................................................................................................................................... 8

Aginity Workbench ........................................................................................................................... 8

EMC Greenplum Database ................................................................................................................ 8

Configuration ........................................................................................................................ 9

Overview .......................................................................................................................................... 9

Environment diagram ....................................................................................................................... 9

Greenplum environment description .............................................................................................. 10

EMC Greenplum Master Server .................................................................................................. 10

EMC Greenplum Segment Servers .............................................................................................. 10

Operational scenarios ......................................................................................................... 11

Overview ........................................................................................................................................ 11

List of scenarios ............................................................................................................................. 11

Scenario 1: Browse objects in the Greenplum Database ................................................................. 11

Scenario 2: Examine data distribution in the Greenplum Database ................................................ 13

Scenario 3: Identify poorly performing queries and optimize performance ..................................... 16

Scenario 4: Examine the status of Greenplum segments ................................................................ 19

Scenario 5: Optimize space usage in a Greenplum Database ......................................................... 21

Scenario 6: Examine roles and resource queues ............................................................................ 23

Scenario 7: Import or export data into or out of a database ............................................................ 24

Conclusion ......................................................................................................................... 27

Summary ....................................................................................................................................... 27

Findings ......................................................................................................................................... 27

Page 4: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 4

References .......................................................................................................................... 28

White papers ................................................................................................................................. 28

Product documentation .................................................................................................................. 28

Other information .......................................................................................................................... 28

Page 5: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

5 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Executive summary

The EMC® Greenplum® Database is a high-performance data warehouse system that employs a massively parallel processing (MPP) architecture – many servers working in parallel on database tasks.

While the details of the architecture and operation are largely hidden from database users, database administrators (DBAs) and developers often need access to these details to check system health, ensure optimal performance, and develop business analytics quickly and easily to derive value from the data in the warehouse.

Standard query and DBA tools fall short of providing visibility into the features of parallel-processing architecture in general, and the unique features of the Greenplum Database in particular.

Aginity Workbench for EMC Greenplum (Aginity Workbench) offers a simple and efficient method of managing a Greenplum Database.

Aginity Workbench gives you a single point of access to manage, monitor, and develop a Greenplum Database, by offering a range of tools and functions that look deep into the Greenplum architecture.

With Aginity Workbench, you can:

• Examine the operational status of all segments

• Browse all objects in the Greenplum Database and make modifications

• Run multiple queries and export results to common file formats including Microsoft Excel

• Generate SQL and DDL with drag-and-drop ease

• Analyze query plans

• Quickly find tables that should be vacuumed to free up database resources

• See how primary and mirror Segment Instances are distributed across the Segment Servers

• Graphically view table distribution and easily spot distribution skew

• Easily redistribute data

Aginity Workbench brings a new level of insight into the Greenplum Database that no other graphical user interface (GUI) tool can provide.

Benefits of using Aginity Workbench include:

• Ease of use - With a single access point from a user-friendly GUI, you require less time and effort to accomplish daily tasks with the Greenplum Database.

• Access to individual components allows for detailed diagnostics - You can analyze, test, and reset the database servers more quickly, which reduces down time.

Business case

Solution overview

Key benefits

Page 6: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 6

• Optimization of database performance - You can adjust the database settings to maximize its performance.

• Reduction of user errors - Developers can use the built-in functions instead of user-written scripts, which reduces errors and time spent on scripting.

Page 7: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

7 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Introduction

The purpose of this white paper is to examine the functionality of the Aginity Workbench and demonstrate the benefits of using it to access, manipulate, and monitor a Greenplum Database.

This white paper describes the features and benefits of using Aginity Workbench in a Greenplum Database environment and describes the functionality of the main features of the product.

This white paper does not provide configuration information for installing Aginity Workbench into a Greenplum environment.

This white paper is intended for EMC employees, partners, customers, and anyone interested in using Aginity Workbench to manage a Greenplum Database.

This white paper includes the following terminology.

Table 1. Terminology

Term Definition

Analytics Analytics is the study of operational data using statistical analysis with a goal of identifying and using patterns to optimize business performance.

Business intelligence Business intelligence is the effective use of information assets to improve the profitability, productivity, or efficiency of a business. Frequently, IT professionals use this term to refer to the business applications and tools that enable such information usage.

DDL Data Definition Language is the syntax that is used to define and create objects in a relational database.

Master Server In an EMC Greenplum Database, the Master Server or Host controls the operation of the entire system and is the main connection point for external clients accessing the database. The Master Server distributes incoming queries to the Segment Servers, gathers the results, and returns them to the client.

Massively parallel processing (MPP)

MPP is the coordinated processing of data by multiple machines that work together on a task.

In a shared-nothing MPP architecture, such as EMC Greenplum, each machine has its own memory and storage and is not choked by negotiation of shared resources.

Segment Server In an EMC Greenplum Database, a Segment Server is one of the worker nodes/servers that is used to do the work in the MPP deployment.

Shared-nothing architecture

Shared-nothing is a distributed computing architecture made up of a collection of independent, self-sufficient servers. This is in contrast to a traditional central computer that hosts all information and processing in a single location.

SQL Structured Query Language is the syntax that is used to access data from a relational database.

Purpose

Scope

Audience

Terminology

Page 8: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 8

Technology overview

The primary components used in this environment are:

• Aginity Workbench

• EMC Greenplum Database

Aginity Workbench makes developers and DBAs more productive by using tools that give new access and insight into the Greenplum Database and Greenplum Data Computing Appliance. Created by and for Aginity’s own developers, Aginity Workbench is a client-based application that communicates with the Greenplum Database and has a deep understanding of the Greenplum internal architecture.

For developers, Aginity Workbench has an intuitive interface for creating, managing, and tracking both individual SQL queries and entire databases. Sophisticated tools help developers analyze and tune queries for maximum performance. Results can be easily viewed or exported to other formats, such as Microsoft Excel, for further use.

For DBAs, Aginity Workbench provides graphical information on important properties such as node status, database size and bloat, and table distribution and skew. Built-in functions assist with generating the commands used to maintain and optimize the database operation and health.

EMC Greenplum Database is a shared-nothing, MPP architecture that has been designed for business intelligence and analytical processing. In this architecture, each server node acts as a self-contained database management system that owns and manages a distinct portion of the overall data. The system automatically distributes data and parallelizes query workloads across all available hardware.

The core shared-nothing MPP architecture enables massive data storage, loading, and processing with linear scalability. Adaptive services provide worldwide businesses with high availability, workload management, and online expansion of capacity. Key product features enable petabyte-scale loading, hybrid storage (row or column) to best fit the unique needs of each analytical use case, and embedded support for SQL, MapReduce, and programmable analytics. In addition, all major third-party analytic and administration tools are supported through standard client interfaces.

The core principle of the EMC Greenplum Database is to move the processing dramatically closer to the data and its users. This effectively enables the computational resources to process every query in a fully parallel manner, use all storage connections simultaneously, and flow data efficiently between resources as the query plan dictates. The result is that complex processing can be pushed down in close proximity to the data for maximum efficiency and incredible performance.

Overview

Aginity Workbench

EMC Greenplum Database

Page 9: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

9 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Configuration

Aginity Workbench is a Microsoft Windows-based tool and can attach to any Greenplum Database. Aginity Workbench uses a native EMC Greenplum connection from the Microsoft Windows client to the Greenplum Database.

Aginity Workbench is a .NET application and is currently supported on the following platforms:

• Windows XP (32-bit)

• Windows 7 (32-bit and 64-bit)

• Windows Server 2003 (32-bit and 64-bit)

• Windows Server 2008 (32-bit and 64-bit)

In this white paper, several operational scenarios are described to show how the Aginity Workbench integrates with the Greenplum Database and makes it easier for you to manage the system.

Figure 1 shows a generic Greenplum environment being managed by Aginity Workbench.

Figure 1. Aginity Workbench in a generic Greenplum environment

Overview

Environment diagram

Page 10: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 10

Aginity Workbench runs on a Windows client that has a connection to the Greenplum Master Server through the data center network. You can use Aginity Workbench to develop and analyze queries, as well as maintain and optimize the database.

EMC Greenplum Master Server The Greenplum Master Server is the access point for all user requests to the Greenplum Database and it also handles all coordination of the Segment Servers.

EMC Greenplum Segment Servers The Greenplum Segment Servers are the workers of the Greenplum Database and perform all MPP tasks.

Greenplum environment description

Page 11: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

11 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Operational scenarios

This section details some common operational scenarios of the Aginity Workbench that you can use to manage the Greenplum Database.

Aginity Workbench was exercised in the following scenarios:

• Scenario 1: Browse objects in the Greenplum Database

• Scenario 2: Examine data distribution in the Greenplum Database

• Scenario 3: Identify poorly performing queries and optimize performance

• Scenario 4: Examine the status of Greenplum segments

• Scenario 5: Optimize space usage in a Greenplum Database

• Scenario 6: Examine roles and resource queues

• Scenario 7: Import or export data into or out of a database

The purpose of this scenario is to expand schemas to view tables, columns, views, stored procedures, and other database objects.

A key function of any database tool is to simply allow browsing and examination of database objects. Aginity Workbench has a familiar tree structure to “walk” into the hierarchy of the database.

Figure 2 shows the top-level view of a Greenplum Database showing the databases - and their sizes - in the system.

Figure 2. Aginity Workbench tree structure

Overview

List of scenarios

Scenario 1: Browse objects in the Greenplum Database

Page 12: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 12

Figure 3 shows a database expanded to display database objects. The view displays Greenplum-specific objects and information such as Partitions and the Distributed By clause in a table definition. This information is typically missed by tools that do not understand the Greenplum architecture.

Figure 3. Expanded database showing database objects

Page 13: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

13 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Each of the objects has a robust context menu that provides many useful functions that DBAs and developers can use to work more efficiently. Figure 4 shows the ability to quickly construct a Select statement for a particular table.

Figure 4. Select statement script

The resulting Select statement can be edited as desired and then executed. Additional menu selections will build Insert, Update, and Delete statements as well as the DDL commands to create the table. These commands can be sent to the workbench query window as well as to the clipboard for pasting into other programs. These shortcut functions are handy for both initial design as well as reverse engineering of existing designs.

Note Commands are only shown in the menu if they are relevant to the object.

The purpose of this scenario is to:

• Check the data distribution of tables to determine how well the data is balanced across all the Segment Servers

• Identify a poorly distributed table and redistribute the data for better query performance

Scenario 2: Examine data distribution in the Greenplum Database

Page 14: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 14

Figure 5 shows a poor table distribution.

Figure 5. Query results showing poor table distribution

Page 15: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

15 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

To change the table distribution, you need to choose the Change distribution option, under Advanced, as shown in Figure 6.

Figure 6. Select Change distribution menu option

As shown in Figure 7, you can choose one or more of the Available Columns by which to redistribute the table. In this example, proc_id was selected.

While Aginity Workbench makes it easy to change the distribution key, it is up to you to choose the column (or columns) that will actually result in a better distribution of the data. Selecting multiple columns for a distribution key makes a composite key from those columns.

Figure 7. Select redistribution criteria and execute command

Page 16: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 16

After clicking OK, Aginity Workbench provides you with the commands that perform the redistribution. As redistribution is a significant activity on all the data in a table, you must manually verify and start the execution of the command.

Choosing Show Distribution again now shows the results of this redistribution activity.

Figure 8 shows the successful completion of the table redistribution.

Figure 8. Successful completion of redistribution showing good table distribution

The purpose of this scenario is to:

• Identify poorly performing queries

• Examine the Explain Plan for the query and determine the reason for the poor performance

• Optimize the query and verify that it performs better

Scenario 3: Identify poorly performing queries and optimize performance

Page 17: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

17 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

To identify poorly performing queries, you go to the Object menu, and under Database choose Show Query History. Figure 9 shows the Query History window. It provides several filters to narrow down the list. The Duration column visualizes query duration, for ease of interpretation.

Figure 9. Query History

After a query is selected, the context menu enables you to choose Explain SQL Statement, which shows the full query and the query plan. It also provides the output of an Explain Analysis of the query.

Figure 10 shows the Explain Plan for the selected query. However, for larger and more complex Explain Plans, it may be difficult to read through all the output.

Figure 10. Explain Plan for the selected query

Page 18: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 18

As shown in Figure 11, Aginity Workbench supports you by providing iterator output of the query. This option is available in the Context menu of the query.

Figure 11. Explain Plan

The iterators give much more detailed information for the steps of the Explain Plan. Iterators are available for queries that have been executed and captured in the Greenplum Performance Monitor Database.

Figure 12 shows the Query Plan window with the query plan as a navigation tree in the left pane, and summary and detail information in the right panes. You can immediately see the steps that are color-highlighted, which indicates that these are possible causes of slow performance.

Figure 12. Query Plan showing iterator details

Page 19: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

19 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

It is evident that without such easy to navigate, interactive support, it would be much more difficult to narrow down pain points in problematic queries this quickly and efficiently.

The purpose of this scenario is to:

• Determine the operational status of Greenplum segments

• Determine the location of primary segments and their corresponding mirror segments

• Identify primary segments that have failed over to their mirror segments

• Observe the failback of mirror segments to the primary server when the Segment Server is restored to operation

Managing a Greenplum Database means managing multiple database instances on multiple servers. Aginity Workbench supports you by providing Server Explorer. This gives a detailed view of the inner workings of the Greenplum architecture, which allows DBAs to easily visualize the system status.

Server Explorer can be accessed from the Server Node in the navigation tree, as shown in Figure 13.

Figure 13. Server explorer

Scenario 4: Examine the status of Greenplum segments

Page 20: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 20

Figure 14 shows the server in a healthy running state.

Figure 14. Server explorer showing a healthy status

The left pane shows the Segment Servers in the cluster. The right pane shows the configuration of each Segment Instance on each Segment Server. Columns can easily be sorted by clicking on the title of a column.

Color-highlighting is used to visualize the placement of the primary-mirror pairs. For each primary-mirror pair, there is one row that shows all the configuration details, for example, role, mode, status, host, and so on. The colors show how the primary Segment Instances of a server are spread over different Segment Servers.

This overview immediately informs you that there are no failed segments and that each Segment Server has six primary and six mirror Segment Instances.

Page 21: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

21 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

If any Segment Instances are in a mode or status other than Synchronized or Up, this is highlighted as shown in Figure 15 and Figure 16.

Figure 15. Server Explorer showing a failover

Figure 16. Server Explorer showing resynchronization

In situations where you want to focus on a certain Segment Server, clicking the node name in the left pane filters the list with segments only to that particular server.

The purpose of this scenario is to:

• Determine space utilization of tables in the database

• Find tables that have bloat caused by deletes that have not been vacuumed

• Reduce system resource usage by easily executing vacuum statements on the database

Periodic vacuuming of database tables helps ensure that the space occupied by deleted items is reclaimed and available for use for new data in the database.

Scenario 5: Optimize space usage in a Greenplum Database

Page 22: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 22

The Aginity Workbench makes it very easy to find the space used by the tables in the database. When you right-click on the database, it lets you choose the Database Maintenance option as shown in Figure 17.

Figure 17. Database Maintenance

This brings up a display of all the tables in that database and includes columns that show the Expected Bytes used, Actual Bytes used, Expired Bytes, and the Percent Unused.

As shown in Figure 18, the Diagnostics Message column gives an indication of the amount of bloat in the table. Tables with high bloat (deleted objects whose space can be reclaimed) can be easily vacuumed right from the menu.

Figure 18. Diagnostics Message showing bloat

Page 23: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

23 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

The purpose of this scenario is to:

• Examine the properties of resource queues

• Identify the resource queues to which roles are assigned

An important aspect of Greenplum performance management is the notion of roles and resource queues. Roles roughly correspond to database users, and each user or role is assigned to a particular resource queue.

Resource queues have associated properties that determine how much of the Greenplum system resources are applied to queries that run in those queues.

Aginity Workbench can display the properties of resource queues as shown in Figure 19.

Figure 19. Resource queues and user roles

Scenario 6: Examine roles and resource queues

Page 24: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 24

Aginity Workbench understands the difference between resource queues with active statement limits and resource queues that have maximum query cost limits. It also understands the different priorities that resource queues can have.

Aginity Workbench also displays properties of user roles, and can show the resource queue to which each role or user is assigned, as shown in Figure 19.

This easy access to workload management information helps DBAs properly allocate system resources so that database jobs are executed with the greatest efficiency.

The purpose of this scenario is to:

• Import data from a disk file to the database

• Export data from the database to a disk file

Moving data into a database from a flat file (TXT or CSV), and exporting data from a table into a flat file, are common actions for developers as well as DBAs.

Greenplum provides the SQL COPY command, which can load an entire file into the database, and is considerably more efficient than executing INSERT statements and much easier than writing a script to load data. Unfortunately, the syntax for the SQL COPY command is a little tricky and, unless you use it every day, easy to forget or enter incorrectly.

Aginity Workbench provides an easy way of importing data into the database from flat files and also exporting data from a table back to a disk file.

To import data from a CSV file, you right-click the table into which you want to load the data and choose Import Data.

In Import Data, as shown in Figure 20, you can specify the location of the file and the format. You can also specify the encoding, delimiters, escape characters, whether the input file has a header row, as well as the Segment reject limit. The reject limit sets the number of errors in the input file that you are willing to accept before aborting the load.

Scenario 7: Import or export data into or out of a database

Page 25: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

25 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Figure 20. Import Data

As shown in Figure 21, the SQL tab shows the corresponding SQL COPY command that is generated, which can be edited further.

Figure 21. SQL tab in Import Data window

Getting data out of the database and into flat files is just as easy; you right-click the table and choose Export Data.

Page 26: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 26

In Export Data, as shown in Figure 22, on the Parameters tab, you can specify many of the same kinds of properties as for importing data. The Selection tab allows you to specify the columns you want to export as well as an order-by clause for your desired sorting order.

Figure 22. Export Data

While the import and export functions do not use the Greenplum gpload/gpfdist programs for parallel bulk loading of extremely large amounts of data, these functions are very handy for quickly getting smaller amounts of data into and out of the database.

Page 27: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

27 EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review

Conclusion

Aginity Workbench integrates easily with EMC Greenplum Database and allows you to quickly and efficiently manage, monitor, and access large-scale enterprise data warehouses.

Aginity Workbench features and functionality provides many benefits including:

• Ease of use, reduction of overhead, and improved return on investment

• Access to individual components in the database, which allows for detailed diagnostics and fine tuning

• Optimization of database performance

• Reduction of errors and down time

Aginity Workbench is unmatched in its ability to expose the internals of the Greenplum Database and optimize the database with ease.

Summary

Findings

Page 28: H8762: EMC Greenplum Management Enabled by Aginity Workbench · Workbench for EMC ® Greenplum ® – a ... the entire system and is the main connection point for external clients

EMC Greenplum Management Enabled by Aginity Workbench—A Detailed Review 28

References

For additional information, see the white papers listed below.

• EMC Greenplum Data Computing Appliance: High Performance for Data Warehousing and Business Intelligence — An Architectural Overview

• EMC Greenplum Database 4.0 — Critical Mass Innovation

For additional information, see the product document listed below.

• Greenplum Database 4.1 Administrator Guide

For additional information and to download the software, see the websites listed below.

• Aginity.com

• Greenplum.com

White papers

Product documentation

Other information