pivotal hawq - high availability (2014)
TRANSCRIPT
A NEW PLATFORM FOR A NEW ERA
SK Krishnamurthy
2© Copyright 2013 Pivotal. All rights reserved.
Agenda HAWQ failover and HA now
HAWQ HA upcoming release
What’s new in PHD 1.1
Pivotal Command Center new features
Discuss roadmap in conjunction with AMEX requirements Open discussion: SAW, PHD 1.1 upgrade, …
3© Copyright 2013 Pivotal. All rights reserved. 3© Copyright 2013 Pivotal. All rights reserved.
HAWQ - Availability
Nov 25, 2013
4© Copyright 2013 Pivotal. All rights reserved.
Deployment Model – Sample HAWQ Cluster
HAWQPM
HAWQSM PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
5© Copyright 2013 Pivotal. All rights reserved.
HAWQ Master Fails
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes (with downtime) HAWQ Cluster available. How does clients connect to SM?Manual process to connect to standby master. Similar to GPDB.
Current “SELECT” queries Aborted Users need to restart the query.
Current Transaction Aborted Dirty data & temp files will be removed.
New “SELECT” & transaction
Yes SM will continue to process queries.
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
6© Copyright 2013 Pivotal. All rights reserved.
HAWQ Master Fails
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
Execution coordinator resides on master
Distributed transaction master resides on master
Log copied up to last committed transaction
Run gpactivatestandby on secondary master
Either VIP or DNS hostname change to re-route client connections
7© Copyright 2013 Pivotal. All rights reserved.
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster Un-Available Cluster is considered to be down.
Current “SELECT” queries Aborted Can’t restart the query.
Current Transaction Aborted Dirty data & temp files will be removed.
New “SELECT” & Transaction query
Not possibleDN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
HAWQ Master & Standby Master Fail
8© Copyright 2013 Pivotal. All rights reserved.
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
HAWQ Master & Standby Master Fail
Configure RAID 10 for HAWQ master so primary segment data directory is never lost
9© Copyright 2013 Pivotal. All rights reserved.
PNN Fails
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes (with downtime) Meta data query can be carried out, but no other queries. No DDL or DML.
Current “SELECT” queries Aborted Users need to restart the query.
Current Transaction Aborted After the PNN Is up, dirty data & temp files will be removed.
New “SELECT” & Transaction query
Not possible
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:• (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node.• (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN.• HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational.
• PHD 1.1.1 (Dec,13)• QA verified testing of above 2 options.
10© Copyright 2013 Pivotal. All rights reserved.
PNN Fails
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:• (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node.• (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN.• HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational.
• PHD 1.1.1 (Dec,13)• QA verified testing of above 2 options.
Normal HDFS failover process
Change DNS name of secondary NN to the current NN
Namenode service will be supported in PHD 1.2 (February)
11© Copyright 2013 Pivotal. All rights reserved.
PNN & Secondary NN Fail
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster No Meta data query can be carried out, but no other queries. No DDL or DML.
Current “SELECT” queries Aborted Users need to restart the query.
Current Transaction Aborted After the PNN Is up, dirty data & temp files will be removed.
New “SELECT” & Transaction query
Not possible
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:• (option 1)Manually bring up PNN. HAWQ cannot switch to secondary name node.• (option 2)HDFS admin should change the FQDN or IP address of secondary NN to the PNN.• HAWQ master keeps on trying to connect PNN and when it finds one, the cluster becomes operational.
• PHD 1.1.1 (Dec,13)• QA verified testing of above 2 options.
12© Copyright 2013 Pivotal. All rights reserved.
PNN & Secondary NN Fail
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
No split information
No transactions
13© Copyright 2013 Pivotal. All rights reserved.
Secondary NN Fail
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes Fully available
Current “SELECT” queries Yes
Current Transaction Yes
New “SELECT” & Transaction query
YesDN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
14© Copyright 2013 Pivotal. All rights reserved.
A Segment Fails
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Aborted Users need to restart the query.
Current Transaction Aborted Dirty data & temp files will be removed.
New “SELECT” & Transaction query
Yes Remaining segments will handle the query.
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
15© Copyright 2013 Pivotal. All rights reserved.
A Segment Fails
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
Segments QE (Query Executers) are killed
HAWQ does not materialize intermediate results
Local actions by QE is not committed
Segment QEs are started by other segments in consequent queries
QE substitution is random
Future release for option to materialize work files
16© Copyright 2013 Pivotal. All rights reserved.
Multiple Segment Fail
HAWQPM
HAWQSM
PNN SNN
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Aborted Users need to restart the query.
Current Transaction Aborted Dirty data & temp files will be removed.
New “SELECT” & Transaction query
Yes Remaining segments will handle the query.
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
17© Copyright 2013 Pivotal. All rights reserved.
DN Fails
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Yes SS will automatically connect to remote DN in the middle of currently executing query.
Current Transaction Yes Transaction will finish successfully.
New “SELECT” & Transaction query
Yes
• PHD 1.1:• No Impact. SS will continue to work with remote DN• Loss of data locality might introduce slight performance impact. In 10G network performance
impact is measured to be around 10% for large queries. Simple queries might experience 50% performance impact.
18© Copyright 2013 Pivotal. All rights reserved.
DN Fails
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
• PHD 1.1:• No Impact. SS will continue to work with remote DN• Loss of data locality might introduce slight performance impact. In 10G network performance
impact is measured to be around 10% for large queries. Simple queries might experience 50% performance impact.
libhdfs faults to read from HDFS replica
Short-term performance loss until NN marks DN as dead
19© Copyright 2013 Pivotal. All rights reserved.
Segment Host Dies
HAWQPM
HAWQSM
PNN SNN
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
DN
SS SS SS
SS SS SS
Action Availability Notes
HAWQ Cluster Yes HAWQ Cluster available.
Current “SELECT” queries Aborted Users need to restart the query.
Current Transaction Aborted Dirty data & temp files will be removed.
New “SELECT” & Transaction query
Yes Remaining segments will handle the query.
20© Copyright 2013 Pivotal. All rights reserved.
Single Disk Failure in DN JBOD
– If Tempdata is not in the failed disk then no impact on the cluster or query. – If Tempdata is configured to be on the failed disk.
▪ Small queries will run, but large queries with too much temporary data will be impacted. ▪ Transactions will be aborted and new transaction will continue if multiple disk are configured
to contain tempdata.
RAID 5– No impact.– Possible performance loss.
RAID 10– No Impact & no performance loss.
21© Copyright 2013 Pivotal. All rights reserved.
HAWQ HA on roadmap Automatic Namenode HA supported on PHD now
Automatic Namenode HA (name service) supported by HAWQ in February release
PXF to also support NN service
No interruption in query execution during NN failure
HAWQ HA unchanged
22© Copyright 2013 Pivotal. All rights reserved. 22© Copyright 2013 Pivotal. All rights reserved.
What’s New in Pivotal HD 1.1November 7th, 2013
23© Copyright 2013 Pivotal. All rights reserved.
Key Themes of PivotalHD 1.1 Release
Leverage more data, in real time, more easily to gain competitive advantage
Richer services and tools to create broader set of applications
Deeper, streamlined administrative capabilities for enterprise deployments
24© Copyright 2013 Pivotal. All rights reserved.
Pivotal HD Architecture
HDFS
HBase
Pig, Hive, Mahout
Map Reduce
Sqoop Flume
Resource Management & Workflow
Yarn
Zookeeper
Apache Pivotal
Command CenterConfigure,
Deploy, Monitor, Manage
Data Loader
Pivotal HDEnterprise
Spring
Unified StorageService
XtensionFramework
CatalogServices
QueryOptimizer
Dynamic Pipelining
ANSI SQL + Analytics
HAWQ – Advanced Database Services
Hadoop VirtualizationExtension
Distrubuted In-memory
Store
Query Transactions
Ingestion Processing
Hadoop Driver – Parallel with Compaction
ANSI SQL + In-Memory
GemFire XD – Real-Time Database Services
MADlib Algorithms
beta
OozieVaidya
25© Copyright 2013 Pivotal. All rights reserved.
GemFire XD : DeliversEnterprise real-time data processing platform for SLA critical applications; enables users to rapidly and reliably analyze & react to high volumes of events while leveraging10s of TBs of in-memory reference data.
Cloud Scale Real-Time Platform
Seamless Pivotal HD Integration
Optimized for Real-Time Analytics
• Very low & predictable latencies at high & variable loads
• 10s of TBs in-memory (Memscale)
• Multi-tiered caching• Efficient in-memory M-R• Real-time event
processing • Continuous querying
• SQL based queries• Support structured and
semi-structured* data• Java stored procedures• Deep Spring Data
integration• Native support for
JSON and Objects (Java, C++, C#)*
• Scale to HDFS with policy driven in-memory data retention
• Online and offline querying of HDFS data
• ETL-less bi-directional integration with other Pivotal HD services
Enterprise-Class Reliability
• JTA distributed transactions
• HA through in-memory redundancy
• Reliable event propagation
• Active-active deployments across WAN
* EA / Not in 1.0
26© Copyright 2013 Pivotal. All rights reserved.
Feature BenefitCommand Center:
Install Wizard Faster, easier set up and configuration of HD cluster
Start/Stop Services Point/click control of multiple services through a central interface
HAWQ
UDF(Partial)
- C, PL/pgsql - pgcrypto, orafce
Enable richer data processing and analytics functionality leveraging existing SQL skill sets
Kerberos Support Tightly integrated security with HDFS
PXF: Writable HDFS Table Support
Easily export HAWQ data to HDFS for external consumption
HAWQ Input Format Reader Directly leverage HAWQ data in MapReduce, Pig and Hive
Diagnostic Tools Lower administration costs
Improved Query Planner “Orca” Enabled to provide more efficient query plans
What’s New in Pivotal HD 1.1
27© Copyright 2013 Pivotal. All rights reserved.
Feature BenefitInstall/Config (ICM) CLI
Add/Remove Services Faster, easier set up and administration of services (e.g. Hbase, GemfireXD etc)
Upgrade Streamlined, low risk upgrade from 1.0.1 to 1.1
Apache Hadoop Components
Hadoop to 2.0.5 and select 2.0.6 patches
Greater stability and lower risk based on critical defect fixes incorporated
Oozie 3.3.2 Orchestrate data processing (e.g. MR, Pig) job pipelines with dependencies
Hive 11 (incl. HCatalog and Hiveserver2)
Significant improvements in functionality, scalability and security.
Hbase 0.94.8 Enables snapshots of tables without overhead to the Region Servers
RHEL 6.4 Certification Enhanced performance optimizations and security improvements
What’s New in Pivotal HD 1.1
28© Copyright 2013 Pivotal. All rights reserved.
Feature BenefitPlatform and Security
Kerberos Support - HDFS - HAWQ - Unified Storage Service - PXF to be supported in Dec 2013
Tighter governance, risk and compliance
JRE 1.7.0.15 support Supported platform. JRE 1.6 is end of life.
RHEL 6.4 (FIPS) certification Federal standard for cryptography modules
Pgcrypto for HAWQ Flexible and robust encryption of sensitive data
ToolsUnified Storage Service: CDH4 as a data source
Stream data from CDH4
Data Loader - Push Stream API - Spring XD front end for Twitter
Integration support for wider variety of data sources
What’s New in Pivotal HD 1.1
29© Copyright 2013 Pivotal. All rights reserved.
Command Center Cluster Deployment Wizard
• Performs “Host Verification” to determine host eligibility to be added to cluster
30© Copyright 2013 Pivotal. All rights reserved.
Command Center Cluster Deployment Wizard
• Easily Add Eligible Nodes to Roles
• Basic Validation of Layout• Checkbox Add/Remove
Services• Ability to Download
Configuration Locally
Recorder Demo can be found -> Here
31© Copyright 2013 Pivotal. All rights reserved.
Orca - Improved Optimizer Pluggable architecture, allowing faster innovation and quicker iteration on quality
improvements
Subset of improved functionality:
• Parity with Planner• Improved join-ordering• Join-Aggregate re-ordering• Sub-query de-correlation• Optimal sort-orders• Full integration of data (re-)distribution
• Contradiction detection• Elimination of redundant joins• Smarter Partition scan• Star-join optimization• Skew aware
32© Copyright 2013 Pivotal. All rights reserved.
What’s new in PXF Profiles Writable external tables Hive partition pruning, HBase filtration Additional connectors & CSV support Complete extensibility Roadmap
– Security & authentication– Multi-FS support & other distributions via OS– Stand-alone service
33© Copyright 2013 Pivotal. All rights reserved.
Why Pivotal HD? Big Data + Fast Data
The first enterprise grade platform that provides OLAP and OLTP with HDFS as the common data substrate
Enables closed loop analytics, real-time event processing and high speed data ingest
34© Copyright 2013 Pivotal. All rights reserved.
Hawq Format Reader
Java Program(i.e. MapReduce
Job)HDFS
Hawq
Hawq Reader(Jar file)
1. Request is made to where Files for specific
“Table” exist
2. Location is returned on where are files
2. HDFS Files with Hawq Format are
streamed to Reader
Recorded Demo can be found -> Here
35© Copyright 2013 Pivotal. All rights reserved.
Oozie now Included and Supported with PHD Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availabilty.
Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
Oozie is a scalable, reliable and extensible system.
36© Copyright 2013 Pivotal. All rights reserved.
Matrix of what is supported via Install method
37© Copyright 2013 Pivotal. All rights reserved.
Security Dashboard (items in bold tested; rest are scheduled))
Support secure cluster
Supports Kerberos for Authentication
Support LDAP for Authentication
HDFS Yes Yes Linux OS supportsMapReduce/Pig Yes N/AHive Yes (standalone mode) N/A
Hiveserver No NoHiveserver2 Yes Yes YesHbase Yes Yes YesHAWQ* Yes Yes YesGemfireXD Yes Yes Yes
* Except PXF; Scheduled for Dec (PHD 1.1.1 release
38© Copyright 2013 Pivotal. All rights reserved.
Vaidya
39© Copyright 2013 Pivotal. All rights reserved. 39© Copyright 2013 Pivotal. All rights reserved.
RoadmapOpen Discussion
Nov 25, 2013
40© Copyright 2013 Pivotal. All rights reserved.
Roadmap – Action Items Error tables released in PHD 1.2 (February)
– Current workaround
PCC new features?! SAW integration PHD 1.1 upgrade planning
41© Copyright 2013 Pivotal. All rights reserved. 41© Copyright 2013 Pivotal. All rights reserved.
Appendix
Nov 25, 2013
42© Copyright 2013 Pivotal. All rights reserved. 42© Copyright 2013 Pivotal. All rights reserved.
HAWQ
Nov 25, 2013
43© Copyright 2013 Pivotal. All rights reserved.
History HAWQ 1.0 (March release)
– True SQL Engine in Hadoop▪ SQL 92, 99 & 2003 OLAP extensions▪ JDBC/ODBC
– Basic SQL functionalities▪ DDL and DML
– High availability feature– Transaction support
HAWQ 1.1 (June release)– JBOD support feature
HAWQ 1.1.1 (August release)– HDFS access layer read fault tolerance support– HAWQ diagnosis tool– ORCA enabled
HAWQ 1.1.2 (September release)– HAWQ MR Inputformat for AO tables– HDFS access layer write fault tolerance support– HDFS 2.0.5 support
HAWQ 1.1.3 (Oct release)– HAWQ Kerberos support– HAWQ on secure HDFS– UDF
HAWQ 1.1.4 (Dec release)– Gptoolkit– UDF enhancement– Manual failover for HDFS HA
HAWQ 1.2 (Feb release)– Parquet storage support – HAWQ MR Inputformat– Automatic failover for HDFS HA– …
44© Copyright 2013 Pivotal. All rights reserved.
NetworkInterconnect
...
......HAWQ & HDFS MasterSevers
Planning & dispatch
SegmentSevers
Query execution
...Storage
HDFS, HBase …
45© Copyright 2013 Pivotal. All rights reserved.
Namenode
Breplication
Rack1 Rack2
DatanodeDatanode Datanode
Read/Write
Segment
Segment host
SegmentSegment
Segment host
SegmentSegment host
Master host
Meta Ops
GPDB Interconnect
Segment
Segment
Segment
Segment hostSegment
Datanode
Segment Segment Segment Segment
46© Copyright 2013 Pivotal. All rights reserved.
Query execution flow
47© Copyright 2013 Pivotal. All rights reserved.
Parallel Query Optimizer• Converts SQL into a physical execution plan
– Cost-based optimization looks for the most efficient plan
– Physical plan contains scans, joins, sorts, aggregations, etc.
– Global planning avoids sub-optimal ‘SQL pushing’ to segments
– Directly inserts ‘motion’ nodes for inter-segment communication
• ‘Motion’ nodes for efficient non-local join processing
(Assume table A is distributed across all segments – i.e. each has AK)
– Broadcast Motion (N:N)
• Every segment sends AK to all other segments
– Redistribute Motion (N:N)
• Every segment rehashes AK (by join column) and redistributes each row
– Gather Motion (N:1)
• Every segment sends its AK to a single node (usually the master)
48© Copyright 2013 Pivotal. All rights reserved.
Example of Parallel Query Optimization
48
select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment
from customer, orders, lineitem, nation
where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date '1994-08-01' and o_orderdate < date '1994-08-01' + interval '3 month' and l_returnflag = 'R' and c_nationkey = n_nationkey
group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment
order by revenue desc
Gather Motion 4:1
(slice 3)
Sort
HashAggregate
HashJoin
Redistribute Motion 4:4
(slice 1)
HashJoin
Seq Scan on lineitem Hash
Seq Scan on orders
Hash
HashJoin
Seq Scan on customer Hash
Broadcast Motion 4:4
(slice 2)
Seq Scan on nation
49© Copyright 2013 Pivotal. All rights reserved.
Interconnect
• UDP based• Flow control
50© Copyright 2013 Pivotal. All rights reserved.
Metadata dispatch
• Metadata dispatch• Stateless segments
– Read only metadata on segment
51© Copyright 2013 Pivotal. All rights reserved.
Transaction Full transaction support tables on HDFS
– When a load transaction is aborted, there will be some garbage data left at the end of file. For HDFS like systems, data cannot be truncated or overwritten.
Methods to process the partial data to support transaction. – Option 1: Load data into a separate HDFS file. Unlimited number of files.– Option 2: Use metadata to records the boundary of garbage data, and
implements a kind of vacuum mechanism.– Option 3: Implement HDFS truncation.
HDFS truncate is added to support transaction
52© Copyright 2013 Pivotal. All rights reserved.
Transaction Snapshot isolation
Simplified Transaction Model Support– Simplified two phase commit
53© Copyright 2013 Pivotal. All rights reserved.
Transaction support• Methods to process the partial data to support
transaction. – Option 1: Load data into a separate HDFS file.
Unlimited number of files.– Option 2: Use metadata to records the boundary of
garbage data, and implements a kind of vacuum mechanism.
– Option 3: Implement HDFS truncation.
54© Copyright 2013 Pivotal. All rights reserved.
Pluggable storage• Read Optimized/Append only storage
• Column store– Compressions: quicklz, zlib, RLE– Partitioned tables hit HDFS limitation
• Parquet– Open source format– PAX like column store– Snappy, gzip
• MR Input/Output format
55© Copyright 2013 Pivotal. All rights reserved.
HDFS C client: why • libhdfs (Current HDFS c client) is based on JNI. It is difficult to make
HAWQ support a large number of concurrent queries. • Example:
– 4 segments on each segment hosts– 50 concurrent queries– each query has 16 QE processes that do scan– there will be about 800 processes that start 800 JVMs to access HDFS. – If each JVM uses 500MB memory, the JVMs will consume 800 * 500M =
400G memory. – Thus naïve usage of libhdfs is not suitable for HAWQ. Currently we
have three options to solve this problem
56© Copyright 2013 Pivotal. All rights reserved.
HDFS client: three options
• Option 1: use HDFS FUSE. HDFS FUSE introduces some performance overhead. And the scalability is not verified yet.
• Option 2 (libhdfs2): implement a webhdfs based C client. webhdfs is based on HTTP. It also introduces some costs. Performance should be benchmarked. Webhdfs based method has several benefits, such as ease to implementation and low maintenance cost.
• Option 3 (libhdfs3): implement a C RPC interface that directly communicates with NameNode and DataNode. Many changes when the RPC protocol is changed.
57© Copyright 2013 Pivotal. All rights reserved. 57© Copyright 2013 Pivotal. All rights reserved.
PXF
Nov 25, 2013
58© Copyright 2013 Pivotal. All rights reserved.
PXF is...
A fast extensible framework connecting Hawq to a data
store of choice that exposes a parallel API
59© Copyright 2013 Pivotal. All rights reserved.
Hawq External Tables• gpfdist
– remote delimited text (or csv) files. • file
– text files on segment filesystem.• execute
– script execution and produced data• pxf
– text and binary data from available pxf connectors (mostly HD based).
60© Copyright 2013 Pivotal. All rights reserved.
Steps• Step 1: GRANT ON PROTOCOL pxf• Step 2: Define a PXF table
– Pick built-in plugins right for the job– Specify data source of choice– Map remote data fields to Hawq db attributes
(plugin dependent)• Step 3: Query the PXF table.
– Directly– Or copy to a Hawq table firstCREATE EXTERNAL TABLE foo(<col list>)LOCATION (‘pxf://<host:port>/<data source>?<plugin options>’)FORMAT ‘<type>’(<params>)
61© Copyright 2013 Pivotal. All rights reserved.
62© Copyright 2013 Pivotal. All rights reserved.
63© Copyright 2013 Pivotal. All rights reserved.
64© Copyright 2013 Pivotal. All rights reserved. 64© Copyright 2013 Pivotal. All rights reserved.
New FeaturesMain additions since PHD1.0
65© Copyright 2013 Pivotal. All rights reserved. 65© Copyright 2013 Pivotal. All rights reserved.
User Experience
66© Copyright 2013 Pivotal. All rights reserved.
User Experience
• Improved/Informative error messages.• Profiles
LOCATION(‘pxf://<host:port>/sales?fragmenter=HiveFragmenter&accessor=HiveAccessor&resolver=HiveResolver’)
LOCATION(‘pxf://<host:port>/sales?profile=Hive’)
67© Copyright 2013 Pivotal. All rights reserved.
profiles.xml
<profile> <name>HBase</name> <description>Used for connecting to an HBase data store engine</description> <plugins> <fragmenter>HBaseDataFragmenter</fragmenter> <accessor>HBaseAccessor</accessor> <resolver>HBaseResolver</resolver> <myidentifier>MyValue</myidentifier> </plugins></profile>
68© Copyright 2013 Pivotal. All rights reserved.
profiles.xml
<profile> <name>HdfsTextSimple</name> <description>Used when reading delimited single line records from plain text files on HDFS</description> <plugins> <fragmenter>HdfsDataFragmenter</fragmenter> <accessor>LineBreakAccessor</accessor> <resolver>StringPassResolver</resolver> <analyzer>HdfsAnalyzer</analyzer> <-- (soon to be added) </plugins></profile>
69© Copyright 2013 Pivotal. All rights reserved.
profiles.xml
<profile> <name>MyCustomProfile</name> <description>Used with a new set of plugins I wrote</description> <plugins> <fragmenter>MyFragmenter</fragmenter> <accessor>MyAccessor</accessor> <resolver>MyResolver</resolver> <analyzer>MyAnalyzer</analyzer> </plugins></profile>
Add your own profiles
70© Copyright 2013 Pivotal. All rights reserved. 70© Copyright 2013 Pivotal. All rights reserved.
Export to HDFS
71© Copyright 2013 Pivotal. All rights reserved.
Writable PXF
• gphdfs-like functionality– but extensible…– currently supports text, csv, SequenceFile– supports various hadoop compression Codecs
CREATE WRITABLE EXTERNAL TABLE ...LOCATION(‘pxf://<host:port>/sales?profile=HdfsTextSimple&COMPRESSION_CODEC=org.apache.hadoop.io.compress.GzipCodec')FORMAT ‘text’(delimiter ‘,’);
can create a new profile “HdfsTextSimpleGZipped” that includes compression_codec
LOCATION(‘pxf://<host:port>/sales?profile=HdfsTextSimpleGZipped')
72© Copyright 2013 Pivotal. All rights reserved. 72© Copyright 2013 Pivotal. All rights reserved.
New Connectors
73© Copyright 2013 Pivotal. All rights reserved.
New Connectors
• GemFire XD (Released. GA February)
• JSON (On github. GA February (r+w))
• Accumulo (On github. GA version being coded by Clearedge. GA February)
• Cassandra (On github. Alpha)
Non of them was written by the PXF Dev team… a testament for extensibility.
74© Copyright 2013 Pivotal. All rights reserved.
Feature Summary★ HBase (w/filter pushdown)★ Hive (w/partition exclusion. various storage file types)★ HDFS Files: read (delimited text, csv, Sequence, Avro)★ HDFS Files: write (delimited text, csv, Sequence, various
compression codecs and options)★ GemFireXD, JSON format, Cassandra, Accumulo (currently Beta)★ Stats collection★ Automatic data locality optimizations★ Extensibility!
75© Copyright 2013 Pivotal. All rights reserved.
Coming Up Very Soon...
★ Isilon Integration★ Kerberized HDFS Support★ Namenode High Availability
76© Copyright 2013 Pivotal. All rights reserved.
Limitations• Local metadata of external data
– Will be made more transparent when UCS exists.• Authentication and Authorization of external
systems– Will be made simpler when centralized user mgmt exists.
• Currently supporting local PHD only• Error tables not yet supported• Sharing space with Name/DataNode
77© Copyright 2013 Pivotal. All rights reserved. 77© Copyright 2013 Pivotal. All rights reserved.
Writing a pluginsteps and guidelines
78© Copyright 2013 Pivotal. All rights reserved.
Main Steps
1. Verify P-HD running and PXF installeda. SingleCluster, AllInAll, SingleNode VM
2. Implement the PXF plugin API for your connector (Java)a. Use the PXF API doc as a reference
3. Compile your connector classes and add them to the hadoop classpath on all nodes
4. Restart PHD (won’t be necessary in the future)5. Add a profile (optional)
79© Copyright 2013 Pivotal. All rights reserved.
Plugins• Fragmenter – returns a list of source data fragments
and their location• Accessor – access a given list of fragments read them
and return records• Resolver – deserialize each record according to a given
schema or technique• Analyzer – returns statistics about the source data
80© Copyright 2013 Pivotal. All rights reserved. 80© Copyright 2013 Pivotal. All rights reserved.
Thanks!
Nov 25, 2013