ingres plus x100 equals ingres vectorwise. agenda why? introduction to vectorwise groundwork ...

35
Ingres Plus X100 Equals Ingres Vectorwise

Upload: timothy-pierce

Post on 11-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Ingres Plus X100 EqualsIngres Vectorwise

Page 2: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Agenda

Why? Introduction to Vectorwise Groundwork Vectorwise and OPF Vectorwise and QEF

Page 3: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Names

X100 was the research project name– “faster times 100”

VectorWise (or Ingres VectorWise) is the product name

X100 or IVW or VW abbreviations are used internally It's all pretty much interchangeable

Page 4: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Confidential — © 2009 Ingres Corporation Slide 4

Why Ingres?

Ingres – enterprise RDBMS–Full functioned data base server–User interfaces: SQL, embedded SQL, API, ODBC, JDBC, etc.–Application interfaces: OpenROAD, ABF, etc.–Utilities: backup, restore, rollforward, etc.–…VectorWise – experimental RDBMS–Very (very, very, …) fast–But only QEF/DMF equivalent

Page 5: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Why Ingres?

Developing “required” components would take many yearsEstablished sales force, customer baseRight sized companyAgreeable business arrangements

Confidential — © 2009 Ingres Corporation Slide 5

Page 6: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

What’s in VectorWise?

Column store data base–Hybrid row store capability is comingInternal catalog–Store definitions of tables, columns, indexes, etc.Relational algebra language interface–Used for DDL and DML operations, transaction management–Basis of Ingres interfaceVarious tools & utilities–iivwinfo, iivwfastload, iivwstats, x100pp, x100profgraph,

x100_client

Confidential — © 2009 Ingres Corporation Slide 6

Page 7: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise Algebra

Actual queries that go to VectorWise are recorded in the VectorWise log – DML can be seen by using trace point op207 (don’t forget to use x100pp)Simple queries:

CreateTable, MinMaxIndex, Savepoint, CommitDML:

Append, Update, Delete

Confidential — © 2009 Ingres Corporation Slide 7

Page 8: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise Algebra

Retrievals – queries consist of nested operators, built by the OPF cross compiler–Project, Select, TopN, Window, Sort, Aggr, OrdAggr, Mscan, MergeJoin1, HashJoin01, HashJoinN, CartProd–“select sno, city from s where status > 50” generates:

Project(

Select(

Mscan(_s = ‘_s’, [‘_status’, ‘_city’, ‘_sno’]

), [‘est_card’ = ‘5’], >(_s._status, sint(‘50’))

), [‘est_card’ = ‘1’], [_s._sno, _s._city]

)

Confidential — © 2009 Ingres Corporation Slide 8

Page 9: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise Algebra

One more -

“select r_name, n_name from region, nation where r_regionkey = n_regionkey order by r_name” generates:

Sort (

Project (

HashJoin01 (

MScan ( _nation = '_nation', [ '_n_regionkey', '_n_name']

) [ 'est_card' = '25' ] , [ _nation._n_regionkey ],

MScan (_region = '_region', [ '_r_regionkey', '_r_name']

) [ 'est_card' = '5' ] , [ _region._r_regionkey ], 0

) [ 'est_card' = '25' ] , [_region._r_name, _nation._n_name]

),[_region._r_name]

) Confidential — © 2009 Ingres Corporation Slide 9

Page 10: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise – Why is it so Fast?

Primarily a column store – much less data read from diskCompression techniques highly tuned to modern computer architectures (multi-level caching, etc.)Lightweight indexing techniqueOther column stores operate on re-constituted rowsVectorWise uses new and novel execution architecture that retains column structure and processes vectors of data at a time–SIMD, CPU cache, …Research is ongoing and there’s more to come!

Confidential — © 2009 Ingres Corporation Slide 10

Page 11: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Implementation Goals

Minimize special X100-only syntax Minimize effect on server facilities not directly

involved (e.g. SCF, DMF) Localize changes to affected facilities as much as

possible (Optimizer in particular) Write new algorithms in

such a way that Ingres tables could eventually take advantage

Page 12: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Groundwork

Initial thought was to add X100 as a new built-in Gateway, similar to IMA etc– Presumably would cause minimal changes to QEF– Would probably be too slow (every result row filtered

one at a time from GWF to DMF then QEF) Better idea, add Vectorwise as a new table storage

type– OPF will generate special plans for IVW tables– Easy to do, minimal new syntax required

Ingres catalogs carry Vectorwise table definitions

Page 13: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Groundwork

Parser changes:– WITH STRUCTURE=VECTORWISE

• Config default for IVW installations• SET RESULT_STRUCTURE …

– Various checks to disallow VW tables in contexts where they aren't supported (e.g. DB procedure)

– New “CALL VECTORWISE” statement to send non-SQL-related requests (e.g. combine)

– New statement types for VW DDL statements• Especially for constraint operations

– New query-uses-VW-tables flag(s)

Page 14: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Groundwork

Front-end utilities changed to recognize VW table types– Some additional work required by VW restrictions,

such as create index only allowed on empty tables Essentially no sequencer changes

– Minor change to recognize X100 interface facility DMF changes to permit tables with no underlying

table file DMF changes for backup/restore RDF changes to support (hidden) VW join indexes

Page 15: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF

Parsing is “easy”, but why did we think OPF could compile VectorWise queries?Optimization is all about row sizes, cardinalities and the comparison of different plansoptimizedb works on VectorWise tables and produces good cardinality estimatesThe optimizer architecture is designed to work for any target database engine

Confidential — © 2009 Ingres Corporation Slide 15

Page 16: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF - Changes

Functional dependenciesDependence relationships can be derived from unique constraints, referential relationships, joins, group byAllows identification of DERIVED columns not needed for duplicates elimination & grouping operations – very important to VectorWise

Confidential — © 2009 Ingres Corporation Slide 16

Page 17: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF - Functional Dependencies

select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue,

c_address, c_phone, c_comment

from ...

group by c_custkey, c_name, c_phone, c_address, c_comment generates:

Project (

As (

Aggr (

Project (

...

) [ 'est_card' = '56357' ] , [_TRSDM_1 = *(_lineitem._l_extendedprice,+( -(_lineitem._l_discount), decimal('1'))), _customer._c_comment, _customer._c_address, _customer._c_phone, _customer._c_name, _customer._c_custkey]

), [_customer._c_comment DERIVED, _customer._c_address DERIVED, _customer._c_phone DERIVED, _customer._c_name DERIVED, _customer._c_custkey] , [_revenue_3 = sum(_TRSDM_1)] , 5636

), __VT_3_0_2_1

), [_c_custkey_1 = __VT_3_0_2_1._c_custkey, _c_name_2 = __VT_3_0_2_1._c_name, __VT_3_0_2_1._revenue_3, _c_address_4 = __VT_3_0_2_1._c_address, _c_phone_5 = __VT_3_0_2_1._c_phone, _c_comment_6 = __VT_3_0_2_1._c_comment] )

Confidential — © 2009 Ingres Corporation Slide 17

Page 18: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF - Changes

ClusteringAggregation doesn’t need input sorted on GROUP BY, just clusteredIndexing, joins, other aggregations have clustering propertiesOrdAggr() is much faster than Aggr()

Confidential — © 2009 Ingres Corporation Slide 18

Page 19: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF - Changes

Referential relationshipsOPF historically ignored referential relationshipsJoins across referential relationships have additional properties

– Cardinalities, clustering/functional dependenciesVectorWise join indexes enable very fast MergeJoinNew iirefrel catalog to track referential relationships

Confidential — © 2009 Ingres Corporation Slide 19

Page 20: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF – Referential Relationships

“select r_name, n_name from region, nation where r_regionkey = n_regionkey order by r_name” generates:

Sort (

Project (

MergeJoin1 (

MScan (_nation = '_nation', [ '_n_regionkey', '_n_name', '__jnation']

) [ 'clusterid' = '1' , 'est_card' = '25' ] , [ _nation.__jnation ],

MScan (_region = '_region', [ '_r_regionkey', '_r_name', '__tid__']

) [ 'clusterid' = '1' , 'est_card' = '5' ] , [ _region.__tid__ ]

) [ 'est_card' = '25' ] , [_region._r_name, _nation._n_name]

),[_region._r_name, _nation._n_name]

)

Confidential — © 2009 Ingres Corporation Slide 20

Page 21: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF – Reuse Segments

VectorWise can cache partial results for later reuse in same queryOPF searches for common table subexpressionsVectorWise materializes them once and caches them for later reuse

Confidential — © 2009 Ingres Corporation Slide 21

Page 22: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF – Reuse Segments

select s_acctbal, s_name, p_partkey, p_mfgr, s_address, s_phone, s_comment

from part, supplier, partsupp where p_partkey = ps_partkey and s_suppkey = ps_suppkey and ...

and ps_supplycost = ( select min(ps_supplycost) from partsupp, supplier where p_partkey = ps_partkey and s_suppkey = ps_suppkey) generates

Project (

HashJoin01 (

As (

IIREUSESQ6 =

Project (

HashJoin01 (

MergeJoin1 (

MScan (

_partsupp000 = '_partsupp', [ '_ps_suppkey', '_ps_partkey', '_ps_supplycost', '__jpartsupp'] ...

), __VT_6_1_3_1

), [ __VT_6_1_3_1._p_partkey, __VT_6_1_3_1._ps_supplycost ],

As (

Aggr (

As (

IIREUSESQ6, __VT_6_0_3_2 ...

Confidential — © 2009 Ingres Corporation Slide 22

Page 23: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF – Cross Compiler

Optimizer compiles QEPCode generator converts it to executable (by QEF) code formNative Ingres query plans are fairly simple transformations from QEPVectorWise query plans are algebra syntax built from QEPQEF sees trivial query plan with single action – the VectorWise syntax

Confidential — © 2009 Ingres Corporation Slide 23

Page 24: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

VectorWise & OPF – Cross Compiler

Cross compiler analyzes QEP much like code generation does for native Ingres queriesQEP nodes result in VectorWise operators–ORIG nodes produce Mscan() operators–Join nodes produce Merge/HashJoin() operators–Etc.Generates functions supported by VectorWise–“not like” becomes “!(like(...”Challenging issues of scope in name management–Lots of “invented” table, column, partial result name

Confidential — © 2009 Ingres Corporation Slide 24

Page 25: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Vectorwise and QEF

select t2.str,count(t1.i) from t1 join t2

on t1.i=t2.i group by t2.strThe full QP tree in brief:

GET

/ | 0|

QP /

HAGF

/ | 1|

HJN / \

| 2| | 3|ORIG ORIG

vs

The full QP tree in brief:

X100Q

Page 26: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Vectorwise and QEF

New QEA_X100_QRY QP action header Handles select, update, delete X100 algebra text is sent to X100 server, reply rows

(if any) returned to the user in the usual QEF manner– QEF arranges for X100 interface to materialize

rows directly into SCF buffers REPEAT query parameters substituted as text into

the X100 algebra

Page 27: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Vectorwise and QEF

New action types for X100 create table, create/drop constraint

New X100 control blocks attached to existing QEF cb's for CREATE TABLE, COPY, INSERT

Little effect on existing QP nodes– QEN_QP extended to understand QEA_X100_QRY

for VW → Ingres CTAS, VW scrollable cursors, future mixed query support

Transaction interface (start, abort, commit)– X100 done first as it's more likely to fail

Page 28: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

COPY and INSERT

COPY uses Ingres COPY client side code but sends rows to X100 rather than DMF– No worker threads used– VW COPY obeys constraints, may fail at end

Bulkload (APPEND) vs single row (INSERT)– Both use X100 BinaryScan operator to read rows

coming from Ingres– Append is used for COPY, INSERT/SELECT– Insert is used for INSERT VALUES

Page 29: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Vectorwise DDL

DDL does Ingres catalog DDL first, then VW DDL– Allows Ingres-style name checking, locking (sort of)– DMF knows that VW tables have no disk file

Constraints implemented directly in X100 rather than as system generated rules/DB procedures– New iirefrel catalog updated for referential constraints– Will someday be useful for Ingres constraints too

VW CREATE INDEX is really a MODIFY

Page 30: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

X100 Interface

back/x100/x100 contains low level Vectorwise server interface:

(X100) Server and session control NetBuffer Ingres ↔ X100 protocol Ingres ↔ X100 data type and null translation X100 → E_VWxxxx error code translation Generation of simple canned queries for some

operations– e.g. generates Append(BinaryScan(...)) for COPY

Page 31: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

X100 Interface

X100 Server runs as a separate process, one per database– Ingres session does not connect until a VW query is

issued– X100 Server doesn't start until it's needed– Once started, server persists until shutdown,

terminate request, or destroydb– Active servers are tracked globally (in lock-log shared

memory) so dmfjsp can access them (this is new)

Page 32: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

X100 Interface

Simple “NetBuffer” protocol talks to x100 server– Send (text) X100 Algebra query to X100

• “password” trailer validates the sender

– Receive ack (after query successfully parsed)– Receive or send rows (if select or insert)

• Receive-with-timeout to poll for FE interrupts

– Receive end-of-query– Error message packet terminates query

Interrupt to X100 handled with VW syscall from a transient connection

Page 33: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Futures

New iivwfastload– Direct pipe from client to X100 server– Special COPY variant will be created to handle X100

server checks, table permit validation, etc– Maybe hook to regular COPY too???

Mixed Ingres/VW queries– Mostly an optimizer problem– Partition query into Ingres and VW parts– Subquery results pushed into Ingres or VW temp

tables as needed

Page 34: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Futures

QEF bypass (blue sky?)– Feed results directly to client or GCD

Apply new optimizer algorithms to Ingres queries– Reuse, iirefrel, functional dependency analysis, etc– Antijoins in QEF to reduce the use of Ingres Sejoins

Merge Ingres main and IVW (codev) code lines!

Page 35: Ingres Plus X100 Equals Ingres Vectorwise. Agenda  Why?  Introduction to Vectorwise  Groundwork  Vectorwise and OPF  Vectorwise and QEF

Questions?