building a terabyte data warehouse, using linux and rac george lumpkin director product management...

35

Upload: jemimah-lawrence

Post on 20-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177
Page 2: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Building a Terabyte Data Warehouse, Using Linux and RAC

George Lumpkin

Director Product Management

Oracle Corporation

Session id: 40177

Page 3: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Do More with Less

More performance More scalability More users Less capital cost Less administration cost

Page 4: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

RAC for Scalability, Availability,

and Flexibility

Page 5: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWScalability

Data Warehouse DB

Linux ‘Starter’ Cluster:-Two nodes-One shared database

Page 6: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWScalability

As the Business Grows …

Data Warehouse DB

Page 7: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWScalability

As the Business Grows …

… so does yourEnvironment:-Three Nodes-One Database

Data Warehouse DB

Page 8: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWScalability

As the Business Grows …

Data Warehouse DB

… and again:-Four Nodes-One Database

Page 9: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWAvailability

When one node fails …

Data Warehouse DB

Page 10: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWAvailability

When one node fails …

… the load is rebalanced and

3/4th of the cluster continues the work

Data Warehouse DB

Page 11: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWFlexibility

The Cluster can share all workload ubiquitously …

QueryQueryQueryQueryETL ETL ETL ETL

Data Warehouse DB

Page 12: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWFlexibility

… or do workloadpartitioning

QueryQueryQueryETL ETL

ETLQuery

ETL

Data Warehouse DB

Page 13: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWFlexibility

QueryQueryQueryETL ETL

ETLQuery

ETL

Workload Management and Provisioning made easy

ETLETL

Data Warehouse DB

Christmas – “Data Season”for Retail

Page 14: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux and RAC for DWFlexibility

QueryQueryQueryETL Query

ETL

Workload Management and Provisioning made easy

ETLETL

Data Warehouse DB

January – “Analysis Season”

QueryQuery

Page 15: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

RAC and Parallel Execution

Page 16: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

RAC and Parallel Execution

• Very large queries utilize all resources on the cluster

Large Query

Page 17: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

RAC and Parallel Execution

• Many large-scale DWs have many concurrrent jobs– Multiple “small-to-medium” size queries – Degree of parallelism < CPUs-per-node

• With Oracle, queries will automatically run on a single node, eliminating traffic over the interconnect

Q1 Q2 Q4Q3

Q5 Q7Q6 Q8

Q9 Q12Q11Q10

Page 18: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Recipe for a RAC Linux DW

Processors I/O Interconnect

Page 19: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Data warehouse workload determines total number of CPU’s

– Same sizing considerations as non-clustered DW

How many processors per node? – Enough CPU’s so that a single node can handle

most database operations Often, 4 cpu’s is a good balance

Recipe for a RAC Linux DW:Processors

Page 20: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Recipe for a RAC Linux DW:I/O I/O is typically the primary determinant of data

warehouse performance– Storage configurations for a data warehouse

should always be chosen based on I/O bandwidth not storage capacity

Rule of thumb: at least 100 MBytes/sec of IO bandwidth per gigahertz of processing power

Every component of the IO system should provide enough bandwidth: disks, IO channels, IO adapters

Page 21: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Recipe for a RAC Linux DW:I/O

CPU power and IO bandwidth should be balanced within a server

– Example: Each node has 4 x 2ghz processors each node can utilize

at least 800 MB/sec Each node should have enough slots to accommodate the

necessary IO throughput If one host bus adapter drives 150 MB/sec, then 6 HBA’s

should accommodate the needed IO bandwidth Note that at least one slot is required for the interconnect

Page 22: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Recipe for a RAC Linux DW:Interconnect Gigabit ethernets are generally sufficient for

data-warehouse workloads– Oracle minimizes interconnect traffic for multi-

user workloads

Workloads requiring inter-node parallel query will utilize more interconnect bandwidth

– 10Gb ethernet, fibre channel, Infiniband

Page 23: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

‘Typical’ Cluster configuration

16-port switch

16-port switch

1 Gigabit ethernet

16 Storage arrays, each with

10-20 disks

4 nodes, each with 4 x 2 Ghz CPUs 5 PCI slots

Page 24: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Oracle Linux/RAC DW Customers

Page 25: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

RAC/Linux DW Customers Euronext

– Database size: 1.5 TB– Hardware: 2 x HP DL580 (4 CPUs)– Storage: HP MSA 1000– Interconnect: 1 Gb ethernet– OS: Red Hat

AOK Berlin– Database size: 780 GB– Hardware: 2 x HP DL580 (4 CPUs)– Storage: EMC Symmetrix– Interconnect: 2 x 1Gb ethernet– OS: SuSE

Vanderbilt University– Database size: 50 GB– Hardware: 3 x HP DL580 (4 CPUs)– Storage: EMC Symmetrix– Interconnect: 1 Gb ethernet– OS: Red Hat

National Bank AG– Database size: 75 GB– Hardware: 3 x IBM Express5800 (2

CPUs)– Interconnect: 100 Mb ethernet– OS: SuSE

Ellis Island Foundation– Database size: 60 GB– Hardware: 2 x HP DL580 (4 CPUs)– Storage: NetApp– Interconnect: 1Gb ethernet– OS: Red Hat

Page 26: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Linux-RAC and the Grid

Page 27: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Increasingly common customer theme these days is “provisioning”

Customers want more value out of their hardware expenditures – they want to take advantage of unused capacity

Oracle’s architecture is unique in being able to truly support flexible provisioning of processing power across multiple databases

Oracle will be widely deployed in large commercial computing “grids” in the future

Evolution of Business Intelligence with Oracle

Page 28: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis

Order Entry, Shipments, Procurement, Inventory, …

Real Application Clusters

Page 29: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Resource ProvisioningDecember: Order Processing Heavy – Analytics Light

ETL processing, Query & Reporting, Data Mining, …

Order Entry, Shipments, Procurement, Inventory, …

Page 30: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Order Entry, Shipments, Procurement, Inventory, …

ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis

Resource ProvisioningJanuary: Order Processing Light – Heavy Analytics

Page 31: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Oracle RACBrings Flexible Processing Power to Databases on the Grid

Page 32: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Next Steps …Data Warehousing DB Sessions

11:00 AM

#40153, Room 304

Oracle Warehouse Builder:

New Oracle Database 10g Release

3:30 PM

#40176, Room 303

Security and the Data Warehouse

4:00 PM

#40166, Room 130

Oracle Database 10g

SQL Model Clause

8:30 AM#40125, Room 130

Oracle Database 10g: A Spatial VLDB Case Study

3:30 PM#40177, Room 303

Building a Terabyte Data Warehouse,Using Linux and RAC

  5:00 PM

#40043, Room 104

Data Pump in Oracle Database 10g:Foundation for Ultrahigh-Speed Data

Movement

TuesdayMonday

For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html

Page 33: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

8:30 AM #40179, Room 304

Oracle Database 10g Data Warehouse Backup and Recovery

11:00 AM#36782, Room 304

Experiences with Real-Time Data Warehousing using Oracle 10g

1:00PM#40150, Room 102

Turbocharge your Database, Using the Oracle Database 10g SQLAccess

Advisor

Thursday

Oracle Database 10g

Oracle OLAP

Oracle Data Mining

Oracle Warehouse Builder

Oracle Application Server 10g

Business Intelligence and Data Warehousing Demos All Four DaysIn The Oracle Demo Campground

For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html

Next Steps …Data Warehousing DB Sessions

Page 34: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177

Reminder – please complete the OracleWorld online session survey

Thank you.

Page 35: Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Session id: 40177