hp oracle database platform / exadata appliance – extreme ...mar 26, 2009 · hp oracle database...
TRANSCRIPT
South Florida Oracle User Group
HP Oracle Database Platform / Exadata Appliance –Extreme Data Warehousing
March 26, 2009
Shyam Varan Nath
President, Oracle BIWA SIG &
Founder Exadata SIG(http://OracleExadata.org)
Tra
inin
g W
ebin
ar
for
Rep
ort
Tem
pla
te_
02
23
07
Agenda
The Problem – Storage Bottleneck for Large Databases
Introduction to Data Warehouse Appliances
Market Landscape
The Solution - Oracle Database Platform and Exadata Storage
Technical Details
Summary
Questions
Tra
inin
g W
ebin
ar
for
Rep
ort
Tem
pla
te_
02
23
07
About Myself….
� A Certified DBA (OCP) on 4 different Database versions – since 1998
� Former member of Oracle Corporation - BI Consulting Practice
� Experience in Oracle Data Warehousing, Business Intelligence (OBIEE) and Data Mining
� Founder and President of Oracle BIWA SIG (http://OracleBIWA.org), Exadata SIG
� Received IOUG Oracle Contribution Award in 2007
� Frequent speaker in Oracle Openworld (2003, 06, 07, 08), NYOUG (June 06, Sep 06, Sep 08, Mar 09), IOUG/Collaborate (2005, 06, 08), NOUG (2006), SFOUG (2007), ODTUG (2008) on topics ranging from Database to BI.
� Bachelors from Indian Institute of Technology (IIT), MBA and MS from Florida Atlantic University
� Based in South FL since 1995
Word of Thanks to SFOUG – for this talk today
Tra
inin
g W
ebin
ar
for
Rep
ort
Tem
pla
te_
02
23
07
The Problem – Storage Bottleneck for Large Databases
� Today most databases run on computers with one or many
powerful CPU’s
� Most large database are I/O bound rather than CPU bound
� The large storage systems are not able to feed data at a fast
enough rate to the database server
� How can we make the storage more intelligent?
Database Engine or Storage or the Interconnect?
Business imperative
What is the choking point for Large Databases?
- 5 -
Parallel Execution
Range Partitioning
Composite Partitioning
Real Application Clusters
Compression
Automatic Storage Management
First 1TB Database built in lab
First 1TB customer: Acxiom
First 10TB customer: Amazon.com
First 100TB customer: Yahoo!
Over 100 Terabyte customers
First 30TB customer: France Telecom
1995 1997 1999 2001 2003 2008Oracle Release 7.3
Oracle8 Oracle8i Oracle9i Oracle10g Oracle11gOracle9iR22005
Exadata Storage:
The next step in VLDW Technology
Over the past 12+ years, Oracle has steadily introduced major architectural advances for large database support
Data warehouses have grown exponentially with these new technologies
ExadataExadata
- 6 -
How Big is the Data Warehouse Storage Problem?
ABC Inc.’s Data Warehouse is approaching 12 terabytes in size and growing by 100% every year! Storage
and backup of data alone is costing 24% of the IT budget.
How much are we
spending in
Storage?
What are the other
impacts of huge
storage needs?
Today
Tomorrow
� Total IT budget is $5m and
cost is expected to double
next year at the given rate
Annual storage cost $1.2 m
� Not only is the Data
Warehouse growing
unmanageable in size,
information query is slowing
down leading to lost orders
Information Retrieval is slow
Business imperative
- 7 -
What is causing the explosion of data in most enterprises?
Regulatory
Compliance
Landscape
Web 2.0
Multi media content
Migration of Legacy
Applications
Government regulations like SOX, HIPAA government
regulations that mandate storing historical data for a certain
number of years
Bandwidth has become cheap and increasing amounts of
multimedia content is being generated and stored
A new kind of data source – Web 2.0 such as social networks,
blogs leading to various forms of semi-structured and
unstructured data. Some of these data is being stored in the
database, some in ECM
As legacy applications from main frames and other files based
databases is being migrated to RDBMS, increasing volumes of
data is being stored inside the database
Click-stream Click-stream and personalization data continues to explore for
online sites
- 8 -
Some Large Databases in use Today
•Yahoo's data needs are
substantial.
•According to Hasan, VP of Data,
the travel industry's Sabre system
handles 50m events / day, credit
card company Visa handles 120m
events / day, and the New York
Stock Exchange has
handled over 225 m events / day.
•Yahoo, he said, handles
24 billion events / day, fully two
orders of magnitude more than
other non-Internet companies.
- 9 -
Source: IDC, Aug 2008 – “Worldwide Data Warehouse Management Tools 2007 Vendor Shares”
Market Size is $6.7 Billion with 14.6% Growth YoY
Building on Oracle’s Leading Position
Number 1 in Data Warehousing!
IBM
21.7%
Microsoft
14.8%
Teradata
11.7%Other
12.5%
Oracle 39.3%
- 10 -
Market Landscape
� How does the Market Landscape of Data Warehouse appliances look like?
Business imperative
TERADATA
Use of Data Compression reduces
storage need by up to 5 times,
reducing storage cost by up to 60%
DW Appliance
Data
Proce
ss &
Org
aniza
tion
Cost B
enefit
Use
r Exp
erie
nce
ORACLE DATABASE PLATFORM
The users are able to retrieve
information faster due to improved
information query response time by up
to 3 times
Com
petiti
ve
Adva
ntage
Data Storage
NETZZA
The cost of additional license for Data
Compression is $ 1 million. Total
expected cost benefit is about $2 million
/ per year
EXADATA STORAGE
Ability to get results 3 times faster from
the Data Warehouse will enhance
Decision Support process and result in
20% more customer orders, adding $4
million to annual revenue
- 11 -
HP Oracle Database Machine:
The next step in DW Hardware Solutions
CustomCustom
• Complete Flexibility
• Any OS, any platform
• Easy fit into a company’s IT standards
• Documented best-practice configurations for data warehousing
Optimized Warehouse
Optimized Warehouse
• Scalable systems pre-installed and pre-configured: ready to run out-of-the-box
• Highest performance
• Pre-installed and pre-configured
• Sold by Oracle
Reference Configurations
Reference Configurations
HP OracleDatabase Machine
HP OracleDatabase Machine
- 12 -
Quote from TDWI
In any BI application, it’s always disk I/O that slows performance.
•Data Warehouses are mainly I/O bound rather than CPU bound
•Other VLDB techniques work with Exadata – such as partitioning and
compression
•Exadata is good for index scan as well, improving the index read efficiency
- 13 -
Three Pronged Approach to Solve the Problem
•Faster Pipe – Infiniband
•More Pipes
•More Efficient use of the
Data Pipe by Division of
Work between the DB Grid
and the Exadata Storage
Server
- 14 -
10-100X faster than conventional DW systems
High bandwidth: 14GB/sec of raw I/O throughput
� >50GB/sec of raw business data can be processed with compression
� High-bandwidth Infiniband network between Database Servers and Storage Servers
� Efficient block access in Storage Servers
“Smart scan” processing
� Data-intensive processing in the storage server
� Compute-intensive processing in the database server
� Less data transfer over the network
HP Oracle Database Machine:
Extreme Performance
- 15 -
HP Oracle Database Machine:
Key Components
Database Server Grid8 Servers, each consisting of:• One HP DL 360-G5 with
•2 Intel Quad-core processors•32 GB RAM•4 146GB SAS disks•Dual-port Infinibad Host Channel Adapter (HCA)
•Oracle Enterprise Linux•Oracle Database 11g Enterprise Edition with
Real Application Clusters and Partitioning
Exadata Storage Server Grid14 Servers, each consisting of: 14 Servers, each consisting of: 14 Servers, each consisting of: 14 Servers, each consisting of: • One HP DL180-G5 with
• 2 Intel Quad-core processors • 8GB RAM•12 450GB SAS or 1TB SATA disks•Dual-port Infiniband Host Channel Adapter (HCA)
• Oracle Enterprise Linux• Oracle Exadata Storage Server Software
4 4 InfinibandInfiniband SwitchesSwitchesEach with 24 portsEach with 24 ports
- 16 -
Division of Work
Exadata Storage Server
� Implements data intensive processing directly in storage
– Scans tables and indexes filtering out data that is not relevant to a query
Compute intensive data processing remains in database servers
� Joins, aggregation, statistics, data conversions, etc.
Exadata cell is smart storage, not a database node
Data Intensive Processing Compute Intensive Processing
- 17 -
How Does Query Processing Change with Exadata?
- 18 -
Smart Scans
� Exadata cells implement smart scans to greatly reduce the data that needs to be processed by database
� Only return relevant rows and columns to database
� Offload predicate evaluation
� Data reduction is usually very large
� Column and row reduction often decrease data to be returned to the database by 10x
- 19 -
Traditional Scan Processing
� Smart Scan Example:
� Telco wants to identify customers that spend more than $200 on a single phone call
� With traditional storage, all database intelligence resides in the database hosts
� Most data returned from storage is discarded by database
� Discarded data consumes valuable resources, and impacts the performance of other workloads
����
IOs Executed:1 terabyte of data returned to hosts
����
DB Host reduces terabyte of data to 1000 customer names that are returned to client
����
Rows Returned
����
SELECT
customer_id
FROM calls
where amount >
200;
����
Table Extents
Identified
����
I/Os Issued
- 20 -
Exadata Smart Scan Processing
� Only the relevant columns � customer_id
� and required rows
� where amount>200
� are are returned to database
� CPU consumed by predicate evaluation is offloaded
� Moving scan processing off the database frees CPU cycles and eliminates lots of unproductive messaging� Returns the needle, not the
entire hay stack
����
2MB of data returned to server
����
Rows Returned
����
Smart Scan Constructed And
Sent To Cells
����
Smart Scan identifies rows and
columns within terabyte table that
match request
����
Consolidated Result Set Built From All Cells
����
SELECT
customer_id
FROM calls
where amount >
200;
- 21 -
Smart Scan Transparency
� Smart Scans correctly handle complex cases including
� Uncommitted data and locked rows
� Chained rows
� Compressed tables
� National Language Processing
� Date arithmetic
� Regular expression searches
� Partitioned tables
� Smart scans are transparent to the application
� No application or SQL changes required
� Returned data is fully consistent and transactional
� If a cell dies during a smart scan, the uncompleted portions of the smart scan are transparently routed to another cell
- 22 -
Data Flow Concepts
�Concept of Data flow and producer – consumer relationships
�Three kinds of data exchanges take place
–Exchange 1
–Exchange 2
–Exchange 3
�Exchange 1 is flow of data within an Exadata Cell using iDB
protocol, throughput is 60-80MB/sec per disk
�Exchange 2 is between a single cell and Database grid
(1Gb/sec)
�Exchange 3 is between the Database grid and the Storage Grid
(1.6 GB/sec)
- 23 -
Visual of Data Flow Exchanges
- 24 -
Targeted Messages: to DW Managers / Architects v/s to DBA’s/ System Admins
Key Messages for DW Managers / Architects
10x – 100x performance gains for end-user
queries
Zero changes to existing BIDW tools and
applications
Supports large numbers of Decision Support
users and applications
Fast deployment: no configuration needed
Key Messages to DBA’s / Sys Admins
Built on Oracle Database 11g (11.1.0.6 and
higher), consistent with corporate standards
Based on standard hardware components from
HP – no proprietary hardware
Oracle provides a single point of purchase and
support
Hardware repair is provided by HP worldwide
- 25 -
10.5 GB/s46 TB168 TBHP Oracle Database Machine Hardware SATA
1 GB/s1.5 TB5.4 TBHP Exadata Storage Server Hardware SAS
0.75 GB/s3.3 TB12 TBHP Exadata Storage Server Hardware SATA
Data
Bandwidth
User
Data
Raw
Storage
14 GB/s30 TB97 TBHP Oracle Database Machine Hardware SAS
Raw Storage: Total raw disk capacity, computed as (# disks x disk capacity)
User Data: Space for end-user data, computed after mirroring and after allowing space fordatabase structures such as temp, logs, undo, and indexes. User data capacity is uncompressed; with compression, 2x to 4x more data can often be stored. Actual user data capacity varies by application
HP Oracle Database MachineData Capacity
- 26 -
HP Oracle Database Machine:
High Availability
Oracle Exadata Storage ServersStorage Server failure
Oracle Real Application ClustersDatabase Server failure
Oracle Automatic Storage Management: all disks are mirrored
Disk failure
Redundant switches; dual-port HCA’s in all servers
Switch failure
Redundant power supplies for all serversPower failure
Database Machine SolutionProblem
- 27 -
HP Oracle Database Machine:
Installation
Goal: Deliver to the customer a completely functioning database system
� All servers properly configured and networked
� All software configured (CRS, RAC, DB, Exadata)
� Default database created
� Performance and functionality validated
Installation is included in the price of HP Oracle Database Machine
� Onsite HP Installation Services
� Onsite Oracle ACS Services
- 28 -
HP Oracle Database Machine:
Support
Single point of contact for support (Oracle) for entire HP Oracle Database Machine
� Hardware
� Software
– Oracle Enterprise Linux
– Database
– Exadata Storage Software
Software issues resolved by Oracle support
Hardware support
� Hardware issues are passed to HP
� HP contacts the customer to resolve the issues
� HP Support is available 24x7
– For on-site support HP has to respond (not repair) within defined times
� Customer can buy additional support (HP Care packs)
- 29 -
DB Machine Technology Comparison
128 GB108 GB368 GBMemory
1 Gb/sec BYNET1Gb/sec Ethernet20Gb/sec InfinibandInterconnect
144 x 300GB disks108 x 400GB disks168 x 450GB disksDisks
32 DB Cores4 DB Cores (?) 64 DB CoresDatabase cores
0 Storage Cores108 Storage Cores*112 Storage CoresStorage cores
32 Cores112 Cores*176 CoresTotal cores
12.6 TB12.5 TB21 TBUser data
HW Architecture
Footprint
Proprietary**ProprietaryOpen
1 rack1 rack1 rack
Teradata
2550
Netezza
10100
HP Oracle
Database
Machine
��������
* Netezza 10100 uses PowerPC CPU’s (less powerful than Intel Xeon cores) ** Teradata BYNET Interconnect is proprietary
- 30 -
Retailer Exadata Speedup – 3x to 50x
- 5.0 10.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0
Recall Query
Gift Card Activations
Sales and Customer Counts
Prompt04 Clone for ACL audit
Date to Date Movement
Comparison - 53 weeks
Materialized Views Rebuild
Merchandising Level 1 Detail by
Week
Supply Chain Vendor - Year - Item
Movement
Merchandising Level 1 Detail:
Current - 52 weeks
Merchandising Level 1 Detail:
Period Ago
x SPEEDUP
16xAverageSpeedup
- 31 -
Oracle HP Database Machine
Oracle HP Database Machine
Scalable DB
Reference Customers
Pre-built BI Accelerators Single Point of
Contact
Industry Vertical Solutions
BI/DWTechnical
Infrastructure
Ready Configuration
Existing DB features
compatibility (Partitioning)
Scalable Storage
Exadata’s Value Proposition
� Ability to stay on Oracle Database
for Extreme BIDW Performance
� Compatibility with DB features like
Partitioning, DB Compression etc
� Horizontal Scalability for
Database Grid and Storage Grid
� Pre-built solutions from Oracle for
BIDW like BI-Apps using OBIEE,
Industry extensions like Oracle
Data Warehouse for Retail
(Accelerators)
� Single point of support –
Hardware and Software
- 32 -
Exadata Benefits
Extreme Performance
�� 10X to 100X10X to 100X speedup for data warehousing
Database Aware Storage
� Smart Scans
Massively Parallel Architecture
� Dynamically Scalable to hundreds of cells
� Linear Scaling of Data Bandwidth
� Transaction/Job level Quality of Service
Mission Critical Availability and Protection
� Disaster recovery, backup, point-in-time recovery, data validation, encryption
- 33 -
Oracle Exadata
Let us look at why Oracle Exadata needs to be in the BIDW roadmap of the companies to
address common issues
What can Oracle Exadata Platform do for you?
Explosion on Data VolumesExplosion on Data Volumes
Cost of licensing new H/W and
S/W
Cost of licensing new H/W and
S/W
Reduced Query Performance due
to large database size
Reduced Query Performance due
to large database size
Fear of adoption and learning
curve of data compression
Fear of adoption and learning
curve of data compression
Compatibility with other 11g
features like compression or
Partitioning
Compatibility with other 11g
features like compression or
Partitioning
DB is on Exadata, what about
backup?
DB is on Exadata, what about
backup?
High Perforamance even with
exponential growth of data
High Perforamance even with
exponential growth of data
Total cost of ownership is reduced
in long run
Total cost of ownership is reduced
in long run
Tremendous Business Productivity
boost
Tremendous Business Productivity
boost
No impact to app developers/end-
users, minimal impact for DBA’s
No impact to app developers/end-
users, minimal impact for DBA’s
Compression/Partitioning can be
used with Exadata storage
Compression/Partitioning can be
used with Exadata storage
Standby DB does not have to be
Exadata
Standby DB does not have to be
Exadata
Issues Opportunities
- 34 -
Questions
Reminder join IOUG Exadata SIG for more info
Contact Info: [email protected]
(954) 609 – 2402 cell
http://OracleExadata.org
- 35 -
Other Resources
� http://OracleExadata.org
� http://www.oracle.com/exadata
� www.oracle.com/technology/products/bi
� www.oracle.com/solutions/business_intelligence
OTN:
� http://www.oracle.com/technology/products/bi/db/dbmachine
� http://www.oracle.com/technology/products/bi/db/exadata
Forums:
� http://structureddata.org/
� http://kevinclosson.wordpress.com/
� http://techspectator.blogspot.com/
Subject:Oracle Exadata Setup/Configuration Best Practices Doc ID:757553.1Type:BULLETIN Modified Date:18-MAR-
2009Status:PUBLISHED
Subject:Oracle Exadata Best Practices Doc ID:757552.1Type:BULLETIN Modified Date:02-MAR-2009