tecnologie db2 luw per distribuzione dati - final - disco
TRANSCRIPT
© 2011 IBM Corporation
Tecnologie DB2 LUW per la distribuzione dei dati
Una panoramica
Francesco Airoldi
Executive Architect
eTS Team - IBM Italia
Michele Benedetti
Senior IT Specialist
Software Group - IBM Italia
Mariangela Fumagalli
Senior IT Specialist
Software Group - IBM Italia
© 2011 IBM Corporation22 DB2 LUW overview
Agenda
Introduction
Distributed Access
Data Federation
Data Replication
© 2011 IBM Corporation33 DB2 LUW overview
Agenda
Introduction
Distributed Access
Data Federation
Data Replication
© 2011 IBM Corporation4
Distributed data: summary
4 DB2 LUW overview
Appl
DBMS
DB
Basic (single db)
connect
Appl
DBMS
DB
DBMS
DB
connect
Fed Srv
Federation
Appl
DBMS
DB
Appl
DBMS
DB
ReplSrv
Replication
Appl
Appl
DBMS
DB
EPSrv
Event Publishing
Appl
DBMS
DB
Appl
DBMS
DB
ETL srv
DW
DBMS
Extract Trasform & Load
Appl
DBMS
DB
DBMS
DB
connect connect
Distributed Access
DA
TA
MO
VE
: N
OD
AT
A M
OV
E: Y
ES
© 2011 IBM Corporation5
Basic (single db)
Appl
DBMS
DB
connect
DBMS engine based – SQL is used
• Insert (Appl � DB)
• Select (DB � Appl)
Utilities – DBMS engine bypassed – SQL not used
DBMS
DB
Load Unload
• Load (external data � DB)
• Unload (DB � external data)
For large and very large data volumes
Logging may be disabled
DB objects (e.g. tables) may be locked while the utility runs
Various external data format supported
Special techniques to handle anomalous data (e.g. duplicate keys)
Logging yes, concurrency control yes
© 2011 IBM Corporation6
Distributed Access
Appl
DBMS
DB
DBMS
DB
connect connectBased on the DRDA standard
DRDA = Distributed Relational Database Architecture• Proposed by IBM, now adopted as a database
interoperability standard from The Open Group
• Implemented in all IBM products belonging to the DB2 Family, and by several non-IBM products
• SQL based
Key points:• How many SQL statements per Unit of Work (UOW)?
• How many databases per UOW?
• How many databases referenced in a single SQL statement?
• Read-only access or not?
• DBMS belonging to the same family (homogeneous) or not (heterogeneous scenarios)?
© 2011 IBM Corporation7
Federation
Extension of the Distributed Access model to heterogeneous environments
Applications connect to a single “virtual database”
Can be extended to allow access to non-relational data sources
Key points:• Query optimization
• Differences in DBMS (SQL dialects, data types, semantics…)
• Two-phase commit required?
• How to handle the non-relational data sources?
• Performances
Appl
DBMS
DB
DBMS
DB
connect
Fed Srv
© 2011 IBM Corporation8
Replication
Appl
DBMS
DB
Appl
DBMS
DB
ReplSrv
Replication
Data physically copied from a source system to a target system
Often between heterogeneous DBMS
Many topological variants:• One-to-one• One-to-many• Many-to-one• Fan-out
One-way or bidirectional
Key points:• Total replica (bulk) or delta replicas (change capture)
• Batch (e.g. once a day) or real-time (continuous replication)
• Table-based or transaction-based
• Change capture: triggers or log-based
• Performance impact on running applications
• Conflict detection and resolution (bidirectional, many-to-one)
• Data replicated as-is or transformed “in flight”
• Transport mechanism between source and target systems
© 2011 IBM Corporation9
Event Publishing
Appl
Appl
DBMS
DB
EPSrv
Event Publishing is a variant of Replication
Data that change in a DBMS when certain “events” occur are sent to external applications
• New rows inserted
• Existing rows deleted
• Existing rows updated
• …..
Key points:• Data format (e.g. xml)
• Event published in real time or not
• May be a component of EAI (Enterprise Application Integration) scenarios (e.g. when the “target application” is an ESB system)
© 2011 IBM Corporation10
Extract Transform & Load (ETL)
Appl
DBMS
DB
Appl
DBMS
DB
ETL srv
DW
DBMS
Theoretically, ETL is an extension of the Replication model
In practice, ETL is the key technology for feeding data into a Data Warehouse system:
• Extract data from operational data sources
• Transform data in a format suitable to be used in the DW environment
• Load data into the DW dbms
Occasionally, some variants may be used:• ELT (Extract – Load – Transform)
• TEL (Transform – Extract – Load)
Key points:• Availability of connectors for different data sources (also non-relational)
• Data volumes to be handled per unit of time (performances)
• Batch or near real-time
• Complexity of data transformation required
• Further data transformation within the DW are possible
• Data quality
• System management and overall governance
© 2011 IBM Corporation1111 DB2 LUW overview
Agenda
Introduction
Distributed Access
Data Federation
Data Replication
© 2011 IBM Corporation12
DRDA
Distributed Relational Database Architecture (DRDA) is a database interoperability standard from The Open Group.DRDA describes the architecture for distributed data. It defines the rules for accessing the
distributed data, but it does not provide the actual application programming interfaces (APIs) to
perform the access. It was first used in DB2 2.3.http://en.wikipedia.org/wiki/DRDA
High level architecture
AR
Appl
AS
DS
DS
SQL
connectApplication Support Protocol
Database Support Protocol
Database Support Protocol
Application Requester (AR)
The AR accepts SQL requests from an application and sends them to the appropriate application servers for processing. Using this function, application programs can access remote data.
Application Server (AS)The AS receives requests from application requesters and processes them. The AS acts upon the portions that can be processed and forwards the remainder to database servers for subsequent processing. The AR and the AS communicate through a protocol called the Application Support Protocol which handles data representation conversion.
Database Server (DS)The DS receives requests from AS or other DS servers. The DS supports distributed requests and will forward parts of the request to collaborating DS in order to fulfill the request. The AS and the DS among themselves communicate through a protocol called the Database Support Protocol.
© 2011 IBM Corporation13
DRDA…Implementation levels and capabilities
DRDA Level 0: Remote Request
AR
Appl
AS
SQL
connect
• One DBMS
• One Unit Of Work (UOW)
• One SQL request
SQL example
connect to REM_DB
insert into REM_TAB1...
commit
DB2client
Appl
connect
DB2 LUWserver
REM_DB
Implementation example
DRDA Level 1: Remote Unit Of Work (RUOW)
AR
Appl
AS
SQL
connect
• One DBMS
• One Unit Of Work (UOW)
• Multiple SQL requests
Switch from one to another dbms is possible, but you need to close the UOW and disconnect from the first dbms, then connect to the second dbms and open another UOW
SQL example
connect to REM_DB
insert into REM_TAB1...
update REM_TAB2…
select … from REM_TAB3……..
commit
© 2011 IBM Corporation14
DRDA…Implementation levels and capabilities…
DRDA Level 2: Distributed Unit Of Work (DUOW)
AR
Appl
AS
SQL
connect
• Multiple DBMS
• One Unit Of Work (UOW)• Two-Phase commit (2PC) required• Transaction Manager required
• Multiple SQL requests• Each SQL request limited to a single
DBMS
AS
connect
SQL example
connect to REM1_DBconnect to REM2_DB
select …. from REM1_TAB1
insert into REM2_TAB2
delete from REM1_TAB1
…..commit
Implementation example
• The 2PC functionality is compliant with the Open Group XA specification for distributed transaction processing, where the two basic roles are
• Transaction Manager TM)• Resource Manager (RM)
• The Transaction Manager role may be fulfilled by a DRDA AR, a DRDA AS, or even by an external component.
• The Resource Manager role may be fulfilled by a DRDA AS or a DRDA DS
http://en.wikipedia.org/wiki/X/Open_XA
DB2client
Appl
connect
DB2 LUWserver
REM1_DB
DB2 z/OSserver
REM2_DB
connect
Acts as Trx Mgr
© 2011 IBM Corporation15
DRDA…Implementation levels and capabilities…
DRDA Level 3: Distributed Request (DR)
• Multiple DBMS
• One Unit Of Work (UOW)• Two-Phase commit (2PC) required• Transaction Manager required
• Multiple SQL requests• One SQL request may refer to objects managed by
different DBMS (e.g. distributed join)
Implementation example
AR
Appl
AS
DS
SQL
connect
SQL example
connect to FED_DBselect ....
from NICK_LOC,
NICK_REM...
update NICK_REM.....
insert into NICK_LOC….
commit
DB2client
Appl
connect
DB2 LUWserver
LOC_DB DB2 LUWserver
REM_DBActs as Federation Server and Trx Mgr
• The AR connects to an AS that owns local data and forwards part of the SQL request to a DS
• There is no explicit connection form the AR to the DS: the latter is connected “under the hood”by the AS
• The AR “sees” only the AS, which acts as “federator” over its local data and the data managed by the second DBMS (the DS)
• Database object are referenced through “nicknames”
• A natural evolution of this capability is the Data Federation scenario, where the AS does not own local data and the underlying DBMS may be heterogenous
© 2011 IBM Corporation16
DRDA…
YYYYDB2 z/OS
Y(N)YYInfoSphere Federation
Server
YYYYDB2 LUW
YNNYDB2 Connect
NNNYDB2 Client
XA
Trx Mgr
DRDA
Database Server
DRDA
Appl Server
DRDA
Appl Req
Mapping DRDA roles on some IBM software productsAR
Appl
AS
DS
DS
© 2011 IBM Corporation1717 DB2 LUW overview
Agenda
Introduction
Distributed Access
Data Federation
Data Replication
© 2011 IBM Corporation18
Different Integration Techniques Meet Different Requirements
Product PerformanceReal-time
Inventory Level
Federation
Analytical &Reporting Tools
Region 1 Product Performance
Region 2 Product Performance
DataWarehouse
Consolidation
Federation Consolidation
Replication Event Publishing
Database
EAI Repl ETL RYO
Capture &Publish
Headquarters
Replication
Stores
Web Applications
PrimaryData Center
Replication
BackupData Center
© 2011 IBM Corporation19
Federation- How does it work?
Product PerformanceReal-time
Inventory Level
Federation
Analytical &Reporting Tools
Federation
Web Applications
Region 1 Product Performance
Region 2 Product Performance
DataWarehouse
Consolidation
Consolidation
Replication Event Publishing
Database
EAI Repl ETL RYO
Capture &Publish
Headquarters
Replication
Stores
PrimaryData Center
Replication
BackupData Center
© 2011 IBM Corporation20
• MQ UDF• Excel• Table-structured
files• Web services• OLE DB• Scripts• Custom-built
WebOther
SQL
InfoSphere Federation Server
• DB2 for iSeries• DB2 for z/OS• DB2 for LUW• Informix• Oracle• Sybase • Teradata • Microsoft SQL Server • ODBC• JDBC
Relationaldatabases
Re
ad
-W
rite R
ea
d o
nly
Federation Data Sources
© 2011 IBM Corporation21
Data Source Client
FS Basic Concepts
Wrapper
ServerServer
Nic
knam
e
Nic
knam
e
Nic
knam
e
Federated server: a DB2
database enabled for
federation.
Wrapper: a library
allowing access to a
particular class of data
sources or protocols
(Net8, DRDA, CTLIB...).
Contains information
about data source
characteristics
Server: represents a
specific data source
Nickname: a local alias to
data on a remote server
(mapped to rows and
columns); appears as a
DB2 table
Federated Server
Stores information about:• Wrappers,servers,
nicknames
• Server attributes
• Nickname attributes
• Remote functions ServerN
icknam
e
Wrapper
ServerN
icknam
e
Wrapper
ServerN
icknam
e
Wrapper
ServerN
icknam
e
Wrapper
ServerN
icknam
eServer
Nic
knam
e
Wrapper
Server
Nic
knam
e
DB2 Catalog
Orders Customers
Data Source Client
Wrapper
© 2011 IBM Corporation22
• Push Down Analysis (PDA) is a
component of query compilation
• PDA determines whether or not an
operation can be pushed down to
the data source
• Just because processing can be
pushed down, does not mean it
will be
• If an operation can be pushed down, the optimizer still has the
final say on whether or not the operation is pushed down.
Query Optimizer Flow for Federated Queries
© 2011 IBM Corporation2323 DB2 LUW overview
Agenda
Introduction
Distributed Access
Data Federation
Data Replication
© 2011 IBM Corporation24
Different Integration Techniques Meet Different Requirements
Product PerformanceReal-time
Inventory Level
Federation
Analytical &Reporting Tools
Region 1 Product Performance
Region 2 Product Performance
DataWarehouse
Consolidation
Federation Consolidation
Replication Event Publishing
Database
EAI Repl ETL RYO
Capture &Publish
Headquarters
Replication
Stores
Web Applications
PrimaryData Center
Replication
BackupData Center
© 2011 IBM Corporation25
CD1SOURCE
TARGET TARGET TARGET
Data Distribution (1:many)
CD1SOURCE CD1SOURCE CD1SOURCE
TARGET
Data Consolidation (many:1)
CD1SOURCE
CD1STAGING CD1STAGING
TARGETTARGET
Multi-Tier Staging
TARGETTARGET
CD1SOURCE
Peer-to-Peer
CD1PRIMARY
Bi-directional
SECONDARY
CD1SOURCE CD1SOURCE
Conflic
t D
ete
ction/R
esolu
tio
n
Many Topologies Possible
© 2011 IBM Corporation26
Changed Data Replication
• Applications make changes to a database (the source)
• Changes are then:• Read, ‘captured’, from the database log
• Copied to other systems (the targets)
• ‘Applied’ to tables
• Diagram shows one-way, or unidirectional, replication
SourceSOURCE2
SOURCE1
Log
Capture
Target
Apply
TARGET 1
TARGET 2
TARGET …
TARGET N
• Subsets• Transformations• History Tables
© 2011 IBM Corporation27
Q Capture Process Flow
TX1: INSERT S1
TX2: INSERT S2
TX3: ROLLBACK
TX1: COMMIT
TX1: UPDATE S1
TX3: DELETE S1
DB2 Log
Q-SUBS
Q-PUBS
SOURCE2
SOURCE1
TX1: INSERT S1
TX1: COMMIT
TX1: UPDATE S1
CAPTURE
In-Memory-Transactions
Transaction is still „in-flight“
Nothing inserted yet. „Zapped“ at Abort
Never makes it to send queue
TX3: DELETE S1
TX3: ROLLBACK TX2: INSERT S2
Restart
Queue
MQ Put when Commit
record is found
Send Queue
© 2011 IBM Corporation28
TGT3
TARGET
TGT1
Q Apply
Browser
Apply Agent
Apply Agent
Apply Agent
TGT2
METADATA
SOURCE
SOURCE2
SOURCE1
METADATA
DB2 Log
Q Replication - The BIG Picture
Q
Capture
• Subsets• Transformations• History Tables
• Applications make changes to a database source
• Changes are then:• Read, ‘captured’, from the database log• Copied to other systems (the targets)• ‘Applied’ to tables
ADMINISTRATION
Replication
MonitorReplication
Center
© 2011 IBM Corporation29
Source
SOURCE2
SOURCE1
DB Log
Capture
• Conceptually, data replication without the apply
• Change data is made available to consuming applications• Examples, InfoSphere DataStage or a message broker
• One common delivery mechanism is WebSphere MQ
Target
InfoSphere
DataStage
SOA/User
Application
User
Application
WBI Event
Broker
TARGET
TARGET
TARGET
Data Event Publishing
© 2011 IBM Corporation30
Essential links
http://www-01.ibm.com/software/data/db2/linux-unix-windows/
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp
Distributed Access
DB2 for Linux Unix and Windows home page
DB2 for LUW and DB2 Connect 9.7 InfoCenter (search for DRDA)
Data Federation and Data Replication
http://www-01.ibm.com/software/data/infosphere/federation-server/
http://www-01.ibm.com/software/data/infosphere/replication-server/
http://www-01.ibm.com/software/data/infosphere/data-event-publisher/
InfoSphere Federation Server home page
InfoSphere Replication Server home page
InfoSphere Event Publisher home page
http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r7/index.jsp
InfoSphere Information Server V8.7 InfoCenterOverview > Introduction to InfoSphere Information Server > Companion products
© 2011 IBM Corporation3131 DB2 LUW overview
Questions ?
© 2011 IBM Corporation3232 DB2 LUW overview
GrazieGrazie
© 2011 IBM Corporation33
Additional slides on Data Federation
InfoSphere Federation Server
• Customer Requirements and Scenarios
• How it works
• Performance considerations
InfoSphere Federation Server
© 2011 IBM Corporation34
Problems Delivering Data to the End User
• Multiple sources for the same entity
• Heterogeneous data sources:
DB2, Oracle, Microsoft SQL Server, XML files, spreadsheets, etc.
• Employees spend significant amount of their time (70%) searching for information
• Lack of an integrated view of information
• Time-consuming and costly aggregation
© 2011 IBM Corporation35
Transparent• Appears to be one source
• Independent of how and where data is stored
• Applications continue to work despite of any change in how data is stored
Heterogeneous• Accesses data from diverse sources: relational,
structured, messages…
Extensible • Bring together almost any data source
• Wrapper Development Toolkit
High Function• Full query support against all data
• Capabilities of sources as well
Autonomous• Non-disruptive to data sources, existing applications,
systems
High Performance• Optimization of distributed queries
InfoSphere Federation Server
Access and integrate heterogeneous information across multiple sources
as if they were a single source
Extend value of existing analytical applications by providing real-time
access to integrated information
InfoSphere Federation Server
Web Services
Excel SQL Server
….Oracle SQL Server
© 2011 IBM Corporation36
Customer Challenge:
� Providing a holistic view of information for
customer-facing or customer supporting
applications
� High development and maintenance costs to
access diverse data sources
� Maximizing value of customer data for
customer satisfaction/retention and increasing
sales
Customer value:
� Reduce coding and skills requirements when
integrating two or more sources
� Reduce redundant data by consolidating only
frequently accessed data
� Reduce application maintenance costs
� Extend customer data with document and other
content data
Application
Developer
Application
RDBMS
Non-relationaldata
Non-traditional data
Development effort to handle:
�Unique interfaces for
each data type
�Joining data from
varied sources
�Transformations
�Correlating data
InfoSphere Federation
Server
Customer-Data Integration
© 2011 IBM Corporation37
Access to Regionally Distributed Data
Requirements
� Several regional databases with similar logical
data models, but unique data
� Application needs to see the data as one large
database with a single schema
� Impractical to physically consolidate data
Solution
� Access relevant remote tables via FS nicknames
� Connect matching nicknames from different
sources via a UNION ALL view
� Can optionally cache common data at the FS or
create local aggregates
Client
InfoSphereFederation Seattle
Phoenix
San Jose
Linux
ORACLE
Windows
SQL Server
Linux
ORACLE
InfoSphere
Federation Server
Linux
ORACLE
Linux
ORACLE
Linux
ORACLE
Windows
SQL Server
Linux
ORACLE
InfoSphere
Federation Server
Linux
ORACLE
Linux
ORACLE
© 2011 IBM Corporation38
Speeding Portal Application Development
Customer Challenge:
� Integrating multiple data sources in a single
application is complex and costly
� Accessing non-traditional sources is too
impractical to leverage their benefit
� Time pressure to deploy new applications
� Scarcity of skills who can work with legacy,
non-traditional data sources
� Extending built-in search to new domains
Customer value:� Reduce amount of integration coding by 40-
65%
� Use existing SQL tools to access all data
� Give applications access to all the relevant data
sources
� Reduce application maintenance costs
� Deploy existing skills over wider range of
integration projects
Application
Developer
Legacy data
Non-relationaldata
Non-traditional dataDevelopment effort
to handle:
�Multiple portlets, one
for each source
�Unique interfaces for
each data type
�Joining data from
varied sources
�Transformations
�Correlating data
Portal
Application
InfoSphere Federation
Server local DB2 for "scratch" temp
tables
Federation Server
Oracle Excel/ODBC
DB2
Federated Application
Non-Federated Application
Connection to Federated server
Connection to all individual data sources
© 2011 IBM Corporation39
Customer
Orders
1. Join rows from both sources 2. Sort them by cust_nation 3. Sum up total order price for each nation 4. Return result to application
What the application seesWhat FS does:
SELECT cust_nation, sum(o_totalprice)
FROM Customer, Orders
WHERE c_custkey = o_custkey
and o_orderstatus = 'OPEN‘
and c_mktsegment = 'BUILDING’
GROUP BY cust_nation
SELECT o_custkey
FROM Orders
WHERE o_orderstatus = 'OPEN‘
SELECT c_custkey, cust_nation
FROM Customer
WHERE c_mktsegment = 'BUILDING’
Example: SELECT
© 2011 IBM Corporation40
SERVER
Physical Properties:Federated system configuration
Query Properties:Optimization class, data distribution,
operators used, query type,
cost models, FIRST N ROWS ?
Statistics:
•Table Statistics•Column statistics•Index statistics
Non-Relational WrapperWrapper Plans
Cost Models
•Characteristics
•Cpu/io ratio,
•Commrate
•Capabilities
•Type/version
Federated Cost-Based Query Optimization
© 2011 IBM Corporation41
Federated Database System (Global) Catalog
• Contains information both about local objects and remote objects
• Global because it contains information about all the objects in the
federated database
• Table information is found in the following SYSCAT tables
• SYSCAT.TABLES• SYSCAT.NICKNAMES
• SYSCAT.TABOPTIONS
• SYSCAT.INDEXES• SYSCAT.INDEXOPTIONS
• SYSCAT.COLUMNS• SYSCAT.COLOPTIONS
• Global catalog also contains other information about remote sources
including: Connection, authorizations, etc.
© 2011 IBM Corporation42
Global Catalog information
• Data type mappings describe the relationship between the data source data type and the FS data type• Can override defaults by altering local nickname column types if appropriate
• Function mappings tell FS that a remote function is semantically equivalent to a local function (need: compatible arguments + types)• Increases the opportunity for pushing down the function to the data source• Without a valid mapping, data has to be retrieved and function applied at FS• Can also tell FS about remote functions that have no local equivalent using function
templates
• Statistics are used by DBMS’s to describe the logical and physical structure of the data• Helps the optimizer generate optimal access strategies• FS retrieves statistics from remote-source catalog and populates DB2 catalog at
CREATE NICKNAME time
• There are no actual "local indexes" on nicknames. Information on remote indexes is kept in the FS catalog
• Normally, information about remote indexes is picked up during nickname creation (including index specification and statistics)
© 2011 IBM Corporation43
Actual pushdown is cost-based
• Just because processing can be pushed down doesn't mean it will be. • Decision influenced by estimates of rows processed/returned.
• Consider a join of two nicknames ORA.T1 and ORA.T2 on a single remote source that is "nearly" a Cartesian product. • May be better to do the join at the InfoSphere Federation Server to avoid retrieval of
many rows.
• Retrieving (10,000 + 25) rows to do a local join is probably faster than retrieving (10,000 * 25) = 250,000 row remote join result
SELECT .... from ORA.T1, ORA.T2 where T1.a = T2.b
ORA.T1 ORA.T2
25 rows 10,000 rows
Single remote Oracle source
© 2011 IBM Corporation44
'Pushdown' of Query Operations
• FS decides whether some or all parts of a query can be "pushed-down", i.e. processed at the remote data source(s). Pushdown-ability depends on• Availability of needed functionality at remote source• Server options (example: is collating sequence at FS and remote source the
same?) • Typically faster than processing the query at FS because of less data movement
from the data source to FS
• Example: A remote source that can handle an equality predicate, but not count(*)....
SELECT count(*) FROM t1 WHERE col = 27 SELECT count(*) FROM...
SELECT '1' FROM t1
Federation Server
Application
non-DB2 data
Compensation
© 2011 IBM Corporation45
Sort Order
• Varies in some cases for different collating sequences• Data consists of combinations of letters and numeric characters
• Data contains both uppercase and lowercase letters
• Data contains special characters, e.g. #
• Affects how data is sorted in a query with an ORDER BY
• Affects how character comparisons are made• E.g., SELECT … WHERE Column3 > ‘Aa3@’
• Two data source of the same type (wrapper) can use different collating sequences
• E.g., in DB2 the collating sequence is specified when the database is
created
• Different databases can use difference collating sequences
• The collating sequence of the data source can be specified when the
Server is defined
© 2011 IBM Corporation46
Server options: Collating Sequence Differences
�EBCDIC Sequence
... ab yz ... AB YZ ... 0 9 ...
�ASCII Sequence
... 0 9 ... AB YZ ... ab yz ...
�LEXICAL Sequence
... 0 9 ... AaBb YyZz ...
© 2011 IBM Corporation47
Server options: Collating Sequence Differences
� ORDER BY COLM2– Different order
EBCDIC
COLM2
V1G
Y2W
7AB
ASCII/LEXICAL
COLM2
7AB
V1G
Y2W
� WHERE COLM2 > ‘TT3’– Different results
EBCDIC
COLM2
TW4
X72
39G
ASCII/LEXICAL
COLM2
TW4
X72
© 2011 IBM Corporation48
...WHERE NAME = ‘MARIANGELA'
Assume that the data source column contains: ‘MariAngela’
TRUE FALSE
Databases using theInsensitive Collate option
(an optional parameter for
MS.SQL Server, Sybaseand Informix)
Databases not using theInsensitive Collate option
Server Options: Case Insensitive Collating Sequences
© 2011 IBM Corporation49
Server Options: COLLATING_SEQUENCE option
COLLATING_SEQUENCE= 'Y'
• indicates that FS and the remote data source sort the same• all char sort and comparison operations can be pushed down
COLLATING_SEQUENCE = 'N' (Es. DB2/390)
• indicates that FS and the remote data source sort differently• char sort and most char comparison operations can not be pushed down
• only char = comparisons can be pushed down
COLLATING_SEQUENCE= 'I' (Es. SQL Server)
• indicates that the remote data source uses insensitive collating sequence• no char sort or char comparison operations can be pushed down
Set COLLATING_SEQUENCE as a Server Option or on a nicknameas the NUMERIC_STRINGS Column Option
You must inform Federation about Data Source collating
© 2011 IBM Corporation50
• Column FIRSTNAME is VARCHAR(25)
• Actual contents are ‘MARYb’
DB2 (and all other major RDBMSs)
SELECT * ...
WHERE
FIRSTNAME = ‘MARY’
TESTS TRUE
Oracle
SELECT * ...
WHERE
FIRSTNAME = ‘MARY’
TESTS FALSE
Server Options: VARCHAR comparison semantics
• Forces COL1= ‘MARY' to be pushed down as
RTRIM(COL1) = ‘MARY'resulting in a relational scan by ORACLE
• Mitigated with VARCHAR_NO_TRAILING_BLANKS = 'Y'
--------> but know the data!
/
© 2011 IBM Corporation51
More Federation Server Features
• Ability to define informational constraints over nicknames
• Ability to refer to and execute remote stored procedures for DB2, Oracle, Sybase, and MSSQL data sources
• Error Tolerant Nested Table Expression
UNION ALL
Remote 1 Remote 3Remote 2
Connection error
Remote 1+
Remote 3
© 2011 IBM Corporation52
Client
SQL Server
Oracle
1)Connect
2)Withdraw
3)Commit
4)Connect
5)Deposit
6)Commit
CHECKING_ACCOUNT
SAVING_ACCOUNT
Money Transfer Example
WebSphere Federation Server w/ F2PC Update
© 2011 IBM Corporation53
Client
SQL Server
Oracle
2)Withdraw
1)Connect
3)Deposit
InfoSphere Federation
Server
CHECKING_ACCOUNT
SAVING_ACCOUNT
Money Transfer Example 4)PREPARE
4)PREPARE
5)COMMIT
5)COMMIT
I am the
TM_DATABASE
WebSphere Federation Server w/ F2PC Update