1 application integration in an e-commerce world leslie m. tierstein str llc
TRANSCRIPT
1
Application Integration in anE-Commerce World
Leslie M. TiersteinSTR LLC
2
Application Integration
Overview
– Acquire data from one or more sources
– Transform its meaning and/or format
– Deliver it to one or more targets
3
Application Integration
Processing scenarios:
– Data warehouse loads
– Conversions from legacy systems or application interfaces performed in a “batch window”
– Ongoing “real-time” interfaces
What are the properties of each scenario?
How are the scenarios different/the same?
4
Warehouse Loads (1)
Repeatable, regularly scheduled– Data is initially loaded (see Conversions)– Then it is “refreshed”, typically via Change Data
Propagation
Load must ensure consistent user views– “Checkpoint” OLAP environment
5
Warehouse Loads (2)
Data must be “transformed”– From third-normal form OLTP source systems to
star-schema OLAP target systems– Possibly to an Operational Data Store (ODS)– Usually from multiple, heterogenous sources– Summarization may also be required to the
desired level of detail
6
Warehouse Loads (3)
Vast amounts of operational data Importance of metadata
– Oracle’s Common Warehouse Metadata (CWM)
7
Legacy Conversions
“One-time” task
– In “big-bang” implementation
– Some phased implementations need conversions repeated numerous times
– Scheduled “cut over” to the new system
Data in the source system is expendable after it is converted -- “quick and dirty” is an option
8
Application Interfaces (1)
Repeatable– Regularly scheduled (“batch”)– Event-driven (“near” real-time)
Small to large volumes of data Operational data at both ends
– Source and target– Custom, COTS, external applications (owned by
another entity/business)
9
Old Terminology
E(T)TL– Extract, (Transport,) Transform, Load
Extract source data (Transport data to new platform) Transform data to new format Load data into new database
– Typically applied to batch application integration or warehouse loads
10
Newer Terminology
EAI
– Enterprise Application Integration
Acquire data from source application(s)
Transform data
Deliver data to target application(s)
– Exchange of data between two or more applications
11
Newest Terminology (1)
A2A: Application to Application Integration
– Exchange of data between two or more applications, typically without a web interface
– May be “real-time” or batch
– “Interfaces” between systems/applications (cf: Oracle Applications Interface tables)
12
Newest Terminology (2)
B2C: Business to Consumer Integration
– A consumer, via a web site, interacts with software owned by one business
– The business’s corporate database(s) is (are) queried in the transaction
– The business’s corporate database(s) is (are) updated as a result of the transaction
13
Newest Terminology (3)
B2B: Business to Business Integration
– “I’ll have my computer call your computer”
– A transaction in one business’s computer automatically triggers a transaction in another business’s computer
– B2B integration may be under the covers in B2C scenarios or performed independent of B2C transactions
14
Newest Terminology (4)
B2B:
“I took my notepad from my shirt pocket and displayed a standard contract … She glanced at it, then had her own computer scrutinize the document. Conversing in modulated infrared, the machines rapidly negotiated the fine details. My notepad signed the agreement on my behalf, and Lansing’s did the same, and they both chimed happily in unison to let us know that the deal had been concluded.”
Greg Egan, “Cocoon”, ©1994
15
Extract/Acquire (1)
Online, real-time database access
– Native Oracle access
– ODBC/JDBC
– Oracle gateways ($$$)
– Heterogeous replication packages (such as DataBridge)
– APIs (COTS packages such as SAP)
16
Extract/Acquire (2)
Alternate character sets– EBCDIC, ASCII, unicode– 7-bit, 8-bit, 16-bit
Change Data Propagation (CDP)– Triggers– Event Logs
17
Load/Deliver
Same access issues as Extract/Acquire
18
Transport (1)
Files
– Connectivity
LAN/WAN, Internet, Sneaker-Net
– Transfer protocols
ftp, proprietary, http, https
WAP
19
Transport (2)
Messages– via queues
IBM MQ Series, Oracle AQ Microsoft MSMQ, Java JMS
– via email POP3, attachments
20
Transform – Data Mapping (1)
Potential many-to-many mapping between sources and targets
– “Point-to-point” mappings
– vs. hub-and-spoke transformation engines
Algorithms to change the format and semantics of the data
21
Transform - Data Mapping (2)
Relationships– 1:1, many:many - Facts of life– 1:many
normalization - conversions, semantically overloaded attributes
– Many:1 mergers/acquisitions; multi-line text to LOB
Repository for impact analysis
22
Data Transformation (1)
Data type translation– Should be transparent (a la Oracle)– Except for rare types (eg, bit maps)
Mutually intelligible data– XML– Emerging XML standards
23
Data Transformation (2)
Algorithms are often referred to as “business rules”– Rules may range from simple assignments
– To complex lookups/translations on multiple columns, with referential integrity checks, data cleansing, functions, etc.
Rules and/or their components should be re-usable
24
Data Transformation (3)
Algorithms/business rules
– Ability to INSERT/UPDATE/DELETE
– Ability to produce multiple target records per one source, or one target per multiple sources
– Ability to track (and potentially reprocess) exceptions (not part of transform per se)
25
Data Transformation (4)
Data Cleansing - specialized transform process, applied to “dirty” legacy data
– Report on fixes, exceptions
– Ability to resubmit failed rows
– Third-party products
Merge-purge software (typically for addresses)
26
Technology (1)
Selection criteria– Runs on your hardware and software– Support for physical data types
Files, databases, message queues Internet and wireless protocols
– Support for logical data types Adapters, connectors, pre-built interfaces Especially for COTS packages (Oracle Apps, PeopleSoft,
SAP, Siebel CRM)
27
Technology (2)
Selection criteria– Ability to write business rules
Language, point-and-click, combination Ability to use external code (custom or bought) Ability to reuse components
– Maintainability Cost-benefit over the system development life cycle
28
Technology (3)
Selection criteria– Real-time, “near real-time”, batch– Metadata
Operational Programmer-oriented (business rules)
– Scalability (maintenance windows?)– Integration with other tools, skill sets– Support for corporate standards– Infrastructure required (middleware)
29
Oracle Technology (1)
PL/SQL and SQL*Loader Data Mart Suite (RIP) Oracle Warehouse Builder Oracle Integration Server (MIA) XML services
30
Oracle Technology (2)
Database services– Scheduling (DBMS_JOB)– Advanced Queuing (AQ)– Replication
31
Third-Party Tools
Specialized for a processing scenario– Conversions– Warehouse loads– Data Integration (A2A, B2B, B2C)
General purpose– Mainframe gateways– Heterogenous replication
32
Summary
Select a methodology and tool to fit your processing scenario -- more than one tool if necessary
Integrate the tool(s) into your development and maintenance methodology
33
About the Author
Leslie Tierstein is an Technical Project Manager at STR LLC in Fairfax VA.
She can be reached at: [email protected]