wrapper-based evolution of legacy information systems from the article by p. thiran, j. hainaut, g....
TRANSCRIPT
Wrapper-Based Evolution of Legacy Information Systems
From the article by P. Thiran, J. Hainaut, G. Houben and D
Benslimane
Presentation by:Alex Saville,
Espen Skogen andSharon Stanley
Introduction
• Overview– Why do we need wrappers?
• Meeting The Challenge– How do they work?
• The Solution– Schemas and Mapping
• The Methodology
• Conclusion
Overview
Why do we need wrappers?
RBS DBMS NatWest DBMS
RBS Data NatWest Data
The Tale of Mr Ludd and Mr Moddy(An Analogy for Wrappers)
Mr Ludd
Oldie = manual transmission Newby = automatic transmission
Magic Sunglasses
Mr Moddy
Oldie = manual transmission Newby = automatic transmission
The Challenge
Legacy Data
Legacy System
We start with a legacy system reading from and writing to a legacy database management system (DBMS).
Few explicit constraints (i.e. built into DBMS setup)
Many implicit constraints (i.e. not built into DBMS setup)
Data integrity checks,
validation etc
Legacy application
Application logic
Legacy DBMS(limited data integrity, limited validation)
The Challenge
Legacy Data
Legacy DBMS(limited data integrity, limited validation)
Legacy System
Modern Data
Modern System
We will try to put the new application into a modern system, accessing the legacy DBMS.
No implicit constraints
The new application is designed using modern techniques, but again, is unsuitable for the legacy DBMS because it does not include data integrity checks, validation – i.e. the implicit constraints that accessing the legacy DBMS calls for.
New application
Application logic
Existing Modern application
Application logicData integrity
checks, validation etc
Legacy application
Application logic
Modern DBMS(includes data integrity, includes validation)
The Challenge
KABOOM!!!
Legacy System
Data integrity checks,
validation etc
So we put data integrity checks and validation into the new application too.
This will work fine as long as the new application accesses the legacy DBMS, although it is still using legacy-style techniques.
Modern System
Data integrity checks,
validation etc
Legacy application
Application logic
New application
Application logic
New application
Application logic
Legacy Data
Legacy DBMS(limited data integrity, limited validation)
Data integrity checks,
validation etc
Existing Modern application
Application logic
Modern Data
Modern DBMS(includes data integrity, includes validation)
The Challenge
Legacy System
Data integrity checks,
validation etc
Modern System
Data integrity checks,
validation etc
If we later wanted to migrate the new application to use the modern DBMS, we would have extra processing which would be redundant at best, and at worst, incompatible data structures (bringing the risk of data corruption), which would mean extra work to ensure compatibility.
Data integrity checks,
validation etc
Legacy application
Application logic
New application
Application logic
New application
Application logic
Existing Modern application
Application logic
Legacy Data
Legacy DBMS(limited data integrity, limited validation)
Modern Data
Modern DBMS(includes data integrity, includes validation)
The Challenge
The Proposed Solution:The Wrapper
Legacy System
R/W Wrapper 1
We will build our new application with a separate wrapper.It is used when reading from and writing to the legacy DBMS.Because it is used for reading and writing, it is termed a R/W wrapper (read/write)
Data integrity checks,
validation etc
Legacy application
Application logic
New application
Application logic
Data integrity checks,
validation etc
Model Conversion
Legacy Data
Legacy DBMS(limited data integrity, limited validation)
If we want to read from or write to the legacy DBMS from the new application in our modern system as well, we can reuse the wrapper from our legacy system.
Modern SystemLegacy System
R/W Wrapper 1
Data integrity checks,
validation etc
Legacy application
Application logic
New application
Application logic
Data integrity checks,
validation etc
Model Conversion
Legacy Data
Legacy DBMS(limited data integrity, limited validation)
R/W Wrapper 1
New application
Application logic
Data integrity checks,
validation etc
Model Conversion
Existing Modern application
Application logic
Modern Data
Modern DBMS(includes data integrity, includes validation)
Modern SystemLegacy System
Data integrity checks,
validation etc
Legacy application
Application logic
New application
Application logic
R/W Wrapper 2
Model Conversion
Legacy Data
Legacy DBMS(limited data integrity, limited validation)
New application
Application logic
Existing Modern application
Application logic
Modern Data
Modern DBMS(includes data integrity, includes validation)
The Wrapper Interface
• How does a application communicate with the wrapper?
• How does the wrapper communicate with the legacy DBMS?
R/W Wrapper
Wrapper Schema
Legacy DBMS(limited data integrity,
limited validation)
Wrapper queries / updates
Database queries / updates
The communication protocol for communication with the
wrapper
A communication with the wrapper schema
The communication protocol for communication with the
legacy DBMSA communication with the
legacy schemaLegacy Schema
R/W Wrapper for Accessing Legacy DBMS
Schemas(diagrams taken from article)
ReferenceNameAddressAccount
Customer
Acc: Reference
NumberCustomerAccountDateAmountProduct
Order
Acc: Customer
Physical Schema
The Physical Schema can be determined by analysing the SQLDDL. Note that the Physical Schema only contains explicit constraints (e.g. uniqueness due to indexes).In this database, there are two tables: Customer and Order.
Customer is indexed on the column Reference.Order is indexed on the column Customer.
Schemas(diagrams and examples taken from article)
Order
id: Numberref: Customerfd: Customer Account
Logical Schema
ReferenceName[0-1]AddressAccount
Customer
id: Reference
StreetZipCity
The next stage is to determine the logical schema. This is derived through analysis of the code and data. For example, the code might check that Customer Number exists on Customer, prior to inserting an Order. This tells us that Customer Number is an implicit foreign key.Data analysis might hint at Customer being functionally dependent on Account due to the otherwise seemingly redundant attribute.
NumberCustomerAccountDateAmountProduct
ReferenceNameAddressAccount
Customer
Acc: Reference
NumberCustomerAccountDateAmountProduct
Order
Acc: Customer
Physical Schema
ReferenceNameAddressAccount
Customer
Acc: Reference
NumberCustomerAccountDateAmountProduct
Order
Acc: Customer
Physical Schema
Schemas(diagrams taken from article)
NumberCustomerDateAmountProduct
Order
id: Numberref: Customer
Wrapper Schema
ReferenceName[0-1]Address Street Zip CityAccount
Customer
id: Reference
Order
id: Numberref: Customerfd: Customer Account
Logical Schema
ReferenceName[0-1]AddressAccount
Customer
id: Reference
StreetZipCity
NumberCustomerAccountDateAmountProduct
Finally, we must create the wrapper schema from the logical schema. •The logical schema must comply with how the wrapper needs to see its data•Any redundant data must be hidden from the wrapper schema, while still being managed. e.g. Account occurred both tables, so one occurrence is redundant. Therefore it has been grouped with Customer, not Order.
Mapping (diagrams taken from article)
INSERT INTO Order (Number, Customer, Date, Amount, Product)VALUES (: Number, :Customer, :Date, :Amount, :Product);
NumberCustomerDateAmountProduct
Order
id: Numberref: Customer
Wrapper Schema
ReferenceName[0-1]Address Street Zip CityAccount
Customer
id: Reference
Wrapper Update
ReferenceName[0-1]AddressAccount
Customer
Acc: Reference
NumberCustomerAccountDateAmountProduct
Order
Acc: Customer
Physical Schema
If exists (SELECT * FROM Order WHERE Number = :Number)or not exists (SELECT * FROM Customer WHERE Reference = :Customer)then return error;else SELECT Account INTO :Account FROM Customer WHERE Reference = :Customer
INSERT INTO Order (Number, Customer, Date, Amount, Product) VALUES (: Number, :Customer, :Date, :Amount, :Product);endif;
Database Query/Update
Mapping (diagrams taken from article)
ReferenceName[0-1]AddressAccount
Customer
Acc: Reference
NumberCustomerAccountDateAmountProduct
Order
Acc: Customer
Physical Schema
If exists (SELECT * FROM Order WHERE Number = :Number)or not exists (SELECT * FROM Customer WHERE Reference = :Customer)then return error;else SELECT Account INTO :Account FROM Customer WHERE Reference = :Customer
INSERT INTO Order (Number, Customer, Date, Amount, Product) VALUES (: Number, :Customer, :Date, :Amount, :Product);endif;
Database Query/Update
1. Implicit Constraint Management (the wrapper checks the constraints implied by the implicit identifier (exists) and implicit foreign keys (not exists)).
2. Data Error Management (the wrapper reports on violation of implicit constraints).
3. Redundancy Management (the wrapper assigns the values for redundant data from the source data (Account)
4. Query translation – of the wrapper update against its schema to updates on the physical schema
We will now examine the database query/update that has resulted.
Wrapping Up – A Summary of the Basic Architecture of
a R/W Wrapper• Query / update analysis
• Error reporting
• Query / update and data translation
• Implicit constraint control
• Security*
• Concurrency and transaction management*
* Out of scope for the article, and therefore for this presentation.
Generic Methodology
• Database reverse engineering
–No up to date documentation
• Wrapper generation
–Mapping to functions
Conclusion
• Relatively smooth transition
• Only one area of coding
• Allows any program to read and update
• Controls the integrity of data
• Relatively time and cost effective
References• P. Thiran, J. Hainaut, G. Houben and D Benslimane (2006), ‘Wrapper-Based
Evolution of Legacy Information Systems’, ACM Transactions on Software Engineering and Methodology, Vol 15, No 4, Pages 329-359
• Dr. David Corman, The Boeing Company (2001), The IULS Approach to Software Wrapper Technology for Upgrading Legacy Systemshttp://www.stsc.hill.af.mil/crosstalk/2001/12/corman.html , accessed October 2007
• Dean, J., Li, L. (2002), Issues in Developing Security Wrapper Technology for COTS Software Products http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-44924_e.html , accessed October 2007
• Kelly, R., Fritsch, M., (2003), Improved effectiveness with information consolidation - Creating information transparency and consistency http://download.oracle.com/owparis_2003/40266.doc , accessed October 2007
Questions?