130214 wei wu - extracting business rules and removing duplication with iris
DESCRIPTION
Business rules, extraction, matching, cleaning.TRANSCRIPT
Extracting Business
Rules and Removing
Duplication With IRIS
Wei Wu, École Ploytechnique de Montéal
Stéphane Vaucher, Benchmark Consulting
Outline
Business Rule (BR)
Problem
Previous Works
BR Extraction
Detect Duplication
Examples
TODO
Business Rule
Business rules (BRs) are statements that prevent,
cause, or suggest business activities to happen.
- The GUIDE Business Rules Project
Example:
If the driver age under 25, car rent rate is $80 per
day, otherwise, it is $60 per day.
Business Rule
OMG Semantics of Business Vocabulary and
Business Rules (SBVR)
Conceptual Model
Business Rule
Rule: If the driver age under 25, car rent rate is $80 per day, otherwise, it is $60 per day.
Term:
Driver age
Car rent rate
Fact:
The driver age under 25
Set car rent rate to $80
Set car rent rate to $60
Problem
Written by business analysts
Implemented in business systems
Documents and implementations are not well
synchronized
Actual BRs are only in the source code
No reliable way to extract them
Previous Works
Erik Putrycz and Anatol W. Kark, Recovering Business Rules from Legacy Source Code, the Proceedings of The International RuleML Symposium on Rule Interchange and Applications (RuleML-2007), 2007
Harry M. Sneed, Extracting Business Logic from existing COBOL programs as a basis for Redevelopment, Proceedings of the 9th International Workshop on Program Comprehension (IWPC’01), 2001
Huang et al., Business Rule Extraction from Legacy Code, COMPSAC '96 Proceedings of the 20th Conference on Computer Software and Applications, 1996
Harry M. Sneed and Katalin Erdos, Extracting Business Rules from Source Code, WPC '96 Proceedings of the 4th International Workshop on Program Comprehension, 1996
Softwareminingʹs BRE Toolkit, http://www.softwaremining.com/services/Business_Rule_Extraction.jsp
Limitations
Not KDM based - Language specific
Not distinguish business and non-business
variable.
Need manually-identified variable-of-interest
Not handle duplication
Term Unit Extraction
DB Related DataElements
ColumnSet – ItemUnit
ColumnSet – InDataRelations – DataModel –
DataElements
DataModel – DataAction - DataElement
DB Related DataElements - TermUnit
Identify DB-related Actions
ActionElements InReads and InWrites of DB Related DataElements
Top-level Conditions if(no-db-realted-Data){
if(db-related-data1){
if(db-related-data2){
}
}
if(db-related-data3){
}
}
Identify Top-level Condition
Condition not dominated by any condition which
accesses DB-realted DataElement
To detect not-top-level conditions
Get basic-block (BB) of each condition
Get all the dominators of the BB
Get the InFlow of the first ActionElement of each dominator
If the InFlow is TrueFlow or FalseFlow and the condition of
the Flows accesses some DB related DataElement
RuleUnit Extraction 2-1
Create RuleUnit for the TrueFlow (FlaseFlow)
of each top-level condition
Put the condition to the implementation list of
the RuleUnit
Create a FactUnit for the condition
Create a ConceptualRole for the FactUnit and
put it to the ConceptualElement list of the Rule
Unit
RuleUnit Extraction 2-2
Create a FactUnit for each ActionElement in the
branch
Create a ConceptualRole for the FactUnit and put it to
the ConceptualElement list of the Rule Unit
Create a normalized ID for the RuleUnit based on all
its ConceptualRoles
If there’s another RuleUnit with the same ID, just add
the implementation list to that RuleUnit.
Otherwise, add the RuleUnit to the Conceptual Model
ActionElement Normalization
Intermediate Presentation
Left - OutReads
Operator - Type
Right – OutReads
Conditions
NNF
DNF
ActionElement Normalization
Alphabetical Normalization
b == a -> a == b
b < a -> a > b
d>c && b<a -> a>b && c<d
Examples
BR:
121810 EVALUATE SQLSTATE 02443500
121810 WHEN SQL-OK 02443600
121810 IF WORK-HV-TXL-CLS-WASH-SALE-AMT >= 0 02443700
121810 IF WORK-HV-TXL-CLS-RECORD-DEL-SW = 'N' AND 02443800
121810 TAXL-TRANS-TYPE-NO = 930 02443900
012211 PERFORM RECALCULATE-AVG-WASH-QTY 02444000
121810 SET PROCESS-AVG-WASH 02444100
121810 TO TRUE 02444200
121810 ELSE 02444300
121810 IF WORK-HV-TXL-CLS-RECORD-DEL-SW = 'Y' AND 02444400
121810 (TAXL-TRANS-TYPE-NO = 500 OR 600) 02444500
012211 PERFORM RECALCULATE-AVG-WASH-QTY 02444600
121810 SET PROCESS-AVG-WASH 02444700
121810 TO TRUE 02444800
121810 END-IF 02444900
121810 END-IF 02445000
121810 END-IF 02445100
121810 WHEN OTHER 02445200
121810 PERFORM ERROR-AVG-WASH 02445300
121810 END-EVALUATE.
Examples
Duplicated
080306 IF HV-TXL-MST-OPEN-DATE IS LESS THAN 00862200
091906 WORK-TAXL-UPDT-DIV-EX-DT-DB2 OR 00862300
091906 HV-TXL-MST-OPEN-DATE IS EQUAL TO DF-NINES-DATE 00862400
080306 PERFORM UPDATE-LIQUIDATION-PAYMENT 00862500
080306 END-IF 00862600
IF TAXL-QTY IS NOT EQUAL TO ZERO 00862700
PERFORM FETCH-ROW-TXL-OPEN-FIFO 00862800
END-IF
080306 IF HV-TXL-MST-OPEN-DATE IS LESS THAN 00869400
091906 WORK-TAXL-UPDT-DIV-EX-DT-DB2 OR 00869500
091906 HV-TXL-MST-OPEN-DATE IS EQUAL TO DF-NINES-DATE 00869600
080306 PERFORM UPDATE-LIQUIDATION-PAYMENT 00869700
080306 END-IF 00869800
IF TAXL-QTY IS NOT EQUAL TO ZERO 00869900
PERFORM FETCH-ROW-TXL-OPEN-FIFO 00870000
END-IF
Examples
Data ActionElements: 2,084
Code ActionElements: 47,699
IF conditions: 2,724
DB related ActionElements: 12,195
DB related IF conditions is 621
Top-level if conditions is 591
ActionElements in BRs are 2,038
Ongoing Works
Compute the percentage of the ActionElements
involved in BRs
Compare with using graph matching techniques
to identify duplicated or similar BRs
Future Works
Get feedback from business people
Multi-objective way to detect duplicated or
similar BRs
Peephole Optimization
Remove irrelevant ActionElements
(Usedef/Defuse)
Inter-procedural analysis
Thank You!
Extracting Business
Rules and Removing
Duplications With IRIS