formal concept analysis based normal forms for class hierarchy design in object-oriented software...
TRANSCRIPT
Formal concept analysis based normal forms for class hierarchy design in
object-oriented software development http://www.info.uqam.ca/~godin/ICFCA2003.ppt
Robert Godin, UQAMFirst Conference on "Formal Concept Analysis" DarmstadtFeb. 2003
Collaborators
Rokia Missaoui, Hafedh Mili (UQAM) Guy Mineau (ULaval) Petko Valtchev, Houari Sahraoui (UdeM) Marianne Huchard (LIRMM)
Plan
The problem of class hierarchy design Quality criteria Work on FCA for class hierarchies FCA framework Normal forms (design patterns) for class hierarchies
– Attributes– Methods body/signature/redefinition– Associations
Problem: designing & maintaininggood class hierarchies
Difficult problem (Booch, 1994; Rumbaugh, Blaha, Premerlani, Eddy & Lorensen, 1991) – Large solution space– Evolution– Conflicting criteria
Large body of work– (Godin, Huchard, Roume & Valtchev, 2002)
Development scenarios
Building the hierarchy from scratch using – objects (Lieberherr, Bergstein & Silva-Lepe, 1991) – class specifications (Dvorak, 1994; Godin & Mili, 1993)
Evolution of the class hierarchy to accommodate new requirements– unconstrained class addition (Dvorak, 1994; Godin & Mili, 1993) – addition constrained by backward compatibility
with a previous hierarchy (Rapicault & Napoli, 2001) existing objects (Huchard, 1999)
Reengineering of an existing class hierarchy– from the relation between classes and their attributes/methods (Casais, 1991; Cook, 1992)– using code analysis tools (Dicky, Dony, Huchard & Libourel, 1996; Godin, Mili, Mineau,
Missaoui, Arfi & Chau, 1998)– by applying refactorings (Fowler, 2002; Moore, 1996)– from UML models including associations (Huchard, Roume, Valtchev…)– from access patterns in applications (Snelting & Tip, 2000)– prompted by detecting defects using software metrics (Sahraoui, Godin & Miceli, 2000)
Reengineering procedural code (Sahraoui, Lounis, Melo & Mili, 1999; Tonella, 2001) Merging existing partial hierarchies (Snelting & Tip, 2002)
Formal quality criteria
Minimize redundancy (code duplication)– Maximal factorization
Subclass as specialization Multiple inheritance only if necessary Limit number of classes Guaranteed by FCA framework
Using FCA for class hierarchy
Initial proposal OOPSLA’93 (Godin & Mili, 1993)– Object Oriented Reorganization using GAlois lattices SysteM
LIRMM : ARES incremental alg. (Dicky, Dony, Huchard & Libourel, 1994)– Galois subhierarchy
ISGOOD incremental alg. (Godin, Mineau, Missaoui, 95; Godin & Chau, 2000)
ARES++ : Overloading/extracting specs. from code OOPSLA’96 (Dicky et al., 1996)
Combined with method refactoring GURU OOPSLA’96(Moore, 96) Object database design (Yahia, Lakhal, Bordat & Cicchetti, 1996) Smalltalk tool, exp., metrics (Godin, Mili, Mineau, Missaoui, Arfi & Chau,
1998) Access patterns from applications TOPLAS (Snelting & Tip, 2000) CERES : Java tool, batch alg. (Huchard, Dicky, Leblanc, 2000) Factoring associations (Huchard, Roume, Valtchev, 2002)
FCA framework
Context KK := (G, M, I) G (objects of context)
– classes– example objects– associations
M (attributes of context)– instance variables (OO attributes)
values– methods body/signature– association role– association properties
Relation gIm
From analyst From code From access patterns in applications
Example : class attributes
a b c d e f 1 2 3 4
Class1
af
Class2
abc
Class3
abd
Class4
bde
Reduced labelling of concept lattice
a b c d e f 1 2 3 4
1 2 3 4
a b
cf e
d
Interpretation as class hierarchy : attribute factored lattice form
1 2 3 4
a b
cf e
d
Class1
f
Class2
cClass3 Class4
e
Class7
a
Class5 Class6
d
Class8
b
Class9
Minimize number of classes while preserving other quality criteria
Class1
abc
Class2
abd
Class1
c
Class2
d
Class3
ab
Class1
c
Class2
d
Class3
a
Class4
b
Prune empty classes : FCA object/attribute concepts (Galois subhierarchy, PICH)
1 2 3 4
a b
cf e
d
1 2 3 4
a b
cf e
d
Attribute factored subhierarchy form
1 2 3 4
a b
cf e
d
Class1
f
Class2
cClass3 Class4
e
Class7
a
Class6
d
Class8
b
The case for empty class ?
Method body factored lattice/subhierarchy form
Class1
a1()
Class2
a2()
Class3Class4
c2()
Class5
b2()
Class6
b1()
Class7
c1()
a1 a2 b1 b2 c1 c2 1 2 3 4 5
4 3 5
c2
c1
a1 a2
1 2
b 1
b 2
Class1
a1()
Class2
a2()
Class3
b1()c1()
Class4
b1()c2()
Class5
b2()c1()
Declaration of method signatures
Use many valued attribute for each signature
+ Scaling
a a a1 a2 a a1 a2
a
a1 a2
b
b 1 b 2
c
c1 c2
Derived one-valued context
a b c a a1 a2 b b1 b2 c c1 c2 1 2 3 4 5
a1 a2 b1 b2 c1 c2 1 2 3 4 5
a a a1 a2 a a1 a2
Concept lattice
a b c a a1 a2 b b1 b2 c c1 c2 1 2 3 4 5
4 3 5
b ,c
c2
c1
a
a1 a2
1 2
b 1
b 2
Method declaration and body factored lattice/subhierarchy
4 3 5
b ,c
c2
c1
a
a1 a2
1 2
b 1
b 2
Class1
a1()
Class2
a2()
Class3Class4
c2()
Class5
b2()
Class6
b1()
Class7
c1()
Class8
a()
Class9
b()c()
Method redefinition
By scaling
b
a
a1 a2
b 1
b 2
c
c1
c2
ab a a1 a2 b b1 b2 a a1 a2 b b1 b2
Method redefinition factored lattice/subhierarchy
4
3
5
b ,c ,b 1 ,c1
b 2
a
a1 a21 2
c2
Class1
a1()
Class2
a2()
Class4
c2()
Class5
b2()
Class6
a()
Class3
b()b1()c()c1()
Maximal factorization
H is maximally factorized : if
– x1 in Class1– x2 in Class2– x3 is least upper bound of x1 and x2
then – x3 in superclass of Class1 and Class2
Maximal factorization
Class1
a1()
Class2
a2()
Class4
c2()
Class5
b2()
Class6
a()
Class3
b()b1()c()c1()
Class1
a1()
Class2
a2()
Class3Class4
c2()
Class5
b2()
Class6
b1()
Class7
c1()
Class8
a()
Class9
b()c()
b
a
a1 a2
b 1
b 2
Associations : value of many-valued attribute is an FCA object
a b c d e ass 1 3 2 4 3 4
ass = a b c d e 3 4 1 2 3 4
C3
cd
C1
a
ass
C4
ce
C2
b
ass
First concept lattice : B0
ass = a b c d e 3 4 1 2 3 4
3 4
c
e
a ,a ss :3 a ,a ss :41 2
d
C3
d
C1
a
ass
C4
e
C2
b
ass
C5
c
Relational extension K+ with B0
ass a b c d e 3 4 5 1 2 3 4
C3
d
C1
a
ass
C4
e
C2
b
ass
C5
c
Concept lattice B1 : association factored lattice/subhierarchy forms
ass a b c d e 3 4 5 1 2 3 4
3 4
c
ea ,a ss :3 a ,a ss :41 2
d
ass :5
C3
d
C1
a
ass
C4
e
C2
b
ass
C5
c
C6
ass
Relational context family (RCF)
General framework for set of related contexts– (Huchard, Roume & Valtchev, 2002)
UML – Take into account UML properties (multiplicity,
association classes,…)– One context for classes : K1 – One context for associations : K2
– Mutual enrichment– Iterate until fixed point is reached
Definition of Relational Context Family (RFC)
Definition : Relational context family (RCF)
R s = (KR , AR) KR set of s multi-valued contexts
Ki = (Ai , Oi , Vi , Ji ) and set of relations AR = {j }
j : Or 2Oq
Auxiliairy functions :
i. dom, cod : AR {Oi } 0 ≤ i ≤ s such that for all j : Or 2Oq : a) dom(j ) = Or ,
b) cod(j ) = Oq .
ii. rel : KR 2AR with
rel ( Ki ) = {j | dom(j ) = Oi}
UML exampleStudentnameaddressCollege
Transaction
dD
TransVal
tV
Tenant
nameaddress
Renting
sPmR
Purchase
pCpD
House
type
Rent >
1…*
*
1…**Buy >
Landlord
nameaddress
Context for classes : K1
GF1 = GF2 = GF3 = GFDate = GFValue = GF4 =
name adrs col sP pD dD mR pC tV type
Student X X X
Tenant X X
Landlord X X
House X
Transaction X
TransVal X
Renting X X X X
Purchase X X X X
Build concept lattice for K1
Tenant-Landlordnameaddress
TransactiondD
RentingsPmR
PurchasepD pC
StudentCollege
Housetype
Rent >1…* *
1…* *Buy >
TransValtV
Context for associations : K2
name = mo = md =
Nav =
Rent Buy 1…* * * O-D D-O
Rent X X X X X X
Buy X X X X X X
K2 K+2
name = mo = md =
Nav = to = td = AC =
Rent Buy1
…* * * O-D D-O Tenant Landlord House Renting Purchase
Rent X X X X X X X X X
Buy X X X X X X X X X
Discover AssG1
RentingsPmR
PurchasepD pC
Rent >
1…* * 1…* *
Buy >
1…* *
AssG1 >
to=Tenant td=House td=House
td=House
to=Landlord
K1 K+1
GFDate = GFValue = OrigOf = DestOf = ClOf =
name
adrs col
sD
pD
dD mR
pC tV
typ
Rent
Buy
AssG1
Rent
Buy
AssG1
Rent
Buy
AssG1
Student X X X
Tenant X X X X
Landlord X X X X
House X X X X
Transaction X
TransVal X
Renting X X X X X X
Purchase X X X X X X
Discover new classes
Tenant
TransactiondD
RentingsPmR
PurchasepD pC
StudentCollege
Housetype
Rent >1…* *
1…* *Buy >
TransValtV
NC2
NC4
1…*
*
AssG1 >
NC1 (Person)nameaddress
Landlord
Conclusion
FCA is natural framework for design of class hierarchies
FCA based normal forms for hierarchy design– Attribute, method body, method signature,
redefinition, association factored lattice/hierarchy
Future work– Theoretical : RFC– Tool support
References Booch, G. (1994). Object-Oriented Analysis and Design (2nd ed.). Reading, MA: Benjamin Cummings. Casais, E. (1991). Managing Evolution in Object Oriented Environments: An Algorithmic Approach . Thèse de doctorat Thesis, Geneva. Cook, W. R. (1992). Interfaces and Specifications for the Smalltalk-80 Collection Classes. In Proceedings of the Conference on Object-Oriented Programming Systems, Languages,
and Applications, A. Paepcke (Ed.), Vancouver, B.C., Canada: ACM Press, pp. 1-15. Dicky, H., Dony, C., Huchard, M. & Libourel, T. (1996). On Automatic Class Insertion with Overloading. In Proceedings of the ACM Conference on Object-Oriented Programming
Systems, Languages, and Applications (OOPSLA'96), CA, USA: ACM SIGPLAN Notices, pp. 251-267. Dvorak, J. (1994). Conceptual Entropy and Its Effect on Class Hierarchies. IEEE Computer, 27(6), 59-63. Fowler, M. (2002). Refactoring : Improving the Design of Existing Code. Reading, MA: Addison-Wesley. Ganter, B. & Wille, R. (1999). Formal Concept Analysis : Mathematical Foundations. Springer-Verlag. Godin, R., Huchard, M., Roume, C. & Valtchev, P. (2002). Inheritance and Automation : Where Are We Know ? In Proceedings of the Inheritance Workshop, ECOOP 2002 Workshop
Reader, Malaga, Spain: Lecture Notes In Computer Science, Springer-Verlag, Godin, R. & Mili, H. (1993). Building and Maintaining Analysis-Level Class Hierarchies Using Galois Lattices. In Proceedings of the ACM Conference on Object-Oriented
Programming Systems, Languages, and Applications (OOPSLA'93), A. Paepcke (Ed.), Washington, DC: ACM Press, pp. 394-410. Godin, R., Mili, H., Mineau, G. W., Missaoui, R., Arfi, A. & Chau, T.-T. (1998). Design of Class Hierarchies based on Concept (Galois) Lattices. Theory and Practice of Object
Systems (TAPOS), 4(2), 117-134. Huchard, M. (1999). Classification des classes contre classification d'instances. Évolution incrémentale dans les systèmes à objets basés sur des treillis de Galois. In Proceedings of
the Langages et Modèles à Objets (LMO'99), Hermès, pp. 179-196. Huchard, M., Dicky, H. & Leblanc, H. (2000). Galois lattice as a framework to specify building class hierarchies algorithms. Theoretical Informatics and Applications, n. 34, 521-548. Huchard, M., Roume, C. & Valtchev, P. (2002). When Concepts Point at Other Concepts : the Case of UML Diagram Reconstruction. In Proceedings of the Workshop FCAKDD, pp.
32-43. Johnson, R. & Foote, B. (1988). Designing Reusable Classes. Journal of Object-Oriented Programming, June/July, 22-35. Korson, T. & McGregor, J. D. (1992). Technical Criteria for the Specification and Evaluation of Object-Oriented Libraries. Software Engineering Journal, March, 85-94. Lieberherr, K. J., Bergstein, P. & Silva-Lepe, I. (1991). From Objects to Classes: Algorithms for Optimal Object-Oriented Design. Journal of Software Engineering, 6(4), 205-228. Moore, I. (1996). Automatic Inheritance Hierarchy Restructuring and Method Refactoring. In Proceedings of the ACM Conference on Object-Oriented Programming Systems,
Languages, and Applications (OOPSLA'96), CA, USA: ACM SIGPLAN Notices, pp. 235-250. Rapicault, P. & Napoli, A. (2001). Évolution d'une hiérarchie de classes par interclassement. L'Objet, 7(1-2). Rumbaugh, J., Blaha, M., Premerlani, W., Eddy, F. & Lorensen, W. (1991). Object-Oriented Modeling and Design. Prentice Hall. Sahraoui, H., Lounis, H., Melo, W. & Mili, H. (1999). ·A Concept Formation Based Approach to Object Identification in Procedural Code. Journal of Automated Software Engineering,
10(4). Sahraoui, H. A., Godin, R. & Miceli, T. (2000). Can Metrics Help to Bridge the Gap Between the Improvement of OO Design Quality and Its Automation? In Proceedings of the
International Conference on Sofware Management (ICSM) 2000, San Jose, CA: Snelting, G. & Tip, F. (2000). Understanding Class Hierarchies Using Concept Analysis. ACM Transactionson Programming Languages and Systems, 22(3), 540-582. Snelting, G. & Tip, F. (2002). Semantics-Based Composition of Class Hierarchies. In Proceedings of the pp. 562-584. Tonella, P. (2001). Concept Analysis for Module Restructuring. IEEE Transactions on Software Engineering, 27(4), 351-363. Yahia, A., Lakhal, L., Bordat, J. P. & Cicchetti, R. (1996). An algorithmic method for building inheritance graphs in object oriented design. In Proceedings of the 15th International
Conference on Conceptual Modeling, ER'96, B. Thalheim (Ed.), Cottbus, Germany: pp. 422-437.