data abstractions eece 310: software engineering

61
Data Abstractions EECE 310: Software Engineering

Upload: felicity-weaver

Post on 13-Dec-2015

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Abstractions EECE 310: Software Engineering

Data Abstractions

EECE 310: Software Engineering

Page 2: Data Abstractions EECE 310: Software Engineering

Learning Objectives

• Define data abstractions and list their elements• Write the abstraction function (AF) and

representation invariant (RI) of a data abstraction• Prove that the RI is maintained and that the

implementation matches the abstraction (i.e., AF)• Enumerate common mistakes in data

abstractions and learn how to avoid them• Design equality methods for mutable and

immutable data types

Page 3: Data Abstractions EECE 310: Software Engineering

Data Abstraction

• Introduction of a new type in the language– Type can be abstract or concrete– Has one of more constructors and operations– Type can be used like a language type

• Both the code and the data associated with the type is encapsulated in the type definition– No need to expose the representation to clients– Prevents clients from depending on implementation

Page 4: Data Abstractions EECE 310: Software Engineering

Isn’t this OOP ?

• NO, though OOP is a way to implement ADTs– OOP is a way of organizing programs into classes

and objects. Data abstraction is a way of introducing new types ADTs with meanings.

– Encapsulation is a goal shared by both. But data abstraction is more than just creating classes.

– In Java, every data abstraction can be implemented by a class declaration. But every class declaration is not a data abstraction.

Page 5: Data Abstractions EECE 310: Software Engineering

Elements of a Data Abstraction

• The abstraction specification should:– Name the data type– List its operations– Describe the data abstraction in English– Specify a procedural abstraction for each

operation

• Public vs. Private– The abstraction only lists the public operations– There may be other private procedures inside…

Page 6: Data Abstractions EECE 310: Software Engineering

Example: IntSet

• Consider a IntSet Data type that we wish to introduce in the language. It needs to have:– Constructors to create the data-type from scratch

or from other data types (e.g., lists, IntSets)– Operations include insert, remove, size and isIn– A specification of what the data type represents– Internal representation of the data type

Page 7: Data Abstractions EECE 310: Software Engineering

IntSet Abstraction• public class IntSet {//OVERVIEW: IntSets are mutable, unbounded sets of integers.

// A typical IntSet is {x1, …xn}, where xi are all integeres

// Constructors• public IntSet();• //EFFECTS: Initializes this to be the empty set• // Mutators• public void insert (int x);• // MODIFIES: this • // EFFECTS: adds x to the set this, i.e, this_post = this {x}• public void remove (int x); • // MODIFIES: this • // EFFECTS: this_post = this - {x}• //Observers• public boolean IsIn(int x);• // EFFECTS: returns true if x this, false otherwise• public int size();• // EFFECTS: Returns the cardinality of this• }

Page 8: Data Abstractions EECE 310: Software Engineering

Group Activity

• Consider the Polynomial data-type below. Write the specifications for its methods.

public class Poly {public Poly(int c, int n) throws NegException;

public Poly add(Poly p) throws NPException; public Poly mul(Poly p) throws NPException; public Poly minus(); public int degree(); }

Page 9: Data Abstractions EECE 310: Software Engineering

Learning Objectives

• Define data abstractions and list their elements• Write the abstraction function (AF) and

representation invariant (RI) of a data abstraction• Prove that the RI is maintained and that the

implementation matches the abstraction (i.e., AF)• Enumerate common mistakes in data

abstractions and learn how to avoid them• Design equality methods for mutable and

immutable data types

Page 10: Data Abstractions EECE 310: Software Engineering

Abstraction Versus Representation• Abstraction: External view of a data type

• Representation: Internal variables to represent the data within a type (e.g., arrays, vectors, lists)

Abstraction

Representation

{ 1, 2, 3 }

123

132

rep objects

abstract objects

Page 11: Data Abstractions EECE 310: Software Engineering

Example: Representation

• Vector directly holds the set elements – if integer e is in the set,

there exists 0 <= i < N, such that elems[i] = e

• Vector is a bitmap for denoting set elements– If integer i is in the set,

then elems[i] = True, else elems[i] = False

Vector<Integer> ‘elems’ of size N to represent an IntSet

0 N

Can you tell how the representation maps to the abstraction ?

Page 12: Data Abstractions EECE 310: Software Engineering

Abstraction Function

• Mathematical function to map the representation to the abstraction

• Captures designer’s intent in choosing the rep– How do the instance variables relate to the

abstract object that they represent ?– Makes this mapping explicit in the code– Advantages: Code maintenance, debugging

Page 13: Data Abstractions EECE 310: Software Engineering

IntSet: Abstraction Function

Unsorted ArrayAF ( c ) = { c.elems[i].intValue 0 <= i < c.elems.size }

Boolean VectorAF( c ) = { j | 0 <= j < 100 && c.elems[j] }

•The abstraction function is defined for concrete instances of the class ‘c’, and only includes the instance variables of the class. Further, it maps the elements of the representation to the abstraction.

Page 14: Data Abstractions EECE 310: Software Engineering

Abstraction Function: Valid Rep

The abstraction function implicitly assumes that the representation is valid for the class– What happens if the vector contains duplicate entries

in the first scenario ?– What happens in the second scenario if the bitmap

contains values other than 0 or 1 ?

The AF holds only for valid representations. How do we know whether a representation is valid ?

Page 15: Data Abstractions EECE 310: Software Engineering

Representation Invariant

• Captures formally the assumptions on which the abstraction function is based

• Representation must satisfy this at all times (except when executing the ADT’s methods)

• Defines whether a particular representation is valid – invariant satisfied only by valid reps.

Page 16: Data Abstractions EECE 310: Software Engineering

IntSet: Representation Invariant

Unsorted Arrays1. c.elems =/= null &&2. c.elems has no null elements

&&3. there are no duplicates in

c.elems i.e., for 0<=i, j <N,c.elems[i].intValue =

c.elems[j].intValue=> i = j.

Boolean Vector1. c.elements =/= null &&2. c.elements.size = maxValue

NOTE: The types of the instance variables are NOT a part of the Rep Invariant. So there is not need to repeat what is there in the type signature.

Page 17: Data Abstractions EECE 310: Software Engineering

Rep Invariant: Important Points• Rep invariant always holds before and after the

execution of the ADT’s operations– Can be violated while executing the ADT’s operations– Can be violated by private methods of the ADT

• How much shall the rep invariant constrain?– Just enough for different developers to implement

different operations AND not talk to each other– Enough so that AF makes sense for the representation

Page 18: Data Abstractions EECE 310: Software Engineering

AF and RI: How to implement ?

RI: repOKPublic method to check if the

rep invariant holdsUseful for testing/debugging

public boolean repOK() { // EFFECTS: Returns true // if the rep invariant holds, // Returns false otherwise}

AF: toStringPublic method to convert a

valid rep to a String form Useful for debugging/printing

public String toString( ) { // EFFECTS: Returns a string // containing the abstraction // represented by the rep.

Page 19: Data Abstractions EECE 310: Software Engineering

Uses of RI and AF

• Documentation of the programmer’s thinking• RepOK method can be called before and after

every public method invocation in the ADT– Typically during debugging only

• toString method can be used both during debugging and in production

• Both the RI and AF can be used to formally prove the correctness of the ADT

Page 20: Data Abstractions EECE 310: Software Engineering

Group Activity

• Assume that the Polynomial data type is represented as an array trms and a variable deg. The co-efficients of the term xi are stored in the ith element of trms array, and the variable deg represents the degree of the polynomial (i.e., its highest exponent).

1.Write its abstraction function2.Write its rep-invariant

Page 21: Data Abstractions EECE 310: Software Engineering

Learning Objectives

• Define data abstractions and list their elements• Write the abstraction function (AF) and

representation invariant (RI) of a data abstraction• Prove that the RI is maintained and that the

implementation matches the abstraction (i.e., AF)• Enumerate common mistakes in data

abstractions and learn how to avoid them• Design equality methods for mutable and

immutable data types

Page 22: Data Abstractions EECE 310: Software Engineering

Reasoning about ADTs - 1

• ADTs have state in the form of representation– Need to consider what happens over a sequence of

operations on the abstraction– Correctness of one operation depends on

correctness of previous operations– We need to reason inductively over the operations

of the ADT• Show that constructor is correct• Show that each operation is correct

Page 23: Data Abstractions EECE 310: Software Engineering

Reasoning about ADTs - 2

• First, need to show that the rep invariant is maintained by the constructor & operations

• Then, show that the implementation of the abstraction matches the specification– Assume that the rep invariant is maintained– Use the abstraction function to map the

representation to the abstraction

Page 24: Data Abstractions EECE 310: Software Engineering

Why show that Rep Invariant is maintained ?

• Consider the implementation of the IntSet using the unsorted vector representation. We wish to compute the size of the set (i.e., its cardinality).

public int size() { return elems.size();}

Is the above implementation correct ?

Page 25: Data Abstractions EECE 310: Software Engineering

Why show that Rep Invariant is maintained ?

Yes, but only if the Rep Invariant holds !c.elems != Null && c.elems has no null elements && c.elems has no duplicatesOtherwise, size can return a value >= cardinality

public int size() { return elems.size();}

Page 26: Data Abstractions EECE 310: Software Engineering

Showing Rep Invariant is maintained:Data Type Induction

• Show that the constructor establishes the Rep Invariant

• For all other operations,

Function Body

Another Valid Rep

A Valid Rep

Assume at the time of the call the invariant holds for

1.this and 2.all argument objects of the type

Demonstrate that the invariant holds on return for

1.this2.all argument objects of the type3.for returned objects of the type

Page 27: Data Abstractions EECE 310: Software Engineering

IntSet : getIndex

private int getIndex( int x ) { // EFFECTS: If x is in this, returns index // where x appears in the Vector elems // else return -1 (do NOT throw an exception)

for (int i = 0; i < els.size( ); i ++ ) if ( x == elements.get(i).intValue() )

return i; return –1;}

Assume that IntSet has the following private function. Note that private methods do not need to preserve the RI.

Page 28: Data Abstractions EECE 310: Software Engineering

IntSet: Constructor

public IntSet( ) { // EFFECTS: Initializes this to be empty elems = new Vector<Integer>();}RI: c.elems != NULL && c.elems has no null elements&& c.elems has no duplicates

Show that the RI is true at the end of the constructor

Proof: When the constructor terminates,

Clause 1 is satisfied because the elems vector is initialized by constructorClause 2 is satisfied because elems has no elements (and hence no null elements)Clause 3 is satisfied because elems has no elements (and hence no duplicates)

Page 29: Data Abstractions EECE 310: Software Engineering

IntSet: Insert

public void insert (int x) { // MODIFIES: this // EFFECTS: adds x to the set such that this_post = this u {x} if ( getIndex(x) < 0 ) elems.add( new Integer(x) );}

Show that if RI holds at the beginning, it holds at the end.

RI: c.elems != NULL && c.elems has no null elements && c.elems has no duplicatesProof:If clause 1 holds at the beginning, it holds at the end of the procedure.

- Because c.elems is not changed by the procedure.If clause 2 holds at the beginning, it holds at the end of the procedure

- Because it adds a non-null reference to c.elemsIf clause 3 holds at the beginning, it holds at the end of the procedure - Because getIndex() prevents duplicate elements from being added to the vector

Page 30: Data Abstractions EECE 310: Software Engineering

IntSet:Remove

pubic void remove(int x) { // MODIFIES: this // EFFECTS: this_post = this - {x} int i = getIndex(x); if (i < 0) return; // Not found elems.set(i, elems.lastElement() ); elems.remove(elems.size() – 1);}

RI: c.elems != NULL && c.elems has no null elements && c.elems has no duplicates

Show that if RI holds at the beginning, it holds at the end.

Page 31: Data Abstractions EECE 310: Software Engineering

IntSet: Observers

public int size() { return elems.size();}

public boolean isIn(int x) { return getIndex(x) >= 0;}

RI: c.elems != NULL && c.elems has no null elements && c.elems has no duplicates

Show that if RI holds at the beginning, it holds at the end.

This completes the proof that the RI holds in the ADT.In other words, given any sequence of operations in the ADT, the RI always holds at the beginning and end of this sequence.

Page 32: Data Abstractions EECE 310: Software Engineering

Group Activity

• Consider the implementation of the Polynomial Datatype described earlier (also on the code handout sheet)

• Show using data-type induction that the Rep Invariant is preserved

Page 33: Data Abstractions EECE 310: Software Engineering

Are we done ?

• Thus, we have shown that the RI is established by the constructor and holds for each operation (i.e., if RI is true at the beginning, it is true at the end). Can we stop here ?

No. To see why not, consider an implementation of the operators that does nothing.Such an implementation will satisfy the rep invariant, but is clearly wrong !!!

To complete the proof, we need to show that the Abstraction provided by the ADT is correct. For this, we use the (now proven) fact that the RI holds and use the AF to show that the rep satisfies the AF’s abstraction after each operation.

Page 34: Data Abstractions EECE 310: Software Engineering

Abstraction Function: IntSet

Show that the implementation matches the ADT’s specification (i.e., its abstraction)

Function Implementation

Pre-Rep

Post- Rep

Pre-Abstraction

Post-Abstraction

Function Spec

Abstraction function

Abstraction function

Given:

Prove that:

Page 35: Data Abstractions EECE 310: Software Engineering

Abstraction Function: Constructor

AF ( c ) = { c.elems[i].intValue | 0 <= i < c.elems.size }

public IntSet( ) { // EFFECTS: Initializes this to be empty elems = new Vector<Integer>() ; }

Empty vector Empty Set

AF

Proof: Constructor creates an empty set, so it is correct.

Page 36: Data Abstractions EECE 310: Software Engineering

Abstraction Function: Size

AF ( c ) = { c.elems[i].intValue | 0 <= i < c.elems.size }

public int size() { // EFFECTS: Returns the cardinality of this return elems.size( );}

Number of elementsin vector

Cardinality of the set (Why ?)

Proof: Because the rep invariant guarantees that there are no duplicates in the vector, the number of elements in the vector denotes the cardinality of the set.

AF

Page 37: Data Abstractions EECE 310: Software Engineering

Abstraction Function: Insert

AF ( c ) = { c.elems[i].intValue | 0 <= i < c.elems.size }

public void insert (int x) { // MODIFIES: this // EFFECTS: adds x to the set // such that this_post = this U {x} if ( getIndex(x) < 0 ) elems.add(new Integer(x));}

Vector with element added if and only if it did not already exist

this_post = this U {x}

Vector this

AF

AF

Implementation

Page 38: Data Abstractions EECE 310: Software Engineering

Abstraction Function: Remove

AF ( c ) = { c.elems[i].intValue| 0 <= i < c.elems.size }

public void remove (int x) {// MODIFIES: this // EFFECTS: this_post = this - {x} int i = getIndex(x); if (i < 0) return; // Not found // Move last element to the index i elems.set(i, elems.lastElement() ); elems.remove(elems.size() – 1);} Vector with first

instance of element removed if it exists

this_post = this - {x}

Vector this

Page 39: Data Abstractions EECE 310: Software Engineering

Abstraction Function: IsIn

public boolean isIn(int x) {// EFFECTS: Returns true if x belongs to // this, false otherwise return getIndex(x) > 0;}

AF ( c ) = { c.elems[i].intValue| 0 <= i < c.elems.size }

vector this

True if and only if x is present in the vector

True if x belongs to this, False otherwise

Page 40: Data Abstractions EECE 310: Software Engineering

Proof Summary

• This completes the proof. Thus, we’ve shown that the ADT implements it spec correcltly. This method is called “Data type induction”, because it proceeds using induction.– Step 0: Write the implementation of the ADT– Step 1: Show that the RI is maintained by the ADT– Step 2: Assuming that the RI is maintained, show

using the AF that the translation from the rep to the abstraction matches the method’s spec.

Page 41: Data Abstractions EECE 310: Software Engineering

Group Activity

• Consider the implementation of the Polynomial Datatype described earlier (also on the code handout sheet)

• Show that the ADT’s implementation matches its specification assuming that the RI holds.

Page 42: Data Abstractions EECE 310: Software Engineering

Learning Objectives

• Define data abstractions and list their elements• Write the abstraction function (AF) and

representation invariant (RI) of a data abstraction• Prove that the RI is maintained and that the

implementation matches the abstraction (i.e., AF)• Enumerate common mistakes in data

abstractions and learn how to avoid them• Design equality methods for mutable and

immutable data types

Page 43: Data Abstractions EECE 310: Software Engineering

Exposing the Rep

• Note that the proof we just wrote assumes that the only way you can modify the representation is through its operations– Otherwise Rep invariant can be violated

• Is this always true ? – What if you expose the representation outside the

class, so that any outside entity can change it ?

Page 44: Data Abstractions EECE 310: Software Engineering

Mistakes that lead to exposing the rep - 1

• Making rep components publicpublic class IntSet { public Vector<Integer> elements;

Your rep must always be private. Otherwise, all bets are off.Hopefully, your code will not have this bug ….

Page 45: Data Abstractions EECE 310: Software Engineering

Mistakes that lead to exposing the rep - 2

public class IntSet {

//OVERVIEW: IntSets are mutable, unbounded sets of integers. // A typical IntSet is {x1, …xn}

private Vector<Integer> elems; // no duplicates in vector

public Vector<Integer> allElements (){

//EFFECTS: Returns a vector containing the elements of this,

// each exactly once, in arbitrary order

return elems;

}

};

intSet = new IntSet();

intSet.allElements().add( new Integer(5) );

intSet.allElements().add( new Integer(5) ); // RI violated – duplicates !

Page 46: Data Abstractions EECE 310: Software Engineering

Mistakes that lead to exposing the rep - 3

public class IntSet {

//OVERVIEW: IntSets are mutable, unbounded sets of integers. // A typical IntSet is {x1, …xn}

private Vector<Integer> elems;

//constructors

public IntSet (Vector<Integer> els) throws NullPointerException {

//EFFECTS: If els is null, throws NullPointerException, else

// initializes this to contain as elements all the ints in els.

if (els == null) throw new NullPointerException();

elems = els;

}

};

Vector<Integer> someVector = new Vector();

intSet = new IntSet(someVector);

someVector.add( new Integer(5) );

someVector.add( new Integer(5) ); // RI violated – duplicates !

Page 47: Data Abstractions EECE 310: Software Engineering

Summary of mistakes that expose the Rep

1.NOT making rep components private2.Returning a reference to the rep’s mutable

components3. Initializing rep components with a reference

to an “outside” mutable object4.NOT performing deep copy of rep elements

1.Use clone method instead2.Perform manual copies

Page 48: Data Abstractions EECE 310: Software Engineering

Group Activity

• For the polynomial example, how many mistakes of exposing the rep can you find. How will you fix them ? (refer to code handout sheet)

Page 49: Data Abstractions EECE 310: Software Engineering

Learning Objectives

• Define data abstractions and list their elements• Write the abstraction function (AF) and

representation invariant (RI) of a data abstraction• Prove that the RI is maintained and that the

implementation matches the abstraction (i.e., AF)• Enumerate common mistakes in data

abstractions and learn how to avoid them• Design equality methods for mutable and

immutable data types

Page 50: Data Abstractions EECE 310: Software Engineering

Mutable objects

• Objects whose abstract state can be modified– Applies to the abstraction, not the representation

• Mutable objects: Can be modified once they are created e.g., IntSet, IntList etc.

• Immutable objects: Cannot be modified– Examples: Polynomials, Strings

Page 51: Data Abstractions EECE 310: Software Engineering

Equality: Equals Method

• All objects are inherited from object which has a method “Boolean equals(Object o)”– Returns true if object o is the same as the current– Returns false otherwise

• Note that equals tests whether two objects have the same state– If a and b are different objects, a.equals(b) will

return false even if they are functionally identical

Page 52: Data Abstractions EECE 310: Software Engineering

Equality: IntSet Example

IntSet a = new IntSet();a.insert(1); a.insert(2); a.insert(3);IntSet b = new IntSet();b.insert(1); b.insert(2); b.insert(3);if ( a.equals(b) ) {

System.out.println(“Equal”);}What is printed by the above code ?

Page 53: Data Abstractions EECE 310: Software Engineering

Equality: IntSet Example

• It prints nothing. Why ?– Because the intsets are different objects and the

object.equals method only compares their hash– Therefore, a.equals(b) returns false

• But this is in fact the correct behavior !– To see this, assume that you added an element to

a but not b after the equals comparison– a.equals(b) would no longer be true, even if you

have not changed the references to a or b

Page 54: Data Abstractions EECE 310: Software Engineering

Rule of Object Equality

• Two objects should be equal if it is impossible to distinguish between them using any sequence of calls to the object’s methods

• Corollary: Once two objects are equal, they should always be equal. Otherwise it is possible to distinguish between them using some combination of the object’s methods.

Page 55: Data Abstractions EECE 310: Software Engineering

Mutability and the Equals Method

• For mutable objects, you can distinguish between two objects by mutating them after the comparison. Therefore, they are NOT equal. The default equals method does the right thing – i.e., returns false.

• If the objects are immutable AND have the same state, then the equals method should return true. So we need to override the equals for immutable objects to do the right thing.

Page 56: Data Abstractions EECE 310: Software Engineering

Immutable Abstractions

• ADT does not change once created– No mutator methods– Producer methods to create new objects

• Appropriate for modeling objects that do not change during their existence– Mathematical entities such as Rational numbers– Certain objects may be implemented more

efficiently e.g., Strings

Page 57: Data Abstractions EECE 310: Software Engineering

Why use immutable ADTs ?• Safety

– Don’t need to worry about accidental changes– Can be assured that rep doesn’t change

• Efficiency– May hurt efficiency if you need to copy the object– In some cases, it may be more efficient by sharing

representations across objects e.g., Strings

• Ease of Implementation– May be easier for concurrency control

Page 58: Data Abstractions EECE 310: Software Engineering

Equality: Immutable objects

• Immutable objects should define their own equals method– Return true if the abstract state matches, even if

the internal state (i.e., rep) is different

• Therefore, methods of an Immutable object can modify its rep, but not the abstraction – Such methods said to have benevolent side effects

Page 59: Data Abstractions EECE 310: Software Engineering

Group Activity

• Design an equals method for two polynomials. What will you do if the polynomials are not in their canonical forms ?

Page 60: Data Abstractions EECE 310: Software Engineering

Learning Objectives

• Define data abstractions and list their elements• Write the abstraction function (AF) and

representation invariant (RI) of a data abstraction• Prove that the RI is maintained and that the

implementation matches the abstraction (i.e., AF)• Enumerate common mistakes in data

abstractions and learn how to avoid them• Design equality methods for mutable and

immutable data types

Page 61: Data Abstractions EECE 310: Software Engineering

To do before next class

• Submit assignment 2 in the lab

• Start working on assignment 3

• Prepare for the midterm exam– Portions include everything covered so far– In class on Feb 28th