chapter 13—collection classes the art and science of an introduction to computer science eric s....

35
The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Ja va Collection Classes C H A P T E R 1 3 I think this is the most extraordinary collection of talent, of human knowledge, that has ever been gathered at the White House—with the possible exception of when Thomas Jefferson dined alone. —John F. Kennedy, dinner for Nobel laureates, 1962 13.1 The ArrayList class revisited 13.2 The HashMap class 13.3 The Java Collections Framework 13.4 Principles of object- oriented design

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Chapter 13—Collection Classes

The Art and Science of

An Introductionto Computer ScienceERIC S. ROBERTS

Java

Collection Classes

C H A P T E R 1 3

I think this is the most extraordinary collection of talent, of humanknowledge, that has ever been gathered at the White House—with the possible exception of when Thomas Jefferson dined alone.

—John F. Kennedy, dinner for Nobel laureates, 1962

13.1 The ArrayList class revisited13.2 The HashMap class13.3 The Java Collections Framework13.4 Principles of object-oriented design

Page 2: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The ArrayList Class Revisited• You have already seen the ArrayList class in section

Chapter 11.8, which included examples showing how to use ArrayLists in both the pre- and post-Java 5.0 worlds. The purpose of section 13.1 is to look at the idea behind the ArrayList class from a more general perspective that paves the way for a discussion of the Java Collection Framework.

• The most obvious difference between the ArrayList class and Java’s array facility is that ArrayList is a full-fledged Java class. As such, the ArrayList class can support more sophisticated operations than arrays can. All of the operations that pertain to arrays must be built into the language; the operations that apply to the ArrayList class, by contrast, can be provided by extension.

Page 3: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Power of Dynamic Allocation• Much of the added power of the ArrayList class comes

from the fact that ArrayLists allow you to expand the list of values dynamically. The class also contains methods that add and remove elements from any position in the list; no such operations are available for arrays.

• The extra flexibility offered by the ArrayList class can reduce the complexity of programs substantially. As an example, the next few slides show two versions of the code for the readLineArray method, which reads and returns an array of lines from a reader.– The first version is the one that appears in section 12.4 and uses

the ArrayList class, adding each line as it appears.– The second version uses only arrays in the implementation and

therefore has to allocate space as the program reads additional lines from the reader. In this implementation, the code doubles the size of the internal array each time it runs out of space.

Page 4: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

/* * Reads all available lines from the specified reader and returns an array * containing those lines. This method closes the reader at the end of the * file. */ private String[] readLineArray(BufferedReader rd) { ArrayList<String> lineList = new ArrayList<String>(); try { while (true) { String line = rd.readLine(); if (line == null) break; lineList.add(line); } rd.close(); } catch (IOException ex) { throw new ErrorException(ex); } String[] result = new String[lineList.size()]; for (int i = 0; i < result.length; i++) { result[i] = lineList.get(i); } return result; }

readLineArray (ArrayList version)

Page 5: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

/* * Reads all available lines from the specified reader and returns an array * containing those lines. This method closes the reader at the end of the * file. */ private String[] readLineArray(BufferedReader rd) { String[] lineArray = new String[INITIAL_CAPACITY]; int nLines = 0; try { while (true) { String line = rd.readLine(); if (line == null) break; if (nLines + 1>= lineArray.length) { lineArray = doubleArrayCapacity(lineArray); } lineArray[nLines++] = line; } rd.close(); } catch (IOException ex) { throw new ErrorException(ex); } String[] result = new String[nLines]; for (int i = 0; i < nLines; i++) { result[i] = lineArray[i]; } return result; }

readLineArray (array version)

skip codepage 1 of 2

Page 6: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

/* * Reads all available lines from the specified reader and returns an array * containing those lines. This method closes the reader at the end of the * file. */ private String[] readLineArray(BufferedReader rd) { String[] lineArray = new String[INITIAL_CAPACITY]; int nLines = 0; try { while (true) { String line = rd.readLine(); if (line == null) break; if (nLines + 1>= lineArray.length) { lineArray = doubleArrayCapacity(lineArray); } lineArray[nLines++] = line; } rd.close(); } catch (IOException ex) { throw new ErrorException(ex); } String[] result = new String[nLines]; for (int i = 0; i < nLines; i++) { result[i] = lineArray[i]; } return result; }

/* * Creates a string array with twice as many elements as the old array and * then copies the existing elements from the old array to the new one. */ private String[] doubleArrayCapacity(String[] oldArray) { String[] newArray = new String[2 * oldArray.length]; for (int i = 0; i < oldArray.length; i++) { newArray[i] = oldArray[i]; } return newArray; }

/* Private constants */ private static final int INITIAL_CAPACITY = 10;

readLineArray (array version)

skip codepage 2 of 2

Page 7: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The HashMap Class• The HashMap class is one of the most valuable tools exported

by the java.util package and comes up in a surprising number of applications.

• The HashMap class implements the abstract idea of a map, which is an associative relationship between keys and values. A key is an object that never appears more than once in a map and can therefore be used to identify a value, which is the object associated with a particular key.

new HashMap( )

map.put(key, value)

map.get(key)

Creates a new HashMap object that is initially empty.

Sets the association for key in the map to value.

Returns the value associated with key, or null if none.

• Although the HashMap class exports other methods as well, the essential operations on a HashMap are the ones listed in the following table:

Page 8: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Generic Types for Keys and Values• Beginning with Java Standard Edition 5.0, the HashMap class

is a generic type that allows clients to specify the types of the keys and values.

• As with the ArrayList class introduced in Chapter 11, the type information is written in angle brackets after the class name. The only difference is that a HashMap requires two type parameters: one for the key and one for the value. For example, the type designation HashMap<String,Integer> indicates a HashMap that uses strings as keys to obtain integer values.

• In versions that predate Java 5.0, the HashMap class uses Object as the type for both keys and values. This definition makes it possible to use any object type as a key or value, but often means that you need a type cast to convert the result of a HashMap method to the intended type.

Page 9: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

A Simple HashMap Application• Suppose that you want to write a program that displays the

name of a state given its two-letter postal abbreviation.

• This program is an ideal application for the HashMap class because what you need is a map between two-letter codes and state names. Each two-letter code uniquely identifies a particular state and therefore serves as a key for the HashMap; the state names are the corresponding values.

• To implement this program in Java, you need to perform the following steps, which are illustrated on the following slide:

Create a HashMap containing all 50 key/value pairs.1.Read in the two-letter abbreviation to translate.2.Call get on the HashMap to find the state name.3.Print out the name of the state.4.

Page 10: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The PostalLookup Application

skip simulation

public void run() { HashMap<String,String> stateMap = new HashMap<String,String>(); initStateMap(stateMap); while (true) { String code = readLine("Enter two-letter state abbreviation: "); if (code.length() == 0) break; String state = stateMap.get(code); if (state == null) { println(code + " is not a known state abbreviation"); } else { println(code + " is " + state); } }}

codestate stateMap

Enter two-letter state abbreviation:

PostalLookup

HIHI is HawaiiEnter two-letter state abbreviation: WIWI is WisconsinEnter two-letter state abbreviation: VEVE is not a known state abbreviationEnter two-letter state abbreviation:

AL=AlabamaAK=AlaskaAZ=Arizona

FL=FloridaGA=GeorgiaHI=Hawaii

WI=WisconsinWY=Wyoming

. . .

. . .

HIWIVEHawaiiWisconsinnull

private void initStateMap(HashMap<String,String> map) { map.put("AL", "Alabama"); map.put("AK", "Alaska"); map.put("AZ", "Arizona");

map.put("FL", "Florida"); map.put("GA", "Georgia"); map.put("HI", "Hawaii");

map.put("WI", "Wisconsin"); map.put("WY", "Wyoming");}

map

. . .

. . .

public void run() { HashMap<String,String> stateMap = new HashMap<String,String>(); initStateMap(stateMap); while (true) { String code = readLine("Enter two-letter state abbreviation: "); if (code.length() == 0) break; String state = stateMap.get(code); if (state == null) { println(code + " is not a known state abbreviation"); } else { println(code + " is " + state); } }}

stateMap

Page 11: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Implementation Strategies for MapsThere are several strategies you might choose to implement the map operations get and put. Those strategies include:

Linear search in parallel arrays. Keep the two-character codes in one array and the state names in a second, making sure that the index numbers of the code and its corresponding state name always match. Such structures are called parallel arrays. You can use linear search to find the two-letter code and then take the state name from that position in the other array. This strategy takes O(N) time.

1.

Binary search in parallel arrays. If you keep the arrays sorted by the two-character code, you can use binary search to find the key. Using this strategy improves the performance to O(log N).

2.

Table lookup in a two-dimensional array. In this specific example, you could store the state names in a 26 x 26 string array in which the first and second indices correspond to the two letters in the code. Because you can now find any code in a single step, this strategy is O(1), although this performance comes at a cost in memory space.

3.

Page 12: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Idea of Hashing• The third strategy on the preceding slide shows that one can

make the get and put operations run very quickly, even to the point that the cost of finding a key is independent of the number of keys in the table. This O(1) performance is possible only if you know where to look for a particular key.

• To get a sense of how you might achieve this goal in practice, it helps to think about how you find a word in a dictionary. You certainly don’t start at the beginning at look at every word, but you probably don’t use binary search either. Most dictionaries have thumb tabs that indicate where each letter appear. Words starting with A are in the A section, and so on.

• The HashMap class uses a strategy called hashing, which is conceptually similar to the thumb tabs in a dictionary. The critical idea is that you can improve performance enormously if you use the key to figure out where to look.

Page 13: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Hash Codes• To make it possible for the HashMap class to know where to

look for a particular key, every object defines a method called hashCode that returns an integer associated with that object. As you will see in a subsequent slide, this hash code value tells the HashMap implementation where it should look for a particular key, dramatically reducing the search time.

• In general, clients of the HashMap class have no reason to know the actual value of the integer returned as a hash code for some key. The important things to remember are:

Every object has a hash code, even if you don’t know what it is.1.

The hash code for any particular object is always the same.2.

If two objects have equal values, they have the same hash code. 3.

Page 14: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Hash Codes and Collisions• For any Java object, the hashCode method returns an int

that can by any one of the 4,294,967,296 (232) possible values for that type.

• While 4,294,967,296 seems huge, it is insignificant compared to the total number of objects that can be represented inside a machine, which would be infinite if there were no limits on the size of memory.

• The fact that there are more possible objects than hash codes means that there must be some distinct objects that have the same hash codes. For example, the strings "hierarch" and "crinolines" have the same hash code, which happens to be -1732884796.

• Because different keys can generate the same hash codes, any strategy for implementing a map using hash codes must take that possibility into account, even though it happens rarely.

Page 15: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Bucket Hashing Strategy• One common strategy for implementing a map is to use the hash

code for an object to select an index into an array that will contain all the keys with that hash code. Each element of that array is conventionally called a bucket.

• In practice, the array of buckets is smaller than the number of hash codes, making it necessary to convert the hash code into a bucket index, typically by executing a statement like

int bucket = Math.abs(key.hashCode()) % N_BUCKETS;

• The value in each element of the bucket array cannot be a single key/value pair given the chance that different keys fall into the same bucket. Such situations are called collisions.

• To take account of the possibility of collisions, each elements of the bucket array is usually a linked list of the keys that fall into that bucket, as shown in the simulation on the next slide.

Page 16: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Simulating Bucket HashingstateMap.put("AL", "Alabama")

"AL".hashCode() 2091

Math.abs(2091) % 7 5

The key "AL" therefore goes in bucket 5.

stateMap.put("AK", "Alaska")

"AK".hashCode() 2090

Math.abs(2090) % 7 4

The key "AK" therefore goes in bucket 4.

stateMap.put("AZ", "Arizona")

"AZ".hashCode() 2105

Math.abs(2105) % 7 5

The key "AZ" therefore goes in bucket 5.

Because bucket 5 already contains "AL",the "AZ" must be added to the chain.

The rest of the keys are added similarly.CACalifornia

null

CACalifornia

null

COColorado

DEDelaware

CACalifornia

null

COColorado

IDIdaho

CACalifornia

null

COColorado

DEDelaware

IDIdaho

CACalifornia

null

COColorado

DEDelaware

KSKansas

IDIdaho

CACalifornia

null

COColorado

DEDelaware

KSKansas

MTMontana

IDIdaho

CACalifornia

null

COColorado

DEDelaware

KSKansas

MTMontana

NJNew

Jersey

IDIdaho

CACalifornia

null

COColorado

DEDelaware

KSKansas

MTMontana

NJNew

Jersey

NCNorth

Carolina

IDIdaho

CACalifornia

null

COColorado

DEDelaware

KSKansas

MTMontana

NJNew

Jersey

NCNorth

Carolina

WYWyoming

ILIllinois

null

ILIllinois

null

MNMinnesota

ILIllinois

null

MNMinnesota

NYNew York

ILIllinois

null

MNMinnesota

NYNew York

NDNorth

Dakota

ILIllinois

null

MNMinnesota

NYNew York

NDNorth

Dakota

OHOhio

ILIllinois

null

MNMinnesota

NYNew York

NDNorth

Dakota

OHOhio

SCSouth

Carolina

ILIllinois

null

MNMinnesota

NYNew York

NDNorth

Dakota

OHOhio

SCSouth

Carolina

TNTennessee

ILIllinois

null

MNMinnesota

NYNew York

NDNorth

Dakota

OHOhio

SCSouth

Carolina

TNTennessee

VAVirginia

HIHawaii

null

HIHawaii

null

MAMassachusetts

HIHawaii

null

MAMassachusetts

MOMissouri

HIHawaii

null

MAMassachusetts

MOMissouri

NENebraska

HIHawaii

null

MAMassachusetts

MOMissouri

NENebraska

SDSouth

Dakota

INIndiana

null

INIndiana

null

MIMichigan

INIndiana

null

MIMichigan

NMNew

Mexico

INIndiana

null

MIMichigan

NMNew

Mexico

UTUtah

AKAlaska

null

AKAlaska

null

ARArkansas

AKAlaska

null

ARArkansas

IAIowa

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OROregon

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OROregon

PAPennsylvania

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OROregon

PAPennsylvania

RIRhode

Island

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OROregon

PAPennsylvania

RIRhode

Island

TXTexas

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OROregon

PAPennsylvania

RIRhode

Island

TXTexas

WAWashington

WVWest

Virginia

OKOklahoma

AKAlaska

null

ARArkansas

IAIowa

OROregon

PAPennsylvania

RIRhode

Island

TXTexas

WAWashington

ALAlabama

null

ALAlabama

null

AZArizona

CTConnecticut

ALAlabama

null

AZArizona

GAGeorgia

ALAlabama

null

AZArizona

CTConnecticut

GAGeorgia

ALAlabama

null

AZArizona

CTConnecticut

MDMaryland

GAGeorgia

ALAlabama

null

AZArizona

CTConnecticut

MDMaryland

NVNevada

GAGeorgia

ALAlabama

null

AZArizona

CTConnecticut

MDMaryland

NVNevada

NHNew

Hampshire

GAGeorgia

ALAlabama

null

AZArizona

CTConnecticut

MDMaryland

NVNevada

NHNew

Hampshire

WIWisconsin

FLFlorida

null

FLFlorida

null

KYKentucky

FLFlorida

null

KYKentucky

LALouisiana

FLFlorida

null

KYKentucky

LALouisiana

MEMaine

FLFlorida

null

KYKentucky

LALouisiana

MEMaine

MSMississippi

FLFlorida

null

KYKentucky

LALouisiana

MEMaine

MSMississippi

VTVermont

0

1

2

3

4

5

6Suppose you call stateMap.get("HI")

"HI".hashCode() 2305

Math.abs(2305) % 7 2

The key "HI" must therefore be in bucket 2and can be located by searching the chain.

SDSouth

Dakota

NENebraska

MOMissouri

MAMassachusetts

HIHawaii

null

skip simulation

Page 17: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Achieving O(1) Performance• The simulation on the previous side uses only seven buckets

to emphasize what happens when collisions occur: the smaller the number of buckets, the more likely collisions become.

• In practice, the real implementation of HashMap uses a much larger value for N_BUCKETS to minimize the opportunity for collisions. If the number of buckets is considerably larger than the number of keys, most of the bucket chains will either be empty or contain exactly one key/value pair.

• The ratio of the number of keys to the number of buckets is called the load factor of the HashMap. Because a HashMap achieves O(1) performance only if the load factor is small, the library implementation of HashMap automatically increases the number of buckets when the table becomes too full.

• The next few slides show an implementation of HashMap that uses this approach.

Page 18: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

public class SimpleStringMap {

/** Creates a new SimpleStringMap with no key/value pairs */ public SimpleStringMap() { bucketArray = new HashEntry[N_BUCKETS]; }

/** * Sets the value associated with key. Any previous value for key is lost. * @param key The key used to refer to this value * @param value The new value to be associated with key */ public void put(String key, String value) { int bucket = Math.abs(key.hashCode()) % N_BUCKETS; HashEntry entry = findEntry(bucketArray[bucket], key); if (entry == null) { entry = new HashEntry(key, value); entry.setLink(bucketArray[bucket]); bucketArray[bucket] = entry; } else { entry.setValue(value); } }

The Code for Bucket Hashing

skip codepage 1 of 4

Page 19: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

public class SimpleStringMap {

/** Creates a new SimpleStringMap with no key/value pairs */ public SimpleStringMap() { bucketArray = new HashEntry[N_BUCKETS]; }

/** * Sets the value associated with key. Any previous value for key is lost. * @param key The key used to refer to this value * @param value The new value to be associated with key */ public void put(String key, String value) { int bucket = Math.abs(key.hashCode()) % N_BUCKETS; HashEntry entry = findEntry(bucketArray[bucket], key); if (entry == null) { entry = new HashEntry(key, value); entry.setLink(bucketArray[bucket]); bucketArray[bucket] = entry; } else { entry.setValue(value); } }

/** * Retrieves the value associated with key, or null if no such value exists. * @param key The key used to look up the value * @return The value associated with key, or null if no such value exists */ public String get(String key) { int bucket = Math.abs(key.hashCode()) % N_BUCKETS; HashEntry entry = findEntry(bucketArray[bucket], key); if (entry == null) { return null; } else { return entry.getValue(); } }

/* * Scans the entry chain looking for an entry that matches the specified key. * If no such entry exists, findEntry returns null. */ private HashEntry findEntry(HashEntry entry, String key) { while (entry != null) { if (entry.getKey().equals(key)) return entry; entry = entry.getLink(); } return null; }

The Code for Bucket Hashing

skip codepage 2 of 4

Page 20: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

/** * Retrieves the value associated with key, or null if no such value exists. * @param key The key used to look up the value * @return The value associated with key, or null if no such value exists */ public String get(String key) { int bucket = Math.abs(key.hashCode()) % N_BUCKETS; HashEntry entry = findEntry(bucketArray[bucket], key); if (entry == null) { return null; } else { return entry.getValue(); } }

/* * Scans the entry chain looking for an entry that matches the specified key. * If no such entry exists, findEntry returns null. */ private HashEntry findEntry(HashEntry entry, String key) { while (entry != null) { if (entry.getKey().equals(key)) return entry; entry = entry.getLink(); } return null; }

/* Private constants */ private static final int N_BUCKETS = 7;

/* Private instance variables */ private HashEntry[] bucketArray;

}

/* Package-private class: HashEntry *//* * This class represents a pair of a key and a value, along with a reference * to the next HashEntry in the chain. The methods exported by the class * consist only of getters and setters. */class HashEntry {

/* Creates a new HashEntry for the specified key/value pair */ public HashEntry(String key, String value) { entryKey = key; entryValue = value; }

/* Returns the key component of a HashEntry */ public String getKey() { return entryKey; }

The Code for Bucket Hashing

page 3 of 4 skip code

Page 21: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

/* Private constants */ private static final int N_BUCKETS = 7;

/* Private instance variables */ private HashEntry[] bucketArray;

}

/* Package-private class: HashEntry *//* * This class represents a pair of a key and a value, along with a reference * to the next HashEntry in the chain. The methods exported by the class * consist only of getters and setters. */class HashEntry {

/* Creates a new HashEntry for the specified key/value pair */ public HashEntry(String key, String value) { entryKey = key; entryValue = value; }

/* Returns the key component of a HashEntry */ public String getKey() { return entryKey; }

/* Returns the value component of a HashEntry */ public String getValue() { return entryValue; }

/* Sets the value component of a HashEntry to a new value */ public void setValue(String value) { entryValue = value; }

/* Returns the next link in the entry chain */ public HashEntry getLink() { return entryLink; }

/* Sets the link to the next entry in the chain */ public void setLink(HashEntry nextEntry) { entryLink = nextEntry; }

/* Private instance variables */ private String entryKey; /* The key component for this HashEntry */ private String entryValue; /* The value component for this HashEntry */ private HashEntry entryLink; /* The next entry in the chain */}

The Code for Bucket Hashing

page 4 of 4

Page 22: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Writing hashCode MethodsIf you want to use one of your own classes as a HashMap key, you will usually need to override its hashCode method. When you do, it is useful to keep the following principles in mind:

The hashCode method must always return the same code if it is called on the same object.

1.

The implementation of hashCode must be consistent with the implementation of the equals method, because hashing uses the equals method to compare keys. This condition is stronger than the first one, which says only that the hash code for a specific object should not change arbitrarily.

2.

The hashCode implementation should seek to minimize collisions.3.

The hashCode method should be relatively simple to compute. If you write a hashCode method that takes a long time to evaluate, you give up the primary advantage of hashing, which is that the basic algorithm runs very quickly.

4.

Page 23: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The hashCode and equals Methods• As noted on the preceding slide, hashing can work only if the

definitions of hashCode and equals are compatible. The practical implication of this principle is that you should never override one without overriding the other.

• The following code, for example, makes it possible to use objects of the Rational class from Chapter 6 as hashMap keys:

public int hashCode() { return new Integer(num).hashCode() ^ (37 * new Integer(den).hashCode());}

public boolean equals(Object obj) { if (obj instanceof Rational) { Rational r = (Rational) obj; return this.num == r.num && this.den == r.den; } else { return false; }}

Page 24: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Java Collections Framework• The ArrayList and HashMap classes are part of a larger set

of classes called the Java Collections Framework, which is part of the java.util package.

• The next slide shows the Java class hierarchy for the first two categories, which together are called collections.

• The classes in the Java Collections Framework fall into three general categories:

Lists. A list is an ordered collection of values that allows the client to add and remove elements. As you would expect, the ArrayList class falls into this category.

1.

Sets. A set is an unordered collection of values in which a particular object can appear at most once.

2.

Maps. A map implements an association between keys and values. The HashMap class is in this category.

3.

Page 25: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Collection Hierarchy

«interface»Collection

ArrayList

«interface»List

«interface»SetAbstractCollection

AbstractList

LinkedList HashSet

AbstractSet

TreeSet

«interface»SortedSet

The following diagram shows the portion of the Java Collections Framework that implements the Collection interface. The dotted lines specify that a class implements a particular interface.

Page 26: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

ArrayList vs. LinkedList• If you look at the left side of the collections hierarchy on the

preceding slide, you will discover that there are two classes in the Java Collections Framework that implement the List interface: ArrayList and LinkedList.

• Because these classes implement the same interface, it is generally possible to substitute one for the other.

• The fact that these classes have the same effect, however, does not imply that they have the same performance characteristics.– The ArrayList class is more efficient if you are selecting a

particular element or searching for an element in a sorted array.– The LinkedList class is more efficient if you are adding or

removing elements from a large list.

• Choosing which list implementation to use is therefore a matter of evaluating the performance tradeoffs.

Page 27: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Set Interface• The right side of the collections hierarchy diagram contains

classes that implement the Set interface, which is used to represent an unordered collection of objects. The two concrete classes in this category are HashSet and TreeSet.

• A set is in some ways a stripped-down version of a list. Both structures allow you to add and remove elements, but the set form does not offer any notion of index positions. All you can know is whether an object is present or absent from a set.

• The difference between the HashSet and TreeSet classes reflects a difference in the underlying implementation. The HashSet class is built on the idea of hashing; the TreeSet class is based on a structure called a binary tree, the details of which are beyond the scope of the text. In practice, the main difference arises when you iterate over the elements of a set, which is described on the next slide.

Page 28: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Iteration in Collections• One of the most useful operations for any collection is the

ability to run through each of the elements in a loop. This process is called iteration.

• The java.util package includes a class called Iterator that supports iteration over the elements of a collection. In older versions of Java, the programming pattern for using an iterator looks like this:

Iterator iterator = collection.elements();while (iterator.hasNext()) { type element = (type) iterator.next(); . . . statements that process this particular element . . . }

• Java Standard Edition 5.0 allows you to simplify this code tofor (type element : collection) { . . . statements that process this particular element . . . }

Page 29: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Iteration Order• For a collection that implements the List interface, the order

in which iteration proceeds through the elements of the list is defined by the underlying ordering of the list. The element at index 0 comes first, followed by the other elements in order.

• The ordering of iteration in a Set is more difficult to specify because a set is, by definition, an unordered collection. A set that implements only the Set interface, for example, is free to deliver up elements in any order, typically choosing an order that is convenient for the implementation.

• If, however, a Set also implements the SortedSet interface (as the TreeSet class does), the iterator is forced to deliver elements that appear in ascending order according to the compareTo method for that class. An iterator for a TreeSet of strings is therefore required to deliver its elements in lexicographic order, as illustrated on the next slide.

Page 30: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Iteration Order in a HashMap

private void listKeys(Map<String,String> map, int nPerLine) { String className = map.getClass().getName(); int lastDot = className.lastIndexOf("."); String shortName = className.substring(lastDot + 1); println("Using " + shortName + ", the keys are:"); Iterator<String> iterator = map.keySet().iterator(); for (int i = 1; iterator.hasNext(); i++) { print(" " + iterator.next()); if (i % nPerLine == 0) println(); }}

The following method iterates through the keys in a map:

If you call this method on a HashMap containing the two-letter state codes, you get:

Using HashMap, the keys are: SC VA LA GA DC OH MN KY WA IL OR NM MA DE MS WV HI FL KS SD AK TN ID RI NC NY NH MT WI CO OK NE NV MI MD TX VT AZ PR IN AL CA UT WY ND PA AR CT NJ ME MO IA

MapIterator

Page 31: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Iteration Order in a TreeMap

private void listKeys(Map<String,String> map, int nPerLine) { String className = map.getClass().getName(); int lastDot = className.lastIndexOf("."); String shortName = className.substring(lastDot + 1); println("Using " + shortName + ", the keys are:"); Iterator<String> iterator = map.keySet().iterator(); for (int i = 1; iterator.hasNext(); i++) { print(" " + iterator.next()); if (i % nPerLine == 0) println(); }}

The following method iterates through the keys in a map:

If you call instead this method on a TreeMap containing the same values, you get:

Using TreeMap, the keys are: AK AL AR AZ CA CO CT DC DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH OK OR PA PR RI SC SD TN TX UT VA VT WA WI WV WY

MapIterator

Page 32: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Map Hierarchy

«interface»Map

HashMap

AbstractMap

TreeMap

«interface»SortedMap

The following diagram shows the portion of the Java Collections Framework that implements the Map interface. The structure matches that of the Set interface in the Collection hierarchy. The distinction between HashMap and TreeMap is the same as that between HashSet and TreeSet.

Page 33: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The Collections Toolbox• The Collections class (not the same as the Collection

interface) exports several static methods that operate on lists, the most important of which appear in the following table:

binarySearch(list, key)

sort(list)

min(list)

max(list)

reverse(list)

shuffle(list)

swap(list, p1, p2)

replaceAll(list, x1, x2)

Finds key in a sorted list using binary search.

Sorts a list into ascending order.

Returns the smallest value in a list.

Returns the largest value in a list.

Reverses the order of elements in a list.

Randomly rearranges the elements in a list.

Exchanges the elements at index positions p1 and p2.

Replaces all elements matching x1 with x2.

• The java.util package exports a similar Arrays class that provides the same basic operations for any array.

Page 34: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

Principles of Object-oriented DesignSection 13.4 offers the following guidelines for package design:

• Unified. Each package should define a consistent abstraction with a clear unifying theme. If a class does not fit within that theme, it should not be part of the package.

• Simple. The package design should simplify things for the client. To the extent that the underlying implementation is itself complex, the package must seek to hide that complexity.

• Sufficient. For clients to adopt a package, it must provide classes and methods that meet their needs. If some critical operation is missing from a package, clients may decide to abandon it and develop their own tools.

• Flexible. A well-designed package should be general enough to meet the needs of many different clients. A package that offers narrowly defined operations for one client is not nearly as useful as one that can be used in many different situations.

• Stable. The methods defined in a class exported as part of a package should continue to have precisely the same structure and effect, even as the package evolves. Making changes in the behavior of a class forces clients to change their programs, which reduces its utility.

Page 35: Chapter 13—Collection Classes The Art and Science of An Introduction to Computer Science ERIC S. ROBERTS Java Collection Classes C H A P T E R 1 3 I think

The End