Data Structure and Algorithms in Java Programs Discussion

Content Type

User Generated

User

uznnne

Subject

Computer Science

Description

Hello, I would like to summarize the following chapter of Lists and Iteratorsof Java Data structure and algorithms in a good manner in One A4 Page.

Thank you

Unformatted Attachment Preview

Chapter 7 List and Iterator ADTs Contents 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 The List ADT . . . . . . . . . . . . . . . . . . . . . . . . . Array Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Dynamic Arrays . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Implementing a Dynamic Array . . . . . . . . . . . . . . . 7.2.3 Amortized Analysis of Dynamic Arrays . . . . . . . . . . . 7.2.4 Java’s StringBuilder class . . . . . . . . . . . . . . . . . . Positional Lists . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 The Positional List Abstract Data Type . . . . . . . . . . 7.3.3 Doubly Linked List Implementation . . . . . . . . . . . . . Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 The Iterable Interface and Java’s For-Each Loop . . . . . 7.4.2 Implementing Iterators . . . . . . . . . . . . . . . . . . . The Java Collections Framework . . . . . . . . . . . . . . . 7.5.1 List Iterators in Java . . . . . . . . . . . . . . . . . . . . 7.5.2 Comparison to Our Positional List ADT . . . . . . . . . . 7.5.3 List-Based Algorithms in the Java Collections Framework . Sorting a Positional List . . . . . . . . . . . . . . . . . . . . Case Study: Maintaining Access Frequencies . . . . . . . . 7.7.1 Using a Sorted List . . . . . . . . . . . . . . . . . . . . . 7.7.2 Using a List with the Move-to-Front Heuristic . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . www.it-ebooks.info 258 260 263 264 265 269 270 272 272 276 282 283 284 288 289 290 291 293 294 294 297 300 Chapter 7. List and Iterator ADTs 258 7.1 The List ADT In Chapter 6, we introduced the stack, queue, and deque abstract data types, and discussed how either an array or a linked list could be used for storage in an efficient concrete implementation of each. Each of those ADTs represents a linearly ordered sequence of elements. The deque is the most general of the three, yet even so, it only allows insertions and deletions at the front or back of a sequence. In this chapter, we explore several abstract data types that represent a linear sequence of elements, but with more general support for adding or removing elements at arbitrary positions. However, designing a single abstraction that is well suited for efficient implementation with either an array or a linked list is challenging, given the very different nature of these two fundamental data structures. Locations within an array are easily described with an integer index. Recall that an index of an element e in a sequence is equal to the number of elements before e in that sequence. By this definition, the first element of a sequence has index 0, and the last has index n − 1, assuming that n denotes the total number of elements. The notion of an element’s index is well defined for a linked list as well, although we will see that it is not as convenient of a notion, as there is no way to efficiently access an element at a given index without traversing a portion of the linked list that depends upon the magnitude of the index. With that said, Java defines a general interface, java.util.List, that includes the following index-based methods (and more): size( ): Returns the number of elements in the list. isEmpty( ): Returns a boolean indicating whether the list is empty. get(i): Returns the element of the list having index i; an error condition occurs if i is not in range [0, size( ) − 1]. set(i, e): Replaces the element at index i with e, and returns the old element that was replaced; an error condition occurs if i is not in range [0, size( ) − 1]. add(i, e): Inserts a new element e into the list so that it has index i, moving all subsequent elements one index later in the list; an error condition occurs if i is not in range [0, size( )]. remove(i): Removes and returns the element at index i, moving all subsequent elements one index earlier in the list; an error condition occurs if i is not in range [0, size( ) − 1]. We note that the index of an existing element may change over time, as other elements are added or removed in front of it. We also draw attention to the fact that the range of valid indices for the add method includes the current size of the list, in which case the new element becomes the last. www.it-ebooks.info 7.1. The List ADT 259 Example 7.1 demonstrates a series of operations on a list instance, and Code Fragment 7.1 below provides a formal definition of our simplified version of the List interface; we use an IndexOutOfBoundsException to signal an invalid index argument. Example 7.1: We demonstrate operations on an initially empty list of characters. Method add(0, A) add(0, B) get(1) set(2, C) add(2, C) add(4, D) remove(1) add(1, D) add(1, E) get(4) add(4, F) set(2, G) get(2) Return Value – – A “error” – “error” A – – “error” – D G List Contents (A) (B, A) (B, A) (B, A) (B, A, C) (B, A, C) (B, C) (B, D, C) (B, E, D, C) (B, E, D, C) (B, E, D, C, F) (B, E, G, C, F) (B, E, G, C, F) 1 /∗∗ A simplified version of the java.util.List interface. ∗/ 2 public interface List { 3 /∗∗ Returns the number of elements in this list. ∗/ 4 int size( ); 5 6 /∗∗ Returns whether the list is empty. ∗/ 7 boolean isEmpty( ); 8 9 /∗∗ Returns (but does not remove) the element at index i. ∗/ 10 E get(int i) throws IndexOutOfBoundsException; 11 12 /∗∗ Replaces the element at index i with e, and returns the replaced element. ∗/ 13 E set(int i, E e) throws IndexOutOfBoundsException; 14 15 /∗∗ Inserts element e to be at index i, shifting all subsequent elements later. ∗/ 16 void add(int i, E e) throws IndexOutOfBoundsException; 17 18 /∗∗ Removes/returns the element at index i, shifting subsequent elements earlier. ∗/ 19 E remove(int i) throws IndexOutOfBoundsException; 20 } Code Fragment 7.1: A simple version of the List interface. www.it-ebooks.info Chapter 7. List and Iterator ADTs 260 7.2 Array Lists An obvious choice for implementing the list ADT is to use an array A, where A[i] stores (a reference to) the element with index i. We will begin by assuming that we have a fixed-capacity array, but in Section 7.2.1 describe a more advanced technique that effectively allows an array-based list to have unbounded capacity. Such an unbounded list is known as an array list in Java (or a vector in C++ and in the earliest versions of Java). With a representation based on an array A, the get(i) and set(i, e) methods are easy to implement by accessing A[i] (assuming i is a legitimate index). Methods add(i, e) and remove(i) are more time consuming, as they require shifting elements up or down to maintain our rule of always storing an element whose list index is i at index i of the array. (See Figure 7.1.) Our initial implementation of the ArrayList class follows in Code Fragments 7.2 and 7.3. 0 1 2 i (a) 0 1 2 i (b) n−1 N −1 n−1 N −1 Figure 7.1: Array-based implementation of an array list that is storing n elements: (a) shifting up for an insertion at index i; (b) shifting down for a removal at index i. 1 public class ArrayList implements List { 2 // instance variables 3 public static final int CAPACITY=16; // default array capacity 4 private E[ ] data; // generic array used for storage 5 private int size = 0; // current number of elements 6 // constructors 7 public ArrayList( ) { this(CAPACITY); } // constructs list with default capacity 8 public ArrayList(int capacity) { // constructs list with given capacity 9 data = (E[ ]) new Object[capacity]; // safe cast; compiler may give warning 10 } Code Fragment 7.2: An implementation of a simple ArrayList class with bounded capacity. (Continues in Code Fragment 7.3.) www.it-ebooks.info 7.2. Array Lists 261 11 // public methods 12 /∗∗ Returns the number of elements in the array list. ∗/ 13 public int size( ) { return size; } 14 /∗∗ Returns whether the array list is empty. ∗/ 15 public boolean isEmpty( ) { return size == 0; } 16 /∗∗ Returns (but does not remove) the element at index i. ∗/ 17 public E get(int i) throws IndexOutOfBoundsException { 18 checkIndex(i, size); 19 return data[i]; 20 } 21 /∗∗ Replaces the element at index i with e, and returns the replaced element. ∗/ 22 public E set(int i, E e) throws IndexOutOfBoundsException { 23 checkIndex(i, size); 24 E temp = data[i]; 25 data[i] = e; 26 return temp; 27 } 28 /∗∗ Inserts element e to be at index i, shifting all subsequent elements later. ∗/ 29 public void add(int i, E e) throws IndexOutOfBoundsException, 30 IllegalStateException { 31 checkIndex(i, size + 1); 32 if (size == data.length) // not enough capacity 33 throw new IllegalStateException("Array is full"); 34 for (int k=size−1; k >= i; k−−) // start by shifting rightmost 35 data[k+1] = data[k]; 36 data[i] = e; // ready to place the new element 37 size++; 38 } 39 /∗∗ Removes/returns the element at index i, shifting subsequent elements earlier. ∗/ 40 public E remove(int i) throws IndexOutOfBoundsException { 41 checkIndex(i, size); 42 E temp = data[i]; 43 for (int k=i; k < size−1; k++) // shift elements to fill hole 44 data[k] = data[k+1]; 45 data[size−1] = null; // help garbage collection 46 size−−; 47 return temp; 48 } 49 // utility method 50 /∗∗ Checks whether the given index is in the range [0, n−1]. ∗/ 51 protected void checkIndex(int i, int n) throws IndexOutOfBoundsException { 52 if (i < 0 | | i >= n) 53 throw new IndexOutOfBoundsException("Illegal index: " + i); 54 } 55 } Code Fragment 7.3: An implementation of a simple ArrayList class with bounded capacity. (Continued from Code Fragment 7.2.) www.it-ebooks.info Chapter 7. List and Iterator ADTs 262 The Performance of a Simple Array-Based Implementation Table 7.1 shows the worst-case running times of the methods of an array list with n elements realized by means of an array. Methods isEmpty, size, get and set clearly run in O(1) time, but the insertion and removal methods can take much longer than this. In particular, add(i, e) runs in time O(n). Indeed, the worst case for this operation occurs when i is 0, since all the existing n elements have to be shifted forward. A similar argument applies to method remove(i), which runs in O(n) time, because we have to shift backward n − 1 elements in the worst case, when i is 0. In fact, assuming that each possible index is equally likely to be passed as an argument to these operations, their average running time is O(n), for we will have to shift n/2 elements on average. Method size( ) isEmpty( ) get(i) set(i, e) add(i, e) remove(i) Running Time O(1) O(1) O(1) O(1) O(n) O(n) Table 7.1: Performance of an array list with n elements realized by a fixed-capacity array. Looking more closely at add(i, e) and remove(i), we note that they each run in time O(n − i + 1), for only those elements at index i and higher have to be shifted up or down. Thus, inserting or removing an item at the end of an array list, using the methods add(n, e) and remove(n − 1) respectively, takes O(1) time each. Moreover, this observation has an interesting consequence for the adaptation of the array list ADT to the deque ADT from Section 6.3.1. If we do the “obvious” thing and store elements of a deque so that the first element is at index 0 and the last element at index n − 1, then methods addLast and removeLast of the deque each run in O(1) time. However, methods addFirst and removeFirst of the deque each run in O(n) time. Actually, with a little effort, we can produce an array-based implementation of the array list ADT that achieves O(1) time for insertions and removals at index 0, as well as insertions and removals at the end of the array list. Achieving this requires that we give up on our rule that an element at index i is stored in the array at index i, however, as we would have to use a circular array approach like the one we used in Section 6.2 to implement a queue. We leave the details of this implementation as Exercise C-7.25. www.it-ebooks.info 7.2. Array Lists 263 7.2.1 Dynamic Arrays 60 59 21 58 21 57 21 56 21 55 21 54 21 53 21 52 21 51 21 50 21 49 21 48 21 47 21 46 21 45 21 21 21 44 The ArrayList implementation in Code Fragments 7.2 and 7.3 (as well as those for a stack, queue, and deque from Chapter 6) has a serious limitation; it requires that a fixed maximum capacity be declared, throwing an exception if attempting to add an element once full. This is a major weakness, because if a user is unsure of the maximum size that will be reached for a collection, there is risk that either too large of an array will be requested, causing an inefficient waste of memory, or that too small of an array will be requested, causing a fatal error when exhausting that capacity. Java’s ArrayList class provides a more robust abstraction, allowing a user to add elements to the list, with no apparent limit on the overall capacity. To provide this abstraction, Java relies on an algorithmic sleight of hand that is known as a dynamic array. In reality, elements of an ArrayList are stored in a traditional array, and the precise size of that traditional array must be internally declared in order for the system to properly allocate a consecutive piece of memory for its storage. For example, Figure 7.2 displays an array with 12 cells that might be stored in memory locations 2146 through 2157 on a computer system. Figure 7.2: An array of 12 cells, allocated in memory locations 2146 through 2157. Because the system may allocate neighboring memory locations to store other data, the capacity of an array cannot be increased by expanding into subsequent cells. The first key to providing the semantics of an unbounded array is that an array list instance maintains an internal array that often has greater capacity than the current length of the list. For example, while a user may have created a list with five elements, the system may have reserved an underlying array capable of storing eight object references (rather than only five). This extra capacity makes it easy to add a new element to the end of the list by using the next available cell of the array. If a user continues to add elements to a list, all reserved capacity in the underlying array will eventually be exhausted. In that case, the class requests a new, larger array from the system, and copies all references from the smaller array into the beginning of the new array. At that point in time, the old array is no longer needed, so it can be reclaimed by the system. Intuitively, this strategy is much like that of the hermit crab, which moves into a larger shell when it outgrows its previous one. www.it-ebooks.info Chapter 7. List and Iterator ADTs 264 7.2.2 Implementing a Dynamic Array We now demonstrate how our original version of the ArrayList, from Code Fragments 7.2 and 7.3, can be transformed to a dynamic-array implementation, having unbounded capacity. We rely on the same internal representation, with a traditional array A, that is initialized either to a default capacity or to one specified as a parameter to the constructor. The key is to provide means to “grow” the array A, when more space is needed. Of course, we cannot actually grow that array, as its capacity is fixed. Instead, when a call to add a new element risks overflowing the current array, we perform the following additional steps: 1. 2. 3. 4. Allocate a new array B with larger capacity. Set B[k] = A[k], for k = 0, . . . , n− 1, where n denotes current number of items. Set A = B, that is, we henceforth use the new array to support the list. Insert the new element in the new array. An illustration of this process is shown in Figure 7.3. A A B B (a) A (b) (c) Figure 7.3: An illustration of “growing” a dynamic array: (a) create new array B; (b) store elements of A in B; (c) reassign reference A to the new array. Not shown is the future garbage collection of the old array, or the insertion of a new element. Code Fragment 7.4 provides a concrete implementation of a resize method, which should be included as a protected method within the original ArrayList class. The instance variable data corresponds to array A in the above discussion, and local variable temp corresponds to array B. /∗∗ Resizes internal array to have given capacity >= size. ∗/ protected void resize(int capacity) { E[ ] temp = (E[ ]) new Object[capacity]; // safe cast; compiler may give warning for (int k=0; k < size; k++) temp[k] = data[k]; data = temp; // start using the new array } Code Fragment 7.4: An implementation of the ArrayList.resize method. www.it-ebooks.info 7.2. Array Lists 265 The remaining issue to consider is how large of a new array to create. A commonly used rule is for the new array to have twice the capacity of the existing array that has been filled. In Section 7.2.3, we will provide a mathematical analysis to justify such a choice. To complete the revision to our original ArrayList implementation, we redesign the add method so that it calls the new resize utility when detecting that the current array is filled (rather than throwing an exception). The revised version appears in Code Fragment 7.5. 28 29 30 31 32 ... /∗∗ Inserts element e to be at index i, shifting all subsequent elements later. ∗/ public void add(int i, E e) throws IndexOutOfBoundsException { checkIndex(i, size + 1); if (size == data.length) // not enough capacity resize(2 ∗ data.length); // so double the current capacity // rest of method unchanged... Code Fragment 7.5: A revision to the ArrayList.add method, originally from Code Fragment 7.3, which calls the resize method of Code Fragment 7.4 when more capacity is needed. Finally, we note that our original implementation of the ArrayList class includes two constructors: a default constructor that uses an initial capacity of 16, and a parameterized constructor that allows the caller to specify a capacity value. With the use of dynamic arrays, that capacity is no longer a fixed limit. Still, greater efficiency is achieved when a user selects an initial capacity that matches the actual size of a data set, as this can avoid time spent on intermediate array reallocations and potential space that is wasted by having too large of an array. 7.2.3 Amortized Analysis of Dynamic Arrays In this section, we will perform a detailed analysis of the running time of operations on dynamic arrays. As a shorthand notation, let us refer to the insertion of an element to be the last element in an array list as a push operation. The strategy of replacing an array with a new, larger array might at first seem slow, because a single push operation may require Ω(n) time to perform, where n is the current number of elements in the array. (Recall, from Section 4.3.1, that big-Omega notation, describes an asymptotic lower bound on the running time of an algorithm.) However, by doubling the capacity during an array replacement, our new array allows us to add n further elements before the array must be replaced again. In this way, there are many simple push operations for each expensive one (see Figure 7.4). This fact allows us to show that a series of push operations on an initially empty dynamic array is efficient in terms of its total running time. www.it-ebooks.info Chapter 7. List and Iterator ADTs primitive operations for a push 266 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 current number of elements Figure 7.4: Running times of a series of push operations on a dynamic array. Using an algorithmic design pattern called amortization, we show that performing a sequence of push operations on a dynamic array is actually quite efficient. To perform an amortized analysis, we use an accounting technique where we view the computer as a coin-operated appliance that requires the payment of one cyberdollar for a constant amount of computing time. When an operation is executed, we should have enough cyber-dollars available in our current “bank account” to pay for that operation’s running time. Thus, the total amount of cyber-dollars spent for any computation will be proportional to the total time spent on that computation. The beauty of using this analysis method is that we can overcharge some operations in order to save up cyber-dollars to pay for others. Proposition 7.2: Let L be an initially empty array list with capacity one, implemented by means of a dynamic array that doubles in size when full. The total time to perform a series of n push operations in L is O(n). Justification: Let us assume that one cyber-dollar is enough to pay for the execution of each push operation in L, excluding the time spent for growing the array. Also, let us assume that growing the array from size k to size 2k requires k cyberdollars for the time spent initializing the new array. We shall charge each push operation three cyber-dollars. Thus, we overcharge each push operation that does not cause an overflow by two cyber-dollars. Think of the two cyber-dollars profited in an insertion that does not grow the array as being “stored” with the cell in which the element was inserted. An overflow occurs when the array L has 2i elements, for some integer i ≥ 0, and the size of the array used by the array representing L is 2i . Thus, doubling the size of the array will require 2i cyber-dollars. Fortunately, these cyber-dollars can be found stored in cells 2i−1 through 2i − 1. (See Figure 7.5.) www.it-ebooks.info 7.2. Array Lists 267 (a) 0 1 2 3 $ $ $ $ $ $ $ $ 4 5 6 7 $ $ (b) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 7.5: Illustration of a series of push operations on a dynamic array: (a) an 8-cell array is full, with two cyber-dollars “stored” at cells 4 through 7; (b) a push operation causes an overflow and a doubling of capacity. Copying the eight old elements to the new array is paid for by the cyber-dollars already stored in the table. Inserting the new element is paid for by one of the cyber-dollars charged to the current push operation, and the two cyber-dollars profited are stored at cell 8. Note that the previous overflow occurred when the number of elements became larger than 2i−1 for the first time, and thus the cyber-dollars stored in cells 2i−1 through 2i − 1 have not yet been spent. Therefore, we have a valid amortization scheme in which each operation is charged three cyber-dollars and all the computing time is paid for. That is, we can pay for the execution of n push operations using 3n cyber-dollars. In other words, the amortized running time of each push operation is O(1); hence, the total running time of n push operations is O(n). Geometric Increase in Capacity Although the proof of Proposition 7.2 relies on the array being doubled each time it is expanded, the O(1) amortized bound per operation can be proven for any geometrically increasing progression of array sizes. (See Section 2.2.3 for discussion of geometric progressions.) When choosing the geometric base, there exists a tradeoff between runtime efficiency and memory usage. If the last insertion causes a resize event, with a base of 2 (i.e., doubling the array), the array essentially ends up twice as large as it needs to be. If we instead increase the array by only 25% of its current size (i.e., a geometric base of 1.25), we do not risk wasting as much memory in the end, but there will be more intermediate resize events along the way. Still it is possible to prove an O(1) amortized bound, using a constant factor greater than the 3 cyber-dollars per operation used in the proof of Proposition 7.2 (see Exercise R-7.7). The key to the performance is that the amount of additional space is proportional to the current size of the array. www.it-ebooks.info Chapter 7. List and Iterator ADTs 268 Beware of Arithmetic Progression primitive operations for a push primitive operations for a push To avoid reserving too much space at once, it might be tempting to implement a dynamic array with a strategy in which a constant number of additional cells are reserved each time an array is resized. Unfortunately, the overall performance of such a strategy is significantly worse. At an extreme, an increase of only one cell causes each push operation to resize the array, leading to a familiar 1 + 2 + 3 + · · ·+ n summation and Ω(n2 ) overall cost. Using increases of 2 or 3 at a time is slightly better, as portrayed in Figure 7.4, but the overall cost remains quadratic. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 current number of elements current number of elements (a) (b) Figure 7.6: Running times of a series of push operations on a dynamic array using arithmetic progression of sizes. Part (a) assumes an increase of 2 in the size of the array, while part (b) assumes an increase of 3. Using a fixed increment for each resize, and thus an arithmetic progression of intermediate array sizes, results in an overall time that is quadratic in the number of operations, as shown in the following proposition. In essence, even an increase in 10,000 cells per resize will become insignificant for large data sets. Proposition 7.3: Performing a series of n push operations on an initially empty dynamic array using a fixed increment with each resize takes Ω(n2 ) time. Justification: Let c > 0 represent the fixed increment in capacity that is used for each resize event. During the series of n push operations, time will have been spent initializing arrays of size c, 2c, 3c, . . . , mc for m = ⌈n/c⌉, and therefore, the overall time is proportional to c + 2c + 3c + · · · + mc. By Proposition 4.3, this sum is m m ∑ ci = c · ∑ i = c i=1 i=1 n n ( + 1) 1 2 m(m + 1) ≥cc c ≥ ·n . 2 2 2c Therefore, performing the n push operations takes Ω(n2 ) time. www.it-ebooks.info 7.2. Array Lists 269 Memory Usage and Shrinking an Array Another consequence of the rule of a geometric increase in capacity when adding to a dynamic array is that the final array size is guaranteed to be proportional to the overall number of elements. That is, the data structure uses O(n) memory. This is a very desirable property for a data structure. If a container, such as an array list, provides operations that cause the removal of one or more elements, greater care must be taken to ensure that a dynamic array guarantees O(n) memory usage. The risk is that repeated insertions may cause the underlying array to grow arbitrarily large, and that there will no longer be a proportional relationship between the actual number of elements and the array capacity after many elements are removed. A robust implementation of such a data structure will shrink the underlying array, on occasion, while maintaining the O(1) amortized bound on individual operations. However, care must be taken to ensure that the structure cannot rapidly oscillate between growing and shrinking the underlying array, in which case the amortized bound would not be achieved. In Exercise C-7.29, we explore a strategy in which the array capacity is halved whenever the number of actual element falls below one-fourth of that capacity, thereby guaranteeing that the array capacity is at most four times the number of elements; we explore the amortized analysis of such a strategy in Exercises C-7.30 and C-7.31. 7.2.4 Java’s StringBuilder class Near the beginning of Chapter 4, we described an experiment in which we compared two algorithms for composing a long string (Code Fragment 4.2). The first of those relied on repeated concatenation using the String class, and the second relied on use of Java’s StringBuilder class. We observed the StringBuilder was significantly faster, with empirical evidence that suggested a quadratic running time for the algorithm with repeated concatenations, and a linear running time for the algorithm with the StringBuilder. We are now able to explain the theoretical underpinning for those observations. The StringBuilder class represents a mutable string by storing characters in a dynamic array. With analysis similar to Proposition 7.2, it guarantees that a series of append operations resulting in a string of length n execute in a combined time of O(n). (Insertions at positions other than the end of a string builder do not carry this guarantee, just as they do not for an ArrayList.) In contrast, the repeated use of string concatenation requires quadratic time. We originally analyzed that algorithm on page 172 of Chapter 4. In effect, that approach is akin to a dynamic array with an arithmetic progression of size one, repeatedly copying all characters from one array to a new array with size one greater than before. www.it-ebooks.info Chapter 7. List and Iterator ADTs 270 7.3 Positional Lists When working with array-based sequences, integer indices provide an excellent means for describing the location of an element, or the location at which an insertion or deletion should take place. However, numeric indices are not a good choice for describing positions within a linked list because, knowing only an element’s index, the only way to reach it is to traverse the list incrementally from its beginning or end, counting elements along the way. Furthermore, indices are not a good abstraction for describing a more local view of a position in a sequence, because the index of an entry changes over time due to insertions or deletions that happen earlier in the sequence. For example, it may not be convenient to describe the location of a person waiting in line based on the index, as that requires knowledge of precisely how far away that person is from the front of the line. We prefer an abstraction, as characterized in Figure 7.7, in which there is some other means for describing a position. Tickets me Figure 7.7: We wish to be able to identify the position of an element in a sequence without the use of an integer index. The label “me” represents some abstraction that identifies the position. Our goal is to design an abstract data type that provides a user a way to refer to elements anywhere in a sequence, and to perform arbitrary insertions and deletions. This would allow us to efficiently describe actions such as a person deciding to leave the line before reaching the front, or allowing a friend to “cut” into line right behind him or her. As another example, a text document can be viewed as a long sequence of characters. A word processor uses the abstraction of a cursor to describe a position within the document without explicit use of an integer index, allowing operations such as “delete the character at the cursor” or “insert a new character just after the cursor.” Furthermore, we may be able to refer to an inherent position within a document, such as the beginning of a particular chapter, without relying on a character index (or even a chapter number) that may change as the document evolves. www.it-ebooks.info 7.3. Positional Lists 271 For these reasons, we temporarily forego the index-based methods of Java’s formal List interface, and instead develop our own abstract data type that we denote as a positional list. Although a positional list is an abstraction, and need not rely on a linked list for its implementation, we certainly have a linked list in mind as we design the ADT, ensuring that it takes best advantage of particular capabilities of a linked list, such as O(1)-time insertions and deletions at arbitrary positions (something that is not possible with an array-based sequence). We face an immediate challenge in designing the ADT; to achieve constant time insertions and deletions at arbitrary locations, we effectively need a reference to the node at which an element is stored. It is therefore very tempting to develop an ADT in which a node reference serves as the mechanism for describing a position. In fact, our DoublyLinkedList class of Section 3.4.1 has methods addBetween and remove that accept node references as parameters; however, we intentionally declared those methods as private. Unfortunately, the public use of nodes in the ADT would violate the objectoriented design principles of abstraction and encapsulation, which were introduced in Chapter 2. There are several reasons to prefer that we encapsulate the nodes of a linked list, for both our sake and for the benefit of users of our abstraction: • It will be simpler for users of our data structure if they are not bothered with unnecessary details of our implementation, such as low-level manipulation of nodes, or our reliance on the use of sentinel nodes. Notice that to use the addBetween method of our DoublyLinkedList class to add a node at the beginning of a sequence, the header sentinel must be sent as a parameter. • We can provide a more robust data structure if we do not permit users to directly access or manipulate the nodes. We can then ensure that users do not invalidate the consistency of a list by mismanaging the linking of nodes. A more subtle problem arises if a user were allowed to call the addBetween or remove method of our DoublyLinkedList class, sending a node that does not belong to the given list as a parameter. (Go back and look at that code and see why it causes a problem!) • By better encapsulating the internal details of our implementation, we have greater flexibility to redesign the data structure and improve its performance. In fact, with a well-designed abstraction, we can provide a notion of a nonnumeric position, even if using an array-based sequence. (See Exercise C-7.43.) Therefore, in defining the positional list ADT, we also introduce the concept of a position, which formalizes the intuitive notion of the “location” of an element relative to others in the list. (When we do use a linked list for the implementation, we will later see how we can privately use node references as natural manifestations of positions.) www.it-ebooks.info Chapter 7. List and Iterator ADTs 272 7.3.1 Positions To provide a general abstraction for the location of an element within a structure, we define a simple position abstract data type. A position supports the following single method: getElement( ): Returns the element stored at this position. A position acts as a marker or token within a broader positional list. A position p, which is associated with some element e in a list L, does not change, even if the index of e changes in L due to insertions or deletions elsewhere in the list. Nor does position p change if we replace the element e stored at p with another element. The only way in which a position becomes invalid is if that position (and its element) are explicitly removed from the list. Having a formal definition of a position type allows positions to serve as parameters to some methods and return values from other methods of the positional list ADT, which we next describe. 7.3.2 The Positional List Abstract Data Type We now view a positional list as a collection of positions, each of which stores an element. The accessor methods provided by the positional list ADT include the following, for a list L: first( ): Returns the position of the first element of L (or null if empty). last( ): Returns the position of the last element of L (or null if empty). before(p): Returns the position of L immediately before position p (or null if p is the first position). after(p): Returns the position of L immediately after position p (or null if p is the last position). isEmpty( ): Returns true if list L does not contain any elements. size( ): Returns the number of elements in list L. An error occurs if a position p, sent as a parameter to a method, is not a valid position for the list. Note well that the first( ) and last( ) methods of the positional list ADT return the associated positions, not the elements. (This is in contrast to the corresponding first and last methods of the deque ADT.) The first element of a positional list can be determined by subsequently invoking the getElement method on that position, as first( ).getElement. The advantage of receiving a position as a return value is that we can subsequently use that position to traverse the list. www.it-ebooks.info 7.3. Positional Lists 273 As a demonstration of a typical traversal of a positional list, Code Fragment 7.6 traverses a list, named guests, that stores string elements, and prints each element while traversing from the beginning of the list to the end. 1 2 3 4 5 Position cursor = guests.first( ); while (cursor != null) { System.out.println(cursor.getElement( )); cursor = guests.after(cursor); } // advance to the next position (if any) Code Fragment 7.6: A traversal of a positional list. This code relies on the convention that the null reference is returned when the after method is called upon the last position. (That return value is clearly distinguishable from any legitimate position.) The positional list ADT similarly indicates that the null value is returned when the before method is invoked at the front of the list, or when first or last methods are called upon an empty list. Therefore, the above code fragment works correctly even if the guests list is empty. Updated Methods of a Positional List The positional list ADT also includes the following update methods: addFirst(e): Inserts a new element e at the front of the list, returning the position of the new element. addLast(e): Inserts a new element e at the back of the list, returning the position of the new element. addBefore(p, e): Inserts a new element e in the list, just before position p, returning the position of the new element. addAfter(p, e): Inserts a new element e in the list, just after position p, returning the position of the new element. set(p, e): Replaces the element at position p with element e, returning the element formerly at position p. remove(p): Removes and returns the element at position p in the list, invalidating the position. There may at first seem to be redundancy in the above repertoire of operations for the positional list ADT, since we can perform operation addFirst(e) with addBefore(first( ), e), and operation addLast(e) with addAfter(last( ), e). But these substitutions can only be done for a nonempty list. www.it-ebooks.info Chapter 7. List and Iterator ADTs 274 Example 7.4: The following table shows a series of operations on an initially empty positional list storing integers. To identify position instances, we use variables such as p and q. For ease of exposition, when displaying the list contents, we use subscript notation to denote the position storing an element. Method addLast(8) first( ) addAfter(p, 5) before(q) addBefore(q, 3) r.getElement( ) after(p) before(p) addFirst(9) remove(last( )) set(p, 7) remove(q) Return Value p p q p r 3 r null s 5 8 “error” List Contents (8 p ) (8 p ) (8 p , 5q ) (8 p , 5q ) (8 p , 3r , 5q ) (8 p , 3r , 5q ) (8 p , 3r , 5q ) (8 p , 3r , 5q ) (9s , 8 p , 3r , 5q ) (9s , 8 p , 3r ) (9s , 7 p , 3r ) (9s , 7 p , 3r ) Java Interface Definitions We are now ready to formalize the position ADT and positional list ADT. A Java Position interface, representing the position ADT, is given in Code Fragment 7.7. Following that, Code Fragment 7.8 presents a Java definition for our PositionalList interface. If the getElement( ) method is called on a Position instance that has previously been removed from its list, an IllegalStateException is thrown. If an invalid Position instance is sent as a parameter to a method of a PositionalList, an IllegalArgumentException is thrown. (Both of those exception types are defined in the standard Java hierarchy.) 1 2 3 4 5 6 7 8 9 public interface Position { /∗∗ ∗ Returns the element stored at this position. ∗ ∗ @return the stored element ∗ @throws IllegalStateException if position no longer valid ∗/ E getElement( ) throws IllegalStateException; } Code Fragment 7.7: The Position interface. www.it-ebooks.info 7.3. Positional Lists 275 1 /∗∗ An interface for positional lists. ∗/ 2 public interface PositionalList { 3 4 /∗∗ Returns the number of elements in the list. ∗/ 5 int size( ); 6 7 /∗∗ Tests whether the list is empty. ∗/ 8 boolean isEmpty( ); 9 10 /∗∗ Returns the first Position in the list (or null, if empty). ∗/ 11 Position first( ); 12 13 /∗∗ Returns the last Position in the list (or null, if empty). ∗/ 14 Position last( ); 15 16 /∗∗ Returns the Position immediately before Position p (or null, if p is first). ∗/ 17 Position before(Position p) throws IllegalArgumentException; 18 19 /∗∗ Returns the Position immediately after Position p (or null, if p is last). ∗/ 20 Position after(Position p) throws IllegalArgumentException; 21 22 /∗∗ Inserts element e at the front of the list and returns its new Position. ∗/ 23 Position addFirst(E e); 24 25 /∗∗ Inserts element e at the back of the list and returns its new Position. ∗/ 26 Position addLast(E e); 27 28 /∗∗ Inserts element e immediately before Position p and returns its new Position. ∗/ 29 Position addBefore(Position p, E e) 30 throws IllegalArgumentException; 31 32 /∗∗ Inserts element e immediately after Position p and returns its new Position. ∗/ 33 Position addAfter(Position p, E e) 34 throws IllegalArgumentException; 35 36 /∗∗ Replaces the element stored at Position p and returns the replaced element. ∗/ 37 E set(Position p, E e) throws IllegalArgumentException; 38 39 /∗∗ Removes the element stored at Position p and returns it (invalidating p). ∗/ 40 E remove(Position p) throws IllegalArgumentException; 41 } Code Fragment 7.8: The PositionalList interface. www.it-ebooks.info 276 Chapter 7. List and Iterator ADTs 7.3.3 Doubly Linked List Implementation Not surprisingly, our preferred implementation of the PositionalList interface relies on a doubly linked list. Although we implemented a DoublyLinkedList class in Chapter 3, that class does not adhere to the PositionalList interface. In this section, we develop a concrete implementation of the PositionalList interface using a doubly linked list. The low-level details of our new linked-list representation, such as the use of header and trailer sentinels, will be identical to our earlier version; we refer the reader to Section 3.4 for a discussion of the doubly linked list operations. What differs in this section is our management of the positional abstraction. The obvious way to identify locations within a linked list are node references. Therefore, we declare the nested Node class of our linked list so as to implement the Position interface, supporting the required getElement method. So the nodes are the positions. Yet, the Node class is declared as private, to maintain proper encapsulation. All of the public methods of the positional list rely on the Position type, so although we know we are sending and receiving nodes, these are only known to be positions from the outside; as a result, users of our class cannot call any method other than getElement( ). In Code Fragments 7.9–7.12, we define a LinkedPositionalList class, which implements the positional list ADT. We provide the following guide to that code: • Code Fragment 7.9 contains the definition of the nested Node class, which implements the Position interface. Following that are the declaration of the instance variables of the outer LinkedPositionalList class and its constructor. • Code Fragment 7.10 begins with two important utility methods that help us robustly cast between the Position and Node types. The validate(p) method is called anytime the user sends a Position instance as a parameter. It throws an exception if it determines that the position is invalid, and otherwise returns that instance, implicitly cast as a Node, so that methods of the Node class can subsequently be called. The private position(node) method is used when about to return a Position to the user. Its primary purpose is to make sure that we do not expose either sentinel node to a caller, returning a null reference in such a case. We rely on both of these private utility methods in the public accessor methods that follow. • Code Fragment 7.11 provides most of the public update methods, relying on a private addBetween method to unify the implementations of the various insertion operations. • Code Fragment 7.12 provides the public remove method. Note that it sets all fields of the removed node back to null—a condition we can later detect to recognize a defunct position. www.it-ebooks.info 7.3. Positional Lists 277 1 /∗∗ Implementation of a positional list stored as a doubly linked list. ∗/ 2 public class LinkedPositionalList implements PositionalList { 3 //---------------- nested Node class ---------------4 private static class Node implements Position { 5 private E element; // reference to the element stored at this node 6 private Node prev; // reference to the previous node in the list 7 private Node next; // reference to the subsequent node in the list 8 public Node(E e, Node p, Node n) { 9 element = e; 10 prev = p; 11 next = n; 12 } 13 public E getElement( ) throws IllegalStateException { 14 if (next == null) // convention for defunct node 15 throw new IllegalStateException("Position no longer valid"); 16 return element; 17 } 18 public Node getPrev( ) { 19 return prev; 20 } 21 public Node getNext( ) { 22 return next; 23 } 24 public void setElement(E e) { 25 element = e; 26 } 27 public void setPrev(Node p) { 28 prev = p; 29 } 30 public void setNext(Node n) { 31 next = n; 32 } 33 } //----------- end of nested Node class ----------34 35 // instance variables of the LinkedPositionalList 36 private Node header; // header sentinel 37 private Node trailer; // trailer sentinel 38 private int size = 0; // number of elements in the list 39 40 /∗∗ Constructs a new empty list. ∗/ 41 public LinkedPositionalList( ) { 42 header = new Node(null, null, null); // create header 43 trailer = new Node(null, header, null); // trailer is preceded by header 44 header.setNext(trailer); // header is followed by trailer 45 } Code Fragment 7.9: An implementation of the LinkedPositionalList class. (Continues in Code Fragments 7.10–7.12.) www.it-ebooks.info Chapter 7. List and Iterator ADTs 278 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 // private utilities /∗∗ Validates the position and returns it as a node. ∗/ private Node validate(Position p) throws IllegalArgumentException { if (!(p instanceof Node)) throw new IllegalArgumentException("Invalid p"); Node node = (Node) p; // safe cast if (node.getNext( ) == null) // convention for defunct node throw new IllegalArgumentException("p is no longer in the list"); return node; } /∗∗ Returns the given node as a Position (or null, if it is a sentinel). ∗/ private Position position(Node node) { if (node == header | | node == trailer) return null; // do not expose user to the sentinels return node; } // public accessor methods /∗∗ Returns the number of elements in the linked list. ∗/ public int size( ) { return size; } /∗∗ Tests whether the linked list is empty. ∗/ public boolean isEmpty( ) { return size == 0; } /∗∗ Returns the first Position in the linked list (or null, if empty). ∗/ public Position first( ) { return position(header.getNext( )); } /∗∗ Returns the last Position in the linked list (or null, if empty). ∗/ public Position last( ) { return position(trailer.getPrev( )); } /∗∗ Returns the Position immediately before Position p (or null, if p is first). ∗/ public Position before(Position p) throws IllegalArgumentException { Node node = validate(p); return position(node.getPrev( )); } /∗∗ Returns the Position immediately after Position p (or null, if p is last). ∗/ public Position after(Position p) throws IllegalArgumentException { Node node = validate(p); return position(node.getNext( )); } Code Fragment 7.10: An implementation of the LinkedPositionalList class. (Continued from Code Fragment 7.9; continues in Code Fragments 7.11 and 7.12.) www.it-ebooks.info 7.3. Positional Lists 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 279 // private utilities /∗∗ Adds element e to the linked list between the given nodes. ∗/ private Position addBetween(E e, Node pred, Node succ) { Node newest = new Node(e, pred, succ); // create and link a new node pred.setNext(newest); succ.setPrev(newest); size++; return newest; } // public update methods /∗∗ Inserts element e at the front of the linked list and returns its new Position. ∗/ public Position addFirst(E e) { return addBetween(e, header, header.getNext( )); // just after the header } /∗∗ Inserts element e at the back of the linked list and returns its new Position. ∗/ public Position addLast(E e) { return addBetween(e, trailer.getPrev( ), trailer); // just before the trailer } /∗∗ Inserts element e immediately before Position p, and returns its new Position.∗/ public Position addBefore(Position p, E e) throws IllegalArgumentException { Node node = validate(p); return addBetween(e, node.getPrev( ), node); } /∗∗ Inserts element e immediately after Position p, and returns its new Position. ∗/ public Position addAfter(Position p, E e) throws IllegalArgumentException { Node node = validate(p); return addBetween(e, node, node.getNext( )); } /∗∗ Replaces the element stored at Position p and returns the replaced element. ∗/ public E set(Position p, E e) throws IllegalArgumentException { Node node = validate(p); E answer = node.getElement( ); node.setElement(e); return answer; } Code Fragment 7.11: An implementation of the LinkedPositionalList class. (Continued from Code Fragments 7.9 and 7.10; continues in Code Fragment 7.12.) www.it-ebooks.info Chapter 7. List and Iterator ADTs 280 133 /∗∗ Removes the element stored at Position p and returns it (invalidating p). ∗/ 134 public E remove(Position p) throws IllegalArgumentException { 135 Node node = validate(p); 136 Node predecessor = node.getPrev( ); 137 Node successor = node.getNext( ); 138 predecessor.setNext(successor); 139 successor.setPrev(predecessor); 140 size−−; 141 E answer = node.getElement( ); 142 node.setElement(null); // help with garbage collection 143 node.setNext(null); // and convention for defunct node 144 node.setPrev(null); 145 return answer; 146 } 147 } Code Fragment 7.12: An implementation of the LinkedPositionalList class. (Continued from Code Fragments 7.9–7.11.) The Performance of a Linked Positional List The positional list ADT is ideally suited for implementation with a doubly linked list, as all operations run in worst-case constant time, as shown in Table 7.2. This is in stark contrast to the ArrayList structure (analyzed in Table 7.1), which requires linear time for insertions or deletions at arbitrary positions, due to the need for a loop to shift other elements. Of course, our positional list does not support the index-based methods of the official List interface of Section 7.1. It is possible to add support for those methods by traversing the list while counting nodes (see Exercise C-7.38), but that requires time proportional to the sublist that is traversed. Method size( ) isEmpty( ) first( ), last( ) before(p), after(p) addFirst(e), addLast(e) addBefore(p, e), addAfter(p, e) set(p, e) remove(p) Running Time O(1) O(1) O(1) O(1) O(1) O(1) O(1) O(1) Table 7.2: Performance of a positional list with n elements realized by a doubly linked list. The space usage is O(n). www.it-ebooks.info 7.3. Positional Lists 281 Implementing a Positional List with an Array We can implement a positional list L using an array A for storage, but some care is necessary in designing objects that will serve as positions. At first glance, it would seem that a position p need only store the index i at which its associated element is stored within the array. We can then implement method getElement(p) simply by returning A[i]. The problem with this approach is that the index of an element e changes when other insertions or deletions occur before it. If we have already returned a position p associated with element e that stores an outdated index i to a user, the wrong array cell would be accessed when the position was used. (Remember that positions in a positional list should always be defined relative to their neighboring positions, not their indices.) Hence, if we are going to implement a positional list with an array, we need a different approach. We recommend the following representation. Instead of storing the elements of L directly in array A, we store a new kind of position object in each cell of A. A position p stores the element e as well as the current index i of that element within the list. Such a data structure is illustrated in Figure 7.8. (0,JFK) (1,BWI) 0 1 (2,PVD) 2 (3,SFO) 3 N−1 Figure 7.8: An array-based representation of a positional list. With this representation, we can determine the index currently associated with a position, and we can determine the position currently associated with a specific index. We can therefore implement an accessor, such as before(p), by finding the index of the given position and using the array to find the neighboring position. When an element is inserted or deleted somewhere in the list, we can loop through the array to update the index variable stored in all later positions in the list that are shifted during the update. Efficiency Trade-Offs with an Array-Based Sequence In this array implementation of a sequence, the addFirst, addBefore, addAfter, and remove methods take O(n) time, because we have to shift position objects to make room for the new position or to fill in the hole created by the removal of the old position (just as in the insert and remove methods based on index). All the other position-based methods take O(1) time. www.it-ebooks.info Chapter 7. List and Iterator ADTs 282 7.4 Iterators An iterator is a software design pattern that abstracts the process of scanning through a sequence of elements, one element at a time. The underlying elements might be stored in a container class, streaming through a network, or generated by a series of computations. In order to unify the treatment and syntax for iterating objects in a way that is independent from a specific organization, Java defines the java.util.Iterator interface with the following two methods: hasNext( ): Returns true if there is at least one additional element in the sequence, and false otherwise. next( ): Returns the next element in the sequence. The interface uses Java’s generic framework, with the next( ) method returning a parameterized element type. For example, the Scanner class (described in Section 1.6) formally implements the Iterator interface, with its next( ) method returning a String instance. If the next( ) method of an iterator is called when no further elements are available, a NoSuchElementException is thrown. Of course, the hasNext( ) method can be used to detect that condition before calling next( ). The combination of these two methods allows a general loop construct for processing elements of the iterator. For example, if we let variable, iter, denote an instance of the Iterator type, then we can write the following: while (iter.hasNext( )) { String value = iter.next( ); System.out.println(value); } The java.util.Iterator interface contains a third method, which is optionally supported by some iterators: remove( ): Removes from the collection the element returned by the most recent call to next( ). Throws an IllegalStateException if next has not yet been called, or if remove was already called since the most recent call to next. This method can be used to filter a collection of elements, for example to discard all negative numbers from a data set. For the sake of simplicity, we will not implement the remove method for most data structures in this book, but we will give two tangible examples later in this section. If removal is not supported, an UnsupportedOperationException is conventionally thrown. www.it-ebooks.info 7.4. Iterators 283 7.4.1 The Iterable Interface and Java’s For-Each Loop A single iterator instance supports only one pass through a collection; calls to next can be made until all elements have been reported, but there is no way to “reset” the iterator back to the beginning of the sequence. However, a data structure that wishes to allow repeated iterations can support a method that returns a new iterator, each time it is called. To provide greater standardization, Java defines another parameterized interface, named Iterable, that includes the following single method: iterator( ): Returns an iterator of the elements in the collection. An instance of a typical collection class in Java, such as an ArrayList, is iterable (but not itself an iterator); it produces an iterator for its collection as the return value of the iterator( ) method. Each call to iterator( ) returns a new iterator instance, thereby allowing multiple (even simultaneous) traversals of a collection. Java’s Iterable class also plays a fundamental role in support of the “for-each” loop syntax (described in Section 1.5.2). The loop syntax, for (ElementType variable : collection) { loopBody } // may refer to ”variable” is supported for any instance, collection, of an iterable class. ElementType must be the type of object returned by its iterator, and variable will take on element values within the loopBody. Essentially, this syntax is shorthand for the following: Iterator iter = collection.iterator( ); while (iter.hasNext( )) { ElementType variable = iter.next( ); loopBody // may refer to ”variable” } We note that the iterator’s remove method cannot be invoked when using the for-each loop syntax. Instead, we must explicitly use an iterator. As an example, the following loop can be used to remove all negative numbers from an ArrayList of floating-point values. ArrayList data; // populate with random numbers (not shown) Iterator walk = data.iterator( ); while (walk.hasNext( )) if (walk.next( ) < 0.0) walk.remove( ); www.it-ebooks.info Chapter 7. List and Iterator ADTs 284 7.4.2 Implementing Iterators There are two general styles for implementing iterators that differ in terms of what work is done when the iterator instance is first created, and what work is done each time the iterator is advanced with a call to next( ). A snapshot iterator maintains its own private copy of the sequence of elements, which is constructed at the time the iterator object is created. It effectively records a “snapshot” of the sequence of elements at the time the iterator is created, and is therefore unaffected by any subsequent changes to the primary collection that may occur. Implementing snapshot iterators tends to be very easy, as it requires a simple traversal of the primary structure. The downside of this style of iterator is that it requires O(n) time and O(n) auxiliary space, upon construction, to copy and store a collection of n elements. A lazy iterator is one that does not make an upfront copy, instead performing a piecewise traversal of the primary structure only when the next( ) method is called to request another element. The advantage of this style of iterator is that it can typically be implemented so the iterator requires only O(1) space and O(1) construction time. One downside (or feature) of a lazy iterator is that its behavior is affected if the primary structure is modified (by means other than by the iterator’s own remove method) before the iteration completes. Many of the iterators in Java’s libraries implement a “fail-fast” behavior that immediately invalidates such an iterator if its underlying collection is modified unexpectedly. We will demonstrate how to implement iterators for both the ArrayList and LinkedPositionalList classes as examples. We implement lazy iterators for both, including support for the remove operation (but without any fail-fast guarantee). Iterations with the ArrayList class We begin by discussing iteration for the ArrayList class. We will have it implement the Iterable interface. (In fact, that requirement is already part of Java’s List interface.) Therefore, we must add an iterator( ) method to that class definition, which returns an instance of an object that implements the Iterator interface. For this purpose, we define a new class, ArrayIterator, as a nonstatic nested class of ArrayList (i.e., an inner class, as described in Section 2.6). The advantage of having the iterator as an inner class is that it can access private fields (such as the array A) that are members of the containing list. Our implementation is given in Code Fragment 7.13. The iterator( ) method of ArrayList returns a new instance of the inner ArrayIterator class. Each iterator maintains a field j that represents the index of the next element to be returned. It is initialized to 0, and when j reaches the size of the list, there are no more elements to return. In order to support element removal through the iterator, we also maintain a boolean variable that denotes whether a call to remove is currently permissible. www.it-ebooks.info 7.4. Iterators 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 285 //---------------- nested ArrayIterator class ---------------/∗∗ ∗ A (nonstatic) inner class. Note well that each instance contains an implicit ∗ reference to the containing list, allowing it to access the list's members. ∗/ private class ArrayIterator implements Iterator { private int j = 0; // index of the next element to report private boolean removable = false; // can remove be called at this time? /∗∗ ∗ Tests whether the iterator has a next object. ∗ @return true if there are further objects, false otherwise ∗/ public boolean hasNext( ) { return j < size; } // size is field of outer instance /∗∗ ∗ Returns the next object in the iterator. ∗ ∗ @return next object ∗ @throws NoSuchElementException if there are no further elements ∗/ public E next( ) throws NoSuchElementException { if (j == size) throw new NoSuchElementException("No next element"); removable = true; // this element can subsequently be removed return data[j++]; // post-increment j, so it is ready for future call to next } /∗∗ ∗ Removes the element returned by most recent call to next. ∗ @throws IllegalStateException if next has not yet been called ∗ @throws IllegalStateException if remove was already called since recent next ∗/ public void remove( ) throws IllegalStateException { if (!removable) throw new IllegalStateException("nothing to remove"); ArrayList.this.remove(j−1); // that was the last one returned j−−; // next element has shifted one cell to the left removable = false; // do not allow remove again until next is called } } //------------ end of nested ArrayIterator class -----------/∗∗ Returns an iterator of the elements stored in the list. ∗/ public Iterator iterator( ) { return new ArrayIterator( ); // create a new instance of the inner class } Code Fragment 7.13: Code providing support for ArrayList iterators. (This should be nested within the ArrayList class definition of Code Fragments 7.2 and 7.3.) www.it-ebooks.info Chapter 7. List and Iterator ADTs 286 Iterations with the LinkedPositionalList class In support the concept of iteration with the LinkedPositionalList class, a first question is whether to support iteration of the elements of the list or the positions of the list. If we allow a user to iterate through all positions of the list, those positions could be used to access the underlying elements, so support for position iteration is more general. However, it is more standard for a container class to support iteration of the core elements, by default, so that the for-each loop syntax could be used to write code such as the following, for (String guest : waitlist) assuming that variable waitlist has type LinkedPositionalList. For maximum convenience, we will support both forms of iteration. We will have the standard iterator( ) method return an iterator of the elements of the list, so that our list class formally implements the Iterable interface for the declared element type. For those wishing to iterate through the positions of a list, we will provide a new method, positions( ). At first glance, it would seem a natural choice for such a method to return an Iterator. However, we prefer for the return type of that method to be an instance that is Iterable (and hence, has its own iterator( ) method that returns an iterator of positions). Our reason for the extra layer of complexity is that we wish for users of our class to be able to use a for-each loop with a simple syntax such as the following: for (Position p : waitlist.positions( )) For this syntax to be legal, the return type of positions( ) must be Iterable. Code Fragment 7.14 presents our new support for the iteration of positions and elements of a LinkedPositionalList. We define three new inner classes. The first of these is PositionIterator, providing the core functionality of our list iterations. Whereas the array list iterator maintained the index of the next element to be returned as a field, this class maintains the position of the next element to be returned (as well as the position of the most recently returned element, to support removal). To support our goal of the positions( ) method returning an Iterable object, we define a trivial PositionIterable inner class, which simply constructs and returns a new PositionIterator object each time its iterator( ) method is called. The positions( ) method of the top-level class returns a new PositionIterable instance. Our framework relies heavily on these being inner classes, not static nested classes. Finally, we wish to have the top-level iterator( ) method return an iterator of elements (not positions). Rather than reinvent the wheel, we trivially adapt the PositionIterator class to define a new ElementIterator class, which lazily manages a position iterator instance, while returning the element stored at each position when next( ) is called. www.it-ebooks.info 7.4. Iterators 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 287 //---------------- nested PositionIterator class ---------------private class PositionIterator implements Iterator { private Position cursor = first( ); // position of the next element to report private Position recent = null; // position of last reported element /∗∗ Tests whether the iterator has a next object. ∗/ public boolean hasNext( ) { return (cursor != null); } /∗∗ Returns the next position in the iterator. ∗/ public Position next( ) throws NoSuchElementException { if (cursor == null) throw new NoSuchElementException("nothing left"); recent = cursor; // element at this position might later be removed cursor = after(cursor); return recent; } /∗∗ Removes the element returned by most recent call to next. ∗/ public void remove( ) throws IllegalStateException { if (recent == null) throw new IllegalStateException("nothing to remove"); LinkedPositionalList.this.remove(recent); // remove from outer list recent = null; // do not allow remove again until next is called } } //------------ end of nested PositionIterator class -----------//---------------- nested PositionIterable class ---------------private class PositionIterable implements Iterable { public Iterator iterator( ) { return new PositionIterator( ); } } //------------ end of nested PositionIterable class -----------/∗∗ Returns an iterable representation of the list's positions. ∗/ public Iterable positions( ) { return new PositionIterable( ); // create a new instance of the inner class } //---------------- nested ElementIterator class ---------------/∗ This class adapts the iteration produced by positions() to return elements. ∗/ private class ElementIterator implements Iterator { Iterator posIterator = new PositionIterator( ); public boolean hasNext( ) { return posIterator.hasNext( ); } public E next( ) { return posIterator.next( ).getElement( ); } // return element! public void remove( ) { posIterator.remove( ); } } /∗∗ Returns an iterator of the elements stored in the list. ∗/ public Iterator iterator( ) { return new ElementIterator( ); } Code Fragment 7.14: Support for providing iterations of positions and elements of a LinkedPositionalList. (This should be nested within the LinkedPositionalList class definition of Code Fragments 7.9–7.12.) www.it-ebooks.info Chapter 7. List and Iterator ADTs 288 The Java Collections Framework Java provides many data structure interfaces and classes, which together form the Java Collections Framework. This framework, which is part of the java.util package, includes versions of several of the data structures discussed in this book, some of which we have already discussed and others of which we will discuss later in this book. The root interface in the Java collections framework is named Collection. This is a general interface for any data structure, such as a list, that represents a collection of elements. The Collection interface includes many methods, including some we have already seen (e.g., size( ), isEmpty( ), iterator( )). It is a superinterface for other interfaces in the Java Collections Framework that can hold elements, including the java.util interfaces Deque, List, and Queue, and other subinterfaces discussed later in this book, including Set (Section 10.5.1) and Map (Section 10.1). X X X X X X X X X X X X X X X X X X X X X X X X X Table 7.3: Several classes in the Java Collections Framework. www.it-ebooks.info Linked List Blocking Array X X X X X X Thread-Safe Storage List Properties Deque Class ArrayBlockingQueue LinkedBlockingQueue ConcurrentLinkedQueue ArrayDeque LinkedBlockingDeque ConcurrentLinkedDeque ArrayList LinkedList Interfaces Capacity Limit The Java Collections Framework also includes concrete classes implementing various interfaces with a combination of properties and underlying representations. We summarize but a few of those classes in Table 7.3. For each, we denote which of the Queue, Deque, or List interfaces are implemented (possibly several). We also discuss several behavioral properties. Some classes enforce, or allow, a fixed capacity limit. Robust classes provide support for concurrency, allowing multiple processes to share use of a data structure in a thread-safe manner. If the structure is designated as blocking, a call to retrieve an element from an empty collection waits until some other process inserts an element. Similarly, a call to insert into a full blocking structure must wait until room becomes available. Queue 7.5 X 7.5. The Java Collections Framework 289 7.5.1 List Iterators in Java The java.util.LinkedList class does not expose a position concept to users in its API, as we do in our positional list ADT. Instead, the preferred way to access and update a LinkedList object in Java, without using indices, is to use a ListIterator that is returned by the list’s listIterator( ) method. Such an iterator provides forward and backward traversal methods as well as local update methods. It views its current position as being before the first element, between two elements, or after the last element. That is, it uses a list cursor, much like a screen cursor is viewed as being located between two characters on a screen. Specifically, the java.util.ListIterator interface includes the following methods: add(e): Adds the element e at the current position of the iterator. hasNext( ): Returns true if there is an element after the current position of the iterator. hasPrevious( ): Returns true if there is an element before the current position of the iterator. previous( ): Returns the element e before the current position and sets the current position to be before e. next( ): Returns the element e after the current position and sets the current position to be after e. nextIndex( ): Returns the index of the next element. previousIndex( ): Returns the index of the previous element. remove( ): Removes the element returned by the most recent next or previous operation. set(e): Replaces the element returned by the most recent call to the next or previous operation with e. It is risky to use multiple iterators over the same list while modifying its contents. If insertions, deletions, or replacements are required at multiple “places” in a list, it is safer to use positions to specify these locations. But the java.util.LinkedList class does not expose its position objects to the user. So, to avoid the risks of modifying a list that has created multiple iterators, the iterators have a “fail-fast” feature that invalidates such an iterator if its underlying collection is modified unexpectedly. For example, if a java.util.LinkedList object L has returned five different iterators and one of them modifies L, a ConcurrentModificationException is thrown if any of the other four is subsequently used. That is, Java allows many list iterators to be traversing a linked list L at the same time, but if one of them modifies L (using an add, set, or remove method), then all the other iterators for L become invalid. Likewise, if L is modified by one of its own update methods, then all existing iterators for L immediately become invalid. www.it-ebooks.info Chapter 7. List and Iterator ADTs 290 7.5.2 Comparison to Our Positional List ADT Java provides functionality similar to our array list and positional lists ADT in the java.util.List interface, which is implemented with an array in java.util.ArrayList and with a linked list in java.util.LinkedList. Moreover, Java uses iterators to achieve a functionality similar to what our positional list ADT derives from positions. Table 7.4 shows corresponding methods between our (array and positional) list ADTs and the java.util interfaces List and ListIterator interfaces, with notes about their implementations in the java.util classes ArrayList and LinkedList. Positional List ADT Method size( ) isEmpty( ) java.util.List Method size( ) isEmpty( ) ListIterator Method get(i) first( ) last( ) before(p) after(p) set(p, e) listIterator( ) listIterator(size( )) previous( ) next( ) set(e) set(i, e) addFirst(e) addFirst(e) addLast(e) addLast(e) add(i, e) add(0, e) addFirst(e) add(e) addLast(e) addAfter(p, e) add(e) addBefore(p, e) add(e) remove(p) remove( ) remove(i) Notes O(1) time O(1) time A is O(1), L is O(min{i, n − i}) first element is next last element is previous O(1) time O(1) time O(1) time A is O(1), L is O(min{i, n − i}) O(n) time A is O(n), L is O(1) only exists in L, O(1) O(1) time only exists in L, O(1) insertion is at cursor; A is O(n), L is O(1) insertion is at cursor; A is O(n), L is O(1) deletion is at cursor; A is O(n), L is O(1) A is O(1), L is O(min{i, n − i}) Table 7.4: Correspondences between methods in our positional list ADT and the java.util interfaces List and ListIterator. We use A and L as abbreviations for java.util.ArrayList and java.util.LinkedList (or their running times). www.it-ebooks.info 7.5. The Java Collections Framework 291 7.5.3 List-Based Algorithms in the Java Collections Framework In addition to the classes that are provided in the Java Collections Framework, there are a number of simple algorithms that it provides as well. These algorithms are implemented as static methods in the java.util.Collections class (not to be confused with the java.util.Collection interface) and they include the following methods: copy(Ldest , Lsrc ): Copies all elements of the Lsrc list into corresponding indices of the Ldest list. disjoint(C, D): Returns a boolean value indicating whether the collections C and D are disjoint. fill(L, e): Replaces each element of the list L with element e. frequency(C, e): Returns the number of elements in the collection C that are equal to e. max(C): Returns the maximum element in the collection C, based on the natural ordering of its elements. min(C): Returns the minimum element in the collection C, based on the natural ordering of its elements. replaceAll(L, e, f ): Replaces each element in L that is equal to e with element f . reverse(L): Reverses the ordering of elements in the list L. rotate(L, d): Rotates the elements in the list L by the distance d (which can be negative), in a circular fashion. shuffle(L): Pseudorandomly permutes the ordering of the elements in the list L. sort(L): Sorts the list L, using the natural ordering of its elements. swap(L, i, j): Swap the elements at indices i and j of list L. www.it-ebooks.info Chapter 7. List and Iterator ADTs 292 Converting Lists into Arrays Lists are a beautiful concept and they can be applied in a number of different contexts, but there are some instances where it would be useful if we could treat a list like an array. Fortunately, the java.util.Collection interface includes the following helpful methods for generating an array that has the same elements as the given collection: toArray( ): Returns an array of elements of type Object containing all the elements in this collection. toArray(A): Returns an array of elements of the same element type as A containing all the elements in this collection. If the collection is a list, then the returned array will have its elements stored in the same order as that of the original list. Thus, if we have a useful array-based method that we want to use on a list or other type of collection, then we can do so by simply using that collection’s toArray( ) method to produce an array representation of that collection. Converting Arrays into Lists In a similar vein, it is often useful to be able to convert an array into an equivalent list. Fortunately, the java.util.Arrays class includes the following method: asList(A): Returns a list representation of the array A, with the same element type as the elements of A. The list returned by this method uses the array A as its internal representation for the list. So this list is guaranteed to be an array-based list and any changes made to it will automatically be reflected in A. Because of these types of side effects, use of the asList method should always be done with caution, so as to avoid unintended consequences. But, used with care, this method can often save us a lot of work. For instance, the following code fragment could be used to randomly shuffle an array of Integer objects, arr: // allowed by autoboxing Integer[ ] arr = {1, 2, 3, 4, 5, 6, 7, 8}; List listArr = Arrays.asList(arr); Collections.shuffle(listArr); // this has side effect of shuffling arr It is worth noting that the array A sent to the asList method should be a reference type (hence, our use of Integer rather than int in the above example). This is because the List interface is generic, and requires that the element type be an object. www.it-ebooks.info 7.6. Sorting a Positional List 7.6 293 Sorting a Positional List In Section 3.1.2, we introduced the insertion-sort algorithm in the context of an array-based sequence. In this section, we develop an implementation that operates on a PositionalList, relying on the same high-level algorithm in which each element is placed relative to a growing collection of previously sorted elements. We maintain a variable named marker that represents the rightmost position of the currently sorted portion of a list. During each pass, we consider the position just past the marker as the pivot and consider where the pivot’s element belongs relative to the sorted portion; we use another variable, named walk, to move leftward from the marker, as long as there remains a preceding element with value larger than the pivot’s. A typical configuration of these variables is diagrammed in Figure 7.9. A Java implementation of this strategy is given in Code 7.15. walk 15 22 25 pivot 29 36 23 53 11 42 marker Figure 7.9: Overview of one step of our insertion-sort algorithm. The shaded elements, those up to and including marker, have already been sorted. In this step, the pivot’s element should be relocated immediately before the walk position. 1 /∗∗ Insertion-sort of a positional list of integers into nondecreasing order ∗/ 2 public static void insertionSort(PositionalList list) { 3 Position marker = list.first( ); // last position known to be sorted 4 while (marker != list.last( )) { 5 Position pivot = list.after(marker); 6 int value = pivot.getElement( ); // number to be placed 7 if (value > marker.getElement( )) // pivot is already sorted 8 marker = pivot; 9 else { // must relocate pivot 10 Position walk = marker; // find leftmost item greater than value 11 while (walk != list.first( ) && list.before(walk).getElement( ) > value) 12 walk = list.before(walk); 13 list.remove(pivot); // remove pivot entry and 14 list.addBefore(walk, value); // reinsert value in front of walk 15 } 16 } 17 } Code Fragment 7.15: Java code for performing insertion-sort on a positional list. www.it-ebooks.info Chapter 7. List and Iterator ADTs 294 7.7 Case Study: Maintaining Access Frequencies The positional list ADT is useful in a number of settings. For example, a program that simulates a game of cards could model each person’s hand as a positional list (Exercise P-7.60). Since most people keep cards of the same suit together, inserting and removing cards from a person’s hand could be implemented using the methods of the positional list ADT, with the positions being determined by a natural order of the suits. Likewise, a simple text editor embeds the notion of positional insertion and deletion, since such editors typically perform all updates relative to a cursor, which represents the current position in the list of characters of text being edited. In this section, we will consider maintaining a collection of elements while keeping track of the number of times each element is accessed. Keeping such access counts allows us to know which elements are among the most popular. Examples of such scenarios include a Web browser that keeps track of a user’s most accessed pages, or a music collection that maintains a list of the most frequently played songs for a user. We will model this with a new favorites list ADT that supports the size and isEmpty methods as well as the following: access(e): Accesses the element e, adding it to the favorites list if it is not already present, and increments its access count. remove(e): Removes element e from the favorites list, if present. getFavorites(k): Returns an iterable collection of the k most accessed elements. 7.7.1 Using a Sorted List Our first approach for managing a list of favorites is to store elements in a linked list, keeping them in nonincreasing order of access counts. We access or remove an element by searching the list from the most frequently accessed to the least frequently accessed. Reporting the k most accessed elements is easy, as they are the first k entries of the list. To maintain the invariant that elements are stored in nonincreasing order of access counts, we must consider how a single access operation may affect the order. The accessed element’s count increases by one, and so it may become larger than one or more of its preceding neighbors in the list, thereby violating the invariant. Fortunately, we can reestablish the sorted invariant using a technique similar to a single pass of the insertion-sort algorithm, introduced in the previous section. We can perform a backward traversal of the list, starting at the position of the element whose access count has increased, until we locate a valid position after which the element can be relocated. www.it-ebooks.info 7.7. Case Study: Maintaining Access Frequencies 295 Using the Composition Pattern We wish to implement a favorites list by making use of a PositionalList for storage. If elements of the positional list were simply elements of the favorites list, we would be challenged to maintain access counts and to keep the proper count with the associated element as the contents of the list are reordered. We use a general objectoriented design pattern, the composition pattern, in which we define a single object that is composed of two or more other objects. (See, for example, Section 2.5.2.) Specifically, we define a nonpublic nested class, Item, that stores the element and its access count as a single instance. We then maintain our favorites list as a PositionalList of item instances, so that the access count for a user’s element is embedded alongside it in our representation. (An Item is never exposed to a user of a FavoritesList.) 1 /∗∗ Maintains a list of elements ordered according to access frequency. ∗/ 2 public class FavoritesList { 3 // ---------------- nested Item class ---------------4 protected static class Item { 5 private E value; 6 private int count = 0; 7 /∗∗ Constructs new item with initial count of zero. ∗/ 8 public Item(E val) { value = val; } 9 public int getCount( ) { return count; } 10 public E getValue( ) { return value; } 11 public void increment( ) { count++; } 12 } //----------- end of nested Item class ----------13 14 PositionalList list = new LinkedPositionalList( ); // list of Items 15 public FavoritesList( ) { } // constructs initially empty favorites list 16 17 // nonpublic utilities 18 /∗∗ Provides shorthand notation to retrieve user's element stored at Position p. ∗/ 19 protected E value(Position p) { return p.getElement( ).getValue( ); } 20 21 /∗∗ Provides shorthand notation to retrieve count of item stored at Position p. ∗/ 22 protected int count(Position p) {return p.getElement( ).getCount( );} 23 24 /∗∗ Returns Position having element equal to e (or null if not found). ∗/ 25 protected Position findPosition(E e) { 26 Position walk = list.first( ); 27 while (walk != null && !e.equals(value(walk))) 28 walk = list.after(walk); 29 return walk; 30 } Code Fragment 7.16: Class FavoritesList. (Continues in Code Fragment 7.17.) www.it-ebooks.info 296 Chapter 7. List and Iterator ADTs 31 /∗∗ Moves item at Position p earlier in the list based on access count. ∗/ 32 protected void moveUp(Position p) { 33 int cnt = count(p); // revised count of accessed item 34 Position walk = p; 35 while (walk != list.first( ) && count(list.before(walk)) < cnt) 36 walk = list.before(walk); // found smaller count ahead of item 37 if (walk != p) 38 list.addBefore(walk, list.remove(p)); // remove/reinsert item 39 } 40 41 // public methods 42 /∗∗ Returns the number of items in the favorites list. ∗/ 43 public int size( ) { return list.size( ); } 44 45 /∗∗ Returns true if the favorites list is empty. ∗/ 46 public boolean isEmpty( ) { return list.isEmpty( ); } 47 48 /∗∗ Accesses element e (possibly new), increasing its access count. ∗/ 49 public void access(E e) { 50 Position p = findPosition(e); // try to locate existing element 51 if (p == null) 52 p = list.addLast(new Item(e)); // if new, place at end 53 p.getElement( ).increment( ); // always increment count 54 moveUp(p); // consider moving forward 55 } 56 57 /∗∗ Removes element equal to e from the list of favorites (if found). ∗/ 58 public void remove(E e) { 59 Position p = findPosition(e); // try to locate existing element 60 if (p != null) 61 list.remove(p); 62 } 63 64 /∗∗ Returns an iterable collection of the k most frequently accessed elements. ∗/ 65 public Iterable getFavorites(int k) throws IllegalArgumentException { 66 if (k < 0 | | k > size( )) 67 throw new IllegalArgumentException("Invalid k"); 68 PositionalList result = new LinkedPositionalList( ); 69 Iterator iter = list.iterator( ); 70 for (int j=0; j < k; j++) 71 result.addLast(iter.next( ).getValue( )); 72 return result; 73 } 74 } Code Fragment 7.17: Class FavoritesList. (Continued from Code Fragment 7.16.) www.it-ebooks.info 7.7. Case Study: Maintaining Access Frequencies 297 7.7.2 Using a List with the Move-to-Front Heuristic The previous implementation of a favorites list performs the access(e) method in time proportional to the index of e in the favorites list. That is, if e is the k th most popular element in the favorites list, then accessing it takes O(k) time. In many real-life access sequences (e.g., Web pages visited by a user), once an element is accessed it is more likely to be accessed again in the near future. Such scenarios are said to possess locality of reference. A heuristic, or rule of thumb, that attempts to take advantage of the locality of reference that is present in an access sequence is the move-to-front heuristic. To apply this heuristic, each time we access an element we move it all the way to the front of the list. Our hope, of course, is that this element will be accessed again in the near future. Consider, for example, a scenario in which we have n elements and the following series of n2 accesses: • element 1 is accessed n times. • element 2 is accessed n times. • ··· • element n is accessed n times. If we store the elements sorted by their access counts, inserting each element the first time it is accessed, then • each access to element 1 runs in O(1) time. • each access to element 2 runs in O(2) time. • ··· • each access to element n runs in O(n) time. Thus, the total time for performing the series of accesses is proportional to n + 2n + 3n + · · · + n · n = n(1 + 2 + 3 + · · · + n) = n · n(n + 1) , 2 which is O(n3 ). On the other hand, if we use the move-to-front heuristic, inserting each element the first time it is accessed, then • each subsequent access to element 1 takes O(1) time. • each subsequent access to element 2 takes O(1) time. • ··· • each subsequent access to element n runs in O(1) time. So the running time for performing all the accesses in this case is O(n2 ). Thus, the move-to-front implementation has faster access times for this scenario. Still, the move-to-front approach is just a heuristic, for there are access sequences where using the move-to-front approach is slower than simply keeping the favorites list ordered by access counts. www.it-ebooks.info 298 Chapter 7. List and Iterator ADTs The Trade-Offs with the Move-to-Front Heuristic If we no longer maintain the elements of the favorites list ordered by their access counts, when we are asked to find the k most accessed elements, we need to search for them. We will implement the getFavorites(k) method as follows: 1. We copy all entries of our favorites list into another list, named temp. 2. We scan the temp list k times. In each scan, we find the entry with the largest access count, remove this entry from temp, and add it to the results. This implementation of method getFavorites(k) takes O(kn) time. Thus, when k is a constant, method getFavorites(k) runs in O(n) time. This occurs, for example, when we want to get the “top ten” list. However, if k is proportional to n, then the method getFavorites(k) runs in O(n2 ) time. This occurs, for example, when we want a “top 25%” list. In Chapter 9 we will introduce a data structure that will allow us to implement getFavorites in O(n + k log n) time (see Exercise P-9.51), and more advanced techniques could be used to perform getFavorites in O(n + k log k) time. We could easily achieve O(n log n) time if we use a standard sorting algorithm to reorder the temporary list before reporting the top k (see Chapter 12); this approach would be preferred to the original in the case that k is Ω(log n). (Recall the big-Omega notation introduced in Section 4.3.1 to give an asymptotic lower bound on the running time of an algorithm.) There is a specialized sorting algorithm (see Section 12.3.2) that can take advantage of the fact that access counts are integers in order to achieve O(n) time for getFavorites, for any value of k. Implementing the Move-to-Front Heuristic in Java We give an implementation of a favorites list using the move-to-front heuristic in Code Fragment 7.18. The new FavoritesListMTF class inherits most of its functionality from the original FavoritesList as a base class. By our original design, the access method of the original class relies on a protected utility named moveUp to enact the potential shifting of an element forward in the list, after its access count had been incremented. Therefore, we implement the move-to-front heuristic by simply overriding the moveUp method so that each accessed element is moved directly to the front of the list (if not already there). This action is easily implemented by means of the positional list ADT. The more complex portion of our FavoritesListMTF class is the new definition for the getFavorites method. We rely on the first of the approaches outlined above, inserting copies of the items into a temporary list and then repeatedly finding, reporting, and removing an element that has the largest access count of those remaining. www.it-ebooks.info 7.7. Case Study: Maintaining Access Frequencies 299 1 /∗∗ Maintains a list of elements ordered with move-to-front heuristic. ∗/ 2 public class FavoritesListMTF extends FavoritesList { 3 4 /∗∗ Moves accessed item at Position p to the front of the list. ∗/ 5 protected void moveUp(Position p) { 6 if (p != list.first( )) 7 list.addFirst(list.remove(p)); // remove/reinsert item 8 } 9 10 /∗∗ Returns an iterable collection of the k most frequently accessed elements. ∗/ 11 public Iterable getFavorites(int k) throws IllegalArgumentException { 12 if (k < 0 | | k > size( )) 13 throw new IllegalArgumentException("Invalid k"); 14 15 // we begin by making a copy of the original list 16 PositionalList temp = new LinkedPositionalList( ); 17 for (Item item : list) 18 temp.addLast(item); 19 20 // we repeated find, report, and remove element with largest count 21 PositionalList result = new LinkedPositionalList( ); 22 for (int j=0; j < k; j++) { 23 Position highPos = temp.first( ); 24 Position walk = temp.after(highPos); 25 while (walk != null) { 26 if (count(walk) > count(highPos)) 27 highPos = walk; 28 walk = temp.after(walk); 29 } 30 // we have now found element with highest count 31 result.addLast(value(highPos)); 32 temp.remove(highPos); 33 } 34 return result; 35 } 36 } Code Fragment 7.18: Class FavoritesListMTF implementing the move-to-front heuristic. This class extends FavoritesList (Code Fragments 7.16 and 7.17) and overrides methods moveUp and getFavorites. www.it-ebooks.info
Purchase answer to see full attachment

Tags: data structure JAVA program ADT inner class composition pattern

User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Hi, here is the final document. In case you have any rectifications that you would like made on the paper, feel free to reach out. I am here to help. Thanks buddy 😎

Running head:Data Structures in Java

Data Structures in Java
Name
Institutional Affiliation
Course
Date of Submission

1

Data Structures in Java

2

An Abstract Data Type (ADT) is defined as a data type manifested in a mathematical
model and that which has its functionality constrained by a collection of operations and values. A
list is an ADT, and it essentially represents a linearly ordered sequence of countable items. To
identify the location of an element in a list, indices are used. In a language like Java, there is an
Interface with a number of predefined methods that facilitate easy traversal of lists, for example,
the size() method that returns the number of items in a list. While a list is an Interface, an
ArrayList is defined a...