Advanced Programming Techniques (2011, 2013)
2. Array Lists
25
An array list (or vector as it is sometimes called) is an abstract data structure that holds a list of values in an array. In C++, the standard template library (STL) vector can hold pointers and primitive variables, such as intand double. In Java the ArrayList and Vector collections can hold only references to Objects, not primitive variables. However, in Java both StringBuffer and StringBuilder are essentially array lists that contain characters. Array lists normally provide the following very useful methods:
· add - Appends an element at the end of the list.
· insert - Inserts an element at the beginning or in the middle of the list.
· size - Returns how many elements are in the list.
· get - Returns the element at a specified location.
· find - Returns the location of a specified element.
· remove - Removes an element from the list.
Internally, array lists have an array with the elements stored at the beginning of the array and extra space at the end of the array. The entire length of the array is known as the array list’s capacity, and the actual number of elements stored in the array is known as the array list’s size. While adding elements, the extra space at the end of the array gradually fills up, meaning the size becomes equal to the capacity. If the array is full when a programs adds another element to it, the code for the array list will cause the computer to
1. Create a new larger array
2. Copy the contents of the old array to the new larger one
3. Discard the old array
4. Add the element to the new larger array
Initial Capacity and Quantum
Whenever a programmer implements an array list, two important questions arise.
1. How big should the initial array be? This is called the initial capacity.
2. When the array needs to grow, by how much should its capacity grow? This is known as the quantum and can be constant, such as 20, or variable, for example double the capacity.
Figure 6 shows an array list with an initial capacity of eight elements that was full with the characters “Wonderfu”. The program called the add method to add the character “l”, so the computer increased the array list’s capacity by six elements and then added the “l” to the new larger array within the array list.
26
old array |
|||||||||||||
'W' |
'o' |
'n' |
'd' |
'e' |
'r' |
'f' |
'u' |
||||||
[0] |
[1] |
[2] |
[3] |
[4] |
[5] |
[6] |
[7] |
||||||
← initial capacity (8) → |
|||||||||||||
← old size (8) → |
|||||||||||||
new array |
|||||||||||||
'W' |
'o' |
'n' |
'd' |
'e' |
'r' |
'f' |
'u' |
'l' |
|||||
[0] |
[1] |
[2] |
[3] |
[4] |
[5] |
[6] |
[7] |
[8] |
[9] |
[10] |
[11] |
[12] |
[13] |
← initial capacity (8) → |
← quantum (6) → |
||||||||||||
← new capacity (14) → |
|||||||||||||
← new size (9) → |
|||||||||||||
Figure 6: A representation of an array list that was full, and the computer increased its capacity and added another element.
It is difficult to choose a good constant quantum, and a constant quantum is usually inefficient in either compute time or memory space. If a programmer chooses a small constant quantum, such as 8, and a program adds thousands of elements several at a time to the array list, the computer will waste a lot of time repeatedly creating a new larger array and copying the contents of the old array to the new one. If the programmer chooses a large constant quantum, and a program adds only a few elements to the array list, the array list will contain a lot of empty, wasted space at the end.
A variable quantum, such as doubling the capacity each time the array list needs to grow, eliminates much of the wasted compute time because as the array list grows so does the quantum, resulting in far less array creating and copying. However, doubling the size of an array easily leads to wasted memory space. Consider an empty StringBuilder that begins with a capacity of 16 characters. Now assume the computer adds 513 characters, several characters at a time. The StringBuilder will first double its capacity to 32 characters, then 64, then 128, then 256, then 512, and finally to a capacity of 1024 characters. However, the StringBuilder will have only 513 characters stored in it, resulting in 511 characters of wasted space.
The following table shows the initial capacities and quantums for several Java classes that use array lists.
Initial Capacity and Quantum of Java Lists |
||
Data Structure |
Initial Capacity |
Quantum |
ArrayList |
10 references |
1.5 times the current capacity |
Vector |
10 references |
a programmer specified amount or, if not specified, twice the current capacity |
StringBuffer |
16 chars |
twice the current capacity |
StringBuilder |
16 chars |
twice the current capacity |
27
Fibonacci Variable Quantum
A nice compromise between a fixed quantum and the doubling variable quantum is a variable quantum that follows a Fibonacci series. A Fibonacci series, named after an Italian mathematician, is a series of numbers where each number is the sum of the previous two numbers. Of course, the first two numbers in the series must be given. The classic Fibonacci series starts with 0 and 1: 0, 1, 1, 2, 3, 5, 8… However, we can start a Fibonacci series with any two numbers we choose. Consider the series that begins with 4 and 8: 4, 8, 12, 20, 32, 52… Every number in this series is a multiple of 4. Most memory allocation functions, including malloc and new allocate memory in chunks whose size are multiples of 4. Even if you write code that asks for 7 bytes, the memory allocation function will probably allocate 8 bytes. So, if we use a Fibonacci series that contains only multiples of 4 as our variable quantum, we can eliminate some wasted bytes in ou
It is interesting to plot the capacity of array lists that use a different quantum to see how the capacity grows over time. Figure 7 shows a graph of the capacity of four array lists that each uses a different quantum:
1. A fixed quantum of 12 elements
2. A variable quantum that is 1.5 times the previous capacity
3. A Fibonacci variable quantum that begins with 12 and 20
4. A variable quantum that is 2 times the previous capacity
Figure 7: A graph of the capacity of array lists that use four different quantums.
28
From the graph we see how slowly the capacity of an array list with a fixed quantum grows over time which usually results in lots of wasted computer time allocating larger arrays and copying data. We also see how quickly the capacity of an array list with a 2 times variable quantum grows which often results in wasted memory space. The Fibonacci variable quantum is a nice compromise between the fixed quantum and the 2 times variable quantum.
Memory Efficient StringBuilder
StringBuilder |
–FIBS : int[] |
+StringBuilder() |
Figure 8: A UML class diagram for a StringBuilder class that uses a Fibonacci variable quantum.
As shown in the previous table, the Java StringBuilder class uses a double variable quantum. We can write our own StringBuilder class as shown in figure 8 and the example code below that uses the more memory efficient Fibonacci series as the quantum. In the code below notice the four append methods call the ensureCapacity method which callsexpandCapacity if necessary. expandCapacity calls findFibonacci which chooses the new larger capacity by finding the smallest Fibonacci number that is greater than or equal to the suggested capacity.
/** A mutable sequence of characters. You can think of
* a string builder as a string that can be modified. */
public class StringBuilder {
/** An array to store the character data. */
char[] array;
/** Actual number of characters stored in the array */
int count;
/** Constructs a string builder with no characters
* in it and an initial capacity of 12 characters. */
public StringBuilder() { this(12); }
/** Constructs a string builder with no characters
* in it and the specified initial capacity. */
public StringBuilder(int capacity) {
capacity = findFibonacci(capacity);
array = new char[capacity];
count = 0;
}
29
/** Returns the number of characters
* stored in this StringBuilder. */
public int length() { return count; }
/** Returns the character stored in this
* StringBuilder at the specified index. */
public char charAt(int index) { return array[index]; }
@Override
public String toString() {
return new String(array, 0, count);
}
Example 1
/** Appends a single character to this StringBuilder. */
public StringBuilder append(char c) {
ensureCapacity(count + 1);
array[count++] = c;
return this;
}
Desk Check |
||||||
this.array |
this.count |
c |
||||
'U' |
'n' |
2 |
'd' |
|||
[0] |
[1] |
[2] |
[3] |
Example 2
/** Appends the contents of an array of
* characters to this StringBuilder. */
public StringBuilder append(char[] a) {
ensureCapacity(count + a.length);
for (int i = 0; i < a.length; ++i) {
array[count++] = a[i];
}
return this;
}
Desk Check |
||||||||||
this.array |
this.count |
i |
||||||||
'U' |
'n' |
'd' |
3 |
|||||||
[0] |
[1] |
[2] |
[3] |
[4] |
[5] |
[6] |
[7] |
|||
a |
||||||||||
'a' |
'u' |
'n' |
||||||||
[0] |
[1] |
[2] |
30
Example 3
/** Appends the contents of another
* StringBuilder to this StringBuilder. */
public StringBuilder append(StringBuilder sb) {
ensureCapacity(count + sb.count);
for (int i = 0; i < sb.count; ++i) {
array[count++] = sb.array[i];
}
return this;
}
Desk Check |
||||||||||
this.array |
this.count |
i |
||||||||
'U' |
'n' |
'd' |
3 |
|||||||
[0] |
[1] |
[2] |
[3] |
[4] |
[5] |
[6] |
[7] |
|||
sb.array |
sb.count |
|||||||||
'a' |
'u' |
'n' |
3 |
|||||||
[0] |
[1] |
[2] |
[3] |
Example 4
/** Appends the contents of a
* String to this StringBuilder. */
public StringBuilder append(String s) {
int len = s.length();
int newLen = count + len;
ensureCapacity(newLen);
s.getChars(0, len, array, count);
count = newLen;
return this;
}
Desk Check |
|||||||||||||
this.array |
this.count |
||||||||||||
'U' |
'n' |
'd' |
'a' |
'u' |
'n' |
6 |
|||||||
[0] |
[1] |
[2] |
[3] |
[4] |
[5] |
[6] |
[7] |
[8] |
[9] |
[10] |
[11] |
newLen |
s |
"ted" |
/** Ensures the capacity of this
* StringBuilder is greater than or
* equal to the specified capacity. */
public void ensureCapacity(int capacity) {
if (array.length < capacity) {
expandCapacity(capacity);
}
}
31
Example 5
/** Expands the capacity of this StringBuilder. */
protected void expandCapacity(int suggest) {
int capacity = findFibonacci(suggest);
char[] old = array;
array = new char[capacity];
System.arraycopy(old, 0, array, 0, count);
}
Desk Check |
|||||||
old |
this.count |
suggest |
capacity |
||||
'U' |
'n' |
'd' |
3 |
6 |
|||
[0] |
[1] |
[2] |
[3] |
this.array |
|||||||
[0] |
[1] |
[2] |
[3] |
[4] |
[5] |
[6] |
[7] |
Example 6
/** Finds the Fibonacci number
* greater than or equal to value. */
private static int findFibonacci(int value) {
int index = java.util.Arrays.binarySearch(FIBS, value);
if (index < 0) {
index = -(index + 1);
}
return FIBS[index];
}
Desk Check |
||
value |
index |
return |
6 |
||
/** An array that contains the Fibonacci
* series that begins with 4 and 8. */
private static final int FIBS[] = {
4, 8, 12, 20,
32, 52, 84, 136,
220, 356, 576, 932,
1508, 2440, 3948, 6388,
10336, 16724, 27060, 43784,
70844, 114628, 185472, 300100,
485572, 785672, 1271244, 2056916,
3328160, 5385076, 8713236, 14098312,
22811548, 36909860, 59721408, 96631268,
156352676, 252983944, 409336620, 662320564,
1071657184, 1733977748, Integer.MAX_VALUE
};
}
Programming Exercises
1. Add to the StringBuilder class in this chapter an insert method that inserts a single character at the beginning of the string builder’s array.
2. Add to the StringBuilder class in this chapter an insert method that inserts a String at any position in the string builder’s array.