Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Beginning Algorithms (2006)

.pdf
Скачиваний:
252
Добавлен:
17.08.2013
Размер:
9.67 Mб
Скачать

Chapter 9

Try It Out

Implementing the Tests

The first set of tests you perform search for all values between 0 and TEST_SIZE, in order, and print the number of comparisons:

public void testRecursiveBinarySearch() {

performInOrderSearch(new RecursiveBinaryListSearcher(_comparator));

}

public void testIterativeBinarySearch() {

performInOrderSearch(new IterativeBinaryListSearcher(_comparator));

}

public void testLinearSearch() {

performInOrderSearch(new LinearListSearcher(_comparator));

}

private void performInOrderSearch(ListSearcher searcher) { for (int i = 0; i < TEST_SIZE; ++i) {

searcher.search(_sortedList, new Integer(i));

}

reportCalls();

}

The next set performs some random searches:

public void testRandomRecursiveBinarySearch() { performRandomSearch(new RecursiveBinaryListSearcher(_comparator));

}

public void testRandomIterativeBinarySearch() { performRandomSearch(new IterativeBinaryListSearcher(_comparator));

}

public void testRandomLinearSearch() { performRandomSearch(new LinearListSearcher(_comparator));

}

private void performRandomSearch(ListSearcher searcher) { for (int i = 0; i < TEST_SIZE; ++i) {

searcher.search(_sortedList,

new Integer((int) (TEST_SIZE * Math.random())));

}

reportCalls();

}

214

Binary Searching and Insertion

How It Works

The in-order tests each construct one of the three different list searchers, which is then passed to performOrderSearch() to perform the in-order (0, 1, 2, . . .) search and finally report the number of comparisons made.

The random tests also construct an appropriate searcher with the counting comparator but pass them to performRandomSearch() to randomly generate some values to look up.

If you run the tests, depending on the tool you use, you will see something like the following:

testRecursiveBinarySearch: 9197 calls testIterativeBinarySearch: 9197 calls testLinearSearch: 521731 calls

testRandomRecursiveBinarySearch: 9197 calls testRandomIterativeBinarySearch: 9132 calls testRandomLinearSearch: 531816 calls

These results have been summarized in Table 9-1 to make it easier to make a comparison between the various search methods.

Table 9-1: Performance Comparison for 1021 Searches of a Sorted List

 

Recursive Binary

Iterative Binary

Linear

Comparisons (In-Order)

9,197

9,197

521,731

Comparisons* (Random)

9,158

9,132

531,816

Comparisons* (Average)

9

9

515

 

 

 

 

* Actual results will vary due to the random nature of the test data.

The recursive and iterative implementations perform the same number of comparisons for the in-order search. (The difference between the number of comparisons for the random search for the recursive and iterative implementations is merely an artifact of the random nature of the test.) This is probably what you expected, but it’s always nice to have your assumptions confirmed. It’s also worth noting that the recursive version will suffer a slight penalty due to the overhead associated with the recursive method calls, but the difference will be negligible.

The important thing to observe, and the one of most interest to us, is the difference between the binary search and the linear search. In the case of the binary search, the average number of comparisons performed is approximately: 9,000 / 1,000 = 9, whereas the average for the linear search is around 500,000 / 1,000 = 500. These figures certainly confirm our original predictions about the performance characteristics of not only binary searching, but also linear searching. We expected our binary search to take, on average, O(log N) comparisons, and our linear search to take around O(N).

Having said this, the actual performance of binary searching is excellent as long as you are using a data structure (such as an array list) that supports fast, indexed-based lookup. When using a linked list, for example, although the number of comparisons remains the same as for an array list, the continual traversal of the list to find the next item for comparison imposes considerable time penalties.

215

Chapter 9

Understanding Binar y Inser tion

Binary insertion is a technique based on binary searching that enables you to maintain your data in sorted order. Clearly, you could use some of the sorting algorithms already covered in this book to keep your data in order, but as we will show, performing a sort after every insertion into a list can be very expensive. Indeed, performing a sort even after all of the data has been added to a list still turns out to be relatively more expensive than inserting the data in sorted order right from the start.

Binary insertion works pretty much like binary searching. In fact, the only difference between the two is that a binary search returns the position of the search key within the data, and a binary insert, as the name suggests, inserts the new key into the list at the appropriate position.

Imagine you wanted to add the letter G into the list of letters defined in the previous example. (Refer to Figure 9-1.) As with a binary search, you start at the middle item, I. Comparing the I with the new value, G, you determine that the insertion point must lie in the lower half of the list (I sorts lower than G).

Figure 9-6 shows that the next letter you compare against is the D. This time, the new value sorts higher than the current item, so you need to concentrate on the upper half of the remaining list.

0

1

2

3

4

5

6

7

8

A

D

F

H

I

K

L

M

P

 

 

 

 

 

 

 

 

 

Figure 9-6: The search moves to the lower half of the list.

You’re now down to only two letters: F and H. Figure 9-7 shows that our new value sorts higher than the current item, F, so look to the H, as shown in Figure 9-8.

0

1

2

3

4

5

6

7

8

A

D

F

H

I

K

L

M

P

 

 

 

 

 

 

 

 

 

Figure 9-7: The search moves to the upper half of the remaining list.

The new value, G, sorts lower than the current item, H, but this time you have no more items to compare against; it’s time to perform the insertion.

0

1

2

3

4

5

6

7

8

A

D

F

H

I

K

L

M

P

 

 

 

 

 

 

 

 

 

Figure 9-8: The search narrows to only one item.

216

Binary Searching and Insertion

The new value belongs before the last item, so you shift all the elements after and including the H to the right one position to make room for G, as shown in Figure 9-9.

 

 

0

1

2

3

4

5

6

7

8

9

 

 

 

A

D

F

 

 

H

I

K

L

M

P

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

G

Figure 9-9: The key is inserted so as to maintain the correct ordering.

You can then insert the G into the correct position that ensures that the ordering of the list is maintained.

Now that you know how binary insertion works, you can write some code to actually perform a binary insertion into a list.

A List Inserter

In this section, you develop a very simple class that inserts a value into a list such that the ordering of the list is maintained. Rather than reinvent the wheel, though, you’ll use a list searcher to find the insertion point.

Try It Out

Creating the Tests

The tests themselves will use a binary insert algorithm to add numbers to a list. You add the numbers in sorted order and randomly. In all cases, expect the values to be inserted in the correct order.

Start by creating the test class as follows:

package com.wrox.algorithms.bsearch;

import com.wrox.algorithms.iteration.Iterator; import com.wrox.algorithms.lists.ArrayList; import com.wrox.algorithms.lists.List;

import com.wrox.algorithms.sorting.NaturalComparator; import junit.framework.TestCase;

public class ListInserterTest extends TestCase { private static final int TEST_SIZE = 1023;

private ListInserter _inserter; private List _list;

protected void setUp() throws Exception { super.setUp();

_inserter = new ListInserter(

new IterativeBinaryListSearcher(NaturalComparator.INSTANCE));

217

Chapter 9

_list = new ArrayList(TEST_SIZE);

}

private void verify() {

int previousValue = Integer.MIN_VALUE; Iterator i = _list.iterator();

for (i.first(); !i.isDone(); i.next()) {

int currentValue = ((Integer) i.current()).intValue(); assertTrue(currentValue >= previousValue); previousValue = currentValue;

}

}

...

}

The first test involves inserting values into the list in ascending order. That is, add numbers starting at 0 followed by 1, 2, 3, and so on up to our specified maximum value, TEST_SIZE:

public void testAscendingInOrderInsertion() { for (int i = 0; i < TEST_SIZE; ++i) {

assertEquals(i, _inserter.insert(_list, new Integer(i)));

}

verify();

}

The next test is really a variation on the first. Instead of inserting values in ascending order, this time you insert them in descending order. That is, start at your specified maximum and work your way down until you reach 0:

public void testDescendingInOrderInsertion() { for (int i = TEST_SIZE - 1; i >= 0; --i) {

assertEquals(0, _inserter.insert(_list, new Integer(i)));

}

verify();

}

The final test inserts random values into the list. Doing so ensures that the inserter isn’t somehow working by coincidence when given values in order (as the previous two tests have done):

public void testRandomInsertion() {

for (int i = 0; i < TEST_SIZE; ++i) { _inserter.insert(_list,

new Integer((int) (TEST_SIZE * Math.random())));

}

verify();

}

218

Binary Searching and Insertion

How It Works

The test class holds an instance of a ListInserter and a list into which each of the tests will insert values, both of which are initialized in the setUp() method.

The verify() method is called by each of the tests to ensure that the contents of the resulting list are in order. It does this by iterating over each item in the list. Each successive value (currentValue) is then checked to ensure that it is not less than the one before it (previousValue). Notice how you have initialized the previous value to Integer.MIN_VALUE. This guarantees that assertion will always succeed the first time through even though there was no previous value to speak of.

In the first test, a simple for loop enables you to add the values in ascending order, using an instance of the inserter. Each time a value is inserted, you also make sure that the return value accurately reflects the insertion point. In the case of this ascending insertion, the insertion point always matches the value inserted: 0 goes into position 0, 1 into position 1, and so on. Finally, after all the values have been added, you call verify() to ensure that they all actually went into the correct positions — just because insert() reported that they went into the correct positions doesn’t actually mean that they did!

The second test used a for loop to add the values in descending order. This time, because you are inserting in descending order, you make sure that insert() reports that each value has been placed into position 0 (shifting all the existing values right by one spot). Finally, you call verify() to ensure that the insertion has actually worked as expected — again, trust that the return value accurately reflects the actual insertion point.

The final test still adds only TEST_SIZE integers to the list, but the value of each is determined randomly using Math.random(). Recall that Math.random() returns a double-precision floating-point number in the range 0.0 to 1.0 — therefore, we multiply the result by TEST_SIZE to ensure that we obtain integers in the range 0 to TEST_SIZE. Notice that this time we make no assertions about the value returned from insert(). How could we? The values are being inserted in random order.

Your tests are in place. In the next Try It Out, you implement the actual inserter.

Try It Out

Implementing the Inserter

The code to perform binary insertion is quite simple. It involves creating the ListInserter class as shown here:

package com.wrox.algorithms.bsearch;

import com.wrox.algorithms.lists.List;

public class ListInserter {

private final ListSearcher _searcher;

public ListInserter(ListSearcher searcher) {

assert searcher != null : “searcher can’t be null”; _searcher = searcher;

}

public int insert(List list, Object value) { assert list != null : “list can’t be null”;

219

Chapter 9

int index = _searcher.search(list, value);

if (index < 0) {

index = -(index + 1);

}

list.insert(index, value);

return index;

}

}

How It Works

As you can see, the constructor for the ListInserter class takes as its sole argument a ListSearcher. This will be used to find the insertion point — most of binary insertion is the same as binary search, so why reinvent the wheel?

The insert() method then uses the list searcher to find the insertion point. If the search was successful, an element with the same value already exists. Not to worry, though, as you allow duplicates, so you simply insert the new value into this position and have the existing value move to the right one spot. If, however, the value wasn’t found (index < 0), then you know from the discussion on searching that you can convert the return value into a valid position with (index + 1). Then, once you know where the new value should go, you insert it into the list and return the insertion point to the caller.

Assessing Performance

You now have a class that uses an efficient binary search mechanism for adding items to a list while at the same time maintaining the list in sorted order. Now the question is, what is the performance like? More specifically, how well does it stack up against the various sorting algorithms you developed in Chapters 6 and 7? Surely it would be better to just create an ordinary list, populate it with data and then sort it.

In the next Try It Out, you create a test suite that enables you to compare the performance of your binary insertion code with several of the sorting algorithms.

Try It Out

Comparing the Binary Inserter with Other Sorting Algorithms

Just as you did when assessing the performance of the list searchers, you create a test suite that exercises the binary inserter and compares it with sorting a list using various sorting algorithms:

package com.wrox.algorithms.bsearch;

import com.wrox.algorithms.lists.ArrayList; import com.wrox.algorithms.lists.List;

import com.wrox.algorithms.sorting.CallCountingComparator; import com.wrox.algorithms.sorting.ListSorter;

import com.wrox.algorithms.sorting.MergesortListSorter; import com.wrox.algorithms.sorting.NaturalComparator; import com.wrox.algorithms.sorting.QuicksortListSorter;

220

Binary Searching and Insertion

import com.wrox.algorithms.sorting.ShellsortListSorter; import junit.framework.TestCase;

public class BinaryInsertCallCountingTest extends TestCase { private static final int TEST_SIZE = 4091;

private List _list;

private CallCountingComparator _comparator;

protected void setUp() throws Exception { super.setUp();

_list = new ArrayList(TEST_SIZE);

_comparator = new CallCountingComparator(NaturalComparator.INSTANCE);

}

...

}

The first test you write exercises the binary inserter just developed to add a number of values to the list and count the number of comparisons made in the process:

public void testBinaryInsert() { ListInserter inserter = new ListInserter(

new IterativeBinaryListSearcher(_comparator));

for (int i = 0; i < TEST_SIZE; ++i) { inserter.insert(_list,

new Integer((int) (TEST_SIZE * Math.random())));

}

reportCalls();

}

private void reportCalls() { System.out.println(getName() + “: “

+ _comparator.getCallCount() + “ calls”);

}

Now that you have a test for binary insertion, you create similar tests for some of the sorting alternatives. For this, you use the same sorting algorithms you used with binary searching earlier in this chapter:

public void testMergeSort() {

populateAndSortList(new MergesortListSorter(_comparator));

}

public void testShellsort() {

populateAndSortList(new ShellsortListSorter(_comparator));

}

public void testQuicksort() {

populateAndSortList(new QuicksortListSorter(_comparator));

}

221

Chapter 9

private void populateAndSortList(ListSorter sorter) { for (int i = 0; i < TEST_SIZE; ++i) {

_list.add(new Integer((int) (TEST_SIZE * Math.random())));

}

_list = sorter.sort(_list);

reportCalls();

}

How It Works

The test class holds a list into which values will be inserted, and of course a comparator to use when ordering and sorting. As in the previous performance tests, you use an array list and a call counting comparator to collect statistics necessary in order to evaluate the different approaches.

The method testBinaryInsert() first creates an iterative binary list searcher. (You could have used the recursive version instead but the iterative one doesn’t have the overhead of the nested calls.) You then insert TEST_SIZE random integer values into the list. The method reportCalls() is then used to print the number of calls made to the comparator, in the following form:

test-name: #### calls

The final three tests each create a different sorting algorithm, which is then passed to populateAndSortList(), where all of the real work is done.

In populateAndSortList(), you add TEST_SIZE random integers to the list (using the same technique as used previously), sort it using the provided sorting algorithm, and finally report the number of calls made to the comparator. Again, you need to multiply the value obtained from Math.random() to ensure that the inserted value falls within the range 0 to TEST_SIZE.

If you run these tests, you should see output similar to the following:

testBinaryInsert: 41471 calls testMergeSort: 43928 calls

testShellsort: 102478 calls testQuicksort: 97850 calls

Table 9-2 summarizes what’s going on.

Table 9-2: Performance Comparison for 4091 Random Inserts into a List

Sort Type

Comparisons*

Binary Insert

41,471

Mergesort

43,928

Shellsort

102,478

Quicksort

97,850

* Actual results will vary due to the random nature of the test data.

222

Binary Searching and Insertion

As Table 9-3 quite clearly shows, binary insert performs the best, with mergesort coming in a close second, and shellsort and quicksort way behind. Remember, however, that although the performance of mergesort is comparable, it requires another list for its results, whereas binary insert adds new values into the same list.

From these results, you can work out the average number of comparisons performed using a binary insert. The binary insert works by performing a binary search; and as you already know, binary search runs in O(log N) comparisons. Because the list starts off empty, there would initially be log20 comparisons required. The next insert would require log21 comparisons, followed by log22, and so on all the way to log2N. A simplistic calculation would suggest performance around N log2N, but actually it’s better, much better, being more like log2N!.

These comparisons have actually been a little unfair. Each time binary insert adds a new value, it goes directly into the correct position to maintain the ordering of values. This means that the list will always be in sorted order, no matter how many values you add or when you choose to add them. Conversely, these tests applied the three sorting algorithms only after all the values were inserted. This means that the list remains largely unsorted until the very end. What would happen if you instead sort the list after every insertion?

Table 9-3 summarizes the results obtained when changing populateAndSort() to sort after every insertion:

private void populateAndSort(ListSorter sorter) { for (int i = 0; i < TEST_SIZE; ++i) {

_list.add(nextValue()); _list = sorter.sort(_list);

}

reportCalls();

}

Table 9-3: Performance Comparison When Sorting after Every Insert

Sort Type

Comparisons*

Binary insert

41,481

Mergesort

48,852,618

Shellsort

44,910,616

Quicksort

N/A**

*Actual results will vary due to the random nature of the test data.

**After 5 minutes, we killed the process because it still hadn’t completed.

223