CS 161 Lab #10

April 11th

Goals:

This week you'll get a chance to play around with the searching code we wrote in class. You'll add some code that keeps track of the number of comparisons done by the searching methods as they operate, then run some experiments to see if our predictions from class were correct regarding the number of comparisons required to search arrays of various lengths.

Getting Started

In each lab this semester you will work with a randomly assigned partner. (I'll have Zoom randomly set up breakout rooms.) Please be kind in your interactions with your partners! Keep in mind that students in this class have a range of previous programming experience, and that some have been college students for longer than others. We're all in this together, and you have something to learn from your partner, no matter who they are or what their previous experiences have been. I expect that group members will collaborate and work together on each step of the lab.

Download the SearchingLab project and extract its contents, then start BlueJ and open the project. In the project you'll find the Searches and SearchTester classes. Our code from class is in Searches. You'll add a bit of code there, but most of your work will go into the SearchTester class.

Directions:

  1. Take a look at the code in the Searches class. The two search methods we wrote in lecture are there (renamed linearSearch and binarySearch for clarity), plus the start of some code to help us count the number of comparisons they perform: I've defined a variable (cleverly named counter) to hold the count, and methods for retrieving and for resetting the count. Your first task is to edit the code in linearSearch so that counter is incremented each time the == operator is evaluated. You can test out your work via the codepad by creating an array of numbers, running linearSearch, then checking the value of the counter:

    > int[] nums = {2,5,6,8,9,11,15,17,18,20,21,23};
    > Searches s = new Searches();
    > s.getCount()
      0   (int)
    > s.linearSearch(nums, 6);
    > s.getCount()
      3   (int)
    > s.clearCount();
    > s.linearSearch(nums, 3);
    > s.getCount()
      12   (int)
    

  2. Now add code to binarySearch to count the number of comparisons performed during a search. Make sure you increment the counter for every comparison between data values! (Don't bother counting how many comparisons get done as part of the test at the top of the loop — we're only interested in how many times we inspect data in the array.) You can ignore the commented-out lines for now, but later it will be interesting to add them back and see how that changes things. Verify that your counter is getting updated as expected before proceeding.
  3. As we saw in the sample interactions for step #2 above, the number of comparisons for a search can vary depending on the position of the key within the array. To get a more accurate estimate, we could do multiple searches (for items at random positions), and average the required number of comparisons. Open the SearchTester class and finish the definition of averageLinearSearchCost. It should do the specified number of searches on a given array and report the average number of comparisons done per search. The method takes a size and the number of searches desired. It should create an array of the specified size, do the requested number of searches, then print the average number of comparisons performed. For each search, you should pick an item from the array at random (from a random position) and then use linearSearch to search for it in the array. (Note that there's a buildOrderedArray method at the top of the class that creates and returns ordered arrays of integers of any size.)

    > SearchTester tester = new SearchTester();
    > tester.averageLinearSearchCost(1000, 5);
    Did 5 linear searches on array of length 1000. Took 515.4 comparisons on avg.
    > tester.averageLinearSearchCost(1000, 500);
    Did 500 linear searches on array of length 1000. Took 497.27 comparisons on avg.
    > tester.averageLinearSearchCost(2000, 500);
    Did 500 linear searches on array of length 2000. Took 969.594 comparisons on avg.
    

  4. Now finish the definition of averageBinarySearchCost, that does the same thing as the previous method, but with binarySearch. (This should pretty much be a copy-and-paste job.)
  5. Time to generate lots of data! The goal of this next step is to learn more about how hard it is to solve search problems of various sizes, and get a sense of how the effort scales as the problem size increases. (How much harder is it to search an array of size 2000 than an array of size 1000? Does it stay basically the same? Increase linearly with the size of the array? Other?) We'll use the average number of comparisons performed by a search as a measure of the effort required. Your job is to run a bunch of tests on arrays of various sizes, then graph the results (with #comparisons on the vertical axis and array size on the horizontal axis).

    Add a new method, runTests to the SearchTester class. In runTests, use a loop to call averageLinearSearchCost on arrays of increasing size — up to at least size 10,000 — and using numSearches of at least a few hundred at each array size to get a good average. Increase the array size by 100 each time, rather than just increasing by one on each test.

    For the full impact, you should graph these results: Enter your data in Excel or a GoogleDocs spreadsheet, and create a chart from the data. Pro tip: If you modify the output in averageLinearSearchCost so that it just prints out the array size and the number of comparisons, separated by a tab character (\t), you can copy and paste rows of text from BlueJ's output window into the spreadsheet! If you're new to Google Docs, you can follow these instructions to get a chart.

  6. Add additional code to runTests that explores the behavior of binarySearch, just like you did with linearSearch. You'll want to get up to much larger sizes here to see the trend. Graph your results for this one too.
  7. Finally, uncomment the three lines of code in binarySearch that check whether the item at position mid is the value we're looking for — this check has the potential to stop the search early, but requires an additional comparison. Fix up your counter code so that it accurately counts comparisons for the revised code, then re-run your tests and see how much things change. We should need fewer iterations of the loop, but each iteration does more comparisons per iteration. Do we end up doing fewer comparisons in total with the new version?

Extensions

Looking for an extra challenge?


Brad Richards, 2024