# Sorting - First Thoughts

Often in programming, we must work with large arrays of data. As a simple example, imagine working with student records at a large university. A small piece of that array might look similar to the table shown below. Notice, each row (like the one shown in blue) corresponds to a single item -- a student, in this case.

Suppose we frequently must search for the student information associated with a particular last name. When we search an array for an item containing a particular value (like the name "Andrews"), the value we seek is referred to as the key.

 Chen A 991-878-4944 308 Blair Rohde A 231-343-5555 343 Forbes Gazsi B 766-093-9873 101 Brown Furia A 766-093-9873 101 Brown Kanaga B 898-122-9643 22 Brown Andrews A 664-480-0023 097 Little Battle C 874-088-1212 121 Whitman

Not surprisingly, we can search for a given student by name much more quickly if the student records are stored in order by student name. As such, before conducting searches of this type, we should sort the array so that students appear in ascending order by student name, as shown below:

 Andrews A 664-480-0023 097 Little Battle C 874-088-1212 121 Whitman Chen A 991-878-4944 308 Blair Furia A 766-093-9873 101 Brown Gazsi B 766-093-9873 101 Brown Kanaga B 898-122-9643 22 Brown Rohde A 231-343-5555 343 Forbes

As it turns out, there are many ways to sort arrays, and some methods are more efficient than others. We, of course, wish to analyze these sorting algorithms so that we know which one is the best to use in a particular situation. To this end, let us "standardize" our analysis in the following ways:

• Let us assume that we always wish to sort items so that their keys are in ascending order.

• Let us construct cost models for each algorithm based on the number of comparisons and exchanges needed to completely sort $N$ items.

• Let us also note whether the sorting requires extra memory (i.e., a copy of the array to be sorted), or if the sorting can be accomplished "in place".

Also -- so that we don't have to rewrite the sorting methods for every type of data we might wish to sort -- we assume that the data we wish to sort implements the Comparable interface.

Recall that the Comparable interface requires the class have a compareTo() method that returns an integer. As an example, suppose we define a Rectangle class and wish to be able to compare rectangles based on their area. Such a class could implement the Comparable interface in the following way:

public class Rectangle implements Comparable {

int width;
int height;

public Rectangle(int width, int height) {
this.width = width;
this.height = height;
}

public int compareTo(Rectangle o) {
int diffArea = this.width * this.height - o.width * o.height;
return diffArea;
}

public static void main(String[] args) {
Rectangle r1 = new Rectangle(3,5);
Rectangle r2 = new Rectangle(4,4);

// Below, we invoke the compareTo() method.
// Since r2 > r1, we know that r2.compareTo(r1) will be positive.
// Had r2 < r1, then r2.compareTo(r1) would have been negative.
// Had r2 == r1, then r2.compareTo(r1) would have been zero.
System.out.println(r2.compareTo(r1));
}
}


As described in the API, regardless of the class involved, one expects x.compareTo(y) to be negative, zero, or positive, depending on whether $x < y$, $x=y$, or $x > y$, respectively.

We can use the comparable interface to make two "helper" methods for sorting. These methods (named less() and exch()) allow us to quickly identify every time we compare two items or exchange two items during a sorting process. Consequently, they make the cost functions for each sorting algorithm much easier to find.