Warning

This book is new. If you'd like to go through it, then join the Learn Code Forum to get help while you attempt it. I'm also looking for feedback on the difficulty of the projects. If you don't like public forums, then email help@learncodethehardway.org to talk privately.

Exercise 16: Bubble, Quick, and Merge Sort

You are now going to attempt to implement sorting algorithms for your DoubleLinkedList data structure. For these descriptions I'm going to use a "list of numbers" to mean a randomized list of things. This could be a deck of poker cards, sheets of paper with numbers on them, lists of names, or anything else you can sort. There are three usual suspects when you're attempting to sort a list of numbers:

Bubble Sort
This is most likely how you would attempt to sort a list of numbers if you knew nothing about sorting. It involves simply going through the list and swapping any out of order pairs you find. You continually loop through the list, swapping pairs, until you've gone through without swapping anything. It's simple to understand, but crazy slow.
Merge Sort
This kind of sorting algorithm divides the list into halves, then quarters, and further partitions until it can't divide it anymore. Then it merges these back, but does it in the right order by checking the ordering of each partition as it merges it. It's a clever algorithm that works really well on linked lists but isn't so great on fixed size arrays as you'll need a Queue of some kind to keep track of the partitions.
Quick Sort
This is similar to merge sort since it's a "divide and conquer" algorithm, but it works by swapping elements around the partition point instead of breaking and merging the list together. In the simplest form, you pick a range from low to high,and a partition point. You then swap the elements that are greater than the partition point above it and those lower below it. Then you pick a new low, high, and partition that's inside this newly shuffled set and do it again. It's dividing the list into smaller chunks, but it doesn't break them apart like a merge sort does.

Exercise Challenge

The purpose of this exercise is to learn how to implement an algorithm based on a "pseudo-code" description or "p-code". You'll be studying the algorithms in references I tell you (primarily Wikipedia) and then using the p-code to implement them. I'll do the first two quickly here and more elaborately in the videos for this exercise. Then your job is to do the Quicksort algorithm on your own. First, let's look at the description of the Bubble Sort from Wikipedia to get started:

procedure bubbleSort( A : list of sortable items )
    n = length(A)
    repeat
        swapped = false
        for i = 1 to n-1 inclusive do
            /* if this pair is out of order */
            if A[i-1] > A[i] then
                /* swap them and remember something changed */
                swap( A[i-1], A[i] )
                swapped = true
            end if
        end for
    until not swapped
end procedure

You'll find that because p-code is just a loose description of the algorithm, it will end up being wildly different between books, authors, and pages on Wikipedia. It's assumed that you can read this "programming-like language" and translate it to what you want. Sometimes the language will look like an old language called Algol, other times it'll look like a poorly formatted JavaScript or Python. You just have to try to guess at what it's saying and then translate it to what you need. Here's my initial implementation of this particular p-code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def bubble_sort(numbers):
    """Sorts a list of numbers using bubble sort."""
    while True:
        # start off assuming it's sorted
        is_sorted = True
        # comparing 2 at a time, skipping ahead
        node = numbers.begin.next
        while node:
            # loop through comparing node to the next
            if node.prev.value > node.value:
                # if the next is greater, then we need to swap
                node.prev.value, node.value = node.value, node.prev.value
                # oops, looks like we have to scan again
                is_sorted = False
            node = node.next

        # this is reset at the top but if we never swapped then it's sorted
        if is_sorted: break

I've added additional comments here so you can study this and follow along, comparing what I've done here to the p-code. You should also see that the Wikipedia page is using a completely different data structure from your DoubleLinkedList. The Wikipedia code is assuming some sort of functioning array or list structure. You have to translate lines such as:

if A[i-1] > A[i] then

Into Python using your DoubleLinkedList:

if node.prev.value > node.value:

We can't randomly access a DoubleLinkedList easily, so we have to convert these array index operations into .next and .prev. We also have to watch for the next or prev attributes being None while we loop. This kind of conversion requires a lot of translation, study, and guessing at the semantics of the p-code you're reading.

Study Bubble Sort

You should now take the time to study this bubble_sort Python code to see how I translated it. Be sure to watch the video so you can watch me do it in real time and get more insight. You should also draw diagrams of how this works on different kinds of lists (already sorted, random, duplicates, etc). Once you have an understanding of how I did it, study the pytest for this and the merge_sort algorithm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import sorting
from dllist import DoubleLinkedList
from random import randint

max_numbers = 30

def random_list(count):
    numbers = DoubleLinkedList()
    for i in range(count, 0, -1):
        numbers.shift(randint(0, 10000))
    return numbers


def is_sorted(numbers):
    node = numbers.begin
    while node and node.next:
        if node.value > node.next.value:
            return False
        else:
            node = node.next

    return True


def test_bubble_sort():
    numbers = random_list(max_numbers)

    sorting.bubble_sort(numbers)

    assert is_sorted(numbers)


def test_merge_sort():
    numbers = random_list(max_numbers)

    sorting.merge_sort(numbers)

    assert is_sorted(numbers)

An important part of this test code is that I'm using the random.randint function to generate randomized data for testing. This test doesn't test for many edge cases, but it's a start, and we'll be improving it later. Remember that you don't have sorting.merge_sort implemented yet, so you can either not write that test function or comment it out for now.

Once you have the test, and this code written up, study the Wikipedia page again and attempt some of the other versions of bubble_sort before attempting merge_sort.

Merge Sort

I'm not quite ready to leave you on your own yet. I'm going to have you repeat this process again for the merge_sort function, but this time I want you to attempt to solve the algorithm from just the p-code on the Merge Sort Wikipedia page before you look at how I did it. There are several proposed implementations, but I used the "top down" version:

function merge_sort(list m)
    if length of m ≤ 1 then
        return m

    var left := empty list
    var right := empty list
    for each x with index i in m do
        if i < (length of m)/2 then
            add x to left
        else
            add x to right

    left := merge_sort(left)
    right := merge_sort(right)

    return merge(left, right)

function merge(left, right)
    var result := empty list

    while left is not empty and right is not empty do
        if first(left) ≤ first(right) then
            append first(left) to result
            left := rest(left)
        else
            append first(right) to result
            right := rest(right)

    while left is not empty do
        append first(left) to result
        left := rest(left)
    while right is not empty do
        append first(right) to result
        right := rest(right)
    return result

Write the remaining test case function for test_merge_sort, and then make an attempt at this implementation. One clue I will give you is that this algorithm works best when given only the first DoubleLinkedListNode. You'll also probably need a way to count the number of nodes from just a given node. That's something the DoubleLinkedList doesn't do.

Merge Sort Cheat Mode

If you attempted it for a while and need to cheat, here's what I did:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def count(node):
    count = 0

    while node:
        node = node.next
        count += 1

    return count


def merge_sort(numbers):
    numbers.begin = merge_node(numbers.begin)

    # horrible way to get the end
    node = numbers.begin
    while node.next:
        node = node.next
    numbers.end = node


def merge_node(start):
    """Sorts a list of numbers using merge sort."""
    if start.next == None:
        return start

    mid = count(start) // 2

    # scan to the middle
    scanner = start
    for i in range(0, mid-1):
        scanner = scanner.next

    # set mid node right after the scan point
    mid_node = scanner.next
    # break at the mid point
    scanner.next = None
    mid_node.prev = None

    merged_left = merge_node(start)
    merged_right = merge_node(mid_node)

    return merge(merged_left, merged_right)



def merge(left, right):
    """Performs the merge of two lists."""
    result = None

    if left == None: return right
    if right == None: return left

    if left.value > right.value:
        result = right
        result.next = merge(left, right.next)
    else:
        result = left
        result.next = merge(left.next, right)

    result.next.prev = result
    return result

I would use this code as a "cheat sheet" to get quick clues while attempting your implementation. You'll also see me in the video attempt to re-implement this code from scratch, so you can watch me struggle with the same issues you are probably having.

Quick Sort

Finally, it's your turn to attempt the quick_sort implementation and to create a test_quicksort test case. I recommend you first implement a simple quicksort using just Python's normal list types. This will help you understand it better. Then, take your simple Python code and make it use the DoubleLinkedList. Remember to take your time working this out and, obviously, do lots of debugging and tests in your test_quicksort.

Study Drills

  1. These implementations are definitely not the best when it comes to performance. Try to write some heinous tests that demonstrate this. You may need to throw a large list at the algorithms. Use your research to find out what pathological (absolute worst) cases exist. For example, what happens when you give quick_sort an already sorted list?
  2. Don't implement any improvements yet, but research various improvements you can make to these algorithms.
  3. Find other sorting algorithms and try to implement them.
  4. Do these also work on SingleLinkedList? What about Queue and Stack? Is this useful?
  5. Read about the theoretical speed of these algorithms. You'll see references to O(n^2) or O(n log n), which is a way of saying that in the worst case these algorithms will perform that poorly. Determining the "Big-O" for an algorithm is beyond the scope of this book, but we'll discuss these measurements briefly in Exercise 18.
  6. I implemented these as a separate module, but would it be simpler to add them as functions on DoubleLinkedList? If you do, then would you need to copy that code to the other data structures it can work on? We're not at a design decision for how to make these sorting algorithms work with any "linked list-like data structure."
  7. Never use bubble sort again. I've included it here because you'll run into it often in bad code and because we'll be improving the performance in Exercise 19.

Pre-order Learn More Python The Hard Way

When you pre-order Learn More Python The Hard Way, you'll receive the Python 3 Edition as it's being created. All files are DRM free and you can download them to your computer for offline viewing. Digital Download Only! You do not get a physical book.

$29.00

Buy the Online Course

Or, you can read Learn More Python the Hard Way for free right here, video lectures not included.

Other Buying Options

Other buying options coming soon.

No content available for this exercise. You can view all available downloads at your customer account page.