Warning

This book is new. If you'd like to go through it, then join the Learn Code Forum to get help while you attempt it. I'm also looking for feedback on the difficulty of the projects. If you don't like public forums, then email help@learncodethehardway.org to talk privately.

# Exercise 16: Bubble, Quick, and Merge Sort

You are now going to attempt to implement sorting algorithms for your `DoubleLinkedList` data structure. For these descriptions I'm going to use a "list of numbers" to mean a randomized list of things. This could be a deck of poker cards, sheets of paper with numbers on them, lists of names, or anything else you can sort. There are three usual suspects when you're attempting to sort a list of numbers:

- Bubble Sort
- This is most likely how you would attempt to sort a list of numbers if you knew nothing about sorting. It involves simply going through the list and swapping any out of order pairs you find. You continually loop through the list, swapping pairs, until you've gone through without swapping anything. It's simple to understand, but
*crazy*slow. - Merge Sort
- This kind of sorting algorithm divides the list into halves, then quarters, and further partitions until it can't divide it anymore. Then it merges these back, but does it in the right order by checking the ordering of each partition as it merges it. It's a clever algorithm that works really well on linked lists but isn't so great on fixed size arrays as you'll need a
`Queue`of some kind to keep track of the partitions. - Quick Sort
- This is similar to merge sort since it's a "divide and conquer" algorithm, but it works by swapping elements around the partition point instead of breaking and merging the list together. In the simplest form, you pick a range from low to high,and a partition point. You then swap the elements that are greater than the partition point above it and those lower below it. Then you pick a new low, high, and partition that's inside this newly shuffled set and do it again. It's dividing the list into smaller chunks, but it doesn't break them apart like a merge sort does.

# Exercise Challenge

The purpose of this exercise is to learn how to implement an algorithm based on a "pseudo-code" description or "p-code". You'll be studying the algorithms in references I tell you (primarily Wikipedia) and then using the p-code to implement them. I'll do the first two quickly here and more elaborately in the videos for this exercise. Then your job is to do the Quicksort algorithm on your own. First, let's look at the description of the Bubble Sort from Wikipedia to get started:

procedure bubbleSort( A : list of sortable items ) n = length(A) repeat swapped = false for i = 1 to n-1 inclusive do /* if this pair is out of order */ if A[i-1] > A[i] then /* swap them and remember something changed */ swap( A[i-1], A[i] ) swapped = true end if end for until not swapped end procedure

You'll find that because p-code is just a loose description of the algorithm, it will end up being wildly different between books, authors, and pages on Wikipedia. It's assumed that you can read this "programming-like language" and translate it to what you want. Sometimes the language will look like an old language called Algol, other times it'll look like a poorly formatted JavaScript or Python. You just have to try to guess at what it's saying and then translate it to what you need. Here's my initial implementation of this particular p-code:

I've added additional comments here so you can study this and follow along, comparing what I've done here to the p-code. You should also see that the Wikipedia page is using a completely different data structure from your `DoubleLinkedList`. The Wikipedia code is assuming some sort of functioning array or list structure. You have to translate lines such as:

if A[i-1] > A[i] then

Into Python using your `DoubleLinkedList`:

if node.prev.value > node.value:

We can't randomly access a `DoubleLinkedList` easily, so we have to convert these array index operations into `.next` and `.prev`. We also have to watch for the `next` or `prev` attributes being `None` while we loop. This kind of conversion requires a lot of translation, study, and guessing at the semantics of the p-code you're reading.

## Study Bubble Sort

You should now take the time to study this `bubble_sort` Python code to see how I translated it. Be sure to watch the video so you can watch me do it in real time and get more insight. You should also draw diagrams of how this works on different kinds of lists (already sorted, random, duplicates, etc). Once you have an understanding of how I did it, study the `pytest` for this and the `merge_sort` algorithm:

An important part of this test code is that I'm using the `random.randint` function to generate randomized data for testing. This test doesn't test for many edge cases, but it's a start, and we'll be improving it later. Remember that you don't have `sorting.merge_sort` implemented yet, so you can either not write that test function or comment it out for now.

Once you have the test, and this code written up, study the Wikipedia page again and attempt some of the other versions of `bubble_sort` before attempting `merge_sort`.

## Merge Sort

I'm not quite ready to leave you on your own yet. I'm going to have you repeat this process again for the `merge_sort` function, but this time I want you to attempt to solve the algorithm from just the p-code on the Merge Sort Wikipedia page *before* you look at how I did it. There are several proposed implementations, but I used the "top down" version:

function merge_sort(list m) if length of m ≤ 1 then return m var left := empty list var right := empty list for each x with index i in m do if i < (length of m)/2 then add x to left else add x to right left := merge_sort(left) right := merge_sort(right) return merge(left, right) function merge(left, right) var result := empty list while left is not empty and right is not empty do if first(left) ≤ first(right) then append first(left) to result left := rest(left) else append first(right) to result right := rest(right) while left is not empty do append first(left) to result left := rest(left) while right is not empty do append first(right) to result right := rest(right) return result

Write the remaining test case function for `test_merge_sort`, and then make an attempt at this implementation. One clue I will give you is that this algorithm works best when given *only* the first `DoubleLinkedListNode`. You'll also probably need a way to count the number of nodes from just a given node. That's something the `DoubleLinkedList` doesn't do.

## Merge Sort Cheat Mode

If you attempted it for a while and need to cheat, here's what I did:

I would use this code as a "cheat sheet" to get quick clues while attempting your implementation. You'll also see me in the video attempt to re-implement this code from scratch, so you can watch me struggle with the same issues you are probably having.

## Quick Sort

Finally, it's your turn to attempt the `quick_sort` implementation and to create a `test_quicksort` test case. I recommend you *first* implement a simple quicksort using just Python's normal `list` types. This will help you understand it better. Then, take your simple Python code and make it use the `DoubleLinkedList`. Remember to take your time working this out and, obviously, do lots of debugging and tests in your `test_quicksort`.

# Study Drills

- These implementations are definitely
*not*the best when it comes to performance. Try to write some*heinous*tests that demonstrate this. You may need to throw a large list at the algorithms. Use your research to find out what pathological (absolute worst) cases exist. For example, what happens when you give`quick_sort`an already sorted list? - Don't implement any improvements yet, but research various improvements you can make to these algorithms.
- Find other sorting algorithms and try to implement them.
- Do these also work on
`SingleLinkedList`? What about`Queue`and`Stack`? Is this useful? - Read about the theoretical speed of these algorithms. You'll see references to
`O(n^2)`or`O(n log n)`, which is a way of saying that in the worst case these algorithms will perform that poorly. Determining the "Big-O" for an algorithm is beyond the scope of this book, but we'll discuss these measurements briefly in Exercise 18. - I implemented these as a separate module, but would it be simpler to add them as functions on
`DoubleLinkedList`? If you do, then would you need to copy that code to the other data structures it can work on? We're not at a design decision for how to make these sorting algorithms work with any "linked list-like data structure." - Never use bubble sort again. I've included it here because you'll run into it often in bad code and because we'll be improving the performance in Exercise 19.