This book is new. If you'd like to go through it, then join the Learn Code Forum to get help while you attempt it. I'm also looking for feedback on the difficulty of the projects. If you don't like public forums, then email email@example.com to talk privately.
Exercise 19: Improving Performance
This is a mostly video exercise where I will demonstrate improving the performance of the code you've written so far, but first you should attempt it. You've analyzed how slow and fast the code is from Exercise 18, so now it's time to implement some of your ideas. I'll give you a quick list of things to look for and change when fixing simple performance problems:
- Loops inside loops repeating calculations that can be avoided. The bubble sort is the classic case of this, and that's why I taught it. Once you can see how terrible bubble sort is compared to other methods you'll start to recognize this a common pattern to avoid.
- Repeatedly calculating something that doesn't really change or can be calculated once during the change. The use of a count() function in the sorting.py and other data structures is a great example of this. You can keep track of the count inside the functions to the data structure. Every time you add, you can increase it and every time you remove, decrease it. There's no need to go through the whole list every time. You can also use this pre-calculated count to improve the logic of other functions by checking for count == 0.
- Using the wrong data structure for the job. In the Dictionary I'm using the DoubleLinkedList as a demonstration of this problem. A Dictionary needs to have random access to element, at least in the list of buckets. Using a DoubleLinkedList with DoubleLinkedList inside means every time you want to access the nth element you have to go through all elements up to n. Replacing this with a Python list would vastly improve the performance. This was an exercise in using your existing code to craft a data structure from simpler data structures so not necessarily an exercise in making the best Python Dictionary (it already has one).
- Using the wrong algorithms on data structures. Bubble sort is obviously the wrong algorithm (never use that again), but remember how merge sort and quick sort are better, depending on the data structure? Merge sort is great for these kind of linked data structures but not as good on arrays like Python's list. Quick sort is better on a list but not really as good on linked data structures.
- Not optimizing for common operations in the best place. In the DoubleLinkedList you'll frequently start at the beginning of a bucket and search the slots for a value. In the current code these slots are simply added as they come in, and that may be random or not. If you adopted a discipline of sorting these lists on insert, then finding an element would be easier and quicker. You could stop when a slot's value is greater than what you're looking for because you know it's already sorted. Doing this makes inserts slower but speeds up nearly every other operation, so choose the right design for the job. If you have a need to do tons of inserts, then this isn't smart. But if your analysis shows you're doing few inserts but many accesses, then this is a way to speed it up.
- Hand rolling your own instead of using existing code. We're doing exercises to learn data structures, but in the real world you would not do this. Python already has great data structures that are built into the language and optimized. You should use those first, and if a performance analysis says your own data structure would be quicker then write your own. Even then you should look up an existing data structure someone else has already proven works instead of rolling your own. In this exercise, write a few tests that compare your Dictionary and lists to the Python built-in types to see how far off you may be by comparison.
- Using recursion in a language that's not good at it. The merge_sort code can be broken by simply handing it a list that's bigger than the Python stack. Try giving it something insane like 3000 elements and then slowly bring that down till you find the sweet spot that causes Python to run out of stack. Python doesn't do certain recursive optimizations, so using recursion without special considerations is going to fail like this. In this case, rewriting the merge_sort to use a loop would be better (but much more difficult).
These are some of the big wins you should have discovered during the analysis in Exercise 18. Your task now is to attempt to implement them and improve the performance of this code.
Attempt to use your analysis and the description of suggested improvements above to methodically improve the performance of your code. Methodically means to do it in a lock step controlled way using data to confirm that you have actually improved things. Here's your process to follow during this exercise:
- Pick your first, smallest, slowest piece of code and make sure there's a test showing how slow it is. Make sure you have a series of measurements that give you an idea of the speed. Graph it if you can.
- Attempt your speed improvement changes and then run your test again. Keep trying to squeeze all the performance you can out of this piece of code.
- If you attempt a code change and it does not improve things, then either figure out what you did wrong or back out that change and try something else. This is important because you are working on a hypothesis, so if you leave a useless code change in it may change the performance of other functions you can fix. Back the change out and try a different approach or move on to another piece of code.
- Re-run your measurements of the other smallest slowest pieces of code to see if they have changed. Your fixes may have fixed other code, so reconfirm what you think you know.
- Once you've gone through everything you've identified, run your measurements again and pick new pieces of code to attempt to improve.
- Keep your tests from step 1 (they should be automated tests) because you want to avoid regressions. If you see a change to a function causing other functions to be slower, then either fix that or simply back that change out and try something a new approach.
You should study the Python Tim Sort original email and finally the bug found in 2015 by researchers from the EU FP7 ENVISAGE. The original email was sent in 2002 and subsequently implemented. It took 13 years before this bug was found. Keep that in mind when you go to implement your own ideas for algorithms. Even top notch developers on big projects have lurking bugs in their algorithms that aren't found for a very long time. Another example would be the OpenSSL project, which had lurking bugs for decades simply because everyone believed that "professional cryptographers" created the code. Turns out, even supposedly professional cryptographers can write terrible code. Getting new algorithms correct requires special skills and--I think--the use of a theorem proving tool to validate the correctness. Unless you have this background, creating new algorithms and data structures can lead to your peril. That includes cryptography algorithms and encrypted network protocols. Implementing algorithms other people have proven is totally fine and a good exercise, so long as you are up front about your skills in implementation. But don't go crafting your own hair-brained data structures without some help.