Warning
This book is new. If you'd like to go through it, then join the Learn Code Forum to get help while you attempt it. I'm also looking for feedback on the difficulty of the projects. If you don't like public forums, then email help@learncodethehardway.org to talk privately.
Exercise 14: Double Linked Lists
The previous exercise may have taken you quite a while to complete since you have to figure out how to make a single linked list work. Hopefully, the video gave you enough information to complete the exercise and also showed you exactly how to do an audit of your code. In this exercise you're going to implement the better version of linked lists called DoubleLinkedList.
In the SingleLinkedList you should have realized that any operation involving the end of the list has to travel through each node until it gets to the end. A SingleLinkedList is only efficient from the front of the list where you can easily change the next pointer. The shift and unshift operations are fast, but pop and push cost you as the list gets bigger. You might be able to speed this up by keeping a reference to the next-to-last element, but then what if you want to replace that element? Again, you'll have to go through all of the elements to find this one. You can get some speed improvements with a few minor changes like this, but a better solution is to simply change the structure to make it easier to work from any position.
The DoubleLinkedList is nearly the same as the SingleLinkedList except it also has a prev (for previous) link to the DoubleLinkedListNode that comes before it. One extra pointer per node and suddenly many of the operations become much easier. You can also easily add a pointer to the end in the DoubleLinkedList so you have direct access to both the beginning and the end. This makes push and pop efficient again since you can access the end directly and use the node.prev pointer to get the previous node.
With these changes in mind our node class looks like this:
1 2 3 4 5 6 7 8 9 10 11 |
All that's added is the self.prev = prev line and then handling that in the __repr__ function. Implementing the DoubleLinkedList class uses the same operations as the SingleLinkedList class, except you add one more variable for the end of the list:
1 2 3 4 5 |
Introducing Invariant Conditions
All of the operations to implement are the same, but now we have some additional considerations:
With a prev pointer you now have to handle more conditions in each operation:
- Are there zero elements? Then self.begin and self.end need to be None.
- If there is one element, then self.begin and self.end have to be equal (point at same node).
- The first element must always have a prev that is None.
- The last element must always have a next that is None.
These truths have to hold during the life of the DoubleLinkedList, which makes them "invariant conditions" or just "invariants". The idea of an invariant is that, no matter what, these are base checks that show the structure is working correctly. A way to look at invariants is any tests or assert calls you seem to do repeatedly can be moved into a special function called _invariant which does these checks. You can then call this function in your tests or at the beginning and end of each function. Doing that will cut down on your defect rates since you are saying, "No matter what I do, these have to be true."
The only problem with invariant checks is they can cost time to run. If every function call also calls another function twice then you are adding a potentially significant burden to every function. If your invariant function also does something costly it gets worse. Imagine if you added the invariant: "All nodes have a next and prev except the first and last ones." That would mean every function call does a complete traversal through the list twice. When you have to make sure the class works all the time, this is worth it. If you don't, then it becomes a problem.
In this book you'll use invariant functions when you can, but keep in mind that they aren't something you always need to use. Finding ways to only activate them in test suites or debugging or using them during initial development is the key to using them effectively. I recommend that you only call invariants at the top of functions or only call them inside the test suite. This is a good compromise.
Exercise Challenge
In this exercise you'll implement the operations for the DoubleLinkedList, but this time you'll also use an _invariant function to check that it is working correctly before and after each operation. The best way to do this is to call the invariant at the top of each function and then in the test suite at key spots. Your test suite for the DoubleLinkedList is almost a copy-paste replica of the SingleLinkedList test, except you are adding _invariant calls at key points.
As with the SingleLinkedList you'll want to manually study this data structure on your own. You should draw out the node structures on paper and do some of the operations manually. Next, work the DoubleLinkedListNode manually in your dllist.py file. After that spend one or two 45-minute sessions trying to hack up a few operations to figure it out. I recommend push and pop. After that you can watch the video to see me work on it and how I use a combination of auditing my code and the _invariant function to check what I'm doing.
Study Drills
As with the previous exercise you'll want to implement this data structure again from memory. Put what you know about it in one room and your laptop in another room. You'll want to do this until you can implement the DoubleLinkedList from memory without reference.