This book is new. If you'd like to go through it, then join the Learn Code Forum to get help while you attempt it. I'm also looking for feedback on the difficulty of the projects. If you don't like public forums, then email firstname.lastname@example.org to talk privately.
Exercise 29: diff and patch
To finish Part IV you will simply apply the full TDD process you've been studying on a much more involved project that may be unfamiliar to you. Refer back to Exercise 28 to confirm you know the process, and make sure you follow it strictly. Create a check-list to follow if you must.
When you are actually working, all this strict process is not very useful. Currently you are studying the process and working on internalizing it so you can use it in the real world. That's why I am being strict about how you should follow it. This is only practice, so don't become a zealot about it when you are doing real work. The purpose of the book is to teach you a set of strategies to get work done, not teach you a religious rite you can preach to the masses.
The diff command takes two files and produces a third file (or output) that encodes what changed in the first to make the second. It's the basis of tools like git and other revision control tools. Implementing diff in Python is fairly trivial since there's a library that does it for you, so you don't need to work on the algorithms (which can be very complex).
The patch tool is the companion to the diff tool as it takes a diff file and applies it to another file to produce the third file. This lets you take changes you've made in two files, run diff to produce only the changes, then send that .diff file to someone. That person can then use their original copy of the file and your .diff with patch to rebuild your changes.
Here's an example work flow to demonstrate how diff and patch work. I have two files A.txt and B.txt. The A.txt file contains some simple text, and then I copied it and created B.txt with some modifications:
$ diff A.txt B.txt > AB.diff $ cat AB.diff 2,4c2,4 < her fleece was white a mud < and every where that marry < her lamb would chew cud --- > her fleece was white a snow > and every where that marry went > her lamb was sure to go
This produces a file AB.diff that has changes from A.txt to B.txt, which you can see is fixing a rhyme I broke. Once you have this AB.diff you can use patch to apply the changes:
$ patch A.txt AB.diff $ diff A.txt B.txt
That finall command should show no output since the patch command before it effectively made A.txt have the same contents as B.txt.
Implementing these two should start with the diff command since you have a fully implemented diff using Python to cheat from. You can find it at the end of the difflib documentation but try to implement your version and see how it compares to theirs.
The real meat of this exercise is the patch tool, which Python does not implement for you. You will want to read up on the SequenceMatcher class in difflib and specifically look at the SequenceMatch.get_opcodes function. That is your only clue to making patch work, but it's a very good clue.
- How far can you take this diff and patch combination? Can you combine them into one tool? Can you make it work like a miniature git?
Find as many diff algorithms as you can. Another thing to research is how a tool like git works.