Almost everyone uses git every day and almost no one knows what it does. We learn the verbs, add, commit, push, pull, and treat the rest as a black box that occasionally eats our work. But the core of git is small. Not small as in simplified, small as in you can rebuild the real thing in about a hundred and fifty lines of Python, and the objects it produces will be byte for byte identical to the ones real git produces.
That last part is the fun bit. By the end of this you will have a program that hashes a file, and git hash-object on the same file prints the same hash. Same trees, same commits. Not a toy that mimics git. Git.
This is the written companion to the video build. It goes a step further at the end and shows how merge works, which the video only teases.
The one idea
Git is a key value store. You hand it some content, it hands you back a hash of that content, and it files the content away under that hash. The key is the hash of the value. That is content addressing, and it is the same idea behind IPFS, Bitcoin block hashes, and the Nix package manager.






