Articles
This article is part of series which provides introduction to git internals.
Introduction
Git is the stupid content tracker
(as per it's man page).
As a content tracker it tracks the modification of done to content.
It acts like a diary for how/when/where content got modified.
This help to enabled the following:
- Track Ownership of changes.
- Backup and Restore to certain older point.
- Synchronization changes between multiple contents.
Git keeps track of the whole system as Directed Acyclic Graph of Hashes of commits. Each hash represents a commit or state of content at given time. Lets dig bit deeper about the graph part.
Directed Acyclic Graph
Graph is structure which represents relation between objects using connection.
- The object in this case is commits.
- The relationship in this case is parent-child relationship.
- There relation can be many to many.
- Multiple child can share single parent.
- Single child may be created by Multiple parents.
Directed refers to the fact that these relationships have a direction.
- The direction in this case is from child to parent.
- The arrow head points to the direction.
Acyclic means the graph does not have cycles or loops.
- More information at wiki.
Hash
The Directed Acyclic Graph is made of Hashes. Hash function map variable size input to fixed size output. Here each content has different size, you hash them you to get fixed sized output. The fixed sized output is the hash of the input.
Cryptographic Hashes are hash which are hard to reverse. You cannot guess input from getting output of the hash. In case of git sha-1 is used for hashing. In recent version sha-256 is used as per hash function transition article).
Commit
The snapshot of content at any point of time is called commit. Even though content refers to folder level of content. The changes are maintained at file level. So the commit contains details of these changes. More about commit will be covered in next chapter.