Git in short
Let imagine two fictional developer Zoltán and Gábor. The former is extremely clever and handsome programmer, the latter is same clever, but even more handsome. They are speaking about Git, and version control in general, while drinking their imaginary beer. Here I just summarize their speaking, as Zoltán likes to have documentations, even from imaginary beer drinking - git speaking.
What to read
This documentation just a very short something. If you eager for more, there are many resources available: http://ftp.newartisans.com/pub/git.from.bottom.up.pdf or http://progit.org/book/ to name only just a few.
Git vs SVN - some facts
Git stores content (snapshots, not differences)
You may recall, that most CVCS (Central Version Control Systems as SVN) stores differences. So they think version controlled files like:
| version1 | version2 | version3 | version4 |
|---|---|---|---|
| file1= | ==> | file1+x=> | file1+x+y |
| file2=> | file2+z= | ==> | file2+z+k |
You get the part on your computer that you checkout, mostly the changes represent the most recent commit. To dig deeper in versions, you need go back to server.
Git thinks about content of files only, and stores them as mini file system snapshot. So, every commit is the picture of the root directory of your project, that you like to "save", because it represents the satisfying state of your project.
| version1 | version2 | version3 | version4 |
|---|---|---|---|
| 1 | (1) | 1.1 | 1.2 |
| 2 | 2.1 | (2.1) | 2.2 |
So, every version stores its own file system. Git is clever enough to not store the same content twice (here I tried to show it with ()). This means, that you have all the content of the files associated the given history, all of the history, locally. You can go back to any time, without internet connection, and change, modify, alter, work, do whatever you want with the whole project.
Almost everything is local
When you create your own repo (clone it or init it) you can work locally, as stated before, but it is important to state again. You can do whatever you want, without breaking anyone's work.
Git thinks only of data
Git cares about data, not filenames or paths. It computes the sha1 hash of the given file's data, and stores the data referenced by that hash. That means, if two repository in the wide internet have the same hash, have the same content. It is very efficient to merging, collaborating this way.
To get just an example of it without any background: If you have the root directory with two files, that you like to commit to preserve it in the repository, Git saves and computes hashes of four objects: two blobs represent the files, one tree that holds references to the files (that is your root directory now) and one commit object that hold reference to the tree. This system makes Git very robust to handling, finding, merging, rebasing, manipulating files and history.
Three places for files
On daily working Git knows three places of the given file
- Working directory (workspace). This is where you work with your files. Here Git places the files that you request by the checkout operation. Basically it checks out a branch - which in turn just pointer to a given commit, which is in turn the given state of the project's file system (read it again :-))). The default branch called master. You can have any branch, as working with them extremely convenient and easy, and quick (as they are only pointers, not directories). The HEAD is the latest commit of the given branch that you work on (and just a pointer too).
- Index (Staging area): Here are the files, that will be included in the next commit. You place files here with the git add command - use add for new (untracked) and modified files too. Files can be removed by the git rm --cached <filename> command (that holds the file in your working directory, but let Git ignores it - the file will be untracked). If you use the git rm <filename> command, the file deleted from the hard drive too.
- Repository when you type git commit everything that is in the index, will be commited in the repository. You can leave out the Index if you like, and commit modified files with the git commit -a command. If you like not to use the editor to form a message, you can use the git commit -m 'Message here' command.
So lets see a quick workflow. When you checkout a branch, the files and directory structure represented by the given branch's (for example master) last commit (the HEAD) will placed in your working directory. All files are unmodified. When you create a new file (with an editor for example) it is untracked. If you modify an existing file, it will be modified. You add the file to the Index (both newly tracked and modified) with the git add command. The files will be staged. You type git commit and all files that are staged will be unmodified again - as they represents the new HEAD that moved to point your newly created commit object.
