Git - User-Manual Documentation
Git - User-Manual Documentation
Git - User-Manual Documentation
About
Branching and Merging
Small and Fast
Distributed
Data Assurance
Staging Area
Free and Open Source
Trademark
Documentation
Reference
Book
Videos
External Links
Downloads
GUI Clients
Logos
Community
English ▾
Localized versions of user-manual manual
1. English
init
clone
Basic Snapshotting
add
status
diff
commit
notes
restore
reset
rm
mv
show
log
diff
difftool
range-diff
shortlog
describe
Patching
apply
cherry-pick
diff
rebase
revert
Debugging
bisect
blame
grep
am
apply
format-patch
send-email
request-pull
External Systems
svn
fast-import
Server Admin
daemon
update-server-info
Guides
gitattributes
Command-line interface conventions
Everyday Git
Frequently Asked Questions (FAQ)
Glossary
Hooks
gitignore
gitmodules
Revisions
Submodules
Tutorial
Workflows
All guides...
Administration
clean
gc
fsck
reflog
filter-branch
instaweb
archive
bundle
Plumbing Commands
cat-file
check-ignore
checkout-index
commit-tree
count-objects
diff-index
for-each-ref
hash-object
ls-files
ls-tree
merge-base
read-tree
rev-list
rev-parse
show-ref
symbolic-ref
update-index
update-ref
verify-pack
write-tree
Version 2.39.0
▾ user-manual last updated in 2.39.0
Changes in the user-manual manual
1. 2.39.0
12/12/22
2. 2.38.1 → 2.38.2 no changes
3. 2.38.0
10/02/22
4. 2.36.1 → 2.37.4 no changes
5. 2.36.0
04/18/22
6. 2.34.1 → 2.35.5 no changes
7. 2.34.0
11/15/21
8. 2.33.1 → 2.33.5 no changes
9. 2.33.0
08/16/21
10. 2.32.1 → 2.32.4 no changes
11. 2.32.0
06/06/21
12. 2.30.1 → 2.31.5 no changes
13. 2.30.0
12/27/20
14. 2.28.1 → 2.29.3 no changes
15. 2.28.0
07/27/20
16. 2.25.1 → 2.27.1 no changes
17. 2.25.0
01/13/20
18. 2.24.1 → 2.24.4 no changes
19. 2.24.0
11/04/19
20. 2.23.1 → 2.23.4 no changes
21. 2.23.0
08/16/19
22. 2.22.1 → 2.22.5 no changes
23. 2.22.0
06/07/19
24. 2.21.1 → 2.21.4 no changes
25. 2.21.0
02/24/19
26.
git --version
Introduction
Git is a fast distributed revision control system.
$ man git-clone
or:
With the latter, you can use the manual viewer of your choice; see
git-help[1] for more information.
See also Git Quick Reference for a brief overview of Git commands,
without any explanation.
Finally, see Notes and todo list for this manual for ways that you can help make this manual more
complete.
The initial clone may be time-consuming for a large project, but you
will only need to clone once.
The clone command creates a new directory named after the project
(git or linux in the examples above). After you cd into
this
directory, you will see that it contains a copy of the project files,
called the working tree, together with a special
top-level
directory named .git, which contains all the information
about the history of the project.
* master
v2.6.11
v2.6.11-tree
v2.6.12
v2.6.12-rc2
v2.6.12-rc3
v2.6.12-rc4
v2.6.12-rc5
v2.6.12-rc6
v2.6.13
...
Create a new branch head pointing to one of these versions and check it
out using git-switch[1]:
$ git switch -c new v2.6.13
The working directory then reflects the contents that the project had
when it was tagged v2.6.13, and git-branch[1] shows two
branches, with an asterisk marking the currently checked-out branch:
$ git branch
master
* new
If you decide that you’d rather see version 2.6.17, you can modify
the current branch to point at v2.6.17 instead, with
$ git reset --hard v2.6.17
Note that if the current branch head was your only reference to a
particular point in history, then resetting that branch may
leave you
with no way to find the history it used to point to; so use this command
carefully.
Understanding History: Commits
$ git show
commit 17cf781661e6d38f737f15f53ab552f1e95960d7
--- a/init-db.c
+++ b/init-db.c
@@ -7,7 +7,7 @@
int len, i;
As you can see, a commit shows who made the latest change, what they
did, and why.
Every commit has a 40-hexdigit id, sometimes called the "object name" or the
"SHA-1 id", shown on the first line of the git
show output. You can usually
refer to a commit by a shorter name, such as a tag or a branch name, but this
longer name can
also be useful. Most importantly, it is a globally unique
name for this commit: so if you tell somebody else the object name
(for
example in email), then you are guaranteed that name will refer to the same
commit in their repository that it does in yours
(assuming their repository
has that commit at all). Since the object name is computed as a hash over the
contents of the
commit, you are guaranteed that the commit can never change
without its name also changing.
Every commit (except the very first commit in a project) also has a
parent commit which shows what happened before this
commit.
Following the chain of parents will eventually take you back to the
beginning of the project.
However, the commits do not form a simple list; Git allows lines of
development to diverge and then reconverge, and the point
where two
lines of development reconverge is called a "merge". The commit
representing a merge can therefore have more
than one parent, with
each parent representing the most recent commit on one of the lines
of development leading to that point.
The best way to see how this works is using the gitk[1]
command; running gitk now on a Git repository and looking for merge
commits will help understand how Git organizes history.
We will sometimes represent Git history using diagrams like the one
below. Commits are shown as "o", and the links between
them with
lines drawn with - / and \. Time goes left to right:
o--o--o <-- Branch A
When we need to be precise, we will use the word "branch" to mean a line
of development, and "branch head" (or just "head")
to mean a reference
to the most recent commit on a branch. In the example above, the branch
head named "A" is a pointer to
one particular commit, but we refer to
the line of three commits leading up to that point as all being part of
"branch A".
However, when no confusion will result, we often just use the term
"branch" both for branches and for branch heads.
Manipulating branches
git branch
The special symbol "HEAD" can always be used to refer to the current
branch. In fact, Git uses a file named HEAD in the .git
directory
to remember which branch is current:
$ cat .git/HEAD
ref: refs/heads/master
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
If you want to create a new branch to retain commits you create, you may
The HEAD then refers to the SHA-1 of the commit instead of to a branch,
and git branch shows that you are no longer on a
branch:
$ cat .git/HEAD
427abfa28afedffadfca9dd8b067eb6d36bac53f
$ git branch
master
$ git branch -r
origin/HEAD
origin/html
origin/maint
origin/man
origin/master
origin/next
origin/seen
origin/todo
Note that the name "origin" is just the name that Git uses by default
to refer to the repository that you cloned from.
The full name is occasionally useful if, for example, there ever
exists a tag and a branch with the same name.
For the complete list of paths which Git checks for references, and
the order it uses to decide which to choose when there are
multiple
references with the same shorthand name, see the "SPECIFYING
REVISIONS" section of gitrevisions[7].
You can also track branches from repositories other than the one you
cloned from, using git-remote[1]:
...
From git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
$ git branch -r
origin/master
staging/master
staging/staging-linus
staging/staging-next
If you examine the file .git/config, you will see that Git has added
a new stanza:
$ cat .git/config
...
[remote "staging"]
url = git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
fetch = +refs/heads/*:refs/remotes/staging/*
...
This is what causes Git to track the remote’s branches; you may modify
or delete these configuration options by editing
.git/config with a
text editor. (See the "CONFIGURATION FILE" section of
git-config[1] for details.)
Git provides extremely flexible and fast tools for exploring the
history of a project.
We start with one specialized tool that is useful for finding the
commit that introduced a bug into a project.
[65934a9a028b88e83e2b0f8b36618fe503349f8e] BLOCK: Make USB storage depend on SCSI rather than selecting it [try #6]
If you run git branch at this point, you’ll see that Git has
temporarily moved you in "(no branch)". HEAD is now detached
from any
branch and points directly to a commit (with commit id 65934) that
is reachable from "master" but not from v2.6.18.
Compile and test it,
and see whether it crashes. Assume it does crash. Then:
$ git bisect bad
checks out an older version. Continue like this, telling Git at each
stage whether the version it gives you is good or bad, and
notice
that the number of revisions left to test is cut approximately in
half each time.
After about 13 tests (in this case), it will output the commit id of
the guilty commit. You can then examine the commit with
git-show[1], find out who wrote it, and mail them your bug
report with the commit id. Finally, run
$ git bisect reset
Note that the version which git bisect checks out for you at each
point is just a suggestion, and you’re free to try a different
version if you think it would be a good idea. For example,
occasionally you may land on a commit that broke something
unrelated;
run
$ git bisect visualize
which will run gitk and label the commit it chose with a marker that
says "bisect". Choose a safe-looking commit nearby, note
its commit
id, and check it out with:
$ git reset --hard fb47ddb2db
In this case, though, Git may not eventually be able to tell the first
bad one between some first skipped commits and a later bad
commit.
There are also ways to automate the bisecting process if you have a
test script that can tell a good from a bad commit. See
git-
bisect[1] for more information about this and other git
bisect features.
Naming commits
We have seen several ways of naming commits already:
There are many more; see the "SPECIFYING REVISIONS" section of the
gitrevisions[7] man page for the complete list of
ways to
name revisions. Some examples:
$ git show fb47ddb2 # the first few characters of the object name
Recall that merge commits may have more than one parent; by default,
^ and ~ follow the first parent listed in the commit, but
you can
also choose:
$ git show HEAD^1 # show the first parent of HEAD
The git fetch operation always stores the head of the last fetched
branch in FETCH_HEAD. For example, if you run git
fetch without
specifying a local branch as the target of the operation
When we discuss merges we’ll also see the special name MERGE_HEAD,
which refers to the other branch that we’re merging
in to the current
branch.
e05db0fd4f31dde7005f075a84f96b360d05984b
Creating tags
We can also create a tag to refer to a particular commit; after
running
Browsing revisions
$ git log test..master # commits reachable from master but not test
$ git log --since="2 weeks ago" # commits from the last 2 weeks
$ git log fs/ # ... which modify any file under fs/
$ git log -S'foo()' # commits which add or remove any file data
And of course you can combine all of these; the following finds
commits since v2.5 which touch the Makefile or any file under
fs:
See the --pretty option in the git-log[1] man page for more
display options.
Note that git log starts with the most recent commit and works
backwards through the parents; however, since Git history can
contain
multiple independent lines of development, the particular order that
commits are listed in may be somewhat arbitrary.
Generating diffs
You can generate diffs between any two versions using
git-diff[1]:
$ git diff master..test
That will produce the diff between the tips of the two branches. If
you’d prefer to find the diff from their common ancestor to
test, you
can use three dots instead of two:
Sometimes what you want instead is a set of patches; for this you can
use git-format-patch[1]:
will generate a file with a patch for each commit reachable from test
but not from master.
You can always view an old version of a file by just checking out the
correct revision first. But sometimes it is more
convenient to be
able to view an old version of a single file without checking
anything out; this command does that:
Before the colon may be anything that names a commit, and after it
may be any path to a file tracked by Git.
Examples
Suppose you want to know how many commits you’ve made on mybranch
since it diverged from origin:
Alternatively, you may often see this sort of thing done with the
lower-level command git-rev-list[1], which just lists the SHA-
1’s
of all the given commits:
$ git rev-list origin..mybranch | wc -l
Suppose you want to check whether two branches point at the same point
in history.
will tell you whether the contents of the project are the same at the
two branches; in theory, however, it’s possible that the
same project
contents could have been arrived at by two different historical
routes. You could compare the object names:
e05db0fd4f31dde7005f075a84f96b360d05984b
e05db0fd4f31dde7005f075a84f96b360d05984b
Or you could recall that the ... operator selects all commits
reachable from either one reference or the other but not
both; so
$ git log origin...master
will return no commits when the two branches are equal.
Suppose you know that the commit e05db0fd fixed a certain problem.
You’d like to find the earliest tagged release that
contains that
fix.
Of course, there may be more than one answer—if the history branched
after commit e05db0fd, then there could be multiple
"earliest" tagged
releases.
e05db0fd tags/v1.5.0-rc1^0~23
v1.5.0-rc0-260-ge05db0f
but that may sometimes help you guess which tags might come after the
given commit.
e05db0fd4f31dde7005f075a84f96b360d05984b
available
...
available
Suppose you would like to see all the commits reachable from the branch
head named master but not from any other head in
your repository.
We can list all the heads in this repository with
git-show-ref[1]:
bf62196b5e363d73353a9dcf094c59595f3153b7 refs/heads/core-tutorial
db768d5504c1bb46f63ee9d6e1772bd047e05bf9 refs/heads/maint
a07157ac624b2524a059a3414e99f6f44bebc1e7 refs/heads/master
24dbc180ea14dc1aebe09f14c8ecf32010690627 refs/heads/tutorial-2
1e87486ae06626c2f31eaa63d26fc0fd646c8af2 refs/heads/tutorial-fixes
We can get just the branch-head names, and remove master, with
the help of the standard utilities cut and grep:
$ git show-ref --heads | cut -d' ' -f2 | grep -v '^refs/heads/master'
refs/heads/core-tutorial
refs/heads/maint
refs/heads/tutorial-2
refs/heads/tutorial-fixes
And then we can ask to see all the commits reachable from master
but not from these other heads:
$ gitk master --not $( git show-ref --heads | cut -d' ' -f2 |
grep -v '^refs/heads/master' )
will use HEAD to produce a gzipped tar archive in which each filename
is preceded by project/. The output file format is
inferred from
the output file extension if possible, see git-archive[1] for
details.
Versions of Git older than 1.7.7 don’t know about the tar.gz format,
you’ll need to use gzip explicitly:
Linus Torvalds, for example, makes new kernel releases by tagging them,
then running:
$ release-script 2.6.12 2.6.13-rc6 2.6.13-rc7
stable="$1"
last="$2"
new="$3"
and then he just cut-and-pastes the output commands after verifying that
they look OK.
[user]
email = you@yourdomain.example.com
$ mkdir project
$ cd project
$ git init
$ cd project
$ git init
$ git commit
3. Creating the commit using the content you told Git about
in step 2.
To update the index with the contents of a new or modified file, use
$ git rm path/to/file
always shows the difference between the HEAD and the index file—this
is what you’d commit if you created the commit now
—and that
$ git diff
shows the difference between the working tree and the index file.
Note that git add always adds just the current contents of a file
to the index; further changes to the same file will be ignored
unless
you run git add on the file again.
and Git will prompt you for a commit message and then create the new
commit. Check to make sure it looks like what you
expected with
$ git show
As a special shortcut,
$ git commit -a
will update the index with any files that you’ve modified or removed
and create a commit, all in one step.
$ git diff --cached # difference between HEAD and the index; what
$ git diff HEAD # difference between HEAD and working tree; what
Ignoring files
A project will often generate files that you do not want to track with Git.
This typically includes files generated by a build
process or temporary
backup files made by your editor. Of course, not tracking files with Git
is just a matter of not calling git
add on them. But it quickly becomes
annoying to have these untracked files lying around; e.g. they make
git add . practically
useless, and they keep showing up in the output of
git status.
You can tell Git to ignore certain files by creating a file called
.gitignore in the top level of your working directory, with
contents
such as:
# Lines starting with '#' are considered comments.
foo.txt
*.html
!foo.html
*.[oa]
How to merge
You can rejoin two diverging branches of development using
git-merge[1]:
Auto-merged file.txt
Automatic merge failed; fix conflicts and then commit the result.
If you examine the resulting commit using gitk, you will see that it
has two parents, one pointing to the top of the current
branch, and
one to the top of the other branch.
Resolving a merge
When a merge isn’t resolved automatically, Git leaves the index and
the working tree in a special state that gives you all the
information you need to help resolve the merge.
Files with conflicts are marked specially in the index, so until you
resolve the problem and update the index, git-commit[1]
will
fail:
$ git commit
Hello world
=======
Goodbye
>>>>>>> 77976da35a11db4580b80ae27e8d65caf5208086:file.txt
All you need to do is edit the files to resolve the conflicts, and then
$ git commit
Note that the commit message will already be filled in for you with
some information about the merge. Normally you can just
use this
default message unchanged, but you may add additional commentary of
your own if desired.
The above is all you need to know to resolve a simple merge. But Git
also provides more information to help resolve conflicts:
All of the changes that Git was able to merge automatically are
already added to the index file, so git-diff[1] shows only
the
conflicts. It uses an unusual syntax:
$ git diff
index 802992c,2b60207..0000000
--- a/file.txt
+++ b/file.txt
++<<<<<<< HEAD:file.txt
+Hello world
++=======
+ Goodbye
++>>>>>>> 77976da35a11db4580b80ae27e8d65caf5208086:file.txt
Recall that the commit which will be committed after we resolve this
conflict will have two parents instead of the usual one:
one parent
will be HEAD, the tip of the current branch; the other will be the
tip of the other branch, which is stored
temporarily in MERGE_HEAD.
During the merge, the index holds three versions of each file. Each of
these three "file stages" represents a different version of
the file:
The diff above shows the differences between the working-tree version of
file.txt and the stage 2 and stage 3 versions. So
instead of preceding
each line by a single + or -, it now uses two columns: the first
column is used for differences between the
first parent and the working
directory copy, and the second for differences between the second parent
and the working
directory copy. (See the "COMBINED DIFF FORMAT" section
of git-diff-files[1] for a details of the format.)
After resolving the conflict in the obvious way (but before updating the
index), the diff will look like:
$ git diff
index 802992c,2b60207..0000000
--- a/file.txt
+++ b/file.txt
- Hello world
-Goodbye
++Goodbye world
This shows that our resolved version deleted "Hello world" from the
first parent, deleted "Goodbye" from the second parent,
and added
"Goodbye world", which was previously absent from both.
Some special diff options allow diffing the working directory against
any of these stages:
$ gitk --merge
You may also use git-mergetool[1], which lets you merge the
unmerged files using external tools such as Emacs or kdiff3.
Each time you resolve the conflicts in a file and update the index:
Undoing a merge
If you get stuck and decide to just give up and throw the whole mess
away, you can always return to the pre-merge state with
Or, if you’ve already committed the merge that you want to throw away,
Fast-forward merges
Fixing mistakes
If you’ve messed up the working tree, but haven’t yet committed your
mistake, you can return the entire working tree to the
last committed
state with
If you make a commit that you later wish you hadn’t, there are two
fundamentally different ways to fix the problem:
1. You can create a new commit that undoes whatever was done
by the old commit. This is the correct thing if your
mistake has already been made public.
2. You can go back and modify the old commit. You should
never do this if you have already made the history public;
Git
does not normally expect the "history" of a project to
change, and cannot correctly perform repeated merges from
a
branch that has had its history changed.
This will create a new commit which undoes the change in HEAD. You
will be given a chance to edit the commit message for
the new commit.
You can also revert an earlier change, for example, the next-to-last:
In this case Git will attempt to undo the old change while leaving
intact any changes made since then. If more recent changes
overlap
with the changes to be reverted, then you will be asked to fix
conflicts manually, just as in the case of resolving a
merge.
If the problematic commit is the most recent commit, and you have not
yet made that commit public, then you may just
destroy it using git reset.
Alternatively, you
can edit the working directory and update the index to fix your
mistake, just as if you were going to create a
new commit, then run
which will replace the old commit by a new commit incorporating your
changes, giving you a chance to edit the old commit
message first.
Again, you should never do this to a commit that may already have
been merged into another branch; use git-revert[1] instead
in
that case.
This command will save your changes away to the stash, and
reset your working tree and the index to match the tip of your
current branch. Then you can make your fix as usual.
... edit and test ...
After that, you can go back to what you were working on with
git stash pop:
Ensuring reliability
$ git fsck
...
You will see informational messages on dangling objects. They are objects
that still exist in the repository but are no longer
referenced by any of
your branches, and can (and will) be removed after a while with gc.
You can run git fsck --no-dangling
to suppress these messages, and still
view real errors.
Reflogs
This lists the commits reachable from the previous version of the
master branch head. This syntax can be used with any Git
command
that accepts a commit, not just with git log. Some other examples:
$ git show master@{2} # See where the branch pointed 2,
will show what HEAD pointed to one week ago, not what the current branch
pointed to one week ago. This allows you to see
the history of what
you’ve checked out.
The reflogs are kept by default for 30 days, after which they may be
pruned. See git-reflog[1] and git-gc[1] to learn
how to
control this pruning, and see the "SPECIFYING REVISIONS"
section of gitrevisions[7] for details.
Note that the reflog history is very different from normal Git history.
While normal history is shared by every repository that
works on the
same project, the reflog history is not shared: it tells you only about
how the branches in your local repository
have changed over time.
In some situations the reflog may not be able to save you. For example,
suppose you delete a branch, then realize you need the
history it
contained. The reflog is also deleted; however, if you have not yet
pruned the repository, then you may still be able to
find the lost
commits in the dangling objects that git fsck reports. See
Dangling objects for the details.
$ git fsck
...
which does what it sounds like: it says that you want to see the commit
history that is described by the dangling commit(s), but
not the
history that is described by all your existing branches and tags. Thus
you get exactly the history reachable from that
commit that is lost.
(And notice that it might not be just one commit: we only report the
"tip of the line" as being dangling, but
there might be a whole deep
and complex commit history that was dropped.)
If you decide you want the history back, you can always create a new
reference pointing to it, for example, a new branch:
$ git branch recovered-branch 7281251ddd
Other types of dangling objects (blobs and trees) are also possible, and
dangling objects can arise in other situations.
After you clone a repository and commit a few changes of your own, you
may wish to check the original repository for
updates and merge them
into your own work.
$ git fetch
In fact, if you have master checked out, then this branch has been
configured by git clone to get changes from the HEAD
branch of the
origin repository. So often you can
accomplish the above with just a simple
$ git pull
This command will fetch changes from the remote branches to your
remote-tracking branches origin/*, and merge the default
branch into
the current branch.
The git pull command can also be given . as the "remote" repository,
in which case it just merges in a branch from the
current repository; so
the commands
git format-patch can include an initial "cover letter". You can insert
commentary on individual patches after the three dash
line which
format-patch places after the commit message but before the patch
itself. If you use git notes to track your cover
letter material,
git format-patch --notes will include the commit’s notes in a similar
manner.
You can then import these into your mail client and send them by
hand. However, if you have a lot to send at once, you may
prefer to
use the git-send-email[1] script to automate the process.
Consult the mailing list for your project first to determine
their requirements for submitting patches.
Git will apply each patch in order; if any conflicts are found, it
will stop, and you can fix the conflicts as described in
"Resolving a merge". (The -3 option tells
Git to perform a merge; if you would prefer it just to abort and
leave your tree and
index untouched, you may omit that option.)
$ git am --continue
and Git will create the commit for you and continue applying the
remaining patches from the mailbox.
The final result will be a series of commits, one for each patch in
the original mailbox, with authorship and commit log
message each
taken from the message containing each patch.
If you and the maintainer both have accounts on the same machine, then
you can just pull changes from each other’s
repositories directly;
commands that accept repository URLs as arguments will also accept a
local directory name:
$ git clone /path/to/repository
you push
^ |
| |
| |
| |
| they push V
Next, copy proj.git to the server where you plan to host the
public repository. You can use scp, rsync, or whatever is most
convenient.
If someone else administers the server, they should tell you what
directory to put the repository in, and what git:// URL it
will
appear at. You can then skip to the section
"Pushing changes to a public
repository", below.
You can also run git daemon as an inetd service; see the
git-daemon[1] man page for details. (See especially the
examples
section.)
All you need to do is place the newly created bare Git repository in
a directory that is exported by the web server, and make
some
adjustments to give web clients some extra information they need:
$ mv proj.git /home/you/public_html/proj.git
$ cd proj.git
$ mv hooks/post-update.sample hooks/post-update
(See also
setup-git-server-over-http
for a slightly more sophisticated setup using WebDAV which also
allows pushing over
HTTP.)
or just
$ git push ssh://yourserver.com/~you/proj.git master
As with git fetch, git push will complain if this does not result in a
fast-forward; see the following section for details on
handling this case.
url = yourserver.com:proj.git
fetch = +refs/heads/*:refs/remotes/example/*
hint: Updates were rejected because the tip of your current branch is behind
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
This can happen, for example, if you:
You may force git push to perform the update anyway by preceding the
branch name with a plus sign:
$ git push ssh://yourserver.com/~you/proj.git +master
Note the addition of the + sign. Alternatively, you can use the
-f flag to force the remote update, as in:
$ git push -f ssh://yourserver.com/~you/proj.git master
It’s also possible for a push to fail in this way when other people have
the right to push to the same repository. In that case, the
correct
solution is to retry the push after first updating your work: either by a
pull, or by a fetch followed by a rebase; see the
next section and
gitcvs-migration[7] for more.
However, while there is nothing wrong with Git’s support for shared
repositories, this mode of operation is not generally
recommended,
simply because the mode of collaboration that Git supports—by
exchanging patches and pulling from public
repositories—has so many
advantages over the central shared repository:
The gitweb cgi script provides users an easy way to browse your
project’s revisions, file contents and logs without having to
install
Git. Features like RSS/Atom feeds and blame/annotation details may
optionally be enabled.
Examples
This describes how Tony Luck uses Git in his role as maintainer of the
IA64 architecture for the Linux kernel.
A "test" tree into which patches are initially placed so that they
can get some exposure when integrated with other
ongoing development.
This tree is available to Andrew for pulling into -mm whenever he
wants.
A "release" tree into which tested patches are moved for final sanity
checking, and as a vehicle to send them upstream to
Linus (by sending
him a "please pull" request.)
To set this up, first create your work tree by cloning Linus’s public
tree:
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git work
$ cd work
Now create the branches in which you are going to work; these start out
at the current tip of origin/master branch, and should
be set up (using
the --track option to git-branch[1]) to merge changes in from
Linus by default.
Important note! If you have any local changes in these branches, then
this merge will create a commit object in the history
(with no local
changes Git will simply do a "fast-forward" merge). Many people dislike
the "noise" that this creates in the
Linux history, so you should avoid
doing this capriciously in the release branch, as these noisy commits
will become part of
the permanent history when you ask Linus to pull
from the release branch.
[remote "mytree"]
url = master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux.git
push = release
push = test
EOF
Then you can push both the test and release trees using
git-push[1]:
$ git push mytree
or
Now you apply the patch(es), run some tests, and commit the change(s). If
the patch is a multi-part series, then you should
apply each as a separate
commit to this branch.
$ ... patch ... test ... commit [ ... patch ... test ... commit ]*
When you are happy with the state of this change, you can merge it into the
"test" branch in preparation to make it public:
$ git switch test && git merge speed-up-spinlocks
It is unlikely that you would have any conflicts here …but you might if you
spent a while on this step and had also pulled new
versions from upstream.
Sometime later when enough time has passed and testing done, you can pull the
same branch into the release tree ready to go
upstream. This is where you
see the value of keeping each patch (or patch series) in its own branch. It
means that the patches
can be moved into the release tree in any order.
$ git switch release && git merge speed-up-spinlocks
After a while, you will have a number of branches, and despite the
well chosen names you picked for each of them, you may
forget what
they are for, or what status they are in. To get a reminder of what
changes are in a specific branch, use:
$ git log linux..branchname | git shortlog
To see whether it has already been merged into the test or release branches,
use:
$ git log test..branchname
or
$ git log release..branchname
(If this branch has not yet been merged, you will see some log entries.
If it has been merged, then there will be no output.)
Once a patch completes the great cycle (moving from test to release,
then pulled by Linus, and finally coming back into your
local
origin/master branch), the branch for this change is no longer needed.
You detect this when the output from:
Here are some of the scripts that simplify all this even further.
==== update script ====
case "$1" in
test|release)
;;
origin)
if [ $before != $after ]
then
fi
;;
*)
exit 1
;;
esac
pname=$0
usage()
exit 1
usage
case "$2" in
test|release)
then
exit 1
fi
;;
*)
usage
;;
esac
gb=$(tput setab 2)
rb=$(tput setab 1)
restore=$(tput setab 9)
then
echo $rb Warning: commits in release that are not in test $restore
fi
do
then
continue
fi
status=
do
then
status=$status${ref:0:1}
fi
done
case $status in
trl)
;;
rl)
;;
l)
;;
"")
;;
*)
;;
esac
done
If you present all of your changes as a single patch (or commit), they
may find that it is too much to digest all at once.
If you present them with the entire history of your work, complete with
mistakes, corrections, and dead ends, they may be
overwhelmed.
4. The complete series produces the same end result as your own
(probably much messier!) development process did.
We will introduce some tools that can help you do this, explain how to
use them, and then explain some of the problems that
can arise because
you are rewriting history.
$ vi file.txt
$ git commit
$ vi otherfile.txt
$ git commit
...
Some more interesting work has been done in the upstream project, and
origin has advanced:
At this point, you could use pull to merge your changes back in;
the result would create a new merge commit, like this:
o--o--O--o--o--o <-- origin
\ \
This will remove each of your commits from mywork, temporarily saving
them as patches (in a directory named .git/rebase-
apply), update mywork to
point at the latest version of origin, then apply each of the saved
patches to the new mywork. The
result will look like:
o--o--O--o--o--o <-- origin
At any point you may use the --abort option to abort this process and
return mywork to the state it had before you started the
rebase:
$ git rebase --abort
which will replace the old commit by a new commit incorporating your
changes, giving you a chance to edit the old commit
message first.
This is useful for fixing typos in your last commit, or for adjusting
the patch contents of a poorly staged commit.
If you need to amend commits from deeper in your history, you can
use interactive rebase’s edit instruction.
Reordering or selecting from a patch series
You can also edit a patch series with an interactive rebase. This is
the same as reordering a patch series using
format-patch, so
use whichever interface you like best.
Rebase your current HEAD on the last commit you want to retain as-is.
For example, if you want to reorder the last 5 commits,
use:
$ git rebase -i HEAD~5
This will open your editor with a list of steps to be taken to perform
your rebase.
pick deadbee The oneline of this commit
...
# Commands:
# These lines can be re-ordered; they are executed from top to bottom.
The rebase will stop where pick has been replaced with edit or
when a step in the list fails to mechanically resolve conflicts
and
needs your help. When you are done editing and/or resolving conflicts
you can continue with git rebase --continue. If
you decide that
things are getting too hairy, you can always bail out with git rebase
--abort. Even after the rebase is
complete, you can still recover
the original branch by using the reflog.
Other tools
There are numerous other tools, such as StGit, which exist for the
purpose of maintaining a patch series. These are outside of
the scope of
this manual.
\ \
\ \
Git has no way of knowing that the new head is an updated version of
the old head; it treats this situation exactly the same as it
would if
two developers had independently done the work on the old and new heads
in parallel. At this point, if someone
attempts to merge the new head
in to their branch, Git will attempt to merge together the two (old and
new) lines of
development, instead of trying to replace the old by the
new. The results are likely to be unexpected.
Why bisecting merge commits can be harder than bisecting linear history
The git-bisect[1] command correctly handles history that
includes merge commits. However, when the commit that it finds is a
merge commit, the user may need to work harder than usual to figure out
why that commit introduced a problem.
---Z---o---X---...---o---A---C---D
\ /
o---o---Y---...---o---B
Partly for this reason, many experienced Git users, even when
working on an otherwise merge-heavy project, keep the history
linear by rebasing against the latest upstream version before
publishing.
The first argument, origin, just tells Git to fetch from the
repository you originally cloned from. The second argument tells Git
to fetch the branch named todo from the remote repository, and to
store it locally under the name refs/heads/my-todo-work.
In some cases it is possible that the new head will not actually be
a descendant of the old head. For example, the developer
may have
realized a serious mistake was made and decided to backtrack,
resulting in a situation like:
o--o--o--o--a--b <-- old head of the branch
In this case, git fetch will fail, and print out a warning.
In that case, you can still force Git to update to the new head, as
described in the following section. However, note that in the
situation above this may mean losing the commits labeled a and b,
unless you’ve already created a reference of your own
pointing to
them.
Note the addition of the + sign. Alternatively, you can use the -f
flag to force updates of all the fetched branches, as in:
core.repositoryformatversion=0
core.filemode=true
core.logallrefupdates=true
remote.origin.url=git://git.kernel.org/pub/scm/git/git.git
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
branch.master.remote=origin
branch.master.merge=refs/heads/master
If there are other repositories that you also use frequently, you can
create similar configuration options to save typing; for
example,
$ git remote add example git://example.com/proj.git
url = git://example.com/proj.git
fetch = +refs/heads/*:refs/remotes/example/*
After configuring the remote, the following three commands will do the
same thing:
$ git fetch git://example.com/proj.git +refs/heads/*:refs/remotes/example/*
Git concepts
Git is built on a small number of simple but powerful ideas. While it
is possible to get things done without understanding
them, you will find
Git much more intuitive if you do.
We already saw in Understanding History: Commits that all commits are stored
under a 40-digit "object name". In fact, all the
information needed to
represent the history of a project is stored in objects with such names.
In each case the name is
calculated by taking the SHA-1 hash of the
contents of the object. The SHA-1 hash is a cryptographic hash function.
What that
means to us is that it is impossible to find two different
objects with the same name. This has a number of advantages; among
others:
Git can quickly determine whether two objects are identical or not,
just by comparing names.
Since object names are computed the same way in every repository, the
same content stored in two repositories will
always be stored under
the same name.
Git can detect errors when it reads an object, by checking that the
object’s name is still the SHA-1 hash of its contents.
(See Object storage format for the details of the object formatting and
SHA-1 calculation.)
There are four different types of objects: "blob", "tree", "commit", and
"tag".
Commit Object
commit 2be7fcb4764f2dbcee52635b91fedb1b3dcf7ab4
tree fb3a8bdd0ceddd019615af4d57a53f43d8cee2bf
parent 257a84d9d02e90447b149af58b271c19405edb6a
a tree: The SHA-1 name of a tree object (as defined below), representing
the contents of a directory at a certain point in
time.
parent(s): The SHA-1 name(s) of some number of commits which represent the
immediately previous step(s) in the
history of the project. The
example above has one parent; merge commits may have more than
one. A commit with no
parents is called a "root" commit, and
represents the initial revision of a project. Each project must have
at least one
root. A project can also have multiple roots, though
that isn’t common (or necessarily a good idea).
an author: The name of the person responsible for this change, together
with its date.
a committer: The name of the person who actually created the commit,
with the date it was done. This may be different
from the author, for
example, if the author was someone who wrote a patch and emailed it
to the person who used it to
create the commit.
Note that a commit does not itself contain any information about what
actually changed; all changes are calculated by
comparing the contents
of the tree referred to by this commit with the trees associated with
its parents. In particular, Git does
not attempt to record file renames
explicitly, though it can identify cases where the existence of the same
file data at changing
paths suggests a rename. (See, for example, the
-M option to git-diff[1]).
Tree Object
...
As you can see, a tree object contains a list of entries, each with a
mode, object type, SHA-1 name, and name, sorted by name.
It represents
the contents of a single directory tree.
Note that the files all have mode 644 or 755: Git actually only pays
attention to the executable bit.
Blob Object
Note that the only valid version of the GPL as far as this project
...
Trust
If you receive the SHA-1 name of a blob from one source, and its contents
from another (possibly untrusted) source, you can
still trust that those
contents are correct as long as the SHA-1 name agrees. This is because
the SHA-1 is designed so that it is
infeasible to find different contents
that produce the same hash.
Similarly, you need only trust the SHA-1 name of a top-level tree object
to trust the contents of the entire directory that it
refers to, and if
you receive the SHA-1 name of a commit from a trusted source, then you
can easily verify the entire history of
commits reachable through
parents of that commit, and all of those contents of the trees referred
to by those commits.
So to introduce some real trust in the system, the only thing you need
to do is to digitally sign just one special note, which
includes the
name of a top-level commit. Your digital signature shows others
that you trust that commit, and the immutability
of the history of
commits tells others that they can trust the whole history.
Tag Object
A tag object contains an object, object type, tag name, the name of the
person ("tagger") who created the tag, and a message,
which may contain
a signature, as can be seen using git-cat-file[1]:
$ git cat-file tag v1.5.0
object 437b1b20df4b356c9342dac8d38849f24ef44f27
type commit
tag v1.5.0
GIT 1.5.0
iD8DBQBF0lGqwMbZpPMRm5oRAuRiAJ9ohBLd7s2kqjkKlq1qqC57SbnmzQCdG4ui
nLE/L9aUXdWeTFPron96DLA=
=2E+0
See the git-tag[1] command to learn how to create and verify tag
objects. (Note that git-tag[1] can also be used to create
"lightweight tags", which are not tag objects at all, but just simple
references whose names begin with refs/tags/).
Newly created objects are initially created in a file named after the
object’s SHA-1 hash (stored in .git/objects).
You can save space and make Git faster by moving these loose objects in
to a "pack file", which stores a group of objects in an
efficient
compressed format; the details of how pack files are formatted can be
found in gitformat-pack[5].
To put the loose objects into a pack, just run git repack:
$ git repack
$ git prune
to remove any of the "loose" objects that are now contained in the
pack. This will also remove any unreferenced objects
(which may be
created when, for example, you use git reset to remove a commit).
You can verify that the loose objects are
gone by looking at the
.git/objects directory or by running
$ git count-objects
0 objects, 0 kilobytes
Although the object files are gone, any commands that refer to those
objects will work exactly as they did before.
Dangling objects
Similarly, when the "ort" merge strategy runs, and finds that
there are criss-cross merges and thus more than one merge base
(which is
fairly unusual, but it does happen), it will generate one temporary
midway tree (or possibly even more, if you had
lots of criss-crossing
merges and more than two merge bases) as a temporary internal merge
base, and again, those are real
objects, but the end result will not end
up pointing to them, so they end up "dangling" in your repository.
This asks for all the history reachable from the given commit but not
from any branch, tag, or other reference. If you decide
it’s something
you want, you can always create a new reference to it, e.g.,
For blobs and trees, you can’t do the same, but you can still examine
them. You can just do
to show what the contents of the blob were (or, for a tree, basically
what the ls for that directory was), and that may give you
some idea
of what the operation was that left that dangling object.
Anyway, once you are sure that you’re not interested in any dangling
state, you can just prune all unreachable objects:
$ git prune
and they’ll be gone. (You should only run git prune on a quiescent
repository—it’s kind of like doing a filesystem fsck
recovery: you
don’t want to do that while the filesystem is mounted.
git prune is designed not to cause any harm in such cases
of concurrent
accesses to a repository but you might receive confusing or scary messages.)
The first defense against such problems is backups. You can back up a
Git directory using clone, or just using cp, tar, or any
other backup
mechanism.
As a last resort, you can search for the corrupted objects and attempt
to replace them by hand. Back up your repository before
attempting this
in case you corrupt things even more in the process.
Before starting, verify that there is corruption, and figure out where
it is with git-fsck[1]; this may be time-consuming.
to blob 4b9458b3786228369c63936db65827de3cc06200
Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
points to it. If you could find just one copy of that
missing blob
object, possibly in some other repository, you could move it into
.git/objects/4b/9458b3... and be done.
Suppose you can’t. You can
still examine the tree that pointed to it with git-ls-tree[1],
which might output something like:
...
...
So now you know that the missing blob was the data for a file named
myfile. And chances are you can also identify the
directory—let’s
say it’s in somedirectory. If you’re lucky the missing copy might be
the same as the copy you have checked
out in your working tree at
somedirectory/myfile; you can test whether that’s right with
git-hash-object[1]:
$ git hash-object -w somedirectory/myfile
which will create and store a blob object with the contents of
somedirectory/myfile, and output the SHA-1 of that object. if
you’re
extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
which case you’ve guessed right, and
the corruption is fixed!
Otherwise, you need more information. How do you tell which version of
the file has been lost?
Because you’re asking for raw output, you’ll now get something like
commit abc
Author:
Date:
...
commit xyz
Author:
Date:
...
This tells you that the immediately following version of the file was
"newsha", and that the immediately preceding version was
"oldsha".
You also know the commit messages that went with the change from oldsha
to 4b9458b and with the change from
4b9458b to newsha.
If you’ve been committing small enough changes, you may now have a good
shot at reconstructing the contents of the in-
between state 4b9458b.
If you can do that, you can now recreate the missing object with
(Btw, you could have ignored the fsck, and started with doing a
and just looked for the sha of the missing object (4b9458b) in that
whole thing. It’s up to you—Git does have a lot of
information, it is
just missing one particular blob version.
The index
...
Note that in older documentation you may see the index called the
"current directory cache" or just the "cache". It has three
important
properties:
2. The index enables fast comparisons between the tree object it defines
and the working tree.
It does this by storing some additional data for each entry (such as
the last modified time). This data is not displayed
above, and is not
stored in the created tree object, but it can be used to determine
quickly which files in the working
directory differ from what was
stored in the index, and thus save Git from having to read all of the
data from such files
to look for changes.
We saw in Getting conflict-resolution help during a merge that during a merge the index can
store multiple versions of a
single file (called "stages"). The third
column in the git-ls-files[1] output above is the stage
number, and will take on
values other than 0 for files with merge
conflicts.
The index is thus a sort of temporary staging area, which is filled with
a tree which you are in the process of working on.
If you blow the index away entirely, you generally haven’t lost any
information as long as you have the name of the tree that it
described.
Submodules
Large projects are often composed of smaller, self-contained modules. For
example, an embedded Linux distribution’s source
tree would include every
piece of software in the distribution with some local modifications; a movie
player might need to
build against a specific, known-working version of a
decompression library; several independent programs might all share the
same
build scripts.
Git does not allow partial checkouts, so duplicating this approach in Git
would force developers to keep a local copy of
modules they are not
interested in touching. Commits in an enormous checkout would be slower
than you’d expect as Git
would have to scan every directory for changes.
If modules have a lot of local history, clones would take forever.
On the plus side, distributed revision control systems can much better
integrate with external sources. In a centralized model, a
single arbitrary
snapshot of the external project is exported from its own revision control
and then imported into the local
revision control on a vendor branch. All
the history is hidden. With distributed revision control you can clone the
entire
external history and much more easily follow development and re-merge
local changes.
Git’s submodule support allows a repository to contain, as a subdirectory, a
checkout of an external project. Submodules
maintain their own identity;
the submodule support just stores the submodule repository location and
commit ID, so other
developers who clone the containing project
("superproject") can easily clone all the submodules at the same revision.
Partial
checkouts of the superproject are possible: you can tell Git to
clone none, some or all of the submodules.
$ cd ~/git
$ for i in a b c d
do
mkdir $i
cd $i
git init
cd ..
done
$ cd super
$ git init
$ for i in a b c d
do
done
Note Do not use local URLs here if you plan to publish your superproject!
. .. .git .gitmodules a b c d
The git submodule add <repo> <path> command does a couple of things:
It clones the submodule from <repo> to the given <path> under the
current directory and by default checks out the master
branch.
$ cd ..
$ cd cloned
. ..
-d266b9873ad50488163457f025db7cdd9683d88b a
-e81d457da15309b4fef4249aba9b50187999670d b
-c1536a972b9affea0f16e0680ba87332dc059146 c
-d96249ff5d57de5de093e6baff9e0aafa5276a74 d
The commit object names shown above would be different for you, but they
should match the HEAD commit object
Note
names of your repositories. You can check
it by running git ls-remote ../a.
Pulling down the submodules is a two-step process. First run git submodule
init to add the submodule repository URLs to
.git/config:
Now use git submodule update to clone the repositories and check out the
commits specified in the superproject:
$ cd a
$ ls -a
. .. .git a.txt
One major difference between git submodule update and git submodule add is
that git submodule update checks out a
specific commit, rather than the tip
of a branch. It’s like checking out a tag: the head is detached, so you’re not
working on a
branch.
$ git branch
master
If you want to make a change within a submodule and you have a detached head,
then you should create or checkout a branch,
make your changes, publish the
change within the submodule, and then update the superproject to reference the
new commit:
$ git switch master
or
$ git switch -c fix-up
then
$ echo "adding a line again" >> a.txt
$ git push
$ cd ..
$ git diff
--- a/a
+++ b/a
@@ -1 +1 @@
$ git add a
$ git push
You have to run git submodule update after git pull if you want to update
submodules, too.
$ cd ..
$ git add a
$ git push
$ cd ~/git/cloned
$ git pull
error: pathspec '261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to git.
--- a/sub
+++ b/sub
@@ -1 +1 @@
You also should not rewind branches in a submodule beyond commits that were
ever recorded in any superproject.
It’s not safe to run git submodule update if you’ve made and committed
changes within a submodule without checking out a
branch first. They will be
silently overwritten:
$ cat a.txt
module a
$ cd ..
$ cd a
$ cat a.txt
module a
The Workflow
but to avoid common mistakes with filename globbing etc., the command
will not normally add totally new entries or remove
old entries,
i.e. it will normally just update existing cache entries.
To tell Git that yes, you really do realize that certain files no
longer exist, or that new files should be added, you
should use the
--remove and --add flags respectively.
NOTE! A --remove flag does not mean that subsequent filenames will
necessarily be removed: if the files still exist in your
directory
structure, the index will be updated with their new status, not
removed. The only thing --remove means is that
update-index will be
considering a removed file to be a valid thing, and if the file really
does not exist any more, it will update
the index accordingly.
You write your current index file to a "tree" object with the program
$ git write-tree
that doesn’t come with any options—it will just write out the
current index into the set of tree objects that describe that state,
and it will return the name of the resulting top-level tree. You can
use that tree to re-generate the index at any time by going in
the
other direction:
You read a "tree" file from the object database, and use that to
populate (and overwrite—don’t do this if your index contains
any
unsaved state that you might want to restore later!) your current
index. Normal operation is just
and your index file will now be equivalent to the tree that you saved
earlier. However, that is only your index file: your
working
directory contents have not been modified.
You update your working directory from the index by "checking out"
files. This is not a very common operation, since
normally you’d just
keep your files updated, and rather than write to your working
directory, you’d tell the index files about
the changes in your
working directory (i.e. git update-index).
or, if you want to check out all of the index, use -a.
NOTE! git checkout-index normally refuses to overwrite old files, so
if you have an old version of the tree already checked
out, you will
need to use the -f flag (before the -a flag or the filename) to
force the checkout.
Finally, there are a few odds and ends which are not purely moving
from one representation to the other:
Normally a "commit" has one parent: the previous state of the tree
before a certain change was made. However, sometimes it
can have two
or more parent commits, in which case we call it a "merge", due to the
fact that such a commit brings together
("merges") two or more
previous states represented by other commits.
You create a commit object by giving it the tree that describes the
state at the time of the commit, and a list of parents:
and then giving the reason for the commit on stdin (either through
redirection from a pipe or file, or by just typing it at the tty).
git commit-tree will return the name of the object that represents
that commit, and you should save it away for later use.
Normally,
you’d commit a new HEAD state, and while Git doesn’t care where you
save the note about that state, in practice we
tend to just write the
result to the file pointed at by .git/HEAD, so that we can always see
what the last committed state was.
commit-tree
commit obj
+----+
| |
| |
V V
+-----------+
| Object DB |
| Backing |
| Store |
+-----------+
write-tree | |
tree obj | |
| | read-tree
| | tree obj
+-----------+
| Index |
| "cache" |
+-----------+
update-index ^
blob obj | |
| |
checkout-index -u | | checkout-index
+-----------+
| Working |
| Directory |
+-----------+
You can examine the data represented in the object database and the
index with various helper tools. For every object, you can
use
git-cat-file[1] to examine details about the
object:
$ git cat-file -t <objectname>
shows the type of the object, and once you have the type (which is
usually implicit in where you find the object), you can use
to show its contents. NOTE! Trees have binary content, and as a result
there is a special helper for showing that content, called
git ls-tree, which turns the binary content into a more easily
readable form.
To perform a three-way merge, you start with the two commits you
want to merge, find their closest common parent (a third
commit),
and compare the trees corresponding to these three commits.
To get the "base" for the merge, look up the common parent of two
commits:
This prints the name of a commit they are both based on. You should
now look up the tree objects of those commits, which
you can easily
do with
$ git cat-file commit <commitname> | head -1
since the tree object information is always the first line in a commit
object.
Once you know the three trees you are going to merge (the one "original"
tree, aka the common tree, and the two "result" trees,
aka the branches
you want to merge), you do a "merge" read into the index. This will
complain if it has to throw away your old
index contents, so you should
make sure that you’ve committed those—in fact you would normally
always do a merge against
your last commit (which should thus match what
you have in your current index anyway).
To do the merge, do
$ git read-tree -m -u <origtree> <yourtree> <targettree>
which will do all trivial merge operations for you directly in the
index file, and you can just write the result out with
git
write-tree.
Sadly, many merges aren’t trivial. If there are files that have
been added, moved or removed, or if both branches have modified
the
same file, you will be left with an index tree that contains "merge
entries" in it. Such an index tree can NOT be written out
to a tree
object, and you will have to resolve any such merge clashes using
other tools before you can write out the result.
You can examine such index state with git ls-files --unmerged
command. An example:
$ mv -f hello.c~2 hello.c
and that is what higher level git merge -s resolve is implemented with.
Hacking Git
This chapter covers internal details of the Git implementation which
probably only Git developers need to understand.
It is not always easy for new developers to find their way through Git’s
source code. This section gives you a little guidance to
show where to
start.
A good place to start is with the contents of the initial commit, with:
The initial revision lays the foundation for almost everything Git has
today, but is small enough to read in one sitting.
Note that terminology has changed since that revision. For example, the
README in that revision uses the word "changeset"
to describe what we
now call a commit.
Also, we do not call it "cache" any more, but rather "index"; however, the
file is still called cache.h. Remark: Not much reason
to change it now,
especially since there is no good single name for it anyway, because it is
basically the header file which is
included by all of Git’s C sources.
If you grasp the ideas in that initial commit, you should check out a
more recent version and skim cache.h, object.h and
commit.h.
In the early days, Git (in the tradition of UNIX) was a bunch of programs
which were extremely simple, and which you used
in scripts, piping the
output of one into another. This turned out to be good for initial
development, since it was easier to test
new things. However, recently
many of these parts have become builtins, and some of the core has been
"libified", i.e. put into
libgit.a for performance, portability reasons,
and to avoid code duplication.
By now, you know what the index is (and find the corresponding data
structures in cache.h), and that there are just a couple of
object types
(blobs, trees, commits and tags) which inherit their common structure from
struct object, which is their first
member (and thus, you can cast e.g.
(struct object *)commit to achieve the same as &commit->object, i.e.
get at the object
name and flags).
Now is a good point to take a break to let this information sink in.
Next step: get familiar with the object naming. Read Naming commits.
There are quite a few ways to name an object (and not
only revisions!).
All of these are handled in sha1_name.c. Just have a quick look at
the function get_sha1(). A lot of the special
handling is done by
functions like get_sha1_basic() or the likes.
This is just to get you into the groove for the most libified part of Git:
the revision walker.
LESS=-S ${PAGER:-less}
git rev-parseis not as important any more; it was only used to filter out
options that were relevant for the different plumbing
commands that were
called by the script.
Sometimes, more than one builtin is contained in one source file. For
example, cmd_whatchanged() and cmd_log() both reside
in builtin/log.c,
since they share quite a bit of code. In that case, the commands which are
not named like the .c file in which
they live have to be listed in
BUILT_INS in the Makefile.
Lesson three is: study the code. Really, it is the best way to learn about
the organization of Git (after you know the basic
concepts).
So, think about something which you are interested in, say, "how can I
access a blob just knowing the object name of it?". The
first step is to
find a Git command with which you can do it. In this example, it is either
git show or git cat-file.
For the sake of clarity, let’s stay with git cat-file, because it
is plumbing, and
was around even in the initial commit (it literally went only through
some 20 revisions as cat-file.c, was renamed to
builtin/cat-file.c
when made a builtin, and then saw less than 10 versions).
So, look into builtin/cat-file.c, search for cmd_cat_file() and look what
it does.
git_config(git_default_config);
if (argc != 3)
if (get_sha1(argv[2], sha1))
Let’s skip over the obvious details; the only really interesting part
here is the call to get_sha1(). It tries to interpret argv[2] as
an
object name, and if it refers to an object which is present in the current
repository, it writes the resulting SHA-1 into the
variable sha1.
case 0:
This is how you read a blob (actually, not only a blob, but any type of
object). To know how the function
read_object_with_reference() actually
works, find the source code for it (something like git grep
read_object_with | grep
":[a-z]" in the Git repository), and read
the source.
To find out how the result can be used, just read on in cmd_cat_file():
Sometimes, you do not know where to look for a feature. In many such cases,
it helps to search through the output of git log,
and then git show the
corresponding commit.
Example: If you know that there was some test case for git bundle, but
do not remember where it was (yes, you could git
grep bundle t/, but that
does not illustrate the point!):
Voila.
You see, Git is actually the best tool to find out about the source of Git
itself!
Git Glossary
Git explained
bare repository
blob object
branch
cache
chain
changeset
checkout
cherry-picking
In SCM jargon, "cherry pick" means to choose a subset of
changes out of a series of changes (typically commits) and
record them
as a new series of changes on top of a different codebase. In Git, this is
performed by the "git cherry-pick"
command to extract the change introduced
by an existing commit and to record it based on the tip
of the current branch
as a new commit.
clean
commit
commit-graph file
commit object
A commit object or an
object that can be recursively dereferenced to
a commit object.
The following are all commit-
ishes:
a commit object,
a tag object that points to a commit
object,
a tag object that points to a tag object that points to a
commit object,
etc.
core Git
DAG
dangling object
detached HEAD
Note that commands that operate on the history of the current branch
(e.g. git commit to build a new history on top of it)
still work
while the HEAD is detached. They update the HEAD to point at the tip
of the updated history without
affecting any branch. Commands that
update or inquire information about the current branch (e.g. git
branch --set-
upstream-to that sets what remote-tracking branch the
current branch integrates with) obviously do not work, as there is
no
(real) current branch to ask about in this state.
directory
dirty
evil merge
fast-forward
fetch
file system
Git archive
gitfile
grafts
Note that the grafts mechanism is outdated and can lead to problems
transferring objects between repositories; see git-
replace[1]
for a more flexible and robust system to do the same thing.
hash
head
HEAD
The current branch. In more detail: Your working tree is normally derived from the state of the tree
referred to by
HEAD. HEAD is a reference to one of the
heads in your repository, except when using a
detached HEAD, in which case
it directly
references an arbitrary commit.
head ref
hook
During the normal execution of several Git commands, call-outs are made
to optional scripts that allow a developer to
add functionality or
checking. Typically, the hooks allow for a command to be pre-verified
and potentially aborted, and
allow for a post-notification after the
operation is done. The hook scripts are found in the
$GIT_DIR/hooks/ directory, and
are enabled by simply
removing the .sample suffix from the filename. In earlier versions
of Git you had to make them
executable.
index
index entry
master
merge
object
object database
object name
object type
One of the identifiers "commit",
"tree", "tag" or
"blob" describing the type of an
object.
octopus
origin
overlay
Only update and add files to the working directory, but don’t
delete them, similar to how cp -R would update the
contents
in the destination directory. This is the default mode in a
checkout when checking out files from the
index or a
tree-ish. In
contrast, no-overlay mode also deletes tracked files not
present in the source, similar to rsync --delete.
pack
A set of objects which have been compressed into one file (to save space
or to transmit them efficiently).
pack index
pathspec
top
literal
Wildcards in the pattern such as * or ? are treated
as literal characters.
icase
glob
attr
Note that when matching against a tree object, attributes are still
obtained from working tree, not from the
given tree object.
exclude
parent
pickaxe
porcelain
per-worktree ref
pseudoref
pull
push
reachable
reachability bitmaps
rebase
ref
There are a few special-purpose refs that do not begin with refs/.
The most notable example is HEAD.
reflog
A reflog shows the local "history" of a ref. In other words,
it can tell you what the 3rd last revision in this repository
was, and what was the current state in this repository,
yesterday 9:14pm. See git-reflog[1] for details.
refspec
remote repository
remote-tracking branch
repository
resolve
revision
rewind
SCM
SHA-1
shallow clone
shallow repository
stash entry
submodule
A repository that holds the history of a
separate project inside another repository (the latter of
which is called
superproject).
superproject
symref
tag
tag object
topic branch
tree
tree object
unmerged index
unreachable object
upstream branch
working tree
The tree of actual checked out files. The working tree normally
contains the contents of the HEAD commit’s tree,
plus
any local changes that you have made but not yet committed.
worktree
$ cd project
$ git init
$ git add .
$ git commit
$ cd project
Managing branches
$ git branch # list all local branches in this repo
$ git branch new test~10 # ten commits before tip of branch "test"
Update and examine branches from the repository you cloned from:
$ git fetch # update
origin/master
origin/next
...
example
origin
* remote example
URL: git://example.com/project.git
master
next
...
Exploring history
$ gitk # visualize and browse history
Making changes
[user]
email = you@yourdomain.example.com
EOF
Select file contents to include in the next commit, then make the
commit:
$ git commit
Merging
$ git merge test # merge branch "test" into the current branch
Store the fetched branch into a local branch before merging into the
current branch:
Repository maintenance
$ git gc
Think about how to create a clear chapter dependency graph that will
allow people to get to important topics without
necessarily reading
everything in between.
some of technical/?
hooks
Scan man pages to see if any assume more background than this manual
provides.