Git

From HerzbubeWiki
Jump to: navigation, search

This page is about the version control software Git and how to use it on the client side. This wiki page covers the server side.


Contents

References

Homepage 
http://git-scm.com/
Pro Git book 
http://book.git-scm.com/
Git User Manual 
http://www.kernel.org/pub/software/scm/git/docs/user-manual.html
Git - SVN Crash Course 
http://git-scm.com/course/svn.html
Git Magic 
http://www-cs-students.stanford.edu/~blynn/gitmagic/index.html
Link collection 
http://git-scm.com/documentation
man pages 
git, gittutorial, gittutorial-2, gitcore-tutorial, gitglossary


GUI clients

  • GitX used to be a promising Mac OS X client. Apparently it was abandoned upstream. The project was subsequently forked by enthusiasts, but I have not followed their progress.
  • These days I use SourceTree by Atlassian. The app is free to use, although you need to register to get the free license.
  • GitHub also has a Mac OS X app, but I don't really use it except once in a while for doing something that is too complicated with SourceTree or on the command line (I forget, though, what exactly that is)


Creating and configuring local repositories

git init: Administration of repositories

Create a new non-bare repository in .git in the current directory:

git init

(set the GIT_DIR variable to create the repository in a directory named differently, or in a different location; by default, GIT_DIR points to .git)

Create a new bare repository:

mkdir repodir
cd repodir
git init --bare

Difference between bare and non-bare repositories:

  • A non-bare repository has a working tree and a hidden directory .git containing the version control information
  • A bare repository just contains the version control information and no working tree. All the contents of the .git directory are placed in the main directory itself
  • Only bare repositories can be the target of a push
  • The purpose of bare repositories is for having a central (usually remote) repository that a number of people can push to
  • To convert a bare into a non-bare repository: Clone the bare repo, then delete the original
  • To convert a non-bare repo into a bare one:
git clone --bare -l /path/to/non/bare/repo /path/to/new/bare/repo


Working Tree vs. Working Copy

In Subversion, users check out one revision from the central, shared repository into a directory that is then called the "working copy". The working copy therefore contains one, and only one, revision (this is simplified, but I need it to be so that I can make a useful comparison to git).

In git, the "working copy" is called "working tree". However, the directory space that contains the working tree at the same time also stores the git repository. As opposed to Subversion where each directory has its own .svn directory, the git working tree has exactly one .git directory in its root folder: That folder contains the entire git repository with all the branches.

A single git repository can track an arbitrary number of branches, but your working tree is associated with just one of them (the "current" or "checked out" branch), and HEAD points to that branch (= the tip, or head, of that branch).


Configuration file

The git configuration file contains a number of variables that affect the behaviour of git commands. A non-comprehensive list is available on the man page of git config. Configuration files exist on three levels:

  • .git/config file for each repository is used to store the information for that repository
  • $HOME/.gitconfig is used to store per user information
  • The file /etc/gitconfig can be used to store system-wide defaults


To write to a configuration file, use the command

git config <section.variable> <value>

Using the --global argument writes to the user specific configuration file, --system writes to the system-wide defaults. With no argument, the repository specific configuration file will be written to.


Important settings, or settings that I like to have are:

  • User name and email address in ~/.gitconfig. These are used for things like git-commit
git config --global user.name "Patrick Näf"
git config --global user.email herzbube@herzbube.ch
git config --global user.signingkey 3FF38573
  • User specific gitignore patterns:
git config --global core.excludesfile "$HOME/.gitignore"
  • Use colors for git-status and git-diff
git config --global color.status true
git config --global color.diff true
  • Don't store a backup file with .orig extension after a successful merge
git config --global mergetool.keepBackup false


Ignoring files

Files generated by a build process (e.g. object files), or by the operating system (.DS_Store), or whatever, should not be versioned. git ignores those files if you tell it their names. You do so by specifying so-called gitignore patterns, either on the command line of certain git commands, or in so-called gitignore files.

Patterns are read from various sources in the following order (this list is taken almost verbatim from the man page of gitignore(5):

  • Patterns read from the command line for those commands that support them.
  • Patterns read from a .gitignore file in the same directory as the path, or in any parent directory, with patterns in the higher level files (up to the root) being overridden by those in lower level files down to the directory containing the file. These patterns match relative to the location of the .gitignore file. A project normally includes such .gitignore files in its repository, containing patterns for files generated as part of the project build.
  • Patterns read from $GIT_DIR/info/exclude.
  • Patterns read from the file specified by the configuration variable core.excludesfile; you would set that variable by saying, for instance:
git config --global core.excludesfile "$HOME/.gitignore"


Example for $HOME/.gitignore:

.DS_Store
.svn/


Complaining about whitespace

Almost all editors I have encountered add unnecessary whitespace at the end of a line in certain situations, usually when they try to help with line indentation. Some editors have an option that removes trailing whitespace when a file is saved, but most do not. Fortunately, git has support for checking for common whitespace problems.

The option core.whitespace in ~/.gitconfig allows to define which whitespace problems should be noticed whenver a whitespace check is run. The default already enables checking for the most important problem, blank-at-eol, so usually you will not have to modify your .gitconfig file.

To enable whitespace problem checks in a repository (local or remote), you can enable the default pre-commit hook:

cd /path/to/repo
cd .git/hooks
mv pre-commit.sample pre-commit


Basic operations

git add: Adding files/directories or making changes to existing files/directories

Files:

  • a new file or directory needs to be added using git add
  • a file whose content has changed needs to be added using git add
  • when git add is run it looks at the file's current content and determines what needs to be added; the content is said to be staged for inclusion in the next commit
  • when a file's content changes after git add has been run, git add needs to be run AGAIN because the new content is NOT automatically staged for inclusion


Directories:

  • it seems that a new empty directory can NOT be added using git add; I was unable to do this, and so far I did not find information about this special (mis)behaviour of git
  • if a directory contains files, it is sufficient to git add the directory; the operation will then recursively iterate over the files; if another file is later added to the directory, the new file is NOT automatically staged for inclusion - git add needs to be run AGAIN


git mv: Renaming or moving files/directories

Existing files and directories can be renamed or moved to a new location using git mv. The result must still be committed.

Note: If a directory becomes empty due to a move operation, the next commit will remove it from source control. If people pull the change, the directory will disappear on their side, too. If a puller has a local change in the directory, the directory will not be deleted, though.


git rm: Removing files/directories

Existing files can be removed using git rm. The result must still be committed.

Note 1: Directories are not normally removed, unless the -r option is specified.

Note 2: If a directory becomes empty due to a remove operation, the same rules apply as with git mv.


git status/diff: See local changes

git status prints out

  • which changes will be committed next time git commit is run
  • which changes have not been staged for committing yet; Note: Empty directories do not appear here; directories only appear if they have at least one file inside

git diff prints out

  • the changes that have not been staged yet
  • in other words: the difference between the working tree and the index

git diff --cached prints out

  • the changes that have been staged and will be included in the next commit
  • in other words: the difference between the index and the HEAD of the current branch (usually "master")

Show all changed files between two commits:

git diff --name-only SHA1 SHA2
git diff --name-only TAG1 TAG2


git reset: Undo changes

git reset can be used to undo all sorts of changes, including destroying commits already made. The command is rather dangerous and you must know what you are doing or you may damage your repository...

Unstage all files that have been staged with git add, keeping all local changes:

git reset

Unstage a single file:

git reset foo.c

Discard unstaged changes for a single file. Surprisingly, this does not require git reset. Warning: There is no warning, the changes are immediately discarded!

git checkout foo.c

Throw away all local changes that have not been committed yet (this is useful after a merge, e.g. to throw away the merge results because of too many conflicts):

git reset --hard

Discard the last commit(s) from the repository, including all changes that were made in that commit (SO question shows a way to get the commit back before 90 days have elapsed):

git reset --hard HEAD^   # discard last commit
git reset --hard HEAD^^  # discard last 2 commits

Discard the last commit from the repository, but keep the changes that were made in that commit in the working tree. This is useful, for instance, if the commit was incomplete, or just not quite right, and you want to redo the commit with a few changes. Note that git commit --amend is simpler if you just need to edit the commit message, or add a file that was forgotten.

git reset HEAD^          # leave changes in the working tree, but not the index; the old head is stored in .git/ORIG_HEAD
git reset --soft HEAD^   # leave changes in the working tree AND in the index
<do some changes>
git add .
git commit -c ORIG_HEAD  # redo commit, re-using the previous commit message (can still be edited)


git stash: Temporarily stash all local changes

Sometimes one needs to interrupt the current work and do something else. A useful workflow is this:

  • Temporarily stash all local changes and revert to a clean working tree
  • Do something else, probably commit
  • Get back the changes that have been stashed away and resume the original work

The commands for this are:

git stash
# do some work
git commit
git stash apply   # the stash is kept
git stash pop     # the stash is applied and then thrown away

It is possible to have multiple stashes. Useful commands:

git stash blablabla          # create new stash with a message
git stash list               # list all stashes
git stash show -p stash@{1}  # display diff (-p = patch format) between named stash and its original parent
git stash pop stash@{1}      # apply the named stash
git stash drop stash@{1}     # throw away the named stash
git clear                    # throw away all stashes


git commit: Make changes to the repository

Commit staged changes:

git commit -m "bla bla bla"

Notes:

  • Author name and email address are taken from ~/.gitconfig.
  • As a convenience, the -a option can be used to automatically stage files that have been modified and deleted. New files are not staged, though.


To fix the commit message of the last commit:

git commit --amend


To add another file to the last commit, or make additional changes to a file already in the commit:

git add <file>
git commit --amend


See git reset for a more sophisticated example of how to modify the last commit.


git show: Display information about commits and other stuff

Note: The man page for git-show is totally incomplete, for instance it does not show the --name-only option :-(


Display the files that changed in a commit:

git show --name-only 356da73

Also display diffs:

git show 356da73


git tag: Working with tags

git tag -s -m "tagging release 0.8.5" 0.8.5 356da73
  • Creates a tag named "0.8.5"
  • The tag refers to commit object 356da73
  • The message specified by -m is associated with the tag
  • Using GnuPG, the tag is PGP-signed, using the PGP key that matches the committer's email address; although I have not formally researched this, I presume that the committer's email address would be the one that has been defined in ~/.gitconfig under the option "user.email".
  • The tag created is an "annotated" tag, i.e. a tag that carries with it more information than just the tag name (in this case the additional information consists of a message, the tagger's name and email, and a PGP signature)


To use a specific PGP key, i.e. not the default one that matches the committer's email address, one has to set the "user.signingkey" option, either in the repository's configuration file, or the global configuration file. For instance:

git config --global user.signingkey 3FF38573   # use the key ID

Set the GIT_COMMITTER_DATE environment variable to create a tag with a given date instead of the current date (useful to backdate date, e.g. after populating a Git repository with content from another SCM). For instance:

GIT_COMMITTER_DATE="2009-06-14 12:58:50" git tag -s -m "tagging release 0.3" 0.3 ed598d1f3d6fac50b67daac2c191798c451cc962

Delete an existing tag:

git tag -d 0.1

Note: If the tag has already been pushed to the server, this must be done both on the client and on the server (a tag-delete cannot be pushed). THIS IS NOT RECOMMENDED!!! See the man page for git-tag for details.

List all tags that exist in the repository:

git tag

Verify the signature of a tag:

git tag -v 0.1

Find tags that contain the given commit:

git tag --contains 356da73

Checkout a specific tag:

git checkout 0.1
git checkout tags/0.1  # in case there is a branch that is also named 0.1

Find out which tag you are on

git describe --tags   # --tags is required to also find tags that are not annotated


git log: Information on the history

This somewhat looks like what I am used from svn log:

git log --name-status

Another abbreviated version of the history:

git log --stat --summary

Commits since v2.5 which modify Makefile:

git log v2.5.. Makefile

Commits between v2.5 and v2.6:

git log v2.5..v2.6

Commits made on the current branch (which is not master), all the way back since it was branched

git log ^master HEAD
git log master..HEAD        # equivalent, but apparently the more common shorthand
git log HEAD --not master   # equivalent, but here it's important to place --not at the end because --not affects all of the subsequent arguments (not just 1)


Working with remote repositories

git clone

Get a copy of a remote/upstream repository:

git clone /path/to/repo

Notes:

  • The copy is created in the current directory in a folder named "repo"
  • The same branch is checked out that is currently active in the remote/upstream repository
  • The origin is set to the remote/upstream repository; it is said that we are tracking that remote/upstream repo; the origin is later going to be used by pull and fetch commands


git pull

Important: Pulling is NOT what you want if you need to get at a branch that was newly created in the remote repository. You first need to create a local tracking branch with either "git branch" or "git checkout".


Pull all changes in the "master" branch from the remote repository that is our origin, into the local repository:

git pull master

Notes:

  • The changes are not only pulled, but the changes in the remote current branch are also merged immediately into the working tree
  • It is therefore a good idea to commit local changes before pulling
  • In addition, the local current branch should somehow match the remote current branch
  • Instead of pulling, which means an immediate merge, one could first do a "fetch" and then inspect the remote changes
git fetch /path/to/repo master
git log -p HEAD..FETCH_HEAD    # shows remote changes since histories forked
git log -p HEAD...FETCH_HEAD   # shows remote AND local changes since histories forked


git fetch

To fetch the content of a remote branch into a local branch:

git fetch origin work-for-0.4:work-for-0.4

Notes:

  • "origin" is an alias for a repository URL that has previously been set with "git remote"
  • On the left-hand side of the ":" is the name of the remote branch
  • On the right-hand side of the ":" is the name under which the branch should be stored locally; I have not found out how abbreviate this so that git automatically uses the remote name locally (another one of git's many mysteries)

Warning: Although the local branch is created if it does not exist, the local branch will NOT track the remote branch. Refer to "git branch" or "git checkout" for examples how to create a tracking branch, or convert a non-tracking into a tracking branch.


git push

Push local changes in the currently checked out branch into the remote repository that is tracked by the branch (the default remote is origin):

git push

Tags are not affected by the above command. To also sync tags:

git push --tags

Push local changes for a named repository or branch:

git push origin           # push changes in all branches that track origin
git push origin mybranch  # push changes only in mybranch

The second example above also creates a branch in the remote repository if it did not exist before. However, after the push the local branch is not tracking the remote branch. This can be fixed by using the -u command line option. A tracking branch is useful because in the future you can simply type git push or git pull to sync the local with the remote branch, and vice versa.

git push -u origin newbranch

Local changes can be pushed only if they result in a fast-forward in the remote repository. This is a problem when you have changed history locally (e.g. remove a commit) and want to push these changes. It is possible to force the push by prefixing the branch name with a "+" character:

git push origin +master


git remote: Manage remote ("tracked") repositories

Show a list of existing "remotes", i.e. remote repositories whose branche are tracked in the local repo (the "-v" option tells git to be verbose and also list the remote URL):

git remote -v

Add a new remote named "foo" that points to the repository at the given URL

git remote add foo git://linux-nfs.org/pub/linux/nfs-2.6.git
git remote add foo gitolite-user:scjd.git                         # the server gitolite-user is defined in .ssh/config

Once a remote repository has been set, its content can be fetched: git fetch foo # fetch all branches

Rename a remote

git remote rename old new

Remove a remote

git remote rm foo


Working with local and remote branches

git branch: List/create/delete branches

List all branches that exist:

git branch     # local branches only
git branch -r  # remote branches only
git branch -a  # both local and remote branches

Create a new branch (but don't check it out). The first example splits off at the head of the currently checked out branch, the second splits off at the named commit.

git branch work-for-0.5
git branch work-for-0.5 7a8c9912

Create a new tracking branch, so-called because it is connected to and "tracks" a remote branch. The branch is not checked out!

git branch --track experimental origin/experimental

Connect an existing local branch to an existing remote branch, i.e. convert a local non-tracking branch into a tracking branch:

git branch -u origin/experimental experimental

Rename a branch:

git branch -m old new

Delete a branch:

git branch -d mybranch             # delete locally
git push origin --delete mybranch  # delete remote branch (local branch remains untouched)


git checkout: Switch working tree

Switch to another branch

Change the working tree to point to a different branch:

git checkout newbranch

Create a new branch and check it out immediately. The first example splits off at the head of the currently checked out branch, the second splits off at the named branch, the third creates a tracking branch.

git checkout -b newbranch
git checkout -b newbranch oldbranch
git checkout -b experimental origin/experimental

If you have local changes, the checkout command will fail unless you specify one of the following:

git checkout --merge newbranch     # merge changes
git checkout -f newbranch          # discards changes

Notes:

  • The merge works regardless of whether the changes have been added to the index or not
  • Conflicts are not reported in any way, though, you have to detect these by yourself :-(((( A conflicted file will contain markes such as this one: "<<<<<<< master:doc/ChangeLog"
  • Files that are present but matched by .gitignore are retained - it has not been verified yet if this applies to any file that is not version-controlled


Switch to an earlier commit

To go back in history and get the repository's state as it was in a specific commit:

git checkout 7a8c9912

After this you are no longer on a branch, this can be verified as follows: nargothrond:~/Documents/dev/littlego --> git branch

* (no branch)
  master

To return to HEAD of the master branch:

git checkout master


git merge: Merge changes from another branch

The following example pulls changes from a source branch and merges them with the current HEAD and working tree. If possible, git will do a so-called "fast-forward". Fast-forwarding is nicely explained in this section of the "Pro Git" book.

git merge sourcebranch

If you delete the source branch after fast-forwarding was applied, history will not show that the source branch ever existed. Instead it will appear as if all of the commits were made directly inside the target branch. To make sure that this does not happen, fast-forwarding can be disabled:

git merge --no-ff sourcebranch
git merge --no-ff --no-commit sourcebranch  # same, but do not auto-commit

Without fast-forwarding, history will always reflect that there was a source branch, and which commits were made on that branch, even if the branch itself has been deleted. The integration point will be marked by a merge commit that is a giant cumulative "patch" of all the commits made on the source branch. The drawback is that the "blame" command will now show a source line to have changed in the merge commit, not in the actual originating commit.


The following is similar to --no-ff: All changes in the source branch are squashed together into a single "patch" which is then applied to the target branch. Unlike --no-ff, however, no relationship between the source and target branch is visible in the history after the "patch" has been committed. Once the source branch has been deleted, there will be no record of the individual commits, and it will appear as if the "patch" commit has been developed as a single change.

git merge --squash sourcebranch


Further notes:

  • Do not merge while you have uncommitted local changes unless you are sure that the merge will not result in conflicts, or you are able to resolve conflicts
  • If a merge aborts due to a conflict that you cannot (or do not want to) resolve, you can recover by discarding the local changes in the working tree with git reset


git rebase

TODO. Also say something how "git rebase --interactive" followed by a fast-forward merge may have the same effect as "git merge --squash".


Modify older commits:

  • Start the procedure with this command:
git rebase 96f7a7f^ --interactive
  • In the editor that pops up, find the line whose commit you want to modify. Change "pick" to "edit".
  • Make changes
    • If a file was changed accidentally, simply copy the unchanged file over the changed file
  • Stage changes (git add ...)
  • Modify the commit
git commit --amend
  • Conclude the procedure
git rebase --continue


Conflict handling

Note: Stuff in this chapter has been extracted from the man page of git merge.

Throw away all local changes (e.g. too many conflicts):

git reset --hard

Show different versions of files that are in conflict (usually 3 versions: 1 = common ancestor, 2 = HEAD version, 3 = remote version):

git ls-files -u

Show each one of these three versions of a conflicted file:

git show :1:filename   # common ancestor
git show :2:filename   # HEAD version
git show :3:filename   # remote version

Run graphical merge tool (on Mac OS X usually launches FileMerge via the opendiff cmdline utility):

git mergetool 


Other stuff

Generating and applying patches

A good overview is this: http://ariejan.net/2009/10/26/how-to-create-and-apply-a-patch-with-git/


Generating patches

Generate a patch that contains one commit A only:

git format-patch -1 A

Note: The resulting file is placed in the current working directory and named after the first line of the commit message. For instance:

0001-final-changes-for-release-0.1.patch

Write the above patch to a different output directory:

git format-patch -1 A -o /tmp/patchdir

Generate a series of patches from commit A+1 up to HEAD:

git format-patch A -o /tmp/patchdir

Generate a series of patches from commit A up to HEAD:

git format-patch A^ -o /tmp/patchdir

Generate a series of patches from the beginning of history up to commit A:

git format-patch A -o /tmp/patchdir --root

Generate a series of patches from commit A to commit B:

git format-patch A^..B -o /tmp/patchdir


Applying patches

Get an overview of what is in the patch:

git apply --stat /path/to/patch

Test whether the patch applies cleanly. If no errors are printed, the patch applies cleanly.

git apply --check /path/to/patch

Apply the patch (without committing):

git apply /path/to/patch

Apply the patch and generate a "Signed-off-by" tag in the commit message. This tag is read by Github and others to provide useful info about how the commit ended up in the code.

git am --signoff /path/to/patch

Notes:

  • Working in a single-committer environment, I find the generated tag not so useful
  • Very useful, however, is that git am automatically uses the comment that is part of the patch file to generate a commit message AND even performs the commit for you. This allows to apply patches very fast - if they apply cleanly
  • To recover from a patch that did not apply, use this command
git am --abort


Submodules

Documentation


Add a new submodule to a project. The remote repo is cloned into the local subfolder 3rdparty/foo. The submodule name, local path and remote URL are recorded in the .gitmodules file. The .gitmodules file is created if this is the first submodule being added.

git submodule add git://github.com/herzbube/foo.git 3rdparty/foo

By default the master branch will be checked out in the submodule. It may be desirable to checkout a different commit, for instance a specific tag:

cd 3rdparty/foo
git checkout 3.1  # or "git checkout tags/3.1", in case there is a branch that is also named 3.1
cd -

Commit the change. The exact commit at which the remote repo was cloned is recorded in the commit.

git commit -m "added submodule foo @ tag 3.1"

Cloning a repo with submodules

git clone ...
git submodule init     # initialize local config file
git submodule update   # check out the submodule's commit that is recorded in the superproject

If a submodule has other submodules, then the "init" and "update" operations both must be performed recursively. A real-world example that requires this is the "modularized boost" repository on GitHub. The following combines the two operations into one command:

git submodule update --init --recursive

Get remote changes into a repo with submodules

git merge origin/master   # merge changes from remote
git submodule update      # also merge changes in the submodule

Making changes to a submodule

cd submodule
# make changes
git commit
cd ..
git add submodule
git commit                # record in the superproject that it is now referencing a new submodule commit

Completely remove a submodule

git submodule deinit 3rdparty/foo
git rm 3rdparty/foo

Change the URL of a submodule (recipe from StackOverflow)

# Use any text editor to change the URL
vi .gitmodules
# This command updates the URL in .git/config so that it matches the URL in .gitmodules
git submodule sync

Other notes

  • A submodule is a full Git repository in its own right. Git commands in the submodule path therefoe operate on the submodule repo and not the superproject repo.
  • Any changes made in the submodule repo must be "published" (= committed/pushed) BEFORE the superproject changes are pushed, otherwise someone who merges the superproject will get a reference to a non-published commit in the submodule
  • If you want to make changes in a submodule, it is a good idea to create a branch first, otherwise you will work in a "detached head" environment (i.e. HEAD points directly to a commit, not to a symbolic reference), which may make your commits inaccessible if you merge updates from remote


Convert sub-directory into repository of its own

The following command "rewrites" a repository to look as if sub-directory foo has been its project root, and discards all other history. This effectively turns the sub-directory into a repository of its own. I don't pretend to understand in the least what this command does, but I got the magic from this stackoverflow.com question.

git filter-branch --subdirectory-filter foo -- --all

Important note: This action drastically modifies the repository!!! Perform this only on a clone, or push all changes first, or make a backup first.


Depending on how long the old repository has been in use before it was rewritten, the newly rewritten repository still contains quite a bit of overhead and hidden cruft from the old repository. Although there probably are other and better ways to do this, my way of cleaning up is to clone the newly rewritten repository.


The following is a full transcript of how I extracted my HTB repository from the Tools repository:

cd /tmp
git clone gitolite-user:tools.git   # get a fresh copy of the repository to convert
cd tools
git filter-branch --subdirectory-filter htb -- --all
cd ..
git clone file:///tmp/tools htb
du -sh tools htb
880K	tools
524K	htb


Further cleanup steps:

  • Get another fresh clone of the original repository and remove the sub-directory that has been extracted. I do this with a simple git rm -r foo. This leaves the sub-directory's history intact, but I am sure there is a way to destroy the history as well.
  • Add the newly rewritten repository to gitolite. The only problem here is that the rewritten repository already has a remote that is still connected to the original repository - this can be easily resolved by removing the remote first:
cd /tmp/htb
git remote rm origin
git remote add origin gitolite-user:htb.git


Replay commits into a different repository

The following steps are taken verbatim from this Stack Overflow question. I used these commands to setup fuego-on-ios a second time, after I had decided to use svn2git instead of svn git.

cd /path/to/destination/repo
git remote add temp file:///path/to/source/repo
git fetch temp
git checkout temp/master -b wip   # wip = work-in-progress
git rebase master                 # replays commits (rebase onto master)
git checkout master
git merge wip
# Cleanup
git branch -d wip
git remote rm temp

At the moment I have no idea why this works. The mysterious steps are "git rebase master", "git checkout master" and "git merge wip".


Diagnostics & error recovery

Check repository integrity:

git fsck --full

If a packed archive exists (pack files are normally located in GITDIR/objects/pack), extract the single objects within the pack and write them to the current repository (note: a pack file always has an accompanying .idx file whch probably must be present as well):

git unpack-objects </tmp/foo.pack

To see the type of an object (the example object would be located in GITDIR/objects/6c/8cae4994b5ec7891ccb1527d30634997a978ee):

git cat-file -t 6c8cae4994b5ec7891ccb1527d30634997a978ee

To see the content (pretty-printed) of an object with ID ID:

git cat-file -p ID

To see the content of a tree object with object ID T (is equivalent to the "cat-file" command if the object is a tree):

git ls-tree T

To see the content of a tree object that belongs to commit with object ID C:

git ls-tree C

To recursively list the content of a tree object (note: it is important to specify the -r option in front of the tree object ID, otherwise git will interpret the option as a pattern to match):

git ls-tree -r 6c8cae4994b5ec7891ccb1527d30634997a978ee

Recreate a tree object from ls-tree formatted text:

cd ~/git/backups/foo.git
git ls-tree 6c8cae4994b5ec7891ccb1527d30634997a978ee >/tmp/lstree.txt
cd ~/git/recovery/foo.git
git mktree </tmp/lstree.txt

Show information about a commit with object ID C:

git show C

Recreate a commit object from tree with object ID T, linking it to the parent commit object with object ID C (note that author name, email and date are taken from environment variables, or from configuration file items):

git commit-tree T -p C </tmp/changelog

Print out the object ID that a file would get if it were made into a blob:

git hash-object <doc/README

Recreate a blob object (and print its ID):

git hash-object -w <doc/README


Working with Subversion

General information

The command to interact with a Subversion repository is

git svn

A Git repository that is connected to a Subversion repository stores the link in

.git/config


Cloning and tracking an upstream Subversion repository

This section shows how to clone a 3rdparty software project's Subversion repository, and how to track the project's upstream progress on an ongoing basis, but actually maintaining my own modifications in a separate branch of a Git repository that I control.


The following commands clone the upstream Subversion repo into a local Git repo. A word of warning: This clones the entire Subversion repo, including all branches and tags. The operation therefore might take quite a while, depending on the size of the upstream repo.

mkdir fuego-on-ios
cd fuego-on-ios
git svn init --stdlayout http://svn.code.sf.net/p/fuego/code/ .
git svn fetch

Notes:

  • This creates a master branch that tracks the upstream trunk
  • This also creates many remote-tracking branches, one for each tag and branch in the upstream repo. git branch -a lists them, for instance:
* master
  remotes/EGC2008
  remotes/OLYMPIAD2008
  remotes/VERSION_0_1_FIXES
  remotes/VERSION_0_2_FIXES
  remotes/VERSION_0_3_FIXES
  remotes/VERSION_0_4_1_FIXES
  remotes/VERSION_0_4_FIXES
  remotes/VERSION_1_FIXES
  remotes/tags/EGC2008_1
  remotes/tags/OLYMPIAD2008_1
  remotes/tags/PAMPLONA_2009
  remotes/tags/UEC_CUP_2013
  remotes/tags/VERSION_0_1
  remotes/tags/VERSION_0_1_1
  remotes/tags/VERSION_0_2
  remotes/tags/VERSION_0_2_2
  remotes/tags/VERSION_0_3
  remotes/tags/VERSION_0_3_1
  remotes/tags/VERSION_0_3_2
  remotes/tags/VERSION_0_4
  remotes/tags/VERSION_0_4_1
  remotes/tags/VERSION_1_0
  remotes/tags/VERSION_1_1
  remotes/trunk

The main integration branch that holds my own modifications cannot be named "master" because git svn has already taken that name for itself. For this reason I like to create a branch that is named after the repository:

git branch fuego-on-ios


Integrating upstream changes into a tracking Git repository

Download upstream revisions to the local object database, but do not create Git commits:

git svn fetch

Integrate upstream revisions into the current branch:

git svn rebase

Notes:

  • Performs git svn fetch first, i.e. downloads upstream revisions. To skip this step, i.e. to rebase only revisions that are already fetched: git svn rebase --local
  • Creates Git commits from all outstanding upstream revisions
  • Replays (rebases) all local commits that have not been committed back to the upstream Subversion repository on top of the latest Subversion revision commit
  • Performs a fast-forward if no local changes exist (the ideal case!)


Reconnecting a Git repository with upstream Subversion repository after a clone

(the solution to the following problem is a glorified copy of this Stack Overflow answer)

The problem:

  1. Machine 1: You create a local Git repository that tracks an upstream Subversion repository
  2. Machine 1: You make the local Git repository public, e.g. you push it to GitHub
  3. Machine 2: You clone the public repository
  4. Machine 2: The cloned repository is no longer connected to the upstream Subversion repository. For instance, it is not possible to say git svn info or sync with upstream with git svn rebase. The error message you receive is this:
Unable to determine upstream SVN information from working tree history

The first thing to fix is to add some necessary entries to .git/config. In the original Git repository that was created with git svn clone there is a section like this:

[svn-remote "svn"]
	url = http://svn.code.sf.net/p/fuego/code
	fetch = trunk:refs/remotes/trunk
	branches = branches/*:refs/remotes/*
	tags = tags/*:refs/remotes/tags/*

We need to replicate this section in the cloned repository. This can be done either by manually editing the .git/config file, or by issuing a number of commands:

git config svn-remote.svn.url http://svn.code.sf.net/p/fuego/code
git config svn-remote.svn.fetch trunk:refs/remotes/trunk
git config svn-remote.svn.branches branches/*:refs/remotes/*
git config svn-remote.svn.tags tags/*:refs/remotes/tags/*

This is not yet enough, though, git svn info still produces the error message from above. What we still need to do is to setup the remote named "trunk". In the original Git repository there is a file such as this:

cat .git/refs/remotes/trunk 
3318343b099ae9649fadfa5dd53a87adff095ed7

So we need to create the same file in the cloned repository with an appropriate hash:

  1. In the cloned repository, look at the output of git log when you are on the master branch. Note down the hash of the most recent commit that represents a Subversion commit.
  2. Create the file .git/refs/remotes/trunk and add a line to it with the hash you noted down in the previous step.


The final step is to restore the contents of the .git/svn folder. I prefer to do this with

git svn info

but other "git svn" commands such as git svn fetch should do as well.


External Tools

DiffMerge

DiffMerge is a freely available (though not open source) visual diff and merge tool. If the diffmerge command line utility was installed, DiffMerge can be integrated as merge tool into Git using the following configuration:

git config --global merge.tool diffmerge
git config --global mergetool.diffmerge.cmd 'diffmerge --merge --result="$MERGED" "$LOCAL" "$(if test -f "$BASE"; then echo "$BASE"; else echo "$LOCAL"; fi)" "$REMOTE"'
git config --global mergetool.diffmerge.trustExitCode true

To also configure DiffMerge as diff tool:

git config --global diff.tool diffmerge
git config --global difftool.diffmerge.cmd 'diffmerge "$LOCAL" "$REMOTE"'


GitHub

Overview

Signing up with GitHub provides a free (for open source projects) public place to host Git repositories. A few general notes:

  • To be allowed to commit, an account needs to be associated with one or more public SSH keys
  • Old versions of the GitHub API required the use of a secret API token if applications wanted to do special things on GitHub. This has changed with v3 of the GitHub API, the new way uses OAuth tokens.


Local Git configuration

GitHub requires "user.name" and "user.email" entries to be in your Git settings file ~/.gitconfig. The following commands will add the entries if they are not yet present:

git config --global user.name "Billy Everyteen"
git config --global user.email "me@here.com"


Note: Old versions of the GitHub API required the presence of "github.user" and, optionally, "github.token". Since the release of the GitHub API v3 these entries are no longer necessary.


Local SSH configuration

Add the following snippet to your ~/.ssh/config file:

Host github.com
UseKeychain yes
AddKeysToAgent yes
IdentityFile ~/.ssh/foo.id_rsa


Cloning a GitHub repository

git clone git@github.com:herzbube/reponame.git 


Adding and removing branches to a GitHub repository

I did not find a way how to create or delete a branch using GitHub's web interface. Presumably the idea is that this must be done locally and then pushed to GitHub.

Create a new branch locally, then push it to GitHub. The push automatically creates the branch upstream. In addition, use -u to let the local branch track the remote branch.

git branch newbranch
git push -u origin newbranch

Delete a branch locally, then do the same remotely on GitHub:

git branch -d mybranch
git push origin --delete mybranch


Creating a GitHub repository with the content of an external repository

This task was needed after I had decided to move my Little Go repository from my own server to GitHub, including all branches, history and tags.

First create the desired repository on GitHub, then run the following commands. Note that this may not work with older versions of git - I don't recollect which version is the minimum, but if you have trouble, try the alternate command sequence further down. The following commands should work with newer versions of git (git 1.8 should do).

git clone --bare http://git.herzbube.ch/littlego.git
cd littlego.git
git push --mirror git@github.com:herzbube/littlego.git
cd ..
rm -rf littlego.git
git clone git@github.com:herzbube/littlego.git

This alternative command sequence should also work, but here you have to take care that you push all branches.

git clone --bare gitolite-user:littlego.git
cd littlego.git
git remote rm origin
git remote add origin git@github.com:herzbube/littlego.git
git push -u origin master
git push -u origin develop
git push -u origin --tags


Add a patch to a project where you don't have write access

  • Fork the project
  • Clone the project locally
  • Make changes, commit & push back to GitHub
  • On GitHub navigate to the forked project, then at the top of the screen click the button "Pull Request" (not the link "Pull Requests" which will display requests for your forked repository)
  • GitHub help on pull requests has all the details
  • Once the request has been sent, an issue will be created for the target (original) repository. The commits that were included in the pull requests are attached to the issue.


Sync a forked repo

The following section is based on this GitHub help section.

Syncing a forked repo with the upstream repo is done locally, i.e. it is not possible to do this directly in the GitHub web interface.

The first step is to locally clone the forked repo. The local clone will now have remotes labelled (by default) "origin" that point to the forked repo on GitHub. You can check the currently configured remotes like this:

git remote -v

The next step is to add a remote to the local clone that points to the upstream repo. The remotes are labelled "upstream" in the following example.

git remote add upstream https://github.com/original_owner/original_repository.git


Next, retrieve the content of the upstream repository into your local clone. Note that "upstream" is the label we used for the remote in the previous example.

git fetch upstream


Finally, merge changes in upstream branches into your forked branches. For the "master" branch this looks like this:

# Switch to forked "master"
git checkout master
# Merge
git merge upstream/master


Repeat for all branches that you want to sync. Obviously, to make the changes visible on GitHub you must then push them:

git push


GitHub Pages

GitHub Pages provides a convenient and easy way to create user and project websites. Everything is nicely documented here: https://help.github.com/categories/20/articles.

The help articles are none too obvious about what to do if you just want a standalone project page accessible under a custom subdomain. For instance, I wanted to have a page for the "Little Go" project to be accessible under littlego.herzbube.ch. It's actually very simple:

  • Add a CNAME for the subdomain to DNS and let it resolve to pages.github.com. For instance, I added the CNAME littlego.herzbube.ch.
  • Create the project's gh-pages branch. For Little Go, I went to the project's settings page and under "GitHub Pages" clicked the button "Automatic Page Generator". This triggers a wizard that you need to step through to create the branch. You can also choose from among several very nice layouts and the wizard will populate the branch with the files necessary to display your page in that layout.
  • Add a file CNAME to the gh-pages branch. The file's content is the name of the subdomain that the page should be accessible under. In my example this is "littlego.herzbube.ch"