Advanced Guide to Git

As a modern software engineer, you've hopefully been introduced the most fundamental commands of Git, but there's actually a surprising amount more to the tool than you might think. So much so, in fact, that it might challenge the way you think of Git. For one thing, your commit history is much more fluid than you might assume—it isn't written in stone the way you're led to believe early on. This guide is about unlocking some of the hidden powers of Git and taking your version control to the next level. As such, it assumes you already have a decent grasp of committing, branching, and merging. It's also a helpful reference for me personally—there's a few super helpful commands that I use so rarely that I have to Google them every time. Ready? Good, let's start.

Global Gitignore

You might already know that if you create a file called .gitignore in your Git project and add some ignore rules, Git will automatically ignore any files that match those rules. What you may not know is that you can create a global ignore file which will apply globally across all projects. Try creating one in your home directory:

~/.gitignore

# Global gitignore

# Build files
*.bin

# Editor-specific files
.idea/
.vscode/

~/.gitignore

Note that you can name your file anything and keep it anywhere. Then, simply run:

git config --global core.excludesfile ~/.gitignore

and all files that match will be ignored in all situations. I recommend keeping editor-specific ignore rules here; this keeps each team member from adding their own editor- and workspace-specific rules to every project's ignore rules.

Git Grep

Occasionally, you'll come across a project where the build directory is significantly larger than the codebase. In cases like this, it's often very slow to use regular grep. Fortunately, Git ships with a tool called git grep, which only searches files tracked by Git. Simply use:

git grep "search term"

Remotes

You might have also heard about the distributed nature of Git, but have you ever actually taken advantage of it? It's good to know how to handle origins flexibly in case you ever run into a situation where you need it. For example, your Git server might down for an extended period of time, or you may need to collaborate with teammates in a place without Internet access.

If you need to, you can clone from a peer machine on the network using clone over SSH using

git clone ssh:gituser@hostname:/path/to/project
Then you'll have to ask the owner of the peer machine to enter their password—or alternatively, ask them to set up a new user on their machine with a shared password—and you're good to go! Push and pull like normal.

Alternatively, you may run into a situation where code is being developed on a machine which doesn't have access to your Git server. Perhaps a contractor who isn't allowed network access or a lab machine that isn't on the main building network. In this case, you would add the access-less machine as another remote of your project on your machine using

git remote add mylabmachine username@hostname[.domain.com]:path/to/project.git
Check to make sure it got added correctly with git remote -vv—you should see two remotes listed now. Now you can simply git pull mylabmachine master and git push origin master whenever you need to pull code from the lab machine and sync it with the common remote.

Stashing

You might already know this one, but Git's stash is incredibly helpful. It's used to save uncommitted changes without committing to be applied later, even in a different branch. For example, if you're trying to git pull but you have a possibly conflicting change that's preventing Git from pulling, git stash your changes so you can pull, then run git stash pop or git stash apply (pop will also delete the stash unless it causes a merge conflict). Now you can properly deal with merge conflicts if there are any.

You can have multiple stashed changes. Run git stash list to see them all and git stash show -p stash@{#} to view the stash in diff form, replacing '#' with the ID of the stash you want. git stash drop stash@{#} is used to delete a stash. It's very helpful when you have uncommitted changes preventing you from performing a Git action, e.g. checking out another branch. Use git stash clear to delete all stashes.

All of these commands can be run without a stash ID—the action will simply apply to the topmost stash (stash@{0}). To keep your stashes from getting out of control, you should attach memorable messages when stashing. Use

git stash save "My memorable message"

Patches

Sometimes you need to move code around but can't or don't want to push it to a remote. Maybe it's test code that you want to send your colleague to debug, and you'd rather not push it, even as a WIP. In this case, you can write your changes into a patch, which is just a text file containing the diff. Simply run

git diff > mypatch.patch

and your diff will be saved into the file. Send it to your colleague, and let them apply it to their copy of the project with

git apply mypatch.patch

Branching

As a mindful developer you should strive to keep your Git history as maintanable as your code. There's a variety of techniques you can use, but be forewarned—some of them sound like heresey because they result in altered Git histories. It's common to hear colleagues call these techniques evil, but the truth is that anything can seem evil if attempting to use it without fully understanding how it works. Bear with me and you'll find they actually result in cleaner code bases and easier debugging when used properly.

The only rule you need to follow is, if you do choose to alter your Git history, either make sure it only affects a local set of unpushed commits, or wait until you're about to merge and your branch will be deleted right after the merge. Unless you know what you're doing, do not ever use

git push --force # Don't do this unless you know what you're doing.

to push your changes to the remote when collaborating. This will overwrite commits on the remote branch and could unintentionally destroy someone else's work.

Amend

Amending a commit a handy technique. Say you're in a situation where you just committed your code, only to realize it doesn't build because you missed a semicolon somewhere. What now? Should you make another commit just for the semicolon fix? Well, if you haven't pushed yet, you're in luck! Make the fix, stage the file(s), and use:

git commit --amend

You'll end up with the fix amended to your last commit. Note that the hash of that commit is no longer the same after the amend, which is why this technique counts as changing Git's history. Note that you can also use this technique to just edit the last commit's message.

What happens if you accidentally amend a commit instead of creating a new one? Fortunately, Git has your back there, too. The history of which commit your HEAD pointer was pointing to is maintained in Git's reflog. View the reflog with:

git reflog

HEAD should currently be pointing to your amended commit, but HEAD@{1} should be pointing to your pre-amend commit, which still exists but is not part of the source tree. So, to undo the amend, you need to reset back to HEAD@{1}, the commit HEAD was pointing to before now. You don't want to lose the amend changes, so we use a soft reset. Finally, we commit the reset files, but only with the details of that HEAD@{1} commit. This should restore you back to your pre-amend state:

git reset --soft HEAD@{1}
git commit -C HEAD@{1}
Moving Commits

Has it ever happened to you that you've committed to the wrong branch? It's okay, it happens to the best of us! Just don't git push yet. Checkout the branch you meant to commit on, then run

git cherry-pick <commit-hash>

to copy over the accidental commit to this branch. Now go back to the other branch and run

git reset --hard <commit-hash>

with the hash of the last commit that should be on that branch. Now you can push your commit to the correct branch. Phew, no problem. Cherry-pick can help you out of tight spots, but be careful not to abuse it.

Git cherry-pick can help you out of tight spots.

Git cherry-pick can help you out of tight spots.

For reference, git reset --hard will delete commits locally—you'd have to git pull to get them back from the remote. Using git reset --soft, on the other hand, will undo the commit but leave the changes from that commit staged. Useful when you realize your commit message has a typo but haven't pushed yet. (On that note, you can also use git commit --amend to add changes to the last commit and/or change the commit message.)

Rebase

Finally, one of the most powerful tools in Git is called rebase. It has two main uses—rebasing on top of another branch and interactive rebase, which is the one that allows rearranging history.

Say you've been working locally on an unpushed branch called topic, which initially branched from master. You see some new commits appear on master that are useful to you. At this point, you have two options—you could merge master into your branch, but if you have to do this often your branch's history will become difficult and hard to parse. Instead, in this case because your branch is unpushed, you have another option, which is rebasing (yes, it does exactly what it sounds like). Once you've fetched the new commits on master, simply run

git rebase master

and voila! Your commits have now moved on top of master. If you encounter a conflict, don't fret; just resolve it like you would for a merge, then run git rebase --continue (Git will prompt you). Now your branch history will stay clean, you'll be able to trace commit history more easily across the project, and you'll be guaranteed to have a clean fast-forward when you merge back into master. As an added benefit, the commits related to your feature will stay together as a group instead of interspersed with many others in master.

Git rebase can be used instead of merge.

Git rebase can be used instead of merge.

As an aside, if you're working on a pushed branch but have some unpushed commits, you can pull your teammates' changes to that branch with git pull --rebase to rebase your commits on top of theirs.

The other rebase, interactive rebase, is an interesting feature that comes with great power/responsibility. Imagine that you've been working again on a branch called topic. You've been diligently marking commits with WIP to signify that they don't necessarily build, and you're getting ready to merge back to master soon. At this point, you might realize that the situation will never arise that someone wants to explicitly checkout one of the WIP commits, so why not combine them. This is called squashing; it's a feature within Git's interactive rebase. Now, if all your commits are unpushed, you can clean up your commits as you go without messing up anyone's history. Otherwise, you should only perform the interactive rebase when you're about to merge back to master and you'll be deleting the topic branch after. Run

git rebase -i HEAD~#

replacing # with the number of commits you want to rearrange. You'll enter the interactive rebase interface. You'll see instructions there, but in this interface you would type s or squash next to all your WIP commits to combine them with their downstream commits. You'll have an opportunity later to assign new commit messages to these super-commits. You can also rearrange and delete commits in this interface, but be careful not to lose any work. And that's it! You've successfully changed your Git history. Again, do not use git push --force when collaborating unless you know what you're doing because you'll risk undoing someone else's work.

Conclusion

I hope that these techniques, life-changing or not, will make you a better developer than ever before. For a more thorough guide, check out Pro Git. As mentioned earlier, I believe a great developer should maintain their source tree just as fastidiously as their code. Not only does a clean history help out your teammates, but it also helps you when bug-hunting (and/or git bisecting, another technique for another time). Try out the commands in a test project to get used to them and you'll reap the rewards. Now git outta here!