www.it-ebooks.info
This work is licensed under the Creative Commons AttributionNonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit or send a letter
to Creative Commons, 171 Second Street, Suite 300, San Francisco, California,
94105, USA.
www.it-ebooks.info
Preface
Welcome to the second edition of Pro Git. The first edition was published over
four years ago now. Since then a lot has changed and yet many important
things have not. While most of the core commands and concepts are still valid
today as the Git core team is pretty fantastic at keeping things backward compatible, there have been some significant additions and changes in the community surrounding Git. The second edition of this book is meant to address those
changes and update the book so it can be more helpful to the new user.
When I wrote the first edition, Git was still a relatively difficult to use and
barely adopted tool for the harder core hacker. It was starting to gain steam in
certain communities, but had not reached anywhere near the ubiquity it has today. Since then, nearly every open source community has adopted it. Git has
made incredible progress on Windows, in the explosion of graphical user interfaces to it for all platforms, in IDE support and in business use. The Pro Git of
four years ago knows about none of that. One of the main aims of this new edition is to touch on all of those new frontiers in the Git community.
The Open Source community using Git has also exploded. When I originally
sat down to write the book nearly five years ago (it took me a while to get the
first version out), I had just started working at a very little known company developing a Git hosting website called GitHub. At the time of publishing there
were maybe a few thousand people using the site and just four of us working on
it. As I write this introduction, GitHub is announcing our 10 millionth hosted
project, with nearly 5 million registered developer accounts and over 230 employees. Love it or hate it, GitHub has heavily changed large swaths of the Open
Source community in a way that was barely conceivable when I sat down to
write the first edition.
I wrote a small section in the original version of Pro Git about GitHub as an
example of hosted Git which I was never very comfortable with. I didn’t much
like that I was writing what I felt was essentially a community resource and also
talking about my company in it. While I still don’t love that conflict of interests,
the importance of GitHub in the Git community is unavoidable. Instead of an
example of Git hosting, I have decided to turn that part of the book into more
deeply describing what GitHub is and how to effectively use it. If you are going
to learn how to use Git then knowing how to use GitHub will help you take part
iii
www.it-ebooks.info
Preface
in a huge community, which is valuable no matter which Git host you decide to
use for your own code.
The other large change in the time since the last publishing has been the development and rise of the HTTP protocol for Git network transactions. Most of
the examples in the book have been changed to HTTP from SSH because it’s so
much simpler.
It’s been amazing to watch Git grow over the past few years from a relatively
obscure version control system to basically dominating commercial and open
source version control. I’m happy that Pro Git has done so well and has also
been able to be one of the few technical books on the market that is both quite
successful and fully open source.
I hope you enjoy this updated edition of Pro Git.
iv
www.it-ebooks.info
Contributors
Since this is an Open Source book, we have gotten several errata and content
changes donated over the years. Here are all the people who have contributed
to the English version of Pro Git as an open source project. Thank you everyone
for helping make this a better book for everyone.
2
4
4
1
2
1
1
1
1
1
2
1
1
1
1
1
2
2
1
1
1
2
1
1
10
2
1
1
1
1
1
2
1
Aaron Schumacher
Aggelos Orfanakos
Alec Clews
Alex Moundalexis
Alexander Harkness
Alexander Kahn
Andrew McCarthy
AntonioK
Benjamin Bergman
Brennon Bortz
Brian P O'Rourke
Bryan Goines
Cameron Wright
Chris Down
Christian Kluge
Christoph Korn
Ciro Santilli
Cor
Dan Croak
Dan Johnson
Daniel Kay
Daniel Rosen
DanielWeber
Dave Dash
Davide Fiorentino lo Regio
Dilip M
Dimitar Bonev
Emmanuel Trillaud
Eric-Paul Lecluse
Eugene Serkin
Fernando Dobladez
Gordon McCreight
Helmut K. C. Tessarek
v
www.it-ebooks.info
Contributors
31
1
1
1
1
51
1
1
1
1
1
1
1
1
1
7
1
1
1
1
8
1
1
1
6
1
1
1
2
1
1
1
1
1
1
1
1
1
2
8
5
4
2
1
1
3
1
1
1
1
Igor Murzov
Ilya Kuznetsov
Jason St. John
Jay Taggart
Jean Jordaan
Jean-Noël Avila
Jean-Noël Rouvignac
Jed Hartman
Jeffrey Forman
John DeStefano
Junior
Kieran Spear
Larry Shatzer, Jr
Linquize
Markus
Matt Deacalion Stevens
Matthew McCullough
Matthieu Moy
Max F. Albrecht
Michael Schneider
Mike D. Smith
Mike Limansky
Olivier Trichet
Ondrej Novy
Ori Avtalion
Paul Baumgart
Peter Vojtek
Philipp Kempgen
Philippe Lhoste
PowerKiKi
Radek Simko
Rasmus Abrahamsen
Reinhard Holler
Ross Light
Ryuichi Okumura
Sebastian Wiesinger
Severyn Kozak
Shane
Shannen
Sitaram Chamarty
Soon Van
Sven Axelsson
Tim Court
Tuomas Suutari
Vlad Gorodetsky
W. Trevor King
Wyatt Carss
Włodzimierz Gajda
Xue Fuqiao
Yue Lin Ho
vi
www.it-ebooks.info
Contributors
2
1
1
1
1
1
7
1
2
1
adelcambre
anaran
bdukes
burningTyger
cor
iosias
nicesw123
onovy
pcasaretto
sampablokuper
vii
www.it-ebooks.info
www.it-ebooks.info
Introduction
You’re about to spend several hours of your life reading about Git. Let’s take a
minute to explain what we have in store for you. Here is a quick summary of the
ten chapters and three appendices of this book.
In Chapter 1, we’re going to cover Version Control Systems (VCSs) and Git
basics—no technical stuff, just what Git is, why it came about in a land full of
VCSs, what sets it apart, and why so many people are using it. Then, we’ll explain how to download Git and set it up for the first time if you don’t already
have it on your system.
In Chapter 2, we will go over basic Git usage—how to use Git in the 80% of
cases you’ll encounter most often. After reading this chapter, you should be
able to clone a repository, see what has happened in the history of the project,
modify files, and contribute changes. If the book spontaneously combusts at
this point, you should already be pretty useful wielding Git in the time it takes
you to go pick up another copy.
Chapter 3 is about the branching model in Git, often described as Git’s killer
feature. Here you’ll learn what truly sets Git apart from the pack. When you’re
done, you may feel the need to spend a quiet moment pondering how you lived
before Git branching was part of your life.
Chapter 4 will cover Git on the server. This chapter is for those of you who
want to set up Git inside your organization or on your own personal server for
collaboration. We will also explore various hosted options if you prefer to let
someone else handle that for you.
Chapter 5 will go over in full detail various distributed workflows and how to
accomplish them with Git. When you are done with this chapter, you should be
able to work expertly with multiple remote repositories, use Git over e-mail and
deftly juggle numerous remote branches and contributed patches.
Chapter 6 covers the GitHub hosting service and tooling in depth. We cover
signing up for and managing an account, creating and using Git repositories,
common workflows to contribute to projects and to accept contributions to
yours, GitHub’s programmatic interface and lots of little tips to make your life
easier in general.
Chapter 7 is about advanced Git commands. Here you will learn about topics like mastering the scary reset command, using binary search to identify
ix
www.it-ebooks.info
Introduction
bugs, editing history, revision selection in detail, and a lot more. This chapter
will round out your knowledge of Git so that you are truly a master.
Chapter 8 is about configuring your custom Git environment. This includes
setting up hook scripts to enforce or encourage customized policies and using
environment configuration settings so you can work the way you want to. We
will also cover building your own set of scripts to enforce a custom committing
policy.
Chapter 9 deals with Git and other VCSs. This includes using Git in a Subversion (SVN) world and converting projects from other VCSs to Git. A lot of organizations still use SVN and are not about to change, but by this point you’ll have
learned the incredible power of Git—and this chapter shows you how to cope if
you still have to use a SVN server. We also cover how to import projects from
several different systems in case you do convince everyone to make the plunge.
Chapter 10 delves into the murky yet beautiful depths of Git internals. Now
that you know all about Git and can wield it with power and grace, you can
move on to discuss how Git stores its objects, what the object model is, details
of packfiles, server protocols, and more. Throughout the book, we will refer to
sections of this chapter in case you feel like diving deep at that point; but if you
are like me and want to dive into the technical details, you may want to read
Chapter 10 first. We leave that up to you.
In Appendix A we look at a number of examples of using Git in various specific environments. We cover a number of different GUIs and IDE programming
environments that you may want to use Git in and what is available for you. If
you’re interested in an overview of using Git in your shell, in Visual Studio or
Eclipse, take a look here.
In Appendix B we explore scripting and extending Git through tools like libgit2 and JGit. If you’re interested in writing complex and fast custom tools and
need low level Git access, this is where you can see what that landscape looks
like.
Finally in Appendix C we go through all the major Git commands one at a
time and review where in the book we covered them and what we did with
them. If you want to know where in the book we used any specific Git command
you can look that up here.
Let’s get started.
x
www.it-ebooks.info
Table of Contents
Preface
iii
Contributors
v
Introduction
ix
CHAPTER 1: Getting Started
23
About Version Control
23
Local Version Control Systems
23
Centralized Version Control Systems
24
Distributed Version Control Systems
25
A Short History of Git
27
Git Basics
27
Snapshots, Not Differences
28
Nearly Every Operation Is Local
29
Git Has Integrity
29
Git Generally Only Adds Data
30
The Three States
30
The Command Line
32
Installing Git
32
Installing on Linux
32
Installing on Mac
33
Installing on Windows
34
xi
www.it-ebooks.info
Table of Contents
Installing from Source
First-Time Git Setup
34
35
Your Identity
35
Your Editor
36
Checking Your Settings
36
Getting Help
37
Summary
37
CHAPTER 2: Git Basics
39
Getting a Git Repository
39
Initializing a Repository in an Existing Directory
39
Cloning an Existing Repository
40
Recording Changes to the Repository
41
Checking the Status of Your Files
42
Tracking New Files
43
Staging Modified Files
43
Short Status
45
Ignoring Files
46
Viewing Your Staged and Unstaged Changes
47
Committing Your Changes
50
Skipping the Staging Area
51
Removing Files
52
Moving Files
53
Viewing the Commit History
Limiting Log Output
Undoing Things
54
59
61
Unstaging a Staged File
62
Unmodifying a Modified File
63
Working with Remotes
64
Showing Your Remotes
65
Adding Remote Repositories
66
xii
www.it-ebooks.info
Table of Contents
Fetching and Pulling from Your Remotes
67
Pushing to Your Remotes
67
Inspecting a Remote
68
Removing and Renaming Remotes
69
Tagging
69
Listing Your Tags
70
Creating Tags
70
Annotated Tags
71
Lightweight Tags
71
Tagging Later
72
Sharing Tags
73
Checking out Tags
74
Git Aliases
74
Summary
75
CHAPTER 3: Git Branching
77
Branches in a Nutshell
77
Creating a New Branch
80
Switching Branches
81
Basic Branching and Merging
85
Basic Branching
85
Basic Merging
90
Basic Merge Conflicts
92
Branch Management
95
Branching Workflows
96
Long-Running Branches
96
Topic Branches
97
Remote Branches
99
Pushing
105
Tracking Branches
107
Pulling
109
xiii
www.it-ebooks.info
Table of Contents
Deleting Remote Branches
Rebasing
109
109
The Basic Rebase
110
More Interesting Rebases
112
The Perils of Rebasing
115
Rebase When You Rebase
118
Rebase vs. Merge
119
Summary
120
CHAPTER 4: Git on the Server
121
The Protocols
122
Local Protocol
122
The HTTP Protocols
123
The SSH Protocol
126
The Git Protocol
126
Getting Git on a Server
127
Putting the Bare Repository on a Server
128
Small Setups
129
Generating Your SSH Public Key
130
Setting Up the Server
131
Git Daemon
134
Smart HTTP
135
GitWeb
137
GitLab
140
Installation
140
Administration
141
Basic Usage
144
Working Together
144
Third Party Hosted Options
xiv
www.it-ebooks.info
145
Table of Contents
Summary
145
CHAPTER 5: Distributed Git
147
Distributed Workflows
147
Centralized Workflow
147
Integration-Manager Workflow
148
Dictator and Lieutenants Workflow
149
Workflows Summary
150
Contributing to a Project
151
Commit Guidelines
151
Private Small Team
153
Private Managed Team
160
Forked Public Project
166
Public Project over E-Mail
170
Summary
173
Maintaining a Project
173
Working in Topic Branches
174
Applying Patches from E-mail
174
Checking Out Remote Branches
178
Determining What Is Introduced
179
Integrating Contributed Work
180
Tagging Your Releases
187
Generating a Build Number
188
Preparing a Release
189
The Shortlog
189
Summary
190
CHAPTER 6: GitHub
191
Account Setup and Configuration
191
SSH Access
192
Your Avatar
194
xv
www.it-ebooks.info
Table of Contents
Your Email Addresses
195
Two Factor Authentication
196
Contributing to a Project
197
Forking Projects
197
The GitHub Flow
198
Advanced Pull Requests
206
Markdown
211
Maintaining a Project
216
Creating a New Repository
216
Adding Collaborators
218
Managing Pull Requests
220
Mentions and Notifications
225
Special Files
229
README
229
CONTRIBUTING
230
Project Administration
230
Managing an organization
232
Organization Basics
232
Teams
233
Audit Log
235
Scripting GitHub
236
Hooks
237
The GitHub API
241
Basic Usage
242
Commenting on an Issue
243
Changing the Status of a Pull Request
244
Octokit
246
Summary
247
CHAPTER 7: Git Tools
249
Revision Selection
249
xvi
www.it-ebooks.info
Table of Contents
Single Revisions
249
Short SHA
249
Branch References
251
RefLog Shortnames
252
Ancestry References
253
Commit Ranges
255
Interactive Staging
258
Staging and Unstaging Files
258
Staging Patches
261
Stashing and Cleaning
262
Stashing Your Work
262
Creative Stashing
265
Creating a Branch from a Stash
266
Cleaning your Working Directory
267
Signing Your Work
268
GPG Introduction
269
Signing Tags
269
Verifying Tags
270
Signing Commits
271
Everyone Must Sign
273
Searching
273
Git Grep
273
Git Log Searching
275
Rewriting History
276
Changing the Last Commit
277
Changing Multiple Commit Messages
277
Reordering Commits
280
Squashing Commits
280
Splitting a Commit
282
The Nuclear Option: filter-branch
283
Reset Demystified
285
xvii
www.it-ebooks.info
Table of Contents
The Three Trees
285
The Workflow
287
The Role of Reset
293
Reset With a Path
298
Squashing
301
Check It Out
304
Summary
306
Advanced Merging
307
Merge Conflicts
307
Undoing Merges
319
Other Types of Merges
322
Rerere
327
Debugging with Git
333
File Annotation
333
Binary Search
335
Submodules
337
Starting with Submodules
337
Cloning a Project with Submodules
339
Working on a Project with Submodules
341
Submodule Tips
352
Issues with Submodules
354
Bundling
356
Replace
360
Credential Storage
369
Under the Hood
370
A Custom Credential Cache
373
Summary
375
CHAPTER 8: Customizing Git
377
Git Configuration
377
Basic Client Configuration
xviii
www.it-ebooks.info
378
Table of Contents
Colors in Git
381
External Merge and Diff Tools
382
Formatting and Whitespace
386
Server Configuration
388
Git Attributes
389
Binary Files
389
Keyword Expansion
392
Exporting Your Repository
395
Merge Strategies
396
Git Hooks
397
Installing a Hook
397
Client-Side Hooks
398
Server-Side Hooks
400
An Example Git-Enforced Policy
401
Server-Side Hook
401
Client-Side Hooks
407
Summary
411
CHAPTER 9: Git and Other Systems
413
Git as a Client
413
Git and Subversion
413
Git and Mercurial
425
Git and Perforce
434
Git and TFS
450
Migrating to Git
459
Subversion
460
Mercurial
462
Perforce
464
TFS
466
A Custom Importer
468
xix
www.it-ebooks.info
Table of Contents
Summary
475
CHAPTER 10: Git Internals
477
Plumbing and Porcelain
477
Git Objects
478
Tree Objects
481
Commit Objects
484
Object Storage
487
Git References
489
The HEAD
490
Tags
491
Remotes
493
Packfiles
493
The Refspec
497
Pushing Refspecs
499
Deleting References
499
Transfer Protocols
500
The Dumb Protocol
500
The Smart Protocol
502
Protocols Summary
505
Maintenance and Data Recovery
506
Maintenance
506
Data Recovery
507
Removing Objects
510
Environment Variables
514
Global Behavior
514
Repository Locations
514
Pathspecs
515
Commiting
515
Networking
516
Diffing and Merging
516
xx
www.it-ebooks.info
Table of Contents
Debugging
517
Miscellaneous
519
Summary
519
Git in Other Environments
521
Embedding Git in your Applications
537
Git Commands
549
Index
567
xxi
www.it-ebooks.info
www.it-ebooks.info
Getting Started
1
This chapter will be about getting started with Git. We will begin by explaining
some background on version control tools, then move on to how to get Git running on your system and finally how to get it set up to start working with. At the
end of this chapter you should understand why Git is around, why you should
use it and you should be all set up to do so.
About Version Control
What is “version control”, and why should you care? Version control is a system
that records changes to a file or set of files over time so that you can recall specific versions later. For the examples in this book you will use software source
code as the files being version controlled, though in reality you can do this with
nearly any type of file on a computer.
If you are a graphic or web designer and want to keep every version of an
image or layout (which you would most certainly want to), a Version Control
System (VCS) is a very wise thing to use. It allows you to revert files back to a
previous state, revert the entire project back to a previous state, compare
changes over time, see who last modified something that might be causing a
problem, who introduced an issue and when, and more. Using a VCS also generally means that if you screw things up or lose files, you can easily recover. In
addition, you get all this for very little overhead.
Local Version Control Systems
Many people’s version-control method of choice is to copy files into another directory (perhaps a time-stamped directory, if they’re clever). This approach is
very common because it is so simple, but it is also incredibly error prone. It is
easy to forget which directory you’re in and accidentally write to the wrong file
or copy over files you don’t mean to.
23
www.it-ebooks.info
CHAPTER 1: Getting Started
To deal with this issue, programmers long ago developed local VCSs that had
a simple database that kept all the changes to files under revision control.
FIGURE 1-1
Local version
control.
One of the more popular VCS tools was a system called RCS, which is still
distributed with many computers today. Even the popular Mac OS X operating
system includes the rcs command when you install the Developer Tools. RCS
works by keeping patch sets (that is, the differences between files) in a special
format on disk; it can then re-create what any file looked like at any point in
time by adding up all the patches.
Centralized Version Control Systems
The next major issue that people encounter is that they need to collaborate
with developers on other systems. To deal with this problem, Centralized Version Control Systems (CVCSs) were developed. These systems, such as CVS,
Subversion, and Perforce, have a single server that contains all the versioned
24
www.it-ebooks.info
About Version Control
files, and a number of clients that check out files from that central place. For
many years, this has been the standard for version control.
FIGURE 1-2
Centralized version
control.
This setup offers many advantages, especially over local VCSs. For example,
everyone knows to a certain degree what everyone else on the project is doing.
Administrators have fine-grained control over who can do what; and it’s far easier to administer a CVCS than it is to deal with local databases on every client.
However, this setup also has some serious downsides. The most obvious is
the single point of failure that the centralized server represents. If that server
goes down for an hour, then during that hour nobody can collaborate at all or
save versioned changes to anything they’re working on. If the hard disk the central database is on becomes corrupted, and proper backups haven’t been kept,
you lose absolutely everything – the entire history of the project except whatever single snapshots people happen to have on their local machines. Local VCS
systems suffer from this same problem – whenever you have the entire history
of the project in a single place, you risk losing everything.
Distributed Version Control Systems
This is where Distributed Version Control Systems (DVCSs) step in. In a DVCS
(such as Git, Mercurial, Bazaar or Darcs), clients don’t just check out the latest
25
www.it-ebooks.info