7. SCCS
• First VCS available on any Unix system
• Developed in SNOBOL at Bell Labs in 1972
• Prepared for IBM Systems/370 computers running
OS/360
• Its file format is used in BitKeeper and other VCS
• Introduced repositories and locking mechanism
8. CVS
• Ancestor of the revision control systems
• First released in 1986 by Dick Grune
• Simple technology with small learning curve
• Useful for sharing and backing up the files
• Tortoise CVS is a de facto client for CVS on Windows
• Introduces merging
• Lifecycle ended in 2008
9. Apache Subversion
• Created in 2000
• Used to host Apache software products, also
Mono, SourceForge, Google Code
• Most adopted SCM
• Atomic commits
• Maintains versioning for directories, renames, and
file metadata
• Better support for branches and tagging
14. Git
• Distributed revision control and source code
management system
• Designed and developed by Linus Torvalds for
Linux kernel development
• Based on BitKeeper system
• The development began on April 2005
• Current version 1.8.2
15. Linus Torvalds
• Swedish-speaking Finnish American
• Chief architect and the project's coordinator of the
Linux kernel
• Names after Linus Pauling and Linus Van Pelt
• Second lieutenant of the Finnish Army
• Winner of Millennium Technology Prize in 2012
• Calls himself egotistical bastard
18. Junio Hamano
• Graduated from Tokyo university
• Git coordinator since 2005
• Participated in the Linux development
• Currently Google developer
19. Design Principles
• Take CVS as an example of what not to do
• Support distributed workflow
• Scaling to thousand developers
• Strong consistency and integrity support
• Free
20. Features
• Rapid branches and merging
• Distributed development
• Compatibility and emulation
• Performance breakthrough
• Revisions hashing
• Garbage collector
• Packed data storage
21. Git Repository
• Database containing revisions and history of the
project
• Retains complete copy of entire project
• Maintains object store and index
• Object store contains data files, log files and audit
information
24. Blobs
• Each version of a file is represented as a blob.
• Blob internal structure is ignored by Git.
• A blob holds a file’s data but does not contain any
metadata about the file or even its name.
• git show command examines contents of the blob
25. Trees
• A tree object represents one level of directory
information.
• It records blob identifiers and path names for all the
files in one directory.
• It can also recursively reference other sub-trees
objects
• Can be examined by git show or git ls-tree
commands
26. Commits
• A commit object holds metadata for each change
including the author, commit date, and log
message.
• Each commit points to a tree object that
captures, the state of the repository at the time the
commit was performed.
• git tag stable-1 1b2e1d63ff
27. Tags
• A tag object assigns an arbitrary yet presumably
human readable name to a specific object, usually
a commit.
• Contains tag type, tag message, author and object
name.
• Can be examined by git cat-file command.
29. Git Object Model
• Object store is organized and implemented as a
content-addressable storage system.
• Each object has a unique name produced by
applying SHA1 to the contents of the object.
• SHA1 hash is a sufficient index or name for that
object in the object database.
• SHA1 values are 160-bit values that are represented
as a 40-digit hexadecimal number
• 9da581d910c9c4ac93557ca4859e767f5caf5169
30. Advantages
• Git can determine equality of the objects by
comparing names.
• The same content stored in two repositories will
always be stored under the same name.
• Corruptions errors can be detected by checking
that the object's name is still the SHA1 hash of its
contents.
31. Name Vs Content
• Git stores each version of file not differences
• Path name is separated from file contents
• Object store is based on hashed computation on
file contents, not name
System Index mechanism Data store
Database Indexed Sequential
Access Method
Data records
Unix FS Directories(/path) Blocks of data
Git .git/objects/hash Blob/tree objects
32. Git Directory
• Stores all Git's history, configuration and meta
information for your project
• There is only one git directory per project
• By default it’s '.git' in the root of your project
35. Git Directory
• Object Database:
-objects
• Default Git object database
• Contains all content or
pointers to local content.
• All objects are immutable
36. Git Directory
• References:
-refs
• Stores reference pointers for
branches, tags and heads.
• A reference is a pointer to
an object, usually of type
tag or commit.
• References changes as
the repository evolves
37. Working Directory
• Holds the current checkout of the files
• Files can be removed or replaced by Git as
branches are switching
• Working directory is temporary checkout place
38. Index
• The index is a temporary and dynamic binary file
that captures a version of the project’s overall
structure
• The project’s state could be represented by a
commit and a tree from any point in the project’s
history
• The index allows a separation between incremental
development steps and the committal of those
changes.
39. Index
• Staging area between your working directory and
your repository
• With commit data files from index are
committed, not from working directory
• Can be viewed by git status command.
49. Branching
• Branch is graph of commits
• Master branch is created by default
• HEAD is pointer to the current branch
• “git branch test” creates branch test.
• “git checkout master” switches to branch master.
• “git merge test” merges changes from test to
master.
• Merges are done automatically.
50. Conflicts
• If conflict cannot be resolved index and working
tree are left in the special state
• “git status” shows unmerged files with conflict
markers
• git add file.txt
• git commit
54. Revert
• Rollbacks the last commit(s) in the repository
• git revert HEAD
• git revert HEAD~1 –m 2
55. Git References
• All references are named with a slash-separated
path name starting with "refs“.
• -The branch "test" is short for "refs/heads/test".
• The tag "v1.0" is short for "refs/tags/v1.0".
• "origin/master" is short for
“refs/remotes/origin/master"
56. Git References
• The HEAD file is a symbolic reference to the branch
we are currently using
• git symbolic-ref HEAD
• ref: refs/heads/master
61. Feature Branches
• Feature branches (or topic branches) are used to
store new features
• Can be added to develop or
disregarded
• git checkout –b newfeature develop
63. Hotfix branches
• Hotfix branches are related to new production
release.
• Created in response to critical bugs in a production
environment.
• Separates developing of the
current version and hotfix.
75. Git Hooks
• Scripts placed in $GIT_DIR/hooks directory to trigger
action at certain points
• pre-commit
• commit-msg
• post-commit
• post-checkout
• post-merge
76. Object Store
• All objects are stored as compressed contents by
their SHA-1 values.
• They contain the object type, size and contents in a
gzipped format.
• Loose objects and packed objects.
77. Loose Objects
• Compressed data stored in a single file on disk
• Every object written to a separate file
• SHA1 ab04d884140f7b0cf8bbf86d6883869f16a46f65
• GIT_DIR/objects/ab/04d884140f7b0cf8bbf86d68838
69f16a46f65
78. Packed Objects
• Packfile is a format which stores the part that has
changed in the second file
• Uses heuristic algorithm to define files to pack
• git gc packs the data
• git unpack-objects converts data into loose format
79. Ignoring files
• # Ignore any file named sample.txt.
• sample.txt
• # Ignore Eclipse files
• *.project
• # except my.project with manual setting.
• !my.project
• # Ignore objects and archives.
• *.[oa]
84. GitHub
• Web-based hosting service
• Was launched in April 2008
• Git repository, paid for private projects and free for
open-source projects
• Run by Ruby on Rails & Erlang
• Provides feeds and followers
85. Growth
Period State
2009 100000 users and 50000
repositories
2011 1 million users
2012 2 million users and 4 million
repositories
2013 3 million users and 5 million
repositories
86. Octocat
• Introduced by Tom Preston-Werner, cofounder of
GitHub
• Composed of octopus and cat words
89. Pros
• Painless branching
• Separation between local repository and upstream
• Simplifies work in the distributed teams
• Dramatic increase in performance
• Integration with major VCS
90. Cons
• Repository security risks
• Latest revision question
• Pessimistic locks
• Big learning curve
• Commit identifiers
• Not optimal for single developers