This document discusses Git workflows for teams. It begins by introducing Git and its benefits for version control and collaboration. It then examines different Git branching models and workflows that can be used by development teams, including feature branches, hotfix branches, and pull requests. It also covers tools for code review, continuous integration, and collaboration using Git repositories.
20. But what if my repo is big?
446k lines of code added
1
3
The Linux kernel 3.13 release had 15+
million LOC
1,339 contributors4
2 12,000 non-merge commits
source lwn.net
23. upcoming Release?
Can we still fix a bug for the
Feature
Is the code for that
complete?
for the current version?
HotfixHow do we do a
ReviewedHas everyone
the code for this feature ?
65. What is an
--interactive rebase?
It’s a way to replay
commits, one by one,
deciding interactively
what to do with each
reword
fixup
pick
squash
edit
exec
72. Merge Commit Rebase (FF) Rebase (Squash)
No merge commits
Verbose history
Easy to read
Can be more difficult
to trace changes
Which should I use?
“Ugly” history
Full traceability
Hard to screw up
mostly
some
95. Code review: why?
• Better code - this one is obvious
• Teach your co-workers - and learn from them
• Lower your bus factor - remove single points
of failure
141. $10 for up to 10 devsFree for 5 users Totally free
The ultimate workflow
Notas do Editor
Tim Pettersen, Developer Provocateur
using a different VCS, but thinking about moving to git
using git, but feel like you don’t have an optimal workflow
just like talks about git and workflows
Though the focus of this talk is SaaS, most of the talk applies to general development workflows, so you don’t have to be a SaaS developer to get something out of it.
who here knows what a version control system is?
who has used git?
You’re in the right place if you’re
using git, but feel like you don’t have an optimal workflow
just like talks about git and workflows
Though the focus of this talk is SaaS, most of the talk applies to git and general development workflows, so you don’t have to be a SaaS developer to get something out of it.
Welcome to the Ultimate Git Workflow. Today I will be showing you the techniques that we have developed at Atlassian to work with Git.
How many people here are currently using Git?
Subversion? Hg?
Welcome to Getting Git Right.
Great to see so many people excited about learning Git.
I’m a “developer provocateur” from Atlassian.
Why am I here talking about Git? Because I’m one of the two developers who started Stash at Atlassian. So I spent a good two years of my life as a core developer on our Git hosting product and another 18 months speaking and blogging about Git and Git workflows.
I won’t be talking too much about Stash during this session, but feel free to ask me about it in the questions :)
Any Stash users in the Audience?
Who here has heard of Atlassian?
For those of you who don’t know much about Atlassian, we have an engineering team that is over 800 developers. We specialize in Java, Python and web technologies like HTML5, CSS and Javascript. We have a fairly hefty contingent of mobile and desktop application specialists too.
These 800 developers are part of a global team of 1800 software nerds. Including supporters, product managers, designers and business types.
We have 9 core products, all focused on making life better for developers and software development teams.
Every team now uses Git for version control.
Atlassian’s tools are designed to make software development teams collaborate and develop software more effectively.
Collaborate transparently
Communicate effectively
Develop and work asynchronously, in a way that fits your team
Our mission is to make developers happier and development teams more productive. We use all of our tools internally and are continually improving them to let us ship software to you, faster and smarter. Which hopefully lets you ship software to your customers, faster and smarter.
Which is also why we love Git. We’ve found that Git is a tool that not only makes us happier and more productive as individual developers, but also more productive and efficient as a software team. Git has let us redesign our development workflows and has dramatically improved the way that we develop and ship software.
Collaboration models… If we have time left.
CI… If we get time. The activity we’ll be running during the intermission is of variable length.
version control sensation that’s sweeping the nation
Developers love:
* speed
* offline
* distributed - good collaboration
* created by Linus Torvalds, the dude who wrote Linux
And there’s a certain peer pressure aspect to using Git too. When you look at the statistics,
Though Mercurial is a pretty great version control system.
One of the reasons these companies are adopting Git is because it speeds up their time to release. More on this later.
1000 participants at JavaZone in Norway (September 2013) & Devoxx in Belgium (November 2013)
13% of git users shipped daily
3% of SVN
27% weekly git
18% weekly SVN
Git doesn’t just speed up releases, it’s also blazingly fast to use. Git’s speed is one of the reasons why developers love it so much. It is an entirely different experience to using something like subversion - commands return instantly and branches that have been active for months are merged in the blink of an eye.
Summary:
You have to experience this!
Git doesn’t get in your way
Git doesn’t have to talk to a server (Don’t elaborate much yet)
VCS speed changes your behavior
Git is fast, even with Linux’s 15M LOC
Full text:
So speed then. Git is well know for it and you have to experience it to true understand the importance of this.
What I mean by that is that it doesn;t get in your way. Gone are the days where you had to wait a minute to svn-checkout a different revision and wait for everything to
download from the server. Or wait for anything you ask svn while it goes back to the server.
Slowness gets in your way and it changes your behavior. You avoid doing things. You avoid switching back and forth between revisions and that’s bad.
Even on huge code bases like the 17 million lines of code in the Linux kernel, Git is incredibly fast and rarely will you be able to even blink before a command returns.
Summary:
First trick: Everything is local
Your copy is identical to the server
In centralized system, almost every operation needs network
FAST!
Full text:
The biggest trick behind all this is that in Git everything happens locally.
In Git you always get a full copy o the entire history of the project onto your local machine. What you have locally is identical to what’s on the server. So Git never has to reach out over the network to talk to a central server the way svn and cvs do.
Everything the server can do, your local machine can do and that makes a real difference. Svn really suffers from this centralized design where almost every
command needs to go out over the network and that is immediately noticeable.
Yes, means you can work on the plane, but just the extra speed is the biggest win.
Summary:
Second trick: well written software!
Linus himself
Maintained by exceptional hackers
Bitbucket’s own occasional implementations can’t match
Full Text:
The other half of the secret is Git’s own implementation. Git is almost entirely written in C by the people who also wrote the Linux kernel.
It was Linus Torvalds himself who wrote the first version of Git that had to be capable of handling the big code base of Linux and it’s since been maintained by likeminded and exceptionally skilled hackers. And as kernel devs, they know the tricks of the trade better than anyone.
I can attest to that by the way. Working on Bitbucket, we have on occasion needed to implement custom code that mimics functionality of a particular Git command.
Now we have some pretty smart people on the team that understand Git inside out, but we’ve never ever been able to match the efficiency and speed of native Git.
This is my proof, this is a chart that shows you how faster are operations done with Git compared in this case with subversion.
This result in Git operations are incredibly fast, even for code bases that are tens of millions of lines of code. This is my proof, this is a chart that shows you how faster are operations done with Git compared in case with subversion.
The charts are similar with other centralized version control systems.
There is a myth that git doesn’t scale to large repositories.
Summary:
How does it scale? Linux is benchmark
SVN tendency to create one repo for everything
Full text:
How does it scale? Pretty well actually and the Linux kernel again provides a bit of a benchmark. Few projects are as big and actively worked on as the kernel.
Chances are your projects are well below these numbers.
There’s another thing that works in our favor here and that is that in distributed version control systems there is a strong tendency to put every individual project in its own repository. This in contrast to subversion where teams typically have a single repository that contains all the source code of completely independent projects and this makes Git much more scalable.
Everything is available at our git site: atlassian.com/git
Tutorials to assist your developers.
Suggested workflows for different types of projects.
Migration tools and strategies.
Other git resources, including statistics and ammunition to make the business case to your managers that Git is not only good for developers, it’s good for business.
Plug Nicola’s blog.
Also mention big binary file problem, and how nicola has some tips to address it.
A good workflow should be able to direct a team of software developers on how to create and maintain a codebase. A great workflow should let you answer certain important questions about your codebase, at any point in time.
If I had to fix a bug today, where would I commit code to ensure it makes the next release?
I can see that code has been committed for a feature, but how do I know if it is code complete?
How do I fix a critical bug in already released code, and perform a maintenance release? And can I do this quickly?
Once I’ve fixed that critical bug, how do I know if it’s production ready; other developers have signed off on it, and it’s ready to release?
I’d love to be able to tell you that there’s a perfect workflow that will solve all of your problems. A one size fits all solution that will get your team up and running with git in no time.
The reality is more complicated. Different software teams have different requirements.
Different companies have different cultures. Are you like a silicon valley startup, moving fast and breaking stuff? Let the users find the bugs and we’ll get a hotfix deployed before they can refresh the page? Or are you more conservative? Do you work banking, or finance, or biotech, where a single bug could cost millions, or even patients lives.
What kind of product do you work on? Is it a mobile or desktop application? Or a monolithic SaaS app? Do you maintain multiple versions or do you force users to upgrade with each release?
How big is your team? Are you a solo developer, or are you a large group? Are you a single rockstar developer backed up by external contractors? Do you let tech writers and designers commit to your codebase?
All of these factors contribute to the workflow that will be most effective for your team.
Fortunately, Git is flexible enough to cater for any workflow that you can dream up.
I’m not going to prescribe a particular workflow, but I will show you some of the elements that comprise an efficient workflow, and show you a couple of the workflows that we’ve adopted at Atlassian.
There are two fundamental components to a software project, a set of issues that need to be implemented. And the code that implements the features described in those issues.
If you’re using Subversion and have a traditional linear workflow for your repository, you’re just creating new commits one on top of the other in a line. In SVN parlance, this series of commits is known as the “trunk” of your repository.
This linear approach can make it hard to tell what state the code is in. You might be able to tell that a developer has started work on a particular issue but it’s difficult to tell whether it is feature complete or if they have more changes to commit.
commit interleaving:
if someone breaks the build, everyone has to stop committing
this is stressful and embarrassing for the developer, huge audience watching you fix something. when I last worked with SVN we had a red fire warden’s hat you had to wear until you fixed it. *** PUT ON THE HAT ***
also means everyone has to stop work until it
broken builds mean you have to go into a code freeze.
code freeze means you stop committing code and generally become unproductive
with a broken build, the code freeze is lifted when the unfortunate developer fixes the build
Commit interleaving causes another problem.
commit interleaving:
hard to tell whether a particular feature is finished
hard to release, have to ask developers whether each feature is ready to ship
performing means you have to go into another code freeze.
with a release, the code freeze can take days while the release manager (often a developer) runs around asking everyone if their features are ready
So there has to be a better way, right?
Git feature branches take care of this problem nicely. The “trunk” branch in git is usually called “master”. Instead of committing straight to master all development work is done on an independent branch. <click>
Once the code is finished, reviewed and all the tests are passing it gets merged into master. <click>
This keeps unstable changes isolated from the master branch. <click>
This means if I break the build on my branch - I’m not blocking other developers. No more red hat of shame! *** REMOVE HAT ***
And has the nice side effect that master only contains code that is ready to ship, so it is _always_ releasable.
Branch type: feature, <click> bugfix <click> or hot fix
(Hotfix is for already released code that we need to release and deploy quickly)
Issue key
Short description of issue
Good convention - allows a developer to quickly see what issue a branch addresses on the command line or in other tools, like Stash
Branch type: feature, <click> bugfix <click> or hot fix
(Hotfix is for already released code that we need to release and deploy quickly)
Issue key
Short description of issue
Good convention - allows a developer to quickly see what issue a branch addresses on the command line or in other tools, like Stash
Branch type: feature, <click> bugfix <click> or hot fix
(Hotfix is for already released code that we need to release and deploy quickly)
Issue key
Short description of issue
Good convention - allows a developer to quickly see what issue a branch addresses on the command line or in other tools, like Stash
The technique we use at Atlassian is to create a new branch for every piece of work that we do.
We love feature branching because they provide:
isolation: a developer is free to experiment on their own branch
stability: the master branch remains stable and can be used for releases, and to create new feature branches
traceability: knowing that all of the work done on a branch relates to a single feature lets you think about your codebase in terms of FEATURES.
If you use feature branching, you can easily determine which features will ship in the next release by using the —merged and —no-merged flags with the git branch command.
Branch type: feature, <click> bugfix <click> or hot fix
(Hotfix is for already released code that we need to release and deploy quickly)
Issue key
Short description of issue
Good convention - allows a developer to quickly see what issue a branch addresses on the command line or in other tools, like Stash
Let’s take a look at the first workflow I want to show you today.
Workflow that Atlassian uses to develop Bitbucket. <click>Bitbucket is a SaaS application, that is, we run Bitbucket on our own servers - we don’t ship the application code to our customers. If you also have a software-as-a-service model this workflow may work for you.
But working in isolation means you can run into integration problems with your branches.
Two branches that work independently, may have problems when they are merged together.
And again, we’ve broken the build and caused a code freeze - on with the hat! *** DON HAT ***
To ensure master is always stable, we can introduce an integration branch where the tests must pass before we merge it to master. We often call this branch “develop”.
This guarantees master remains green - so developers can always create their feature branches from master. This means we’re no longer blocked by a red build on master - so no code freeze!
This also lets us do continuous deployment to our staging and production environments. The “develop” branch is continuously deployed to our “Staging” environment.
Then, when we’re ready to push to production, we merge develop into master.
Usually, we create feature branches off develop. Sometimes if something breaks in production, we need to get a fix deployed as quickly as possible.
In this case we create a hotfix branch.
Once the branch is ready, we merge it to develop first for testing.
If the fix is successful, we then merge it directly into master for deployment to production.
We can’t merge develop into master at this point, because it has some work that we don’t want to ship to production just yet.
Used by the Atlassian Stash team.
Unlike Bitbucket, it is software that is shipped to the customer and installed by them on their server.
Unlike Bitbucket, we support multiple stable versions of Stash, so the software-as-service workflow no longer works.
So we need a different workflow.
Say we’re the Stash team and we’re about to release Stash version 1.2. Towards the end of the release cycle we’ll create a branch named 1.2 which will contain all of the code that we intend to release. Creating a branch like this means we can start testing and hardening the code for the 1.2 release. Developers who are hardening the code merge their branches into 1.2, while developers who are working on features for a future release of Stash can merge their branches into master, without effecting the stability of the 1.2 release.
But what happens if we find a critical bug after we’ve released 1.2? First we create a bugfix branch off our stable 1.2 branch.
Then we fix the bug. Once we’ve the code has been reviewed and our CI tests are passing, we merge the bugfix branch back into 1.2. Now we can release version 1.2.1 and get it out to our customers.
Then we fix the bug. Once we’ve the code has been reviewed and our CI tests are passing, we merge the bugfix branch back into 1.2. Now we can release version 1.2.1 and get it out to our customers.
But we’re not finished, we also have to merge our stable 1.2 branch (which also contains our fix) into master. This ensures that future releases will also contain our bugfix.
But what if we found a bug in an even earlier version of Stash? It’s much the same process, but we create our bugfix branch off the earliest version that we want to introduce the fix in. In this case, it’s 1.1.
Then we merge it all the way forward to our master branch. This means that the next 1.1.x version, the next 1.2.x version and the next major release will all have our code changes in it.
You can repeat this for as many stable branches as you want to maintain.
Then we merge it all the way forward to our master branch. This means that the next 1.1.x version, the next 1.2.x version and the next major release will all have our code changes in it.
You can repeat this for as many stable branches as you want to maintain.
Then we merge it all the way forward to our master branch. This means that the next 1.1.x version, the next 1.2.x version and the next major release will all have our code changes in it.
You can repeat this for as many stable branches as you want to maintain.
One problem with this workflow is that the forward merging is tedious and (like most tedious activities that humans are forced to do) prone to human error.
So we decided to automate this in Stash.
The first method is with a git hook. You can think of git hooks as git’s plugin system. They can be used to trigger behavior when certain changes happen to your repository, either on the client or on the server.
One of our developers has written a server-side update hook that will forward merge branches for you, according to an ordered list of branches.
Alternatively if you’re using Stash, it has forward merging baked right into it. Provided you use semantic versioning, Stash will automatically detect and suggest forward merges when you attempt to merge a Pull Request.
So I’ve talked a lot about branching. Let’s look at the different ways you can merge those branches together.
A merge occurs when you we’ve been working on a branch for a while and you get to the point where you feature is finished and it’s ready to get applied to the main
branch, the stable version of your project. This process is merging and it leads to a new commit being created on the main branch that unifies the changes from both
sides.
Summary:
Merging is important for all VCSes
Should not be avoided. Branching makes teams productive.
Git branching > SVN braching
SVN braching is cheap, merging is not
Explain points on slide
Full text:
The ability for a version control system to perform merges is incredible important. After all, for every branch you create, you end up having to perform a merge. And that is not something you should avoid. Branching makes teams much more productive.
It’s often said that git’s merging implementation is better than that of other version control systems and is certainly true for svn. The comparison with svn is kind of
interesting.
Svn has always claimed branching as one of its core strengths. Branching is cheap it claims. And it is. But if you can’t merge just as easily, then it’s just not as interesting.
So let’s touch on a few reasons why Git is better.
First off, like almost everything else, merging is a local operation and does not require communication with a server. Which makes it substantially faster than centralized systems, especially on larger repositories.
Secondly, Git has access to the entire history of the branches it needs to merge and it uses that to figure out at what point the branches start diverging. The “common ancestor” commit. Using just these three pieces of information it can work out what happened on each branch and therefor how to combine them. 3 way?
It’s important to realize that Git does not replay the commits on the feature branch against the main branch. It doesn’t even look at the contents of any commit other than the latest and so even merging very old and active branches is very fast.
Now there is no guarantee that everything can always be merged automatically and depending on what happened to both branches, there can still be merge conflicts that require manual intervention. However here Git has another trick up its sleeve. Changes that appear to be conflicting at first sight, can sometimes be resolved by looking carefully at the ancestry and a series of fairly complex merging algorithms have been developed over time. Git takes advantage of this by making its merging algorithms pluggable and so over time it has become even better at merging.
The result of all this is that in most cases, merges become so simple that you forget they’re even happening. In fact, most of the time, Git will merge fully automatically for you and so aside from the automated commits that appear in the timeline, you’d never even realize they were occurring.
Now lets look at a slightly more advanced way to combine branches together: a command that’s unique to Git called the rebase.
Summary:
Explain scenario with pulls from upstream branch
Explain rebasing solution with manual patches
Full text:
Rebasing can be a little hard to understand at first and I’m going to try to explain it using a common scenario.
Imagine you’re working on a long lived feature branch. It contains days or weeks worth of commits, but you’re not ready to merge it just yet. It’s not finished. In the
meantime, development on master also continued and you want to use those changes. Maybe a bug has been fixed and you need that fix on your branch.
So far we’ve talked about merging feature branches into the main branch, but our branch isn’t finished yet. Now instead we can do the reverse. We can merge the main branch into our feature branch! Note that this does not affect the main branch. It stays where it is. We just pull its changes into our branch. This creates an automated merge commit on our branch. We then continue development.
After doing this a few times, we see a pattern of normal commits, interleaved by merge commits. That’s no big deal, except that if we want to see everything we did on our branch since its creation, we see not only our own work, but also that which was pulled in from master. Work that is completely unrelated to our feature.
Rebasing addresses this problem. Say we’re at the point where we need to pull in master. Imagine we would export the work we did on our branch as a patch. Just a file we store on the side for a while. We then delete our commits and the branch. Next, we go to the tip of the master branch and we create a new branch off of it. Then we apply our patch file onto the new branch. This restores our work, but now it’s based off of the current version of master. And we didn’t need to create a merge commit.
This is essentially what git does when you rebase. It automates what would otherwise have been a huge amount of manual labor.
Summary:
Rebase to avoid merging feature branch into master
Full text:
You can also do this when you are ready to merge your branch into master. Instead of letting Git create a merge commit, you can instead tell it rebase all your feature
commits directly onto the master branch. Replaying them one by one as it were.
This way you can effectively merge your changes in without creating any actual merge commits.
pick: leaves a commit intact
reword: lets you change a commit message
edit: pauses the rebase process and lets you change the code in the commit
squash: combines multiple commits into a single commit, and concatenates the commit messages together
fixup: is the same as squash, but just uses the latest message instead of combining them
exec: lets you run shell commands in between rebase commands. One common use case is to use it to run tests to make sure you didn’t break anything during the rebase!
Speaking of breaking things. Whenever I talk about a rebase, I need to give you a customary warning. Rebase rewrites the history of your code base. You should only use it carefully and on branches that you haven't shared with anybody.
If you rewrite public history in git, you can end up with the same commit multiple times in your history. Or you can end up with changes that you thought you deleted re-applied when somebody else pushes.
Say the last commit on master is a shiny new postgres adapter that Bob has written to use postgres in the app. Say Bob decides that he wants the app to use mysql instead. Instead of creating a new commit, he replaces the postgres commit by rewriting it thinking that no-one would ever find it useful. When Alice pulls, she’ll be prompted to merge, and we’ll end up with two different database adapter implementations in the code base!
So remember, don’t rewrite history on shared branches!
Now you know what a merge commit and a rebase are, so let’s look at three popular ways to move changes between branches.
There’s a traditional merge commit - creates a commit with two parents.
You can rebase on to master, where you recreate the commits of your branch on top of your master branch.
We don’t like this at Atlassian because you lose information about whether features were originally developed on isolated branches.
And then there’s the squash, where you rewrite all the changes on your branch as a single commit on top of master.
We don’t like this because you lose interstitial information about your history - and sometimes miss the subtle development of a bug.
And Git is supposed to follow the unix philosophy of individual commands that do "one thing well"..
At Wed Dec 21 00:01:00 apparently, which makes me think the timestamp was forged.
what does quality mean?
it’s not a furry Australian marsupial drinking boiled leaves
quality code is code with:
- low bugs
- performant
- low technical debt
- easy to maintain
all sorts of things can help quality: testing, static analysis tools, code coverage reports, but I think the most important is..
Code Review!
How many people are doing code review?
Sometimes developers avoid doing code review because they think it will slow them down. However, our research shows that teams who do code review actually release software faster!
Sometimes developers avoid doing code review because they think it will slow them down. However, our research shows that teams who do code review actually release software faster!
Better code - different experience, different knowledge, different specializations
Shared knowledge - back on Crucible .., technologies specific to a stack, patterns specific to a team
Team ownership - has anyone here ever shipped a bug?
You see your code. You see which lines got removed in red and which lines got added in green. Then you can add in-line comments and discuss the code.
Pull requests should be a lightweight process. If you make sure the pull request diff relates only to a single issue, it is easier for the reviewer to reason about why you have changed something.
Similarly, reformatting existing code is just noise - create a new pull request if you want to do this. Remember - do just one thing.
Git, like all other version control systems, outputs diff information by line by default. If you have really long lines, reviewers will spend a lot of time scrolling around in their browser.
Make sure your code works and is tested, before you create the pull request. Some teams will also use a code coverage tool to check that the %age of the codebase covered by tests doesn’t drop on a feature branch. Stash actually helps you out here by displaying the build status of the commits inside the Stash UI.
Mark’s 10th point is to avoid thrashing. Thrashing is where a reviewer doesn’t like some of the work that you’ve done and asks you to remove it <click>, and you create a new commit removing the work. <click> You now have three extra commits that don’t do anything useful! Instead you could consider using an interactive rebase to remove those commits.
But remember - don’t do this on a shared branch.
But I have two
But there’s also another form of thrashing that’s good to avoid. Sometimes developers might disagree on the best way to implement something. Don’t let your pull request comments look like youtube comments! As a general rule of thumb, if a comment thread gets deeper than 5 levels of comments, it’s time to try something else. Often talking to the person in real life will be a more efficient way to reach a resolution. At Atlassian we also sometimes respectfully resort to architects to have a final say when there is a contentious decision to be made.
One final tip is to make sure that you agree as a team how you want to approach pull requests.
So we’ve discussed many different aspects of software
So now let’s talk about how git can magically improve the quality of your code.
We’ve talked about building master and develop
branching also brings another advantage. in fact - this next thing has saved me literally weeks of time since we moved to git
branching also brings another advantage. in fact - this next thing has saved me literally weeks of time since we moved to git
The conventional wisdom for continuous integration is to always build your master branch, but as I mentioned earlier, we like to build feature branches too so we can make sure features are working before they’re merged into master.
Ideally, your CI server is configured to automatically detect when a new branch is pushed and build it.
We also show the status of the commit you can create a branch from in the Stash UI.
At first, developers are like “creating branches on the server? That’s crazy!”
This little green tick means that the tip of branch you are creating a branch from is green.
this is important, because it stops you from confounding build breakages
imagine developer who is a little too cool for school
commits directly on master to fix a bug in your authentication system
accidentally breaks a test
no worries, commits a fix and the build is green again
but there is a problem
imagine another developer comes in to fix a bug in the data access layer, and does the right thing and creates a branch
their build will fail to, and it’s going to be a very surprising situation as it’s look like their change has somehow caused a problem with the authentication system, and they’ll spend time debugging it
the little green build indicator makes it hard to miss, eliminating this problem. another good reason to create branches on the server.
So now let’s talk about how git can magically improve the quality of your code.
You can think of hooks as git’s plugin system. They let you hook into certain git operations and customize the behavior of your repository. If you look in the .git directory of any git repository you’ll see a directory named “hooks” which contains a set of example hook scripts.
but there is a problem
imagine another developer comes in to fix a bug in the data access layer, and does the right thing and creates a branch
their build will fail to, and it’s going to be a very surprising situation as it’s look like their change has somehow caused a problem with the authentication system, and they’ll spend time debugging it
There are two broad classes of hooks, client-side and server-side. Client-side hooks run on your developers machines and server-side hooks run on your git server.
You can also categorize hooks as pre- or post- hooks. Pre-hooks are invoked before certain git operations, and have the option to cancel an operation if needed. Post-hooks on the other hand run after an operation has completed, and therefore don’t have the option to cancel it.
For example, I have a post-checkout hook installed on my development machine that runs every time I checkout a branch. It calls out to my Bamboo server and checks to see whether the tip of the branch has a passing build or not. If it doesn’t, it warns me that it is not safe to create a new branch from that point, as my new branch will presumably have the same test failures.
Another popular hook is the post-receive hook. Post-receive hooks are called when branches are updated on your git server. One popular use case for this kind of hook is notifications. Some teams at Atlassian use post-receive hooks to send notifications to their team’s HipChat rooms when new branches are pushed to the server.
Pre-hooks are a very powerful type of hook, as they allow you to prevent certain operations from proceeding. The pre-commit hook is a hook that lets you intercept (and optionally prevent) a commit operation before it happens. One common use of the pre-commit hook is to check that your code passes automated style or lint rules before you create a commit. Since this type of automated check typically runs very quickly, it’s convenient to run it locally before you commit your changes. This prevents you from pushing a commit that will break the checkstyle or lint build.
Some operations are too expensive to run locally though. This is where the powerful server-side pre-receive hook comes in. Pre-receive hooks can be used to prevent developers from pushing code to your server, unless the code meets certain conditions. <click> You can think of these hooks as elite ninja guardians, protecting the master branch from bad code.
There are many different techniques you can use with pre-receive hooks to protect your master.
One popular technique is to block merges unless the tip of the branch being merged in has a green build. By only allowing branches with green builds to be merged into master, we vastly improve our chances of keeping master green and stable too. When a developer attempts to push the merge commit to the git server, the pre-receive hook is executed and calls out to your CI server. If the CI server reports that the tip of the merged branch is broken, the push is rejected and the developer is notified that they will have to fix the build before merging.
A similar technique can be used for checking code coverage or check style violations. If your CI server generates coverage metrics as part of the build, you can use a pre-receive hook to check whether a particular commit increases or decreases your code coverage. If code coverage decreases, the update is rejected and the developer has to improve their test code and try again.
If you’re interested in playing around with some of these hooks, checkout this link: bit.do/git-ci. It’ll take you to a Bitbucket repository with some ruby hooks for enforcing check style, code coverage and green builds on your master branch.
Now I’ll hand you back over to Sarah to wrap up.
allows core development team control over what ships to your customers
allows junior developers to work on the main repository while having their code reviewed by senior developers
allows contractors to work on up to date forks of the repository using fork synchonization, which means no ugly conflicts for your development team to solve
version control sensation that’s sweeping the nation
Developers love:
* speed
* offline
* created by Linus Torvalds, the dude who wrote Linux
Teams love it:
* collaboration among distributed teams really easy
* flexible branching model to get rid of code freezes and make releasing easy
* light weight pull requests that ensures high quality code and smarter developers
Let’s quickly talk about why you might want to use Stash to host your Git repositories.
But I do have some git credentials. If you look at the commit log for Atlassian Stash, you can see my name on the very first commit. Back in late 2011 myself and another developer built the very first spike of Stash during Atlassian’s quarterly ShipIt competition in the Sydney Atlassian office. I then spent a bit of time as a Stash team lead before relocating to San Francisco to join our developer relations team.
Three reasons why I’m really proud of what we’ve built with the Stash project.
Three reasons why I’m really proud of what we’ve built with the Stash project.
Forking: Each developer creates a copy of the main repository, pushes changes to it, and then creates a pull request between the two repositories.
Branching: Each developer creates branches in the same repository and then creates pull requests between branches.
We prefer branching: simpler - developers don’t have to configure multiple remotes, and you don’t have to set up your continuous integration servers to watch multiple repositories. Also you have better visibility into what your team is doing from one place.
Bitbucket and GitHub have good pull request implementations
Other products allow you to discuss changes that have been made
Stash ALSO enforces that certain conditions are met.
Pull request merge checks allow you to enforce certain pre-conditions before a merge is allowed, including requiring green builds and a certain number of reviewers.
<click>
Repository hooks allow you to intercept code changes even earlier - when a developer pushes changes to the server! e.g. hip chat and CI server notifications, or preventing people from doing force pushes or pushing files of a certain type or size.
<click>
Branch permissions allow you to limit who can push to a particular branch. Great for contractors and interns!
Branch permissions can be configured by branch name or an ant glob pattern, allowing you to do funky things like enforcing naming conventions for your branches.
Both merge checks and hooks are pluggable, so you can customize them to suit your workflow exactly!
Another reason to use Stash is it’s tight integration with the rest of the Atlassian stack. We’ve shown you how you can view your Stash repo data in JIRA, and vice versa, and transition your issues based on your code. We’ve also shown you how Stash can notify Bamboo and HipChat when developers push changes or work with Pull Requests.
But Stash also integrates with other tools that you may already by using!
There are freely available community maintained plugins for Jenkins, TeamCity and plenty of other CI servers. Plus there are SVN mirroring plugins and migration tools if you’re looking for a nice way to migrate smoothly.
The reason these plugins have evolved is because Stash has a comprehensive and open API!
Full Java API - with full backwards compatibility and proper deprecation procedures
Repository Hooks & Merge Checks - as we discussed earlier
Pluggable User Interface - allowing you to inject your own HTML, CSS and Javascript into the Stash UI
Build your own REST end-points and SSH commands - to provide custom functionality & expose your data in interesting ways
Rich plugin points like Filetype renderers, that let you define customer “source” and “diff” views for the file types that you use, e.g. 3d models or proprietary formats
<click>
And what makes Stash really special is that we’ve built it as a platform first.
<click>
Many of the features that we’ve shown you today are actually built as plugins! This means that we’re coding against exactly the same API as external developers use, meaning it’s rock solid and battle-hardened when you come to do your own customization.
<click>
Plus, if you have a commercial license - even a $10 starter license - you get the source for free. So while we’re not exactly open source, you can audit what we’re doing if you want - and use it as inspiration for your own plugins.
(use previous slide instead?)
When we first started the Stash project, we had two teams. One was mainly front-end developers, and the other was mainly back-end. The contract between the two teams was the REST API. This means that Stash has a very strong mapping between it’s REST and Web UI. Almost all the functionality available in the UI is available via REST. In fact - all you need to do is insert /rest/api/latest at the front of a particular resource and bingo! <click>
.. you get the REST variant of that resource.
The third reason I’m proud of what we’ve done with Stash, is what we’ve done to improve it’s performance and scalability.
To ensure uptime, we’ve implemented a throttling system that limits the number of concurrent git commands that are invoked at any one time. There are separate limits for hosting operations - like clones, pushes and pulls - and operations that are used to populate the user interface.
This means that your developers will still be able to do pull requests while your CI servers are hammering the hosting APIs, and vice-versa.
The limits themselves are configurable - just like everything else in Stash. There are very few hardcoded constants - almost everything is configurable by your system administrator.
Speaking of CI servers <click> one thing we implemented fairly early on was a repository hook to notify CI servers when a branch was updated - this is much more efficient than having your CI servers poll the repository for changes. <click> It did, however, mean that you’d have a large number of incoming requests coming from CI servers whenever a change was pushed - sometimes pushing Stash over the throttling limit. This would occasionally result in rejected requests, which would cause the CI server to fail the build. So we implemented request queuing, which holds the request open for a short period of time, waiting to be allowed to proceed. This time is also configurable.
Then we got even smarter about it. Because these CI servers are all typically requesting the same data - that is the same set of ref updates - we realized it was a pretty good candidate for caching. So now, <click>when the first request comes in from a CI server, <click>Stash spools the response out to disk whilst queuing up any other concurrent requests for the same objects. Then, <click> subsequent requests are served off the cached file - which is much more efficient, and much less load on the server, meaning we can serve more requests!
But we still weren’t 100% happy, because even though it was much more efficient - there was still an upper bound to how many concurrent requests that a single Stash server could handle. <click>So we decided to implement clustering. Stash is the only clusterable Git solution out there, and effectively means you can scale up your instance to support as many developers, CI servers and other clients as you like!
So these are the three reasons we are super proud of what we’ve achieved with Stash. Quick recap on workflows; APIs, extensions & integrations; and scalability.
But we still weren’t 100% happy, because even though it was much more efficient - there was still an upper bound to how many concurrent requests that a single Stash server could handle. <click>So we decided to implement clustering. Stash is the only clusterable Git solution out there, and effectively means you can scale up your instance to support as many developers, CI servers and other clients as you like!
These features are why there are all sorts of teams using Git and Stash.
These are not small companies - one of the common misconceptions about git is that it is only used by startups and open source projects. Not true! Huge enterprises are using Git.
These are not small companies - one of the common misconceptions about git is that it is only used by startups and open source projects. Not true! Huge enterprises are using Git.
If you want to try out any of the products we’ve shown off tonight, we have a free 30 day trial of Stash, bitbucket is free for five users and SourceTree is 100% free. We also have $10 starter licenses for all of our products that are fully featured for up to 10 users.
Bring up Jesse Miller from ServiceRocket, our sponsors for this evening to help answer any questions.
Before we head into Q & A I’d like to pull the organizer Atlassian User Group up on stage to tell you a bit about it, Paul.
Thanks Paul! Now we’ll take a bit of time for questions.