14. Code Review with Gerrit
• Web interface – accessible from multiple sites.
• Scales to handle a large number of repositories
• Inline comments and block comments
• Code review required before changes hit the master
branch
15. Big Picture
• Code Review workflow is integrated into the
development process
• Developer does not have to do anything “extra” to start
review process.
• The CI Server is an active participant in the review
process
• The final, cleaned patchset is what appears in the
project history
• My project history looks good. Damn good.
• My developers appear to be geniuses.
• fin
16. Actions items:
All the tools discussed today are Free Open Source Software.
• http://git-scm.com/
• https://code.google.com/p/gerrit/
• http://jenkins-ci.org/
• https://wiki.jenkins-ci.org/display/JENKINS/Gerrit+Trigger
• http://hudson-ci.org/
Developer is given a task to implement a new feature / fix a bug.
He checks his code into the SCM. This is so his changes are not just local to his machine and also so he can share his changes with other Developers for Review.
Other Developers are tasked with doing a review of the code.This can involve patching a local copy of the src tree, getting a new clone out to a temp dir, etc..
Build Server – Runs a compile on the src code on a nightly/weekly. This will catch compile errors for targets that the developer may not be building for.
Developer checks his code into Gerrit/Git – They do not have to do anything special, it looks and feels like a regular push.Gerrit holds the changes in a patchset for review - It will also notify anyone that is in the review queue for this project that a patchset is ready for review.Reviewers log into Gerrit and can review all the patchsets in their queue. This can be done as they have time/resources available.Reviewers add comments to the patchset.Developer can look at the comments (code inline as well as patchset) and can make any recommended changes.He can then push this “new” change to Gerrit. Gerrit sees this new patchset and “knows” that this is an update to the original patchset and creates patchset2. Rinse – Repeat until all parties are satisfied on the patchset.The final change is pushed into the backend Git repository.Jenkins – Is an active particiapnt in the review process. - He can checkout a patchset and verify the patchset will build for the targets it is configured for. He then marks the Verified Column in the Gerrit Review with a fail/pass.Pre-commit code reviewsThis one item is worth using git/gerrit. Using gerrit as a frontend provides a ‘standard’ git interface to the developers. They push there code to the git server, no special check in process, no special software to install. The developer just pushes their code to a tag “refs/for/<branch>” that gerrit understands. gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review.The patchset is ‘held’ in gerrit until the code has been reviewed. It then can be submitted into the git repository. This patchset can be updated, abandoned, resurrected, etc. all without impacting the git repository that it has been pushed to. This allows for changesets to be in review and pending without impacting the code base. The patchset can also be updated by the developer based on comments during review. They make the requested changes and just push the same commit to the git server. Gerrit sees that this is a new patchset based on a previous one and adds it to the review as patchset<x>. Review Process Workflow that can be integrated into the development process.This is where the rubber hits the road. Using gerrit/git allows the review process to be fully integrated into the development process. The developer does not have to learn any new process, they just push their changes to git and gerrit takes care of the magic. The developer pushes there code to the git server, no special check in process, no special software to install. The code is pushed to a special tag “refs/for/<branch>” that gerrit understands. Gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review. It then emails out to whomever is on the review list that a new review is in their queue. When the reviewer(s) log into gerrit, they see the patchset they have been asked to review in their queue.Integration with a build serverThis is where the three amigos meet. Jenkins (build server) has built-in hooks to monitor and build against a gerrit/git SCM system. This allows for automated builds to happen as a trigger event based on a patchset being submitted into gerrit. The developer does not have to do anything special to trigger this event, it is automatic based on the patchset and which branch it is being pushed to in gerrit/git.This can be used to build a set group of targets based on a given branch, or all of the targets that a given project builds for.
Developer checks his code into Gerrit/Git – They do not have to do anything special, it looks and feels like a regular push.Gerrit holds the changes in a patchset for review - It will also notify anyone that is in the review queue for this project that a patchset is ready for review.Reviewers log into Gerrit and can review all the patchsets in their queue. This can be done as they have time/resources available.Reviewers add comments to the patchset.Developer can look at the comments (code inline as well as patchset) and can make any recommended changes.He can then push this “new” change to Gerrit. Gerrit sees this new patchset and “knows” that this is an update to the original patchset and creates patchset2. Rinse – Repeat until all parties are satisfied on the patchset.The final change is pushed into the backend Git repository.Jenkins – Is an active particiapnt in the review process. - He can checkout a patchset and verify the patchset will build for the targets it is configured for. He then marks the Verified Column in the Gerrit Review with a fail/pass.Pre-commit code reviewsThis one item is worth using git/gerrit. Using gerrit as a frontend provides a ‘standard’ git interface to the developers. They push there code to the git server, no special check in process, no special software to install. The developer just pushes their code to a tag “refs/for/<branch>” that gerrit understands. gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review.The patchset is ‘held’ in gerrit until the code has been reviewed. It then can be submitted into the git repository. This patchset can be updated, abandoned, resurrected, etc. all without impacting the git repository that it has been pushed to. This allows for changesets to be in review and pending without impacting the code base. The patchset can also be updated by the developer based on comments during review. They make the requested changes and just push the same commit to the git server. Gerrit sees that this is a new patchset based on a previous one and adds it to the review as patchset<x>. Review Process Workflow that can be integrated into the development process.This is where the rubber hits the road. Using gerrit/git allows the review process to be fully integrated into the development process. The developer does not have to learn any new process, they just push their changes to git and gerrit takes care of the magic. The developer pushes there code to the git server, no special check in process, no special software to install. The code is pushed to a special tag “refs/for/<branch>” that gerrit understands. Gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review. It then emails out to whomever is on the review list that a new review is in their queue. When the reviewer(s) log into gerrit, they see the patchset they have been asked to review in their queue.Integration with a build serverThis is where the three amigos meet. Jenkins (build server) has built-in hooks to monitor and build against a gerrit/git SCM system. This allows for automated builds to happen as a trigger event based on a patchset being submitted into gerrit. The developer does not have to do anything special to trigger this event, it is automatic based on the patchset and which branch it is being pushed to in gerrit/git.This can be used to build a set group of targets based on a given branch, or all of the targets that a given project builds for.
Web Interface Gerrit provides a web interface that allows code review, patchset generation, cherry-picking, etc. of patchsets that have been submitted for review. Access to this web interface and the underlying repositories can be access controlled so that developers only have access to the projects that they are working on. It allows for a custom view of the patchset under review. A reviewer can choose to view any number of lines that surround the change, up to the whole file. This allows each reviewer to view as much information as they need, without having to check out any code.
Inline comments and block commentsGerrit allows the reviewer(s) to enter both inline and block comments on any patchset they are reviewing. It also keeps a history of the patchset as it goes through the review process. This gives the reviewer/developer the ability to access the past history of comments on the change.
Inline comments and block commentsGerrit allows the reviewer(s) to enter both inline and block comments on any patchset they are reviewing. It also keeps a history of the patchset as it goes through the review process. This gives the reviewer/developer the ability to access the past history of comments on the change.
“Ideal” Code Review SystemWeb InterfaceToday we have development teams spread around the world. The old adage the sun never sets on the British Empire, could be applied to some of our current development teams. With developers not always located in the same geographical/timezone area, it becomes important to have a web interface to allow code review to be a process that is not dependent on sitting around a table. A developer can submit his code for review and when his teammate gets into his office, he can review his code. Pre-commit code reviewsOne of the biggest problems facing code review is how to satisfy both requirements to have the code under SCM, and at the same time not impact any current code base with pre-reviewed code. There are many ways to implement this, having a separate (sandbox) repository for untested/unreviewed code and submitting patchset(s) for changes into a SCM are a couple of ways. The problem here is that most of these methods add overhead to your development process. Having to maintain two repos, one for production one for development or adding additional steps to the development process to create the patchset for a change to be reviewed. Can handle a large number of repositoriesDevelopment teams today work on multiple projects, each one normally has its own code base that needs to be maintained. Being able to maintain a large number of different repositories, while not a major issue with SCM systems today, is worth mentioning. It can become an issue in how the SCM stores the repository and how much space on the server it takes. Inline comments and block commentsThis is important to allow reviewers, to not only comment on the actual change itself, but add comments inline in the patchset/code that is being reviewed. Think of this as a global comment on the change, “The commit message needs to have some more verbage added to describe the change better” versus a local comment on in the code, “This variable is being used in file ABCD.c, check this file to make sure we do not have an issue.” Both types of comments, inline and block, should be part of the code review history. Integration with a build serverProjects that share code across platforms and need to be able to cross check common code for multiple build targets. Having a build server that can do the ‘grunt’ work of building multiple targets for a code base puts a check in place that is not dependent on a developer doing the builds. With some projects having many targets, having a build server helps to automate and standardize the process. Review Process Workflow that can be integrated into the development process.The trick is to integrate the code review so it is a part of your ‘normal’ code development process. If there is any “exception” path that allows engineers to bypass code review for emergencies, this will become the normal path. From the developer’s point of view, the code review process should have a minimal impact on the development process. The best case is that the developer normal check/commit process for submitting code into the SCM is the code review process.Can handle a large number of repositoriesAll modern SCM systems can handle multiple repositories. Where git stands out though is in the size of the repository and how it stores the files.For example the Mozilla repository is reported to be almost 12 Gb when stored in SVN using the fsfs backend. Previously, the fsfs backend also required over 240,000 files in one directory to record all 240,000 commits made over the 10 year project history. The exact same history is stored in git by only two files totaling just over 420 Mb. This means that SVN requires 30x the disk space to store the same history. One of the reasons for the smaller repo size is that an SVN working directory always contains two copies of each file: one for the user to actually work with and another hidden in .svn/ to aid operations such as status, diff and commit. In contrast a git working directory requires only one small index file that stores about 100 bytes of data per tracked file. On projects with a large number of files this can be a substantial difference in the disk space required per working copy. This same comparison can be made between git and cvs, where a 3x improvement in disk space usage has been seen.A side effect of how git manages its repository, is that each time you clone a repository locally you get the full repository. All the history, etc. is cloned to the local machine from the server. This allows for developers to work on code and switch between branches, search history, etc. without having to be physically attached to the ‘central’ SCM.
“Ideal” Code Review SystemWeb InterfaceToday we have development teams spread around the world. The old adage the sun never sets on the British Empire, could be applied to some of our current development teams. With developers not always located in the same geographical/timezone area, it becomes important to have a web interface to allow code review to be a process that is not dependent on sitting around a table. A developer can submit his code for review and when his teammate gets into his office, he can review his code. Pre-commit code reviewsOne of the biggest problems facing code review is how to satisfy both requirements to have the code under SCM, and at the same time not impact any current code base with pre-reviewed code. There are many ways to implement this, having a separate (sandbox) repository for untested/unreviewed code and submitting patchset(s) for changes into a SCM are a couple of ways. The problem here is that most of these methods add overhead to your development process. Having to maintain two repos, one for production one for development or adding additional steps to the development process to create the patchset for a change to be reviewed. Can handle a large number of repositoriesDevelopment teams today work on multiple projects, each one normally has its own code base that needs to be maintained. Being able to maintain a large number of different repositories, while not a major issue with SCM systems today, is worth mentioning. It can become an issue in how the SCM stores the repository and how much space on the server it takes. Inline comments and block commentsThis is important to allow reviewers, to not only comment on the actual change itself, but add comments inline in the patchset/code that is being reviewed. Think of this as a global comment on the change, “The commit message needs to have some more verbage added to describe the change better” versus a local comment on in the code, “This variable is being used in file ABCD.c, check this file to make sure we do not have an issue.” Both types of comments, inline and block, should be part of the code review history. Integration with a build serverProjects that share code across platforms and need to be able to cross check common code for multiple build targets. Having a build server that can do the ‘grunt’ work of building multiple targets for a code base puts a check in place that is not dependent on a developer doing the builds. With some projects having many targets, having a build server helps to automate and standardize the process. Review Process Workflow that can be integrated into the development process.The trick is to integrate the code review so it is a part of your ‘normal’ code development process. If there is any “exception” path that allows engineers to bypass code review for emergencies, this will become the normal path. From the developer’s point of view, the code review process should have a minimal impact on the development process. The best case is that the developer normal check/commit process for submitting code into the SCM is the code review process.Can handle a large number of repositoriesAll modern SCM systems can handle multiple repositories. Where git stands out though is in the size of the repository and how it stores the files.For example the Mozilla repository is reported to be almost 12 Gb when stored in SVN using the fsfs backend. Previously, the fsfs backend also required over 240,000 files in one directory to record all 240,000 commits made over the 10 year project history. The exact same history is stored in git by only two files totaling just over 420 Mb. This means that SVN requires 30x the disk space to store the same history. One of the reasons for the smaller repo size is that an SVN working directory always contains two copies of each file: one for the user to actually work with and another hidden in .svn/ to aid operations such as status, diff and commit. In contrast a git working directory requires only one small index file that stores about 100 bytes of data per tracked file. On projects with a large number of files this can be a substantial difference in the disk space required per working copy. This same comparison can be made between git and cvs, where a 3x improvement in disk space usage has been seen.A side effect of how git manages its repository, is that each time you clone a repository locally you get the full repository. All the history, etc. is cloned to the local machine from the server. This allows for developers to work on code and switch between branches, search history, etc. without having to be physically attached to the ‘central’ SCM.