The document describes building a low-cost sample tracking system using G Suite and Jira Cloud. It discusses using current off-the-shelf technology to create a serverless solution, how low-cost solutions can accelerate academic research, and developing the minimum viable product through iterative delivery. Permission to learn new skills can help develop capabilities to address problems and move research forward.
Vector Search -An Introduction in Oracle Database 23ai.pptx
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking
1. Building a low-cost sample tracking system
with G Suite & Jira Cloud
What you can do with a little knowledge, a lot of ignorance,
some time, and permission to take a boondoggle
For Bio-IT World
2019/04/17 v1
2. About the Broad Institute of MIT and Harvard
• Propelling the understanding and
treatment of disease
• Collaborating deeply
• Reaching globally
• Empowering scientists
• Building partnerships
• Sharing data and knowledge
• Promoting inclusion
3. Take aways
• Current off-the-shelf technology allows for a serverless sample tracking solution (backed by a
lot of infrastructure)
• Low-cost solutions in academic research are good due to the effects of overhead and having
them removes finding sources of funding as a rate limiting factor for accelerating science
• Developing the Minimum Viable Product, along with short cycle/iterative delivery of solutions
to users, allows rapid feedback of what works to increase the velocity of science
• Making delivery deadlines on time builds faith that further iterations are worth the investment of
the project team’s time/focus
• Permission to invest time into learning a new skill not obviously in line with a job description
can move research forward by developing of new capabilities to apply to problems
V
fungible
4. A little history…
• 2014: I arrive at the Broad to work on solutions for management of laboratoryscientific data,
divided into functions (graphic by Scott Sutherland)
5. A little history
• Turns out the biggest need: Where’s my stuff (i.e., samples, data)?
6. One view of sample tracking at the Broad
• The parable of the blind people and the elephantiformes
7. One view of sample tracking at the Broad
Sample lifecycle Activity Tracking systems
Before physical samples received Project launch
Find participantssamples
Ship sample kits to participants
Jira Cloud
Google SheetsGoogle Forms
Consent systems
Smart sheets
Before processing Store samples
Process samples prior to
sequencing
Bespoke LIMS, COTS lab data
management systems
Google Sheets
Jira Cloud
During processing (e.g.,
sequencing)
Sequencing at GP/elsewhere
Analysis by Proteomics
Bespoke LIMS, on-premises Jira
After processing Data analysis
Data transfer
Google Sheets, Jira Cloud
Aspera, Trello
After initial use Compare samples
Reuse samples
?
Consent systems
8. One view of sample tracking at the Broad
Sample lifecycle Activity Tracking systems
Before physical samples received Project launch
Find participantssamples
Ship sample kits to participants
Jira Cloud
Google SheetsGoogle Forms
Consent systems
Smart sheets
Before processing Store samples
Process samples prior to
sequencing
Bespoke LIMS, COTS lab data
management systems
Google Sheets
Jira Cloud
During processing (e.g.,
sequencing)
Sequencing at GP/elsewhere
Analysis by Proteomics
Bespoke LIMS, on-premises Jira
After processing Data analysis
Data transfer
Google Sheets, Jira Cloud
Aspera, Trello
After initial use Compare samples
Reuse samples
?
Consent systems
9. One view of sample tracking at the Broad
Sample lifecycle Activity Tracking systems
Before physical samples received Project launch
Find participantssamples
Ship sample kits to participants
Jira Cloud
Google SheetsGoogle Forms
Consent systems
Smart sheets
Before processing Store samples
Process samples prior to
sequencing
Bespoke LIMS, COTS lab data
management systems
Google Sheets
Jira Cloud
During processing (e.g.,
sequencing)
Sequencing at GP/elsewhere
Analysis by Proteomics
Bespoke LIMS, on-premises Jira
After processing Data analysis
Data transfer
Google Sheets, Jira Cloud
Aspera, Trello
After initial use Compare samples
Reuse samples
?
Consent systems
10. Components of a G Suite & Jira Cloud-based
sample tracking system
<name>@broadinstitute.org
<name>@broad.mit.edu
16. Suitable for all?
• Discovery: Can experimental techniques can
produce data to answer scientific questions
• Scale Discovery: Scaling experimental techniques
so they can more reliably produce data at high rate
• Data Production: Regularly producing experimental
data and producing quality control data
• Iterative Refinement: Refining production-scale processes
and some level of change management is expected to
ensure the quality of the data produced is maintained
or improved
Early stages
technology
development
(e.g., PRISM)
Platform
(e.g., DMX)
17. Projects helped so far…
Project Things tracked Approximate go live date
Comparative Medicine Can’t tell (Issue security!) 2018/01/01
Firehose to FireCloud Migration ~2800 2018/03/01
Regev Lab (scRSP) ~2700 2018/04/01
Archive of Lines in Artificial Societies ~500 2018/04/15
NeuroGAP-Psychosis Ship Log ~100 2018/05/01
External Compound Request ~50 2018/06/01
Microbial Omics ~950 2018/08/01
Data Map Expansion In planning stages
18. Common factor? Each of these groups is piloting solutions with
rapid iterations, applying Agile techniques to speed science
• Sheila Dodge's Dynamic Work Design paper
• Agile Academia (Broad Affinity Group)
• Kendra West's The Agile Laboratory Handbook
• Kendall Square Agilists & Agile Biotech Boston
19. Development principles
1. Move science forward
2. Usability to encourage people to use it!
3. Low cost (i.e., no Jira Cloud add-ons, no outside labor)
4. Solution sustainable beyond initial development team
5. Deliver solutions to users in short time frames and rapidly iterate
6. Users in control as much as possible for shape of solution (e.g., layout of Google Sheet,
which fields needed, columns in Jira Boards, etc.)
7. Have as little code as necessary/leave as much to other components as possible (e.g.,
VLOOKUP in Google Sheets)
8. Limit dependencies between components where possible (hah)
9. At least attempt to think about security (e.g., limit storage of credentials)
10. Document, document, document…
20. Why G Suite & Jira Cloud?
G Suite
• Already established at the Broad
• High level of user familiarityskill already exists
• Cost covered by overhead already*
• Users able to prototype solutions quickly
• Metadata datasets are small
• Adequate feature set, i.e., can persist data,
flexible data types (+/-)
• Can share outside Broad easily
• SaaSintegrated into BITS architecture
• Developer (me) had easily transferable
skillsexperience
• Lots of resources from which to learn and copy
Jira Cloud
• Already established at the Broad
• Some level of user familiarityskill already exists
• Cost covered by overhead already
• Developer able to prototype solutions quickly
• Metadata datasets are small
• Adequate feature set, i.e., configurable workflows,
separable workflows by item type, custom fields
• SaaSintegrated into BITS architecture
• Developer (me) had some transferable
skillsexperience and some good history to follow
• Lots of resources from which to learn and copy
21. Lots of resources to learn how to automate them!
• w3schools.com
• https://www.w3schools.com/js/default.asp
• Atlassian
• https://developer.atlassian.com/cloud/jira/platform/rest/v3/?utm_source=%2Fcloud%2Fjira%
2Fplatform%2Frest&utm_medium=302
• https://developer.atlassian.com/server/jira/platform/jira-rest-api-examples/
• Stack Overflow
• https://stackoverflow.com
• Style guides
• https://www.w3schools.com/js/js_conventions.asp
• https://google.github.io/styleguide/jsguide.html
22. Design considerations
• Put code as close to where it needs to be as possible
• Google Sheet: Code to add menus to Google Sheets
• Google Forms or Google Sheets: Code to call Google Apps Script Module
• Google Apps Script module: Code to do extract/transform/load from Google
Sheet, upload to Jira Cloud, link Issues in Jira Cloud, etc.
• Use Google Forms to design intake forms for collaborators
• Use Google Sheets to store data from Google Forms and data necessary in
Issues in Jira
• Use Google Groups to establish role accounts for G Suite and Jira Cloud
• Multiple Boards in Jira for different views of the same data
• Prefer using each component for what it does best
23. Keys for success (thus far)
• Not quarterly-driven pharma (time = $$$) so space to learn new things
• (Some) freedom to work on interesting and pressing issues
• Feasible to pick up required knowledge to deliver minimum viable product
• Culture of volunteerism (no one said I couldn’t work on it)
• Supportive environment for learning and applying agile techniques
• Iterative developmentfocus on minimum viable product
(or hang yourself)
24. Not all is copacetic
• Google Apps Script editor is a bit
primitive (I miss colors,
autocompletion)
• Using another editor seemingly
requires a lot of futzing that could be
better spent fixing bugs delivering
features
• No integration to GitHub
• Must remember to NOT paste
username:password into GitHub :-|
• Calling code in GAS modules is
sloooow
• Google Sheets configuration can be
brittle (must know a priori about
columns and sheets)
• Have to know a lot about Jira Cloud
configuration to connect with it (e.g.,
custom field code ID)
• Our Jira Cloud instance configuration
needs housecleaning
• Versioning to be refined
• Security to be refined
25. To do list
• GAS project setup refinements
• Need to not run in development mode
• Code changes
• Use API tokens per role account
• Checks on user permissions (does
Google Sheets user has access to the
Jira Cloud Project)
• Use Jira Cloud Webhook to call a Google
Cloud Function, then modify the passed
JSON object to call back to Jira Cloud to
extend Jira functionality (e.g., transition to a
new status once all required fields filled out)
• A developer’s guide would be useful
• Setup of role account via Google Group
and some BITS trickery
• Ensure role account given access to the
Jira Cloud Project
• What is the appropriate role in the Jira
Cloud Project/what permissions does it
need
• Add Google Group to BITS’ Google
automation
• What is best done in each component
• Remove password:username from code
before posting to GitHub (always!)
26. Silly things
• let vs var
• No warnings on changing values of const!?!?!
• Style guide, what’s that?
• Changing GAS library names == bad idea
27. Even with modest skills, you can still deliver value…
I started with OLD technical skills
• C++, Java, Perl, OLE Automation
• RCS, SourceSafe, ClearCase
• HP-UX, QNX, Red Hat (pre RHEL)
• Client server development
• ExcelWordOutlook automation VBA
• ERP, MES, ELN, SDMS, LIMS
• Jira administration
and ended up updating them
• Google Apps ScriptJavaScript
• Rudimentary GitHub
• Web development (GET, PUT)
• Debug in cloud
• G Suite, Jira REST API
• (Rudimentary) Google Cloud IAM
authentication with BITS infrastructure
28. Take aways
• Current off-the-shelf technology allows for a serverless sample tracking solution (backed by a
lot of infrastructure)
• Low-cost solutions in academic research are good due to the effects of overhead and having
them removes finding sources of funding as a rate limiting factor for accelerating science
• Developing the Minimum Viable Product, along with short cycle/iterative delivery of solutions
to users, allows rapid feedback of what works to increase the velocity of science
• Making delivery deadlines on time builds faith that further iterations are worth the investment of
the project team’s time/focus
• Permission to invest time into learning a new skill not obviously in line with a job description
can move research forward by developing of new capabilities to apply to problems
V
fungible
29. Acknowledgements
(even if they’d rather that their names not be listed)
• Broad Information Technology
Services (BITS)
• Scientific Computing Services
(SCS) group: Vicky Guo
(manager), Michelle Campo,
Eric Jones, Michael Kirby,
Anthony Losada, Peter Ragone,
Gordon Saksena
• Other BITS people:
Jared Bancroft, Lukas Karlsson,
Bill Mayo, Katie Shakun,
Andrew Teixeira, Elsa Tsao…
• Scientific collaborators
• Thomas Cleland, Danielle Dionne,
Joshua Gould, Zach Leber,
Yenarae Lee, Anna Neumann,
Jenna Pfiffner-Borges,
Anne Stevenson, Kendra West,
Alec Wysoker, Didi Vaz
• Broad alumni
• Sadiya Akasha, Marc Monnar,
Scott Rich
31. Learn Broad's Institute best practices using the Atlassian tools
(Since it was on the advertisement!)
• Integration with Broad infrastructure (Single Sign On mostly)
• Understand our environment and tailor approach to it
• Flexible and ever changing workforce (groups and personnel), i.e., graduate students, post
docs, Associates, outside collaborators, interns, normal turnover, new groups, refactored
groups…
• A collection of semi-independent entities with a common goal
• Training, training, training
• Jira 101: Aimed at those new to Jira Cloud
• Jira 102: Jira Cloud Board and Project Administration
• Jira 201: Advanced Boards, JQL, Importing Issues, Mass Change (planned)
• Jira 301: Jira Cloud Project Administration (in development)
• Jira 401: Integrating Jira Cloud (planned)
32. Learn Broad's Institute best practices using the Atlassian tools
(Since it was on the advertisement!)
• Standardish practices/models
• KISS, e.g., start simple (To Do, Doing, Done) and iterate (To Do, Doing, Checked, Done)
• Share as little as possible (what’s not possible to separate, e.g., Statuses, Issue Types, custom fields)
• Separate things as much as possible (e.g., Workflows, Notification Schemes)
• Keep things private by default (e.g., only add people to Projects instead of all users in your instance)
• Use a tool for what it is best suited to do, and not other things
• Sometimes ServiceNow, Trello, or Smartsheets might be better suited to people’s needs
• Deliver value in short increments
• Attempt to follow IT system management/software engineering best practices
• Ticketing system for Jira Cloud Project requests
• Change Requests for major changes
• Test plans for major changes