SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Building a low-cost sample tracking system
with G Suite & Jira Cloud
What you can do with a little knowledge, a lot of ignorance,
some time, and permission to take a boondoggle
For Bio-IT World
2019/04/17 v1
About the Broad Institute of MIT and Harvard
• Propelling the understanding and
treatment of disease
• Collaborating deeply
• Reaching globally
• Empowering scientists
• Building partnerships
• Sharing data and knowledge
• Promoting inclusion
Take aways
• Current off-the-shelf technology allows for a serverless sample tracking solution (backed by a
lot of infrastructure)
• Low-cost solutions in academic research are good due to the effects of overhead and having
them removes finding sources of funding as a rate limiting factor for accelerating science
• Developing the Minimum Viable Product, along with short cycle/iterative delivery of solutions
to users, allows rapid feedback of what works to increase the velocity of science
• Making delivery deadlines on time builds faith that further iterations are worth the investment of
the project team’s time/focus
• Permission to invest time into learning a new skill not obviously in line with a job description
can move research forward by developing of new capabilities to apply to problems
V
fungible
A little history…
• 2014: I arrive at the Broad to work on solutions for management of laboratoryscientific data,
divided into functions (graphic by Scott Sutherland)
A little history
• Turns out the biggest need: Where’s my stuff (i.e., samples, data)?
One view of sample tracking at the Broad
• The parable of the blind people and the elephantiformes
One view of sample tracking at the Broad
Sample lifecycle Activity Tracking systems
Before physical samples received Project launch
Find participantssamples
Ship sample kits to participants
Jira Cloud
Google SheetsGoogle Forms
Consent systems
Smart sheets
Before processing Store samples
Process samples prior to
sequencing
Bespoke LIMS, COTS lab data
management systems
Google Sheets
Jira Cloud
During processing (e.g.,
sequencing)
Sequencing at GP/elsewhere
Analysis by Proteomics
Bespoke LIMS, on-premises Jira
After processing Data analysis
Data transfer
Google Sheets, Jira Cloud
Aspera, Trello
After initial use Compare samples
Reuse samples
?
Consent systems
One view of sample tracking at the Broad
Sample lifecycle Activity Tracking systems
Before physical samples received Project launch
Find participantssamples
Ship sample kits to participants
Jira Cloud
Google SheetsGoogle Forms
Consent systems
Smart sheets
Before processing Store samples
Process samples prior to
sequencing
Bespoke LIMS, COTS lab data
management systems
Google Sheets
Jira Cloud
During processing (e.g.,
sequencing)
Sequencing at GP/elsewhere
Analysis by Proteomics
Bespoke LIMS, on-premises Jira
After processing Data analysis
Data transfer
Google Sheets, Jira Cloud
Aspera, Trello
After initial use Compare samples
Reuse samples
?
Consent systems
One view of sample tracking at the Broad
Sample lifecycle Activity Tracking systems
Before physical samples received Project launch
Find participantssamples
Ship sample kits to participants
Jira Cloud
Google SheetsGoogle Forms
Consent systems
Smart sheets
Before processing Store samples
Process samples prior to
sequencing
Bespoke LIMS, COTS lab data
management systems
Google Sheets
Jira Cloud
During processing (e.g.,
sequencing)
Sequencing at GP/elsewhere
Analysis by Proteomics
Bespoke LIMS, on-premises Jira
After processing Data analysis
Data transfer
Google Sheets, Jira Cloud
Aspera, Trello
After initial use Compare samples
Reuse samples
?
Consent systems
Components of a G Suite & Jira Cloud-based
sample tracking system
<name>@broadinstitute.org
<name>@broad.mit.edu
Collaborativeiterative design
Collaborativeiterative design
Collaborativeiterative design
Collaborativeiterative design
Collaborativeiterative design
Suitable for all?
• Discovery: Can experimental techniques can
produce data to answer scientific questions
• Scale Discovery: Scaling experimental techniques
so they can more reliably produce data at high rate
• Data Production: Regularly producing experimental
data and producing quality control data
• Iterative Refinement: Refining production-scale processes
and some level of change management is expected to
ensure the quality of the data produced is maintained
or improved
Early stages
technology
development
(e.g., PRISM)
Platform
(e.g., DMX)
Projects helped so far…
Project Things tracked Approximate go live date
Comparative Medicine Can’t tell (Issue security!) 2018/01/01
Firehose to FireCloud Migration ~2800 2018/03/01
Regev Lab (scRSP) ~2700 2018/04/01
Archive of Lines in Artificial Societies ~500 2018/04/15
NeuroGAP-Psychosis Ship Log ~100 2018/05/01
External Compound Request ~50 2018/06/01
Microbial Omics ~950 2018/08/01
Data Map Expansion In planning stages
Common factor? Each of these groups is piloting solutions with
rapid iterations, applying Agile techniques to speed science
• Sheila Dodge's Dynamic Work Design paper
• Agile Academia (Broad Affinity Group)
• Kendra West's The Agile Laboratory Handbook
• Kendall Square Agilists & Agile Biotech Boston
Development principles
1. Move science forward
2. Usability to encourage people to use it!
3. Low cost (i.e., no Jira Cloud add-ons, no outside labor)
4. Solution sustainable beyond initial development team
5. Deliver solutions to users in short time frames and rapidly iterate
6. Users in control as much as possible for shape of solution (e.g., layout of Google Sheet,
which fields needed, columns in Jira Boards, etc.)
7. Have as little code as necessary/leave as much to other components as possible (e.g.,
VLOOKUP in Google Sheets)
8. Limit dependencies between components where possible (hah)
9. At least attempt to think about security (e.g., limit storage of credentials)
10. Document, document, document…
Why G Suite & Jira Cloud?
G Suite
• Already established at the Broad
• High level of user familiarityskill already exists
• Cost covered by overhead already*
• Users able to prototype solutions quickly
• Metadata datasets are small
• Adequate feature set, i.e., can persist data,
flexible data types (+/-)
• Can share outside Broad easily
• SaaSintegrated into BITS architecture
• Developer (me) had easily transferable
skillsexperience
• Lots of resources from which to learn and copy
Jira Cloud
• Already established at the Broad
• Some level of user familiarityskill already exists
• Cost covered by overhead already
• Developer able to prototype solutions quickly
• Metadata datasets are small
• Adequate feature set, i.e., configurable workflows,
separable workflows by item type, custom fields
• SaaSintegrated into BITS architecture
• Developer (me) had some transferable
skillsexperience and some good history to follow
• Lots of resources from which to learn and copy
Lots of resources to learn how to automate them!
• w3schools.com
• https://www.w3schools.com/js/default.asp
• Atlassian
• https://developer.atlassian.com/cloud/jira/platform/rest/v3/?utm_source=%2Fcloud%2Fjira%
2Fplatform%2Frest&utm_medium=302
• https://developer.atlassian.com/server/jira/platform/jira-rest-api-examples/
• Stack Overflow
• https://stackoverflow.com
• Style guides
• https://www.w3schools.com/js/js_conventions.asp
• https://google.github.io/styleguide/jsguide.html
Design considerations
• Put code as close to where it needs to be as possible
• Google Sheet: Code to add menus to Google Sheets
• Google Forms or Google Sheets: Code to call Google Apps Script Module
• Google Apps Script module: Code to do extract/transform/load from Google
Sheet, upload to Jira Cloud, link Issues in Jira Cloud, etc.
• Use Google Forms to design intake forms for collaborators
• Use Google Sheets to store data from Google Forms and data necessary in
Issues in Jira
• Use Google Groups to establish role accounts for G Suite and Jira Cloud
• Multiple Boards in Jira for different views of the same data
• Prefer using each component for what it does best
Keys for success (thus far)
• Not quarterly-driven pharma (time = $$$) so space to learn new things
• (Some) freedom to work on interesting and pressing issues
• Feasible to pick up required knowledge to deliver minimum viable product
• Culture of volunteerism (no one said I couldn’t work on it)
• Supportive environment for learning and applying agile techniques
• Iterative developmentfocus on minimum viable product
(or hang yourself)
Not all is copacetic
• Google Apps Script editor is a bit
primitive (I miss colors,
autocompletion)
• Using another editor seemingly
requires a lot of futzing that could be
better spent fixing bugs delivering
features
• No integration to GitHub
• Must remember to NOT paste
username:password into GitHub :-|
• Calling code in GAS modules is
sloooow
• Google Sheets configuration can be
brittle (must know a priori about
columns and sheets)
• Have to know a lot about Jira Cloud
configuration to connect with it (e.g.,
custom field code ID)
• Our Jira Cloud instance configuration
needs housecleaning
• Versioning to be refined
• Security to be refined
To do list
• GAS project setup refinements
• Need to not run in development mode
• Code changes
• Use API tokens per role account
• Checks on user permissions (does
Google Sheets user has access to the
Jira Cloud Project)
• Use Jira Cloud Webhook to call a Google
Cloud Function, then modify the passed
JSON object to call back to Jira Cloud to
extend Jira functionality (e.g., transition to a
new status once all required fields filled out)
• A developer’s guide would be useful
• Setup of role account via Google Group
and some BITS trickery
• Ensure role account given access to the
Jira Cloud Project
• What is the appropriate role in the Jira
Cloud Project/what permissions does it
need
• Add Google Group to BITS’ Google
automation
• What is best done in each component
• Remove password:username from code
before posting to GitHub (always!)
Silly things
• let vs var
• No warnings on changing values of const!?!?!
• Style guide, what’s that?
• Changing GAS library names == bad idea
Even with modest skills, you can still deliver value…
I started with OLD technical skills
• C++, Java, Perl, OLE Automation
• RCS, SourceSafe, ClearCase
• HP-UX, QNX, Red Hat (pre RHEL)
• Client server development
• ExcelWordOutlook automation VBA
• ERP, MES, ELN, SDMS, LIMS
• Jira administration
and ended up updating them
• Google Apps ScriptJavaScript
• Rudimentary GitHub
• Web development (GET, PUT)
• Debug in cloud
• G Suite, Jira REST API
• (Rudimentary) Google Cloud IAM
authentication with BITS infrastructure
Take aways
• Current off-the-shelf technology allows for a serverless sample tracking solution (backed by a
lot of infrastructure)
• Low-cost solutions in academic research are good due to the effects of overhead and having
them removes finding sources of funding as a rate limiting factor for accelerating science
• Developing the Minimum Viable Product, along with short cycle/iterative delivery of solutions
to users, allows rapid feedback of what works to increase the velocity of science
• Making delivery deadlines on time builds faith that further iterations are worth the investment of
the project team’s time/focus
• Permission to invest time into learning a new skill not obviously in line with a job description
can move research forward by developing of new capabilities to apply to problems
V
fungible
Acknowledgements
(even if they’d rather that their names not be listed)
• Broad Information Technology
Services (BITS)
• Scientific Computing Services
(SCS) group: Vicky Guo
(manager), Michelle Campo,
Eric Jones, Michael Kirby,
Anthony Losada, Peter Ragone,
Gordon Saksena
• Other BITS people:
Jared Bancroft, Lukas Karlsson,
Bill Mayo, Katie Shakun,
Andrew Teixeira, Elsa Tsao…
• Scientific collaborators
• Thomas Cleland, Danielle Dionne,
Joshua Gould, Zach Leber,
Yenarae Lee, Anna Neumann,
Jenna Pfiffner-Borges,
Anne Stevenson, Kendra West,
Alec Wysoker, Didi Vaz
• Broad alumni
• Sadiya Akasha, Marc Monnar,
Scott Rich
Thanks for listening (and being kind)
Learn Broad's Institute best practices using the Atlassian tools
(Since it was on the advertisement!)
• Integration with Broad infrastructure (Single Sign On mostly)
• Understand our environment and tailor approach to it
• Flexible and ever changing workforce (groups and personnel), i.e., graduate students, post
docs, Associates, outside collaborators, interns, normal turnover, new groups, refactored
groups…
• A collection of semi-independent entities with a common goal
• Training, training, training
• Jira 101: Aimed at those new to Jira Cloud
• Jira 102: Jira Cloud Board and Project Administration
• Jira 201: Advanced Boards, JQL, Importing Issues, Mass Change (planned)
• Jira 301: Jira Cloud Project Administration (in development)
• Jira 401: Integrating Jira Cloud (planned)
Learn Broad's Institute best practices using the Atlassian tools
(Since it was on the advertisement!)
• Standardish practices/models
• KISS, e.g., start simple (To Do, Doing, Done) and iterate (To Do, Doing, Checked, Done)
• Share as little as possible (what’s not possible to separate, e.g., Statuses, Issue Types, custom fields)
• Separate things as much as possible (e.g., Workflows, Notification Schemes)
• Keep things private by default (e.g., only add people to Projects instead of all users in your instance)
• Use a tool for what it is best suited to do, and not other things
• Sometimes ServiceNow, Trello, or Smartsheets might be better suited to people’s needs
• Deliver value in short increments
• Attempt to follow IT system management/software engineering best practices
• Ticketing system for Jira Cloud Project requests
• Change Requests for major changes
• Test plans for major changes

Mais conteúdo relacionado

Mais procurados

Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargBig Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
QA or the Highway
 
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoIasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Codecamp Romania
 

Mais procurados (20)

Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
H2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupH2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User Group
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno Candel
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
 
H2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelH2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno Candel
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
 
re:Invent 2013-foster-madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduri
 
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for Spark
 
Migrating from Closed to Open Source - Fonda Ingram & Ken Sanford
Migrating from Closed to Open Source - Fonda Ingram & Ken SanfordMigrating from Closed to Open Source - Fonda Ingram & Ken Sanford
Migrating from Closed to Open Source - Fonda Ingram & Ken Sanford
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and ShinyMaking Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
Making Multimillion-Dollar Baseball Decisions with H2O AutoML, LIME and Shiny
 
Applying Testing Techniques for Big Data and Hadoop
Applying Testing Techniques for Big Data and HadoopApplying Testing Techniques for Big Data and Hadoop
Applying Testing Techniques for Big Data and Hadoop
 
H2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SF
H2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SFH2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SF
H2O Machine Learning and Kalman Filters for Machine Prognostics - Galvanize SF
 
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya GargBig Data - Hadoop and MapReduce for QA and testing by Aditya Garg
Big Data - Hadoop and MapReduce for QA and testing by Aditya Garg
 
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoIasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
 

Semelhante a 2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking

Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Lucidworks
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 

Semelhante a 2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking (20)

It's all about feedback - code review as a great tool in the agile toolbox
It's all about feedback - code review as a great tool in the agile toolboxIt's all about feedback - code review as a great tool in the agile toolbox
It's all about feedback - code review as a great tool in the agile toolbox
 
Test Automation you'll actually Like - Gauge by ThoughtWorks
Test Automation you'll actually Like - Gauge by ThoughtWorksTest Automation you'll actually Like - Gauge by ThoughtWorks
Test Automation you'll actually Like - Gauge by ThoughtWorks
 
Microsoft Graph community call-November 2018
Microsoft Graph community call-November 2018Microsoft Graph community call-November 2018
Microsoft Graph community call-November 2018
 
Agile Secure Cloud Application Development Management
Agile Secure Cloud Application Development ManagementAgile Secure Cloud Application Development Management
Agile Secure Cloud Application Development Management
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
The Power of Azure DevOps
The Power of Azure DevOpsThe Power of Azure DevOps
The Power of Azure DevOps
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with GraphsNeo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
The Power of Azure DevOps
The Power of Azure DevOpsThe Power of Azure DevOps
The Power of Azure DevOps
 
A FAIR Approach to Publishing and Sharing Machine Learning Models
A FAIR Approach to Publishing and Sharing Machine Learning ModelsA FAIR Approach to Publishing and Sharing Machine Learning Models
A FAIR Approach to Publishing and Sharing Machine Learning Models
 
Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...
Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...
Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...
 
Getting started with GCP ( Google Cloud Platform)
Getting started with GCP ( Google  Cloud Platform)Getting started with GCP ( Google  Cloud Platform)
Getting started with GCP ( Google Cloud Platform)
 
Re-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudRe-Platforming Applications for the Cloud
Re-Platforming Applications for the Cloud
 

Mais de Bruce Kozuma

Mais de Bruce Kozuma (6)

Perceptions of Project Managers in the Job Marketplace (and what to do about it)
Perceptions of Project Managers in the Job Marketplace (and what to do about it)Perceptions of Project Managers in the Job Marketplace (and what to do about it)
Perceptions of Project Managers in the Job Marketplace (and what to do about it)
 
IT-focused Project Management in a Biopharmaceutical Manufacturing Environment
IT-focused Project Management in a Biopharmaceutical Manufacturing EnvironmentIT-focused Project Management in a Biopharmaceutical Manufacturing Environment
IT-focused Project Management in a Biopharmaceutical Manufacturing Environment
 
2016 IQPC 13th Laboratory Informatics Summit Preparing for Possibly, Maybe, H...
2016 IQPC 13th Laboratory Informatics Summit Preparing for Possibly, Maybe, H...2016 IQPC 13th Laboratory Informatics Summit Preparing for Possibly, Maybe, H...
2016 IQPC 13th Laboratory Informatics Summit Preparing for Possibly, Maybe, H...
 
2018 Bio-IT World Agile in Wet Labs Speeds Big Data
2018 Bio-IT World Agile in Wet Labs Speeds Big Data2018 Bio-IT World Agile in Wet Labs Speeds Big Data
2018 Bio-IT World Agile in Wet Labs Speeds Big Data
 
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v12016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1
 
2016 Bio-IT World Cell Line Coordination 2016-04-06v1
2016 Bio-IT World Cell Line Coordination 2016-04-06v12016 Bio-IT World Cell Line Coordination 2016-04-06v1
2016 Bio-IT World Cell Line Coordination 2016-04-06v1
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking

  • 1. Building a low-cost sample tracking system with G Suite & Jira Cloud What you can do with a little knowledge, a lot of ignorance, some time, and permission to take a boondoggle For Bio-IT World 2019/04/17 v1
  • 2. About the Broad Institute of MIT and Harvard • Propelling the understanding and treatment of disease • Collaborating deeply • Reaching globally • Empowering scientists • Building partnerships • Sharing data and knowledge • Promoting inclusion
  • 3. Take aways • Current off-the-shelf technology allows for a serverless sample tracking solution (backed by a lot of infrastructure) • Low-cost solutions in academic research are good due to the effects of overhead and having them removes finding sources of funding as a rate limiting factor for accelerating science • Developing the Minimum Viable Product, along with short cycle/iterative delivery of solutions to users, allows rapid feedback of what works to increase the velocity of science • Making delivery deadlines on time builds faith that further iterations are worth the investment of the project team’s time/focus • Permission to invest time into learning a new skill not obviously in line with a job description can move research forward by developing of new capabilities to apply to problems V fungible
  • 4. A little history… • 2014: I arrive at the Broad to work on solutions for management of laboratoryscientific data, divided into functions (graphic by Scott Sutherland)
  • 5. A little history • Turns out the biggest need: Where’s my stuff (i.e., samples, data)?
  • 6. One view of sample tracking at the Broad • The parable of the blind people and the elephantiformes
  • 7. One view of sample tracking at the Broad Sample lifecycle Activity Tracking systems Before physical samples received Project launch Find participantssamples Ship sample kits to participants Jira Cloud Google SheetsGoogle Forms Consent systems Smart sheets Before processing Store samples Process samples prior to sequencing Bespoke LIMS, COTS lab data management systems Google Sheets Jira Cloud During processing (e.g., sequencing) Sequencing at GP/elsewhere Analysis by Proteomics Bespoke LIMS, on-premises Jira After processing Data analysis Data transfer Google Sheets, Jira Cloud Aspera, Trello After initial use Compare samples Reuse samples ? Consent systems
  • 8. One view of sample tracking at the Broad Sample lifecycle Activity Tracking systems Before physical samples received Project launch Find participantssamples Ship sample kits to participants Jira Cloud Google SheetsGoogle Forms Consent systems Smart sheets Before processing Store samples Process samples prior to sequencing Bespoke LIMS, COTS lab data management systems Google Sheets Jira Cloud During processing (e.g., sequencing) Sequencing at GP/elsewhere Analysis by Proteomics Bespoke LIMS, on-premises Jira After processing Data analysis Data transfer Google Sheets, Jira Cloud Aspera, Trello After initial use Compare samples Reuse samples ? Consent systems
  • 9. One view of sample tracking at the Broad Sample lifecycle Activity Tracking systems Before physical samples received Project launch Find participantssamples Ship sample kits to participants Jira Cloud Google SheetsGoogle Forms Consent systems Smart sheets Before processing Store samples Process samples prior to sequencing Bespoke LIMS, COTS lab data management systems Google Sheets Jira Cloud During processing (e.g., sequencing) Sequencing at GP/elsewhere Analysis by Proteomics Bespoke LIMS, on-premises Jira After processing Data analysis Data transfer Google Sheets, Jira Cloud Aspera, Trello After initial use Compare samples Reuse samples ? Consent systems
  • 10. Components of a G Suite & Jira Cloud-based sample tracking system <name>@broadinstitute.org <name>@broad.mit.edu
  • 16. Suitable for all? • Discovery: Can experimental techniques can produce data to answer scientific questions • Scale Discovery: Scaling experimental techniques so they can more reliably produce data at high rate • Data Production: Regularly producing experimental data and producing quality control data • Iterative Refinement: Refining production-scale processes and some level of change management is expected to ensure the quality of the data produced is maintained or improved Early stages technology development (e.g., PRISM) Platform (e.g., DMX)
  • 17. Projects helped so far… Project Things tracked Approximate go live date Comparative Medicine Can’t tell (Issue security!) 2018/01/01 Firehose to FireCloud Migration ~2800 2018/03/01 Regev Lab (scRSP) ~2700 2018/04/01 Archive of Lines in Artificial Societies ~500 2018/04/15 NeuroGAP-Psychosis Ship Log ~100 2018/05/01 External Compound Request ~50 2018/06/01 Microbial Omics ~950 2018/08/01 Data Map Expansion In planning stages
  • 18. Common factor? Each of these groups is piloting solutions with rapid iterations, applying Agile techniques to speed science • Sheila Dodge's Dynamic Work Design paper • Agile Academia (Broad Affinity Group) • Kendra West's The Agile Laboratory Handbook • Kendall Square Agilists & Agile Biotech Boston
  • 19. Development principles 1. Move science forward 2. Usability to encourage people to use it! 3. Low cost (i.e., no Jira Cloud add-ons, no outside labor) 4. Solution sustainable beyond initial development team 5. Deliver solutions to users in short time frames and rapidly iterate 6. Users in control as much as possible for shape of solution (e.g., layout of Google Sheet, which fields needed, columns in Jira Boards, etc.) 7. Have as little code as necessary/leave as much to other components as possible (e.g., VLOOKUP in Google Sheets) 8. Limit dependencies between components where possible (hah) 9. At least attempt to think about security (e.g., limit storage of credentials) 10. Document, document, document…
  • 20. Why G Suite & Jira Cloud? G Suite • Already established at the Broad • High level of user familiarityskill already exists • Cost covered by overhead already* • Users able to prototype solutions quickly • Metadata datasets are small • Adequate feature set, i.e., can persist data, flexible data types (+/-) • Can share outside Broad easily • SaaSintegrated into BITS architecture • Developer (me) had easily transferable skillsexperience • Lots of resources from which to learn and copy Jira Cloud • Already established at the Broad • Some level of user familiarityskill already exists • Cost covered by overhead already • Developer able to prototype solutions quickly • Metadata datasets are small • Adequate feature set, i.e., configurable workflows, separable workflows by item type, custom fields • SaaSintegrated into BITS architecture • Developer (me) had some transferable skillsexperience and some good history to follow • Lots of resources from which to learn and copy
  • 21. Lots of resources to learn how to automate them! • w3schools.com • https://www.w3schools.com/js/default.asp • Atlassian • https://developer.atlassian.com/cloud/jira/platform/rest/v3/?utm_source=%2Fcloud%2Fjira% 2Fplatform%2Frest&utm_medium=302 • https://developer.atlassian.com/server/jira/platform/jira-rest-api-examples/ • Stack Overflow • https://stackoverflow.com • Style guides • https://www.w3schools.com/js/js_conventions.asp • https://google.github.io/styleguide/jsguide.html
  • 22. Design considerations • Put code as close to where it needs to be as possible • Google Sheet: Code to add menus to Google Sheets • Google Forms or Google Sheets: Code to call Google Apps Script Module • Google Apps Script module: Code to do extract/transform/load from Google Sheet, upload to Jira Cloud, link Issues in Jira Cloud, etc. • Use Google Forms to design intake forms for collaborators • Use Google Sheets to store data from Google Forms and data necessary in Issues in Jira • Use Google Groups to establish role accounts for G Suite and Jira Cloud • Multiple Boards in Jira for different views of the same data • Prefer using each component for what it does best
  • 23. Keys for success (thus far) • Not quarterly-driven pharma (time = $$$) so space to learn new things • (Some) freedom to work on interesting and pressing issues • Feasible to pick up required knowledge to deliver minimum viable product • Culture of volunteerism (no one said I couldn’t work on it) • Supportive environment for learning and applying agile techniques • Iterative developmentfocus on minimum viable product (or hang yourself)
  • 24. Not all is copacetic • Google Apps Script editor is a bit primitive (I miss colors, autocompletion) • Using another editor seemingly requires a lot of futzing that could be better spent fixing bugs delivering features • No integration to GitHub • Must remember to NOT paste username:password into GitHub :-| • Calling code in GAS modules is sloooow • Google Sheets configuration can be brittle (must know a priori about columns and sheets) • Have to know a lot about Jira Cloud configuration to connect with it (e.g., custom field code ID) • Our Jira Cloud instance configuration needs housecleaning • Versioning to be refined • Security to be refined
  • 25. To do list • GAS project setup refinements • Need to not run in development mode • Code changes • Use API tokens per role account • Checks on user permissions (does Google Sheets user has access to the Jira Cloud Project) • Use Jira Cloud Webhook to call a Google Cloud Function, then modify the passed JSON object to call back to Jira Cloud to extend Jira functionality (e.g., transition to a new status once all required fields filled out) • A developer’s guide would be useful • Setup of role account via Google Group and some BITS trickery • Ensure role account given access to the Jira Cloud Project • What is the appropriate role in the Jira Cloud Project/what permissions does it need • Add Google Group to BITS’ Google automation • What is best done in each component • Remove password:username from code before posting to GitHub (always!)
  • 26. Silly things • let vs var • No warnings on changing values of const!?!?! • Style guide, what’s that? • Changing GAS library names == bad idea
  • 27. Even with modest skills, you can still deliver value… I started with OLD technical skills • C++, Java, Perl, OLE Automation • RCS, SourceSafe, ClearCase • HP-UX, QNX, Red Hat (pre RHEL) • Client server development • ExcelWordOutlook automation VBA • ERP, MES, ELN, SDMS, LIMS • Jira administration and ended up updating them • Google Apps ScriptJavaScript • Rudimentary GitHub • Web development (GET, PUT) • Debug in cloud • G Suite, Jira REST API • (Rudimentary) Google Cloud IAM authentication with BITS infrastructure
  • 28. Take aways • Current off-the-shelf technology allows for a serverless sample tracking solution (backed by a lot of infrastructure) • Low-cost solutions in academic research are good due to the effects of overhead and having them removes finding sources of funding as a rate limiting factor for accelerating science • Developing the Minimum Viable Product, along with short cycle/iterative delivery of solutions to users, allows rapid feedback of what works to increase the velocity of science • Making delivery deadlines on time builds faith that further iterations are worth the investment of the project team’s time/focus • Permission to invest time into learning a new skill not obviously in line with a job description can move research forward by developing of new capabilities to apply to problems V fungible
  • 29. Acknowledgements (even if they’d rather that their names not be listed) • Broad Information Technology Services (BITS) • Scientific Computing Services (SCS) group: Vicky Guo (manager), Michelle Campo, Eric Jones, Michael Kirby, Anthony Losada, Peter Ragone, Gordon Saksena • Other BITS people: Jared Bancroft, Lukas Karlsson, Bill Mayo, Katie Shakun, Andrew Teixeira, Elsa Tsao… • Scientific collaborators • Thomas Cleland, Danielle Dionne, Joshua Gould, Zach Leber, Yenarae Lee, Anna Neumann, Jenna Pfiffner-Borges, Anne Stevenson, Kendra West, Alec Wysoker, Didi Vaz • Broad alumni • Sadiya Akasha, Marc Monnar, Scott Rich
  • 30. Thanks for listening (and being kind)
  • 31. Learn Broad's Institute best practices using the Atlassian tools (Since it was on the advertisement!) • Integration with Broad infrastructure (Single Sign On mostly) • Understand our environment and tailor approach to it • Flexible and ever changing workforce (groups and personnel), i.e., graduate students, post docs, Associates, outside collaborators, interns, normal turnover, new groups, refactored groups… • A collection of semi-independent entities with a common goal • Training, training, training • Jira 101: Aimed at those new to Jira Cloud • Jira 102: Jira Cloud Board and Project Administration • Jira 201: Advanced Boards, JQL, Importing Issues, Mass Change (planned) • Jira 301: Jira Cloud Project Administration (in development) • Jira 401: Integrating Jira Cloud (planned)
  • 32. Learn Broad's Institute best practices using the Atlassian tools (Since it was on the advertisement!) • Standardish practices/models • KISS, e.g., start simple (To Do, Doing, Done) and iterate (To Do, Doing, Checked, Done) • Share as little as possible (what’s not possible to separate, e.g., Statuses, Issue Types, custom fields) • Separate things as much as possible (e.g., Workflows, Notification Schemes) • Keep things private by default (e.g., only add people to Projects instead of all users in your instance) • Use a tool for what it is best suited to do, and not other things • Sometimes ServiceNow, Trello, or Smartsheets might be better suited to people’s needs • Deliver value in short increments • Attempt to follow IT system management/software engineering best practices • Ticketing system for Jira Cloud Project requests • Change Requests for major changes • Test plans for major changes