SlideShare uma empresa Scribd logo
1 de 29
Managing Data Flow Through the
      Barcoding Pipeline

                   Amy Driskell
      Laboratories of Analytical Biology (LAB)
       National Museum of Natural History
             Smithsonian Institution
What is the “pipeline”?


                  LIMS
Specimen



                               Data Deposition



                 Data QC
Outline
1. BEFORE the LIMS
2. LIMS
  – Data recorded
  – Exploring laboratory success/failure
  – Tracking project completion
3. Data QC
  – Criteria and data requirements
  – Checking for contamination and validity
Critical data management BEFORE
specimen enters the laboratory pipeline

• Data elements (“metadata”) necessary for laboratory
  processing:
   – Taxonomy, collection information, etc.


IMPORTANT!
• Assess laboratory successes/failures in light of this
  information
• Tailor/change lab protocols
Careful Metadata Collection at
        Specimen Collection or Harvest
• Metadata can be formatted at the beginning of a
  project (e.g. at specimen collection) to guarantee a
  smooth information transfer into the LIMS
• Multiple sources for metadata:
   –   Spreadsheets
   –   Field Information Management Systems (FIMS)
   –   Museum databases
   –   Fusion tables
Rockin’ It “Old School” -- Spreadsheets




•   Modified BOLD specimen spreadsheet for use in field/museum
•   Additional fields desired by PIs
•   Modified easily to interface with multiple kinds of databases
•   96-well format – 2D barcoded tubes, extraction plates
•   NOT directly connected to other databases, including LIMS
An Elegant Solution:
                       BiocodeMoorea FIMS
                                      Actively connected to their LIMS




http://biocode.berkeley.edu/
bioValidator – cleaning up the
             collection of metadata
• Many aspects of metadata require specific formats:
  digital lat/long, meters, names
• bioValidator enforces adherence to formatting and
  other rules
• Photo matcher




http://biovalidator.sourceforge.net/
Museum Collection Databases
• Sampling directly from existing collections?
• Some museum databases cannot link directly
  to lab-based information systems (LIMS)
• Requires output from collection
  database, input into lab database – no
  automatic updates
Why?
     1. Downstream insertion of data into other databases simplified
2. Because metadata has important uses in the lab
• Determine possible causes of failure: taxonomy, collection
  event, specimen age
• adjust extraction or amplification protocols
• design new primers – e.g. smaller fragments
Specimens enter the lab
           Metadata enters the LIMS


                  LIMS
Specimen
   &
Metadata
                                Data Deposition



                 Data QC
What is a LIMS?
• An electronic lab “notebook” (aka database) to
  replace our traditional paper lab notebooks.
• Tracks a specimen through lab processes from
  extraction through to barcode sequence completion
  (data QC may use external software).
• Records every lab procedure.
• Provides information to guide further lab efforts –
  success rates, “redo” lists
• Records the physical location of extracts, etc.
My requirements for a LIMS
• I want a system that records every piece of
  information about each specimen/extract for
  which I produce a barcode sequence.
• I want my procedures and protocols to be
  transparent enough so that anyone can
  reproduce my results.
• This includes my QC procedures.
• Currently no good place to publish these data.
Data to be recorded
• Extraction: protocol, digestion time, etc.
• PCR: recipes, DNA [ ], cycling parameters, clean-up
  method (PCR machine, brand of enzyme, lot #)
• Gel photos
• Sequencing: recipe, clean-up, machine, etc.

• Bonus: success or failure can be mapped back to any
  of these recorded values. Maybe the Taq was bad?
  Or the PCR machine needs repair?
• A LIMS can be homegrown (like LAB’s barcoding
     LIMS, or SI’s plant barcoding LIMS) – relatively simple
     relational databases
   • Sophisticated, commercially produced – Geneious
     plug-in MooreaBiocode LIMS (plug-in is free)



•Software updated and maintained
•Plugs into the Geneious data analysis software




   http://software.mooreabiocode.org
Workflow
Mapping workflow elements to success
Tracking project progress
      & identifying next steps
• Which specimens have
  completed barcodes?
• Which specimens need
  additional labwork?
• Which specimens should be
  abandoned?
• Where are the original DNA
  extracts or tissue samples?
Project Progress
Raw data enters the QC process


             LIMS
Specimen



                          Data Deposition



            Data QC
Data QC
• OUTSIDE of LIMS database
• “Clean up” raw data – trim, examine quality
• Assemble passed traces (“contig”) for a
  specimen
• Examine/edit contigs
• Check validity of resulting sequences
My data QC ethos
• All criteria for each step of data analysis is
  recorded
• For raw trace processing: trimming
  criteria, length and quality requirements, binning
  criteria
• For assembly: assembly parameters, product
  length, etc.
• Hand editing is minimized*
• It would be possible for anyone to recreate the
  barcode sequence
Any DNA sequence analysis software can be
               used for data QC

• Sequencher (Genecodes) &Geneious (Biomatters)
   – Trim ends of raw sequences with adjustable criteria, explore
     effects of trim criteria
   – Discard short or poor sequences
   – Assemble trimmed reads with stringent, but adjustable criteria
   – Output completed sequences
• Geneious LIMS is plugged into the data analysis software
   – direct communication
   – binning*
• Sequencher data must be exported and imported into LIMS
Data analysis



               Here are the traces. You can see some
                FIMS data in the document fields (eg
            identified by, tissue id). You will also notice
             a binning column (see the following slide)
Binning
Automatic categorization of reads and
              assemblies

                                 •Change binning
                            parameters, examine effects
                           •Trimming and assembly dialog
                                   boxes similar
Final Steps:
  Is it a contaminant? Is it identified correctly?

• A number of procedures for identifying
  contamination or incorrect identification
   – BLASTingdatabase of known contaminants; Genbank;
     BOLD
   – Quick and dirty assembly tests
   – NJ trees
   – Geneious taxonomy verification tool
Verify Taxonomy
• BLASTs your sequences
• Gets the NCBI taxonomy for the best hit(s)
• Compares to the taxonomy from the FIMS
Good, clean, barcode sequences
  • Feed back into LIMS*
      – Monitor progress
      – Connect sequences and traces to specimen data
  • Prepare for output to databases Genbank or
    BOLD upload packages
                     LIMS
                       &
Specimen
                    Data QC

                                            Data Deposition
Positive Information Flow from field or
     museum to final data deposition

1. Collect metadata to flow easily into LIMS and
   other databases
2. Record all aspects of all laboratory procedures
   (LIMS)
3. Use LIMS system for reporting and protocol
   investigation, monitoring of project progress
4. Input information and data from QC procedures
   into LIMS*
5. LIMS output upload packages for public
   databases

Mais conteúdo relacionado

Semelhante a Amy Driskell - Information management and data Quality

Road to database automation - Database source control
Road to database automation - Database source controlRoad to database automation - Database source control
Road to database automation - Database source controlEduardo Piairo
 
2012 sept 18_thug_biotech
2012 sept 18_thug_biotech2012 sept 18_thug_biotech
2012 sept 18_thug_biotechAdam Muise
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupGenomeInABottle
 
2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange
2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange
2013 OHSUG - Integration of Argus and Other Products Using the E2B InterchangePerficient
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...Ken Karapetyan
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsAltuna Akalin
 
Alpha analytical edd_services_2012
Alpha analytical edd_services_2012Alpha analytical edd_services_2012
Alpha analytical edd_services_2012Kristin Garboski
 
MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)Nikos Palavitsinis, PhD
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptRafiulHasan19
 
Electronic Data Management Systems.ppt
Electronic Data Management Systems.pptElectronic Data Management Systems.ppt
Electronic Data Management Systems.pptTim Sandle, Ph.D.
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsPerficient, Inc.
 
Qa what is_clinical_data_management
Qa what is_clinical_data_managementQa what is_clinical_data_management
Qa what is_clinical_data_managementHitesh Kadam
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data ManagementMahesh Koppula
 
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...Perficient
 

Semelhante a Amy Driskell - Information management and data Quality (20)

Road to database automation - Database source control
Road to database automation - Database source controlRoad to database automation - Database source control
Road to database automation - Database source control
 
2012 sept 18_thug_biotech
2012 sept 18_thug_biotech2012 sept 18_thug_biotech
2012 sept 18_thug_biotech
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
 
Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?
 
2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange
2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange
2013 OHSUG - Integration of Argus and Other Products Using the E2B Interchange
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...
 
The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...
 
Data analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomicsData analysis patterns, tools and data types in genomics
Data analysis patterns, tools and data types in genomics
 
Alpha analytical edd_services_2012
Alpha analytical edd_services_2012Alpha analytical edd_services_2012
Alpha analytical edd_services_2012
 
Labmatrix
LabmatrixLabmatrix
Labmatrix
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)MetadataTheory: Learning Repositories Technologies (9th of 10)
MetadataTheory: Learning Repositories Technologies (9th of 10)
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
Electronic Data Management Systems.ppt
Electronic Data Management Systems.pptElectronic Data Management Systems.ppt
Electronic Data Management Systems.ppt
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study Migrations
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...
 
Qa what is_clinical_data_management
Qa what is_clinical_data_managementQa what is_clinical_data_management
Qa what is_clinical_data_management
 
Clinical data management
Clinical data management Clinical data management
Clinical data management
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
Integrating Oracle Argus Safety with other Clinical Systems Using Argus Inter...
 

Mais de Consortium for the Barcode of Life (CBOL)

Mais de Consortium for the Barcode of Life (CBOL) (20)

Andrew Lowe - Opening Plenary
Andrew Lowe - Opening PlenaryAndrew Lowe - Opening Plenary
Andrew Lowe - Opening Plenary
 
Axel Hausmann - Invertebrates Plenary
Axel Hausmann - Invertebrates PlenaryAxel Hausmann - Invertebrates Plenary
Axel Hausmann - Invertebrates Plenary
 
Hannah McPherson - Plants Plenary
Hannah McPherson - Plants PlenaryHannah McPherson - Plants Plenary
Hannah McPherson - Plants Plenary
 
Rebecca Johnson - Opening Plenary
Rebecca Johnson - Opening PlenaryRebecca Johnson - Opening Plenary
Rebecca Johnson - Opening Plenary
 
K.A. Seifert - Algae, Protists & Fungi Plenary
K.A. Seifert - Algae, Protists & Fungi PlenaryK.A. Seifert - Algae, Protists & Fungi Plenary
K.A. Seifert - Algae, Protists & Fungi Plenary
 
Scott Miller - Opening Plenary
Scott Miller - Opening PlenaryScott Miller - Opening Plenary
Scott Miller - Opening Plenary
 
Bruce Deagle - Opening Plenary
Bruce Deagle - Opening PlenaryBruce Deagle - Opening Plenary
Bruce Deagle - Opening Plenary
 
Ralph Imondi - Opening Plenary
Ralph Imondi - Opening PlenaryRalph Imondi - Opening Plenary
Ralph Imondi - Opening Plenary
 
Damon Little - Opening Plenary
Damon Little - Opening PlenaryDamon Little - Opening Plenary
Damon Little - Opening Plenary
 
Natasha de Vere - Plants Plenary
Natasha de Vere - Plants PlenaryNatasha de Vere - Plants Plenary
Natasha de Vere - Plants Plenary
 
Robert Hanner - Closing Plenary
Robert Hanner - Closing PlenaryRobert Hanner - Closing Plenary
Robert Hanner - Closing Plenary
 
Paul Hebert - Saturday Closing Plenary
Paul Hebert - Saturday Closing PlenaryPaul Hebert - Saturday Closing Plenary
Paul Hebert - Saturday Closing Plenary
 
Conrad Schoch - Saturday Closing Plenary
Conrad Schoch - Saturday Closing PlenaryConrad Schoch - Saturday Closing Plenary
Conrad Schoch - Saturday Closing Plenary
 
Xin Zhou - Saturday Closing Plenary
Xin Zhou - Saturday Closing PlenaryXin Zhou - Saturday Closing Plenary
Xin Zhou - Saturday Closing Plenary
 
Pierre Taberlet - Saturday Closing Plenary
Pierre Taberlet - Saturday Closing PlenaryPierre Taberlet - Saturday Closing Plenary
Pierre Taberlet - Saturday Closing Plenary
 
Stoeckle - All Birds Barcoding Initiative
Stoeckle - All Birds Barcoding Initiative Stoeckle - All Birds Barcoding Initiative
Stoeckle - All Birds Barcoding Initiative
 
Weiland Meyer - Algae, Protists & Fungi Plenary
Weiland Meyer - Algae, Protists & Fungi PlenaryWeiland Meyer - Algae, Protists & Fungi Plenary
Weiland Meyer - Algae, Protists & Fungi Plenary
 
Alain Franc - Algae, Protists & Fungi Plenary
Alain Franc - Algae, Protists & Fungi PlenaryAlain Franc - Algae, Protists & Fungi Plenary
Alain Franc - Algae, Protists & Fungi Plenary
 
Marieka Gryzenhout - Algae, Protists & Fungi Plenary
Marieka Gryzenhout - Algae, Protists & Fungi PlenaryMarieka Gryzenhout - Algae, Protists & Fungi Plenary
Marieka Gryzenhout - Algae, Protists & Fungi Plenary
 
John La Salle - Opening Plenary
John La Salle - Opening PlenaryJohn La Salle - Opening Plenary
John La Salle - Opening Plenary
 

Último

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 

Último (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Amy Driskell - Information management and data Quality

  • 1. Managing Data Flow Through the Barcoding Pipeline Amy Driskell Laboratories of Analytical Biology (LAB) National Museum of Natural History Smithsonian Institution
  • 2. What is the “pipeline”? LIMS Specimen Data Deposition Data QC
  • 3. Outline 1. BEFORE the LIMS 2. LIMS – Data recorded – Exploring laboratory success/failure – Tracking project completion 3. Data QC – Criteria and data requirements – Checking for contamination and validity
  • 4. Critical data management BEFORE specimen enters the laboratory pipeline • Data elements (“metadata”) necessary for laboratory processing: – Taxonomy, collection information, etc. IMPORTANT! • Assess laboratory successes/failures in light of this information • Tailor/change lab protocols
  • 5. Careful Metadata Collection at Specimen Collection or Harvest • Metadata can be formatted at the beginning of a project (e.g. at specimen collection) to guarantee a smooth information transfer into the LIMS • Multiple sources for metadata: – Spreadsheets – Field Information Management Systems (FIMS) – Museum databases – Fusion tables
  • 6. Rockin’ It “Old School” -- Spreadsheets • Modified BOLD specimen spreadsheet for use in field/museum • Additional fields desired by PIs • Modified easily to interface with multiple kinds of databases • 96-well format – 2D barcoded tubes, extraction plates • NOT directly connected to other databases, including LIMS
  • 7. An Elegant Solution: BiocodeMoorea FIMS Actively connected to their LIMS http://biocode.berkeley.edu/
  • 8. bioValidator – cleaning up the collection of metadata • Many aspects of metadata require specific formats: digital lat/long, meters, names • bioValidator enforces adherence to formatting and other rules • Photo matcher http://biovalidator.sourceforge.net/
  • 9. Museum Collection Databases • Sampling directly from existing collections? • Some museum databases cannot link directly to lab-based information systems (LIMS) • Requires output from collection database, input into lab database – no automatic updates
  • 10. Why? 1. Downstream insertion of data into other databases simplified 2. Because metadata has important uses in the lab • Determine possible causes of failure: taxonomy, collection event, specimen age • adjust extraction or amplification protocols • design new primers – e.g. smaller fragments
  • 11. Specimens enter the lab Metadata enters the LIMS LIMS Specimen & Metadata Data Deposition Data QC
  • 12. What is a LIMS? • An electronic lab “notebook” (aka database) to replace our traditional paper lab notebooks. • Tracks a specimen through lab processes from extraction through to barcode sequence completion (data QC may use external software). • Records every lab procedure. • Provides information to guide further lab efforts – success rates, “redo” lists • Records the physical location of extracts, etc.
  • 13. My requirements for a LIMS • I want a system that records every piece of information about each specimen/extract for which I produce a barcode sequence. • I want my procedures and protocols to be transparent enough so that anyone can reproduce my results. • This includes my QC procedures. • Currently no good place to publish these data.
  • 14. Data to be recorded • Extraction: protocol, digestion time, etc. • PCR: recipes, DNA [ ], cycling parameters, clean-up method (PCR machine, brand of enzyme, lot #) • Gel photos • Sequencing: recipe, clean-up, machine, etc. • Bonus: success or failure can be mapped back to any of these recorded values. Maybe the Taq was bad? Or the PCR machine needs repair?
  • 15. • A LIMS can be homegrown (like LAB’s barcoding LIMS, or SI’s plant barcoding LIMS) – relatively simple relational databases • Sophisticated, commercially produced – Geneious plug-in MooreaBiocode LIMS (plug-in is free) •Software updated and maintained •Plugs into the Geneious data analysis software http://software.mooreabiocode.org
  • 18. Tracking project progress & identifying next steps • Which specimens have completed barcodes? • Which specimens need additional labwork? • Which specimens should be abandoned? • Where are the original DNA extracts or tissue samples?
  • 20. Raw data enters the QC process LIMS Specimen Data Deposition Data QC
  • 21. Data QC • OUTSIDE of LIMS database • “Clean up” raw data – trim, examine quality • Assemble passed traces (“contig”) for a specimen • Examine/edit contigs • Check validity of resulting sequences
  • 22. My data QC ethos • All criteria for each step of data analysis is recorded • For raw trace processing: trimming criteria, length and quality requirements, binning criteria • For assembly: assembly parameters, product length, etc. • Hand editing is minimized* • It would be possible for anyone to recreate the barcode sequence
  • 23. Any DNA sequence analysis software can be used for data QC • Sequencher (Genecodes) &Geneious (Biomatters) – Trim ends of raw sequences with adjustable criteria, explore effects of trim criteria – Discard short or poor sequences – Assemble trimmed reads with stringent, but adjustable criteria – Output completed sequences • Geneious LIMS is plugged into the data analysis software – direct communication – binning* • Sequencher data must be exported and imported into LIMS
  • 24. Data analysis Here are the traces. You can see some FIMS data in the document fields (eg identified by, tissue id). You will also notice a binning column (see the following slide)
  • 25. Binning Automatic categorization of reads and assemblies •Change binning parameters, examine effects •Trimming and assembly dialog boxes similar
  • 26. Final Steps: Is it a contaminant? Is it identified correctly? • A number of procedures for identifying contamination or incorrect identification – BLASTingdatabase of known contaminants; Genbank; BOLD – Quick and dirty assembly tests – NJ trees – Geneious taxonomy verification tool
  • 27. Verify Taxonomy • BLASTs your sequences • Gets the NCBI taxonomy for the best hit(s) • Compares to the taxonomy from the FIMS
  • 28. Good, clean, barcode sequences • Feed back into LIMS* – Monitor progress – Connect sequences and traces to specimen data • Prepare for output to databases Genbank or BOLD upload packages LIMS & Specimen Data QC Data Deposition
  • 29. Positive Information Flow from field or museum to final data deposition 1. Collect metadata to flow easily into LIMS and other databases 2. Record all aspects of all laboratory procedures (LIMS) 3. Use LIMS system for reporting and protocol investigation, monitoring of project progress 4. Input information and data from QC procedures into LIMS* 5. LIMS output upload packages for public databases

Notas do Editor

  1. An example workflow. This workflow was very straight forward – everything worked the first time so we didn’t have to rerun anything. Reaction templates and cocktails on the left, reaction thermocycles on the right.