SlideShare uma empresa Scribd logo
1 de 31
Notes from Nature
Citizen Science data transcription

Peter Oboyski, Jun Ying Lim, Joyce Gross,
Chris Snyder*, Arfon Smith*, Joanie Ball,
Kip Will, Rosemary Gillespie
Essig Museum of Entomology
* Zooniverse Citizen Science Alliance
How does it work?
•
•
•
•
•
•
•
•

Introduction to CalBug
What is Zooniverse?
What do we provide?
What happens online?
What do we get back?
Technical issues
Maintaining interest
How can you get involved?
What is CalBug?
NSF - ADBC grant
Collaboration among the eight major
entomology collections in California
Digitize 1.2 million specimens
Essig Museum of Entomology
California Academy of Sciences
California State Collection of Arthropods
Bohart Museum, UC Davis
Entomology Research Museum, UC Riverside
San Diego Natural History Museum
Santa Barbara Museum of Natural History
LA County Museum
Stephen Dowlan

CalPhotos
MySQL database

Berkeley Mapper
http://calbug.berkeley.edu
Berkeley Natural History
Museums

• In development
– Integrating point data (specimen records) with
Habitat, Range maps, Elevation, Climate, etc.
– Historical recreation of the environment
– Predict potential impacts of environmental change
– Facilitate land use/management decisions
Digitization workflow
(Optional)
Sort by locality,
date, sex, etc.

Error checking
Manually enter data
into MySQL database

Remove labels, add
unique identifier

Geographic
referencing
Online crowd-sourcing
of manual data entry

Take digital image,
name and save file

Replace labels,
return to collection

Handling & Imaging

Aggregate data in
online cache
Optical Character
Recognition (OCR) &
Automated data parsing

Data Capture

Temporospatial
analyses

Data Manipulation
Why Image Labels?
• Magnify difficult to read labels
• Verbatim archive of label data
– Essential for proofing data
– Useful for taxonomists interested in label data

• Data capture can be done remotely
Digital camera tethered to computer
Average 50-55 images per hour
Including imaging, file renaming, and upload

Filename = EMEC218958 Paracotalpa ursina.jpg
Slide Scanning
average 150 slides per hour
including scan, file renaming, and upload
400 DPI
Seems to
provide high
enough
resolution for
difficult to read
labels while
keeping file
size relatively
small
But not high resolution enough for taxonomic work
Using Citizen Scientist
to transcribe label data
http://www.notesfromnature.org/

Launched April 22, 2013
Images in  Transcriptions out
• We supply jpeg images
– 400 DPI (300 DPI good)
– Deposited as zip file
– Stored in Amazon Cloud

• In development
– Automated service to
upload images to A.C.
– Be able to prioritize
image set

• Zooniverse provides
– MondoDB data dump
– 1 record = 1 transcription
– 4 transcriptions / image

• In development
– Automated daily dump
Reconciling transcriptions
• Drop down lists (Country, State, County, Date)
are compared for exact match
– Occasionally missing, sometimes wrong
– Majority rule

• Free-form text fields (Locality, Collectors)
are much more problematic
– Transcribers asked to record label data verbatim
– Puctuation, capitalization, spacing between words
– Misspelling, expanding abbreviations, interpretations
Reconciling transcriptions
• Developing scripts in R to reconcile free-form text

• Text matching for maximum correspondence among
multiple transcriptions (cf. DNA alignment methods)
• Final result = 1 transcription in our database
with links to the 4 original transcriptions
marked as Citizen Science transcribed record
• Vetting by CalBug personnel still necessary, but we can
prioritize based on record-matching confidence scores
Generating & Maintaining Interest
Number of Notes from Nature transcriptions for CalBug
Generating & Maintaining Interest
Generating & Maintaining Interest
• Popular media, social media, and press releases
– Only so many occasions for a press release

• Campaigns
– Highlight particular taxa, habitats, geographic regions

• Education
– High quality, high resolution photo of species transcribed
– Create links to other services to learn more about species

• Competitions
– Prizes are worth more than badges
– However, need to watch for bad data in pursuit of prize
How can you get involved?
• Right now you cannot
• iDigBio is interested in getting involved
• iDigBio hosting a hackathon in December
• Begin building up collections of images
Thank you
And a HUGE thank you to the
CalBug Army
who image our specimens
Chris Amy, Maritess Aristorenas, Jazmin Calderon, Alex Carolina, Sonia Castillo, Matthew Chan, Sabina Cook, Alex Darwish, John Davie, Jesson Go, Nick
Grady-Grote, Ginger Haight, Laura Hayes, Dennis Ho, Aubrey Huey, Leah Humphreys, Veronica Hurd, Hanna Huynh, Eseosa Igbinedion, Ilona Istenes, Emma
Kohlsmith, Asia Kwan, Tiffany Kyo, Jerry Lee, Ken Lee, Christina Lew, Maggie Lewis, Alex Lim, Derick Matano, Christian Munevar, Frank Ngo, Kent Nguyen,
Minh Nguyen, Riley O'Brien, Marielle Pinheiro, Rammonhan Reddy, Jessica Rothery, Stacey Rutherford, Anna Szendrenyi, Anni Sheh, Hannah Shin, Erika So,
Mee Thao, Cindy Truong, Darleen Tu, Skyler Valle, Daug Vaughn, Hayden Wong, Yiu Kei Wong, Keane Yang, Kevin Yao, Frances Zhang

Mais conteúdo relacionado

Destaque

The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...
The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...
The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...Martin Kalfatovic
 
W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010
W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010
W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010smithje
 
Into the Night - Technology for citizen science
Into the Night - Technology for citizen scienceInto the Night - Technology for citizen science
Into the Night - Technology for citizen scienceMuki Haklay
 
Project Management Plan Template
Project Management Plan TemplateProject Management Plan Template
Project Management Plan TemplateSimplilearn
 
Mobilising the world's Natural History - Open Data + Citizen Science
Mobilising the world's Natural History - Open Data + Citizen ScienceMobilising the world's Natural History - Open Data + Citizen Science
Mobilising the world's Natural History - Open Data + Citizen ScienceMargaret Gold
 
The Laws of User Experience: Making it or Breaking It with the UX Factor
The Laws of User Experience: Making it or Breaking It with the UX FactorThe Laws of User Experience: Making it or Breaking It with the UX Factor
The Laws of User Experience: Making it or Breaking It with the UX FactorEffective
 
bsnl project report
 bsnl project report bsnl project report
bsnl project reportTara Saini
 
Cafe construction project report
Cafe construction project reportCafe construction project report
Cafe construction project reportHagi Sahib
 
project on construction of house report.
project on construction of house report.project on construction of house report.
project on construction of house report.Hagi Sahib
 
Sample project plan
Sample project planSample project plan
Sample project planmamoonnift
 
3D Printing - A 2014 Horizonwatching Trend Summary Report
3D Printing - A 2014 Horizonwatching Trend Summary Report3D Printing - A 2014 Horizonwatching Trend Summary Report
3D Printing - A 2014 Horizonwatching Trend Summary ReportBill Chamberlin
 

Destaque (13)

The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...
The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...
The Natural History of Unicorns: Museums, Libraries, and Technology Collabora...
 
Citizen Science and Cultural Heritage
Citizen Science and Cultural HeritageCitizen Science and Cultural Heritage
Citizen Science and Cultural Heritage
 
W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010
W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010
W:\Jane Smith\Biodiversity Heritage Library News From Europe Ala2010
 
Into the Night - Technology for citizen science
Into the Night - Technology for citizen scienceInto the Night - Technology for citizen science
Into the Night - Technology for citizen science
 
Project Management Plan Template
Project Management Plan TemplateProject Management Plan Template
Project Management Plan Template
 
Mobilising the world's Natural History - Open Data + Citizen Science
Mobilising the world's Natural History - Open Data + Citizen ScienceMobilising the world's Natural History - Open Data + Citizen Science
Mobilising the world's Natural History - Open Data + Citizen Science
 
The Laws of User Experience: Making it or Breaking It with the UX Factor
The Laws of User Experience: Making it or Breaking It with the UX FactorThe Laws of User Experience: Making it or Breaking It with the UX Factor
The Laws of User Experience: Making it or Breaking It with the UX Factor
 
bsnl project report
 bsnl project report bsnl project report
bsnl project report
 
Cafe construction project report
Cafe construction project reportCafe construction project report
Cafe construction project report
 
project on construction of house report.
project on construction of house report.project on construction of house report.
project on construction of house report.
 
Mba project report
Mba project reportMba project report
Mba project report
 
Sample project plan
Sample project planSample project plan
Sample project plan
 
3D Printing - A 2014 Horizonwatching Trend Summary Report
3D Printing - A 2014 Horizonwatching Trend Summary Report3D Printing - A 2014 Horizonwatching Trend Summary Report
3D Printing - A 2014 Horizonwatching Trend Summary Report
 

Semelhante a Oboyski ecn2013

We've Got Issues: Issue Tracking and Workflow in the Digital Library
We've Got Issues: Issue Tracking and Workflow in the Digital LibraryWe've Got Issues: Issue Tracking and Workflow in the Digital Library
We've Got Issues: Issue Tracking and Workflow in the Digital LibraryElectronic Resources & Libraries
 
Giddens ecn2013
Giddens ecn2013Giddens ecn2013
Giddens ecn2013ECNOfficer
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr
 
Planning the Future and Preserving the Past: Emerging Technology in the Libra...
Planning the Future and Preserving the Past: Emerging Technology in the Libra...Planning the Future and Preserving the Past: Emerging Technology in the Libra...
Planning the Future and Preserving the Past: Emerging Technology in the Libra...Kristen Yarmey
 
#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...
#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...
#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...NZ Landcare Trust
 
Biotracker: Melding Human and Machine Capabilities to Document the World's L...
 Biotracker: Melding Human and Machine Capabilities to Document the World's L... Biotracker: Melding Human and Machine Capabilities to Document the World's L...
Biotracker: Melding Human and Machine Capabilities to Document the World's L...Harish Vaidyanathan
 
Biodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History MuseumBiodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History MuseumEdward Baker
 
Technology and Trees.ppt
Technology and Trees.pptTechnology and Trees.ppt
Technology and Trees.pptMary Van Dyke
 
Technology And Trees.Ppt
Technology And Trees.PptTechnology And Trees.Ppt
Technology And Trees.PptMary Van Dyke
 
Helping Genealogists Climb Family Trees June 2008
Helping Genealogists Climb Family Trees June 2008Helping Genealogists Climb Family Trees June 2008
Helping Genealogists Climb Family Trees June 2008Elise C. Cole
 
Shoeboxes and Scanners: Digitizing Personal Treasures
Shoeboxes and Scanners: Digitizing Personal TreasuresShoeboxes and Scanners: Digitizing Personal Treasures
Shoeboxes and Scanners: Digitizing Personal TreasuresCory Lampert
 
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...GigaScience, BGI Hong Kong
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsJohn Kunze
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchMarieke van Erp
 
What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?Adrian Kingston
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Trish Rose-Sandler
 
Librarians and Open Access: the case of E-LIS
Librarians and Open Access: the case of E-LIS Librarians and Open Access: the case of E-LIS
Librarians and Open Access: the case of E-LIS Fatima Darries
 

Semelhante a Oboyski ecn2013 (20)

We've Got Issues: Issue Tracking and Workflow in the Digital Library
We've Got Issues: Issue Tracking and Workflow in the Digital LibraryWe've Got Issues: Issue Tracking and Workflow in the Digital Library
We've Got Issues: Issue Tracking and Workflow in the Digital Library
 
Giddens ecn2013
Giddens ecn2013Giddens ecn2013
Giddens ecn2013
 
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Planning the Future and Preserving the Past: Emerging Technology in the Libra...
Planning the Future and Preserving the Past: Emerging Technology in the Libra...Planning the Future and Preserving the Past: Emerging Technology in the Libra...
Planning the Future and Preserving the Past: Emerging Technology in the Libra...
 
Planning a Successful Digital Project
Planning a Successful Digital ProjectPlanning a Successful Digital Project
Planning a Successful Digital Project
 
Gregoire Taillefer poster ESC final
Gregoire Taillefer poster ESC finalGregoire Taillefer poster ESC final
Gregoire Taillefer poster ESC final
 
#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...
#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...
#CitSciNZ2018 - Andrea Wiggins (University of Nebraska) Fitness for Intended ...
 
Biotracker: Melding Human and Machine Capabilities to Document the World's L...
 Biotracker: Melding Human and Machine Capabilities to Document the World's L... Biotracker: Melding Human and Machine Capabilities to Document the World's L...
Biotracker: Melding Human and Machine Capabilities to Document the World's L...
 
Biodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History MuseumBiodiversity Informatics at the Natural History Museum
Biodiversity Informatics at the Natural History Museum
 
Technology and Trees.ppt
Technology and Trees.pptTechnology and Trees.ppt
Technology and Trees.ppt
 
Technology And Trees.Ppt
Technology And Trees.PptTechnology And Trees.Ppt
Technology And Trees.Ppt
 
Helping Genealogists Climb Family Trees June 2008
Helping Genealogists Climb Family Trees June 2008Helping Genealogists Climb Family Trees June 2008
Helping Genealogists Climb Family Trees June 2008
 
Shoeboxes and Scanners: Digitizing Personal Treasures
Shoeboxes and Scanners: Digitizing Personal TreasuresShoeboxes and Scanners: Digitizing Personal Treasures
Shoeboxes and Scanners: Digitizing Personal Treasures
 
Curating Humanities Data: Law, technology and reality
Curating Humanities Data: Law, technology and realityCurating Humanities Data: Law, technology and reality
Curating Humanities Data: Law, technology and reality
 
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and ...
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data Citations
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
 
What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
 
Librarians and Open Access: the case of E-LIS
Librarians and Open Access: the case of E-LIS Librarians and Open Access: the case of E-LIS
Librarians and Open Access: the case of E-LIS
 

Mais de ECNOfficer

Price2 ecn2013
Price2 ecn2013Price2 ecn2013
Price2 ecn2013ECNOfficer
 
Sikes ecn2013 dn_ab
Sikes ecn2013 dn_abSikes ecn2013 dn_ab
Sikes ecn2013 dn_abECNOfficer
 
Janzen ecn2013
Janzen ecn2013Janzen ecn2013
Janzen ecn2013ECNOfficer
 
Nearns ecn2013
Nearns ecn2013Nearns ecn2013
Nearns ecn2013ECNOfficer
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013ECNOfficer
 
Rubinoff ecn2013 uhim
Rubinoff ecn2013 uhimRubinoff ecn2013 uhim
Rubinoff ecn2013 uhimECNOfficer
 
Mc alister ecn2013
Mc alister ecn2013Mc alister ecn2013
Mc alister ecn2013ECNOfficer
 
Dombroskie ecn2013
Dombroskie ecn2013Dombroskie ecn2013
Dombroskie ecn2013ECNOfficer
 
Dmitriev ecn2013
Dmitriev ecn2013Dmitriev ecn2013
Dmitriev ecn2013ECNOfficer
 
Thomas ecn2013
Thomas ecn2013Thomas ecn2013
Thomas ecn2013ECNOfficer
 
Jones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabioJones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabioECNOfficer
 
Austin ecn2013
Austin ecn2013Austin ecn2013
Austin ecn2013ECNOfficer
 
Yu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasingYu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasingECNOfficer
 
Solis ecn2013 usfws
Solis ecn2013 usfwsSolis ecn2013 usfws
Solis ecn2013 usfwsECNOfficer
 
Schuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structureSchuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structureECNOfficer
 
Gil ecn2013 ppt
Gil ecn2013 pptGil ecn2013 ppt
Gil ecn2013 pptECNOfficer
 
Dm smith ecn2013
Dm smith ecn2013Dm smith ecn2013
Dm smith ecn2013ECNOfficer
 
Abrahamson ecn2013 evaluating_naturalhistorycollectionuse
Abrahamson ecn2013 evaluating_naturalhistorycollectionuseAbrahamson ecn2013 evaluating_naturalhistorycollectionuse
Abrahamson ecn2013 evaluating_naturalhistorycollectionuseECNOfficer
 

Mais de ECNOfficer (20)

Price2 ecn2013
Price2 ecn2013Price2 ecn2013
Price2 ecn2013
 
Sikes ecn2013 dn_ab
Sikes ecn2013 dn_abSikes ecn2013 dn_ab
Sikes ecn2013 dn_ab
 
Ryder ecn2013
Ryder ecn2013Ryder ecn2013
Ryder ecn2013
 
Janzen ecn2013
Janzen ecn2013Janzen ecn2013
Janzen ecn2013
 
Nearns ecn2013
Nearns ecn2013Nearns ecn2013
Nearns ecn2013
 
Krell ecn2013
Krell ecn2013Krell ecn2013
Krell ecn2013
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 
Rubinoff ecn2013 uhim
Rubinoff ecn2013 uhimRubinoff ecn2013 uhim
Rubinoff ecn2013 uhim
 
Mc alister ecn2013
Mc alister ecn2013Mc alister ecn2013
Mc alister ecn2013
 
Dombroskie ecn2013
Dombroskie ecn2013Dombroskie ecn2013
Dombroskie ecn2013
 
Dmitriev ecn2013
Dmitriev ecn2013Dmitriev ecn2013
Dmitriev ecn2013
 
Thomas ecn2013
Thomas ecn2013Thomas ecn2013
Thomas ecn2013
 
Jones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabioJones ecn2013 the_goodbadugly conabio
Jones ecn2013 the_goodbadugly conabio
 
Austin ecn2013
Austin ecn2013Austin ecn2013
Austin ecn2013
 
Yu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasingYu ecn2013 cnc_databasing
Yu ecn2013 cnc_databasing
 
Solis ecn2013 usfws
Solis ecn2013 usfwsSolis ecn2013 usfws
Solis ecn2013 usfws
 
Schuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structureSchuh ecn2013 tcn_data_structure
Schuh ecn2013 tcn_data_structure
 
Gil ecn2013 ppt
Gil ecn2013 pptGil ecn2013 ppt
Gil ecn2013 ppt
 
Dm smith ecn2013
Dm smith ecn2013Dm smith ecn2013
Dm smith ecn2013
 
Abrahamson ecn2013 evaluating_naturalhistorycollectionuse
Abrahamson ecn2013 evaluating_naturalhistorycollectionuseAbrahamson ecn2013 evaluating_naturalhistorycollectionuse
Abrahamson ecn2013 evaluating_naturalhistorycollectionuse
 

Último

APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationAnamaria Contreras
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportMintel Group
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdfKhaled Al Awadi
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfrichard876048
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...ShrutiBose4
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 

Último (20)

APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample Report
 
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdfNewBase  19 April  2024  Energy News issue - 1717 by Khaled Al Awadi.pdf
NewBase 19 April 2024 Energy News issue - 1717 by Khaled Al Awadi.pdf
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
Innovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdfInnovation Conference 5th March 2024.pdf
Innovation Conference 5th March 2024.pdf
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
Ms Motilal Padampat Sugar Mills vs. State of Uttar Pradesh & Ors. - A Milesto...
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
Call Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North GoaCall Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North Goa
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 

Oboyski ecn2013

  • 1. Notes from Nature Citizen Science data transcription Peter Oboyski, Jun Ying Lim, Joyce Gross, Chris Snyder*, Arfon Smith*, Joanie Ball, Kip Will, Rosemary Gillespie Essig Museum of Entomology * Zooniverse Citizen Science Alliance
  • 2.
  • 3. How does it work? • • • • • • • • Introduction to CalBug What is Zooniverse? What do we provide? What happens online? What do we get back? Technical issues Maintaining interest How can you get involved?
  • 4. What is CalBug? NSF - ADBC grant Collaboration among the eight major entomology collections in California Digitize 1.2 million specimens Essig Museum of Entomology California Academy of Sciences California State Collection of Arthropods Bohart Museum, UC Davis Entomology Research Museum, UC Riverside San Diego Natural History Museum Santa Barbara Museum of Natural History LA County Museum
  • 5. Stephen Dowlan CalPhotos MySQL database Berkeley Mapper http://calbug.berkeley.edu
  • 6. Berkeley Natural History Museums • In development – Integrating point data (specimen records) with Habitat, Range maps, Elevation, Climate, etc. – Historical recreation of the environment – Predict potential impacts of environmental change – Facilitate land use/management decisions
  • 7. Digitization workflow (Optional) Sort by locality, date, sex, etc. Error checking Manually enter data into MySQL database Remove labels, add unique identifier Geographic referencing Online crowd-sourcing of manual data entry Take digital image, name and save file Replace labels, return to collection Handling & Imaging Aggregate data in online cache Optical Character Recognition (OCR) & Automated data parsing Data Capture Temporospatial analyses Data Manipulation
  • 8. Why Image Labels? • Magnify difficult to read labels • Verbatim archive of label data – Essential for proofing data – Useful for taxonomists interested in label data • Data capture can be done remotely
  • 9. Digital camera tethered to computer Average 50-55 images per hour Including imaging, file renaming, and upload Filename = EMEC218958 Paracotalpa ursina.jpg
  • 10. Slide Scanning average 150 slides per hour including scan, file renaming, and upload
  • 11. 400 DPI Seems to provide high enough resolution for difficult to read labels while keeping file size relatively small
  • 12. But not high resolution enough for taxonomic work
  • 13. Using Citizen Scientist to transcribe label data
  • 14.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. Images in  Transcriptions out • We supply jpeg images – 400 DPI (300 DPI good) – Deposited as zip file – Stored in Amazon Cloud • In development – Automated service to upload images to A.C. – Be able to prioritize image set • Zooniverse provides – MondoDB data dump – 1 record = 1 transcription – 4 transcriptions / image • In development – Automated daily dump
  • 25. Reconciling transcriptions • Drop down lists (Country, State, County, Date) are compared for exact match – Occasionally missing, sometimes wrong – Majority rule • Free-form text fields (Locality, Collectors) are much more problematic – Transcribers asked to record label data verbatim – Puctuation, capitalization, spacing between words – Misspelling, expanding abbreviations, interpretations
  • 26. Reconciling transcriptions • Developing scripts in R to reconcile free-form text • Text matching for maximum correspondence among multiple transcriptions (cf. DNA alignment methods) • Final result = 1 transcription in our database with links to the 4 original transcriptions marked as Citizen Science transcribed record • Vetting by CalBug personnel still necessary, but we can prioritize based on record-matching confidence scores
  • 27. Generating & Maintaining Interest Number of Notes from Nature transcriptions for CalBug
  • 29. Generating & Maintaining Interest • Popular media, social media, and press releases – Only so many occasions for a press release • Campaigns – Highlight particular taxa, habitats, geographic regions • Education – High quality, high resolution photo of species transcribed – Create links to other services to learn more about species • Competitions – Prizes are worth more than badges – However, need to watch for bad data in pursuit of prize
  • 30. How can you get involved? • Right now you cannot • iDigBio is interested in getting involved • iDigBio hosting a hackathon in December • Begin building up collections of images
  • 31. Thank you And a HUGE thank you to the CalBug Army who image our specimens Chris Amy, Maritess Aristorenas, Jazmin Calderon, Alex Carolina, Sonia Castillo, Matthew Chan, Sabina Cook, Alex Darwish, John Davie, Jesson Go, Nick Grady-Grote, Ginger Haight, Laura Hayes, Dennis Ho, Aubrey Huey, Leah Humphreys, Veronica Hurd, Hanna Huynh, Eseosa Igbinedion, Ilona Istenes, Emma Kohlsmith, Asia Kwan, Tiffany Kyo, Jerry Lee, Ken Lee, Christina Lew, Maggie Lewis, Alex Lim, Derick Matano, Christian Munevar, Frank Ngo, Kent Nguyen, Minh Nguyen, Riley O'Brien, Marielle Pinheiro, Rammonhan Reddy, Jessica Rothery, Stacey Rutherford, Anna Szendrenyi, Anni Sheh, Hannah Shin, Erika So, Mee Thao, Cindy Truong, Darleen Tu, Skyler Valle, Daug Vaughn, Hayden Wong, Yiu Kei Wong, Keane Yang, Kevin Yao, Frances Zhang

Notas do Editor

  1. Collaboration between Zooniverse, a citizen science portal which hosts a number of citizen science projects with a very large following, and CalBug, SERNAC (SouthEast Regional Network of Collections), and Natural History Museum, London, Ornithology Collection.
  2. The site went live while I was at the iDigBio meeting at the Field Museum in April. Since that time we have surpassed a quarter million transcriptions by over 3,500 citizen scientists.
  3. CalBug is an NSF-ADBC collaborative project among the eight major arthropod collections in California to digitize over one million specimens from our combined collections. Although we are collecting all the data together in a single cache and sharing techniques and workflows, each museum has developed its own approach based on the people and resources they have available. Therefore, what I am presenting is the approach we use at the Essig Museum, which may be somewhat different from the other institutions.
  4. The goal is to make California arthropod diversity data available online through our own web service as well as through aggregators such as GBIF.
  5. Our workflow for digitization can be broken down into three general categories. First is specimen handling and imaging where we remove the labels (from pinned specimens), add unique identifiers (we use datamatrix barcodes), and image the labels placed next to the specimens. Next we capture data from the images either with our own people directly in our own MySQL database, or through our citizen science project, Notes from Nature. We are also looking into ways to incorporate OCR into data capture. Finally, the data are proofed, georeferenced, aggregated and analyzed.
  6. During the iDigBio meeting at the Field Museum in Chicago in April I learned that although many institutions are doing some form of imaging, hardly any were using the images as part of their databasing workflow! Personally I see an overwhelming benefit to imaging the individual specimens with their labels.
  7. Here is an example of one of our pinned specimens. We use a digital camera tethered to a computer. Using IrfanView software to batch process image files we rename each file to include the unique identifier, genus, and species name. Although the genus and species name may change for this specimen over time, it is critical that these elements are in the filename for fast and efficient management of image files.
  8. And now … slide scanning
  9. The site went live while I was at the iDigBio meeting at the Field Museum in April. Since that time we have surpassed a quarter million transcriptions by over 3,500 citizen scientists.