SlideShare uma empresa Scribd logo
1 de 29
Digitalização: Captura de Imagem
e Fluxo de Trabalho
Martin Kalfatovic, Keri Thompson &
Connie Rinaldo
Selection
Refinement
Digitization
CurationUse
Selection
Collection Management Cycle
• Communication
Selection
Refinement
Digitization
CurationUse
Selection
Collection Management Cycle
• Workflow has become more
complicated
• Difficulty finding books that are
easy to scan
• Reviewing titles in copyright takes
time
• Fragile books need repair
• The same amount of work, but a
different kind
Upload spreadsheet titles scanned plans. Include OCLC number, title, volume number,
Author, Publisher, Date
Tool tries to find matches in other spreadsheets submitted
Lesson: metadata is always worse than you think
Title, volumes needed
Which library has which volumes,
additional information
conversation
about which
volumes need
to be scanned
GEMINI: A Critical Tool
Selection
Refinement
Digitization
CurationUse
Selection
• Purpose - to provide an accurate digital
representation of the original object
• one page per image
• (except Field note-books - 2 pages per image)
• no image editing
• Reuse existing metadata
• in the library catalog
• other sources (BioStor etc.)
Capture: Scanning
Capture-Scanning
• Most libraries BHL US / UK use the Internet
Archive (IA) for scanning books
• Some shared funds/one contract for all BHL
• Open Access, nonprofit
• Services inexpensive
• Each member library has its own workflow
• Members provide basic metadata from library
catalog
• In-house digitization or hire another seller
• MACAW
• * Scan books, from
cover to cover one
image per page?
• * Also called
"volume" or "item"
is a physical unit,
not intellectual
unity, ie, a book =
multiple articles or
book = a
monograph
Cover
Cover
good stuff
Partial replication in
Alexandria, Egypt
Secondary backup is in the
Smithsonian, including TIFF
scanned volumes for home (SIL)
~ 90TB
Primary Storage files and
"staging area" is on the
Internet Archive in San
Francisco, USA
Images scanned by the library or other
vendor
Metadata collected through Z39.50
Additional metadata for the item and
pages entered by library staff using the
software Macaw (biblio software mimics
IA)
In-house scanning
Smithsonian Libraries:
uses 2 sets of Phase One:
P65 60 MP camera on a copy stand and BC100 -
dual-chamber 40mP
CaptureOne software
By folios (> 36cm), fragile books
EXCEPT Notebooks Field
Project (Smithsonian
Archives) - 2 pages per
image to notebooks, letters
flatbed scanner
Capture: Harvest
• Scheduled tasks automated
• Books already in the Internet Archive
• subject terms
• Library "call numbers”
• BioStor/articles
Selection
Refinement
Digitization
CurationUse
Selection
Interface for staff to
edit records and
serial volumes put in
order
Curated add and edit
metadata includes
books, merging records
and authors, removing
volumes that are
outside the scope of
the collection, re-scan
books with errors.
CURATION
allows people to
enter the page-level
metadata such as
page number, page
type (picture, text,
etc.)
creates XML files to
upload to IA
Replicates software
functionality from
Internet Archive
Installed in a shared
SI server for
partners to use
MACAW: MetadatA Collection And Workflow
A Critical Tool
•"Title" Record MARC library catalog
•Transformed into MARCXML and MODS
•Information "Volume" catalog or introduced by humans, stored
in xml
•"Segment" (article) the information entered by humans or
bioStor etc. (after scanning)
•"Page" metadata entered by humans, stored in the XML file that
provides structure to the digital object
Metadata
add metadata
page level,
such as page
numbers or
titles of
articles
• Other files derived from Internet Archive processes
– PDF
– Djvu (OCR text - .txt and .xml)
– ePub/Daisy/Kindle
• Other files created by BHL processes
–Taxonomic names
–OCR text
– BHL METS
Discovering and storing species names associated with pages allows the creation of
"species bibliographies," EOL.org connections, GBIF connections
Selection
Refinement
Digitization
CurationUse
Selection
Users can (and do!)
Report technical
problems
Request new
functionality
Report data errors
Request scanning of
specific titles
Gemini
Which library has which volumes,
additional information
Gemini
Title, volumes needed
Assigned to
Cornell
University
Requestor
For all we know, in response to user requests is rare in the world
of Digital Library.
Smithsonian Libraries
Workflow
s
database
library
catalog
Macaw
Internet
Archive
Move &
de-
duplicate
tracking &
shipping
Scanning &
metadata
harvesting
BHL
transform
& package
scanning &
metadata
harvesting
create
metadata
page
create
derivative
create
metadata
page
MARC  MARCxml
URL to BHL into MARC record species names
quality
control
(% sample)
• Obrigada!
Serial Gemini workflow

Mais conteúdo relacionado

Destaque

Blog SciELO y sistemas de gestión editorial - Alex Mendonça
Blog SciELO y sistemas de gestión editorial - Alex MendonçaBlog SciELO y sistemas de gestión editorial - Alex Mendonça
Blog SciELO y sistemas de gestión editorial - Alex Mendonça
SciELO - Scientific Electronic Library Online
 
Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...
Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...
Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...
SciELO - Scientific Electronic Library Online
 
ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)
ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)
ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)
SciELO - Scientific Electronic Library Online
 
Control de calidadde los números antes del envío para procesamiento - Equipo ...
Control de calidadde los números antes del envío para procesamiento - Equipo ...Control de calidadde los números antes del envío para procesamiento - Equipo ...
Control de calidadde los números antes del envío para procesamiento - Equipo ...
SciELO - Scientific Electronic Library Online
 
Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...
Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...
Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...
SciELO - Scientific Electronic Library Online
 
Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...
Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...
Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...
SciELO - Scientific Electronic Library Online
 
Preparación de archivos para marcación - Equipo Producción SciELO Brasil
Preparación de archivos para marcación - Equipo Producción SciELO BrasilPreparación de archivos para marcación - Equipo Producción SciELO Brasil
Preparación de archivos para marcación - Equipo Producción SciELO Brasil
SciELO - Scientific Electronic Library Online
 
Workshop scholar one e os novos critérios SciELO - Alex Mendonça
Workshop scholar one e os novos critérios SciELO - Alex MendonçaWorkshop scholar one e os novos critérios SciELO - Alex Mendonça
Workshop scholar one e os novos critérios SciELO - Alex Mendonça
SciELO - Scientific Electronic Library Online
 

Destaque (18)

Blog SciELO y sistemas de gestión editorial - Alex Mendonça
Blog SciELO y sistemas de gestión editorial - Alex MendonçaBlog SciELO y sistemas de gestión editorial - Alex Mendonça
Blog SciELO y sistemas de gestión editorial - Alex Mendonça
 
Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...
Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...
Critérios SciELO Brasil – Aumentando la visibilidad y el impacto de las revis...
 
Construindo a vasta biblioteca de Vida - Nancy Gwinn
Construindo a vasta biblioteca de Vida - Nancy GwinnConstruindo a vasta biblioteca de Vida - Nancy Gwinn
Construindo a vasta biblioteca de Vida - Nancy Gwinn
 
Sobre o impacto dos artigos científicos de autores no Brasil - Carlos Henriqu...
Sobre o impacto dos artigos científicos de autores no Brasil - Carlos Henriqu...Sobre o impacto dos artigos científicos de autores no Brasil - Carlos Henriqu...
Sobre o impacto dos artigos científicos de autores no Brasil - Carlos Henriqu...
 
Interfaces to inform biodiversity public policies - Carlos Joly
Interfaces to inform biodiversity public policies - Carlos JolyInterfaces to inform biodiversity public policies - Carlos Joly
Interfaces to inform biodiversity public policies - Carlos Joly
 
ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)
ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)
ScholarOne e os Critérios SciELO (II Curso de Atualização SciELO-ScholarOne)
 
A Biblioteca Global de Patrimônio da Biodiversidade: BHL Day - Constance Rinaldo
A Biblioteca Global de Patrimônio da Biodiversidade: BHL Day - Constance RinaldoA Biblioteca Global de Patrimônio da Biodiversidade: BHL Day - Constance Rinaldo
A Biblioteca Global de Patrimônio da Biodiversidade: BHL Day - Constance Rinaldo
 
Control de calidadde los números antes del envío para procesamiento - Equipo ...
Control de calidadde los números antes del envío para procesamiento - Equipo ...Control de calidadde los números antes del envío para procesamiento - Equipo ...
Control de calidadde los números antes del envío para procesamiento - Equipo ...
 
GBIF – avanços e perspectivas - Tim Hirsch
GBIF – avanços e perspectivas - Tim HirschGBIF – avanços e perspectivas - Tim Hirsch
GBIF – avanços e perspectivas - Tim Hirsch
 
Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...
Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...
Boas práticas, erros comuns, resolução de problemas e dicas de uso no Scholar...
 
Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...
Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...
Panorama geral do ScholarOne na Coleção SciELO Brasil e retrospectiva 2015 (I...
 
Preparación de archivos para marcación - Equipo Producción SciELO Brasil
Preparación de archivos para marcación - Equipo Producción SciELO BrasilPreparación de archivos para marcación - Equipo Producción SciELO Brasil
Preparación de archivos para marcación - Equipo Producción SciELO Brasil
 
Workshop scholar one e os novos critérios SciELO - Alex Mendonça
Workshop scholar one e os novos critérios SciELO - Alex MendonçaWorkshop scholar one e os novos critérios SciELO - Alex Mendonça
Workshop scholar one e os novos critérios SciELO - Alex Mendonça
 
Dimensões nacional e internacional do impacto e consumo de informação dos per...
Dimensões nacional e internacional do impacto e consumo de informação dos per...Dimensões nacional e internacional do impacto e consumo de informação dos per...
Dimensões nacional e internacional do impacto e consumo de informação dos per...
 
A Global Biodiversity Heritage Library - Ely Wallis
A Global Biodiversity Heritage Library - Ely WallisA Global Biodiversity Heritage Library - Ely Wallis
A Global Biodiversity Heritage Library - Ely Wallis
 
Mini curso Cognos: construindo relatórios de autores, pareceristas e editores...
Mini curso Cognos: construindo relatórios de autores, pareceristas e editores...Mini curso Cognos: construindo relatórios de autores, pareceristas e editores...
Mini curso Cognos: construindo relatórios de autores, pareceristas e editores...
 
A informação na palma da mão - Keila Elizabeth
A informação na palma da mão - Keila ElizabethA informação na palma da mão - Keila Elizabeth
A informação na palma da mão - Keila Elizabeth
 
Local / universal: questões e desafios para os periódicos das ciências humana...
Local / universal: questões e desafios para os periódicos das ciências humana...Local / universal: questões e desafios para os periódicos das ciências humana...
Local / universal: questões e desafios para os periódicos das ciências humana...
 

Semelhante a Digitalização: Captura de Imagem e Fluxo de Trabalho - Constance Rinaldo

Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
Sean Petiya
 
Kampmeier ecn 2012
Kampmeier ecn 2012Kampmeier ecn 2012
Kampmeier ecn 2012
ECNOfficer
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
Zubair Nabi
 
BHL-Africa Launch 2013: Collection Mgmt Overview
BHL-Africa Launch 2013: Collection Mgmt OverviewBHL-Africa Launch 2013: Collection Mgmt Overview
BHL-Africa Launch 2013: Collection Mgmt Overview
Bianca Crowley
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Lucidworks
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
Yahoo Developer Network
 

Semelhante a Digitalização: Captura de Imagem e Fluxo de Trabalho - Constance Rinaldo (20)

Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...Thesis Proposal: User Application Profiles for Publishing Linked Data in  HTM...
Thesis Proposal: User Application Profiles for Publishing Linked Data in HTM...
 
A Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big DataA Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big Data
 
What do you want to discover today? / Janet Aucock, University of St Andrews
What do you want to discover today? / Janet Aucock, University of St AndrewsWhat do you want to discover today? / Janet Aucock, University of St Andrews
What do you want to discover today? / Janet Aucock, University of St Andrews
 
Ils on a shoe string budget
Ils on a shoe string budgetIls on a shoe string budget
Ils on a shoe string budget
 
Kampmeier ecn 2012
Kampmeier ecn 2012Kampmeier ecn 2012
Kampmeier ecn 2012
 
Council on Botanical and Horticultural Libraries 2016 Presentation
Council on Botanical and Horticultural Libraries 2016 PresentationCouncil on Botanical and Horticultural Libraries 2016 Presentation
Council on Botanical and Horticultural Libraries 2016 Presentation
 
Big data berlin
Big data berlinBig data berlin
Big data berlin
 
Possibilities for Koha 4
Possibilities for Koha 4Possibilities for Koha 4
Possibilities for Koha 4
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
Kings fund - implementing Hyku
Kings fund - implementing HykuKings fund - implementing Hyku
Kings fund - implementing Hyku
 
Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval Latest trends in AI and information Retrieval
Latest trends in AI and information Retrieval
 
Cataloging Basics Webinar (NEKLS)
Cataloging Basics Webinar (NEKLS)Cataloging Basics Webinar (NEKLS)
Cataloging Basics Webinar (NEKLS)
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
BHL-Africa Launch 2013: Collection Mgmt Overview
BHL-Africa Launch 2013: Collection Mgmt OverviewBHL-Africa Launch 2013: Collection Mgmt Overview
BHL-Africa Launch 2013: Collection Mgmt Overview
 
BHL Collections Management
BHL Collections ManagementBHL Collections Management
BHL Collections Management
 
The Elephant in the Library - Integrating Hadoop
The Elephant in the Library - Integrating HadoopThe Elephant in the Library - Integrating Hadoop
The Elephant in the Library - Integrating Hadoop
 
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...
Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...
Making the Big Move: Moving to Cloud-Based OCLC’s WorldShare Management Servi...
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Mais de SciELO - Scientific Electronic Library Online

Publons (IV Curso SciELO-ScholarOne)
Publons (IV Curso SciELO-ScholarOne)Publons (IV Curso SciELO-ScholarOne)
Publons (IV Curso SciELO-ScholarOne)
SciELO - Scientific Electronic Library Online
 
Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...
Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...
Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...
SciELO - Scientific Electronic Library Online
 
Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...
Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...
Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...
SciELO - Scientific Electronic Library Online
 
Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...
Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...
Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...
SciELO - Scientific Electronic Library Online
 
Patricia Muñoz Palma - SciELO-Chile: Una mirada de 20 años
Patricia Muñoz Palma - SciELO-Chile: Una mirada de 20 añosPatricia Muñoz Palma - SciELO-Chile: Una mirada de 20 años
Patricia Muñoz Palma - SciELO-Chile: Una mirada de 20 años
SciELO - Scientific Electronic Library Online
 
James Testa - International Collaboration Top Cited journals
James Testa - International Collaboration Top Cited journals James Testa - International Collaboration Top Cited journals
James Testa - International Collaboration Top Cited journals
SciELO - Scientific Electronic Library Online
 
Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...
Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...
Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...
SciELO - Scientific Electronic Library Online
 
Cathy Holland - The Performance of SciELO journals
Cathy Holland - The Performance of SciELO journalsCathy Holland - The Performance of SciELO journals
Cathy Holland - The Performance of SciELO journals
SciELO - Scientific Electronic Library Online
 
Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...
Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...
Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...
SciELO - Scientific Electronic Library Online
 
Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...
Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...
Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...
SciELO - Scientific Electronic Library Online
 
Kátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da Capes
Kátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da CapesKátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da Capes
Kátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da Capes
SciELO - Scientific Electronic Library Online
 
Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...
Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...
Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...
SciELO - Scientific Electronic Library Online
 
Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...
Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...
Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...
SciELO - Scientific Electronic Library Online
 
Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...
Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...
Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...
SciELO - Scientific Electronic Library Online
 

Mais de SciELO - Scientific Electronic Library Online (20)

Sebastião V. Canevarolo Jr. - Transferência do conhecimento para desenvolvime...
Sebastião V. Canevarolo Jr. - Transferência do conhecimento para desenvolvime...Sebastião V. Canevarolo Jr. - Transferência do conhecimento para desenvolvime...
Sebastião V. Canevarolo Jr. - Transferência do conhecimento para desenvolvime...
 
Cláudia Valentina Galian - A revista Educação e Pesquisa e as práticas ligada...
Cláudia Valentina Galian - A revista Educação e Pesquisa e as práticas ligada...Cláudia Valentina Galian - A revista Educação e Pesquisa e as práticas ligada...
Cláudia Valentina Galian - A revista Educação e Pesquisa e as práticas ligada...
 
Angelina Zanesco - ALINHAMENTO DOS PERIÓDICOS SCIELO BRASIL COM AS BOAS PRÁTI...
Angelina Zanesco - ALINHAMENTO DOS PERIÓDICOS SCIELO BRASIL COM AS BOAS PRÁTI...Angelina Zanesco - ALINHAMENTO DOS PERIÓDICOS SCIELO BRASIL COM AS BOAS PRÁTI...
Angelina Zanesco - ALINHAMENTO DOS PERIÓDICOS SCIELO BRASIL COM AS BOAS PRÁTI...
 
Abel Packer - SciELO - Estado de avanço e perspectivas futuras
Abel Packer - SciELO - Estado de avanço e perspectivas futurasAbel Packer - SciELO - Estado de avanço e perspectivas futuras
Abel Packer - SciELO - Estado de avanço e perspectivas futuras
 
Carlos Menck - Revistas Brasileiras e sustentabilidade: As dificuldades de mo...
Carlos Menck - Revistas Brasileiras e sustentabilidade: As dificuldades de mo...Carlos Menck - Revistas Brasileiras e sustentabilidade: As dificuldades de mo...
Carlos Menck - Revistas Brasileiras e sustentabilidade: As dificuldades de mo...
 
Reinaldo Cantarutti - Sustentabilidade operacional & modalidades de financiam...
Reinaldo Cantarutti - Sustentabilidade operacional & modalidades de financiam...Reinaldo Cantarutti - Sustentabilidade operacional & modalidades de financiam...
Reinaldo Cantarutti - Sustentabilidade operacional & modalidades de financiam...
 
Publons (IV Curso SciELO-ScholarOne)
Publons (IV Curso SciELO-ScholarOne)Publons (IV Curso SciELO-ScholarOne)
Publons (IV Curso SciELO-ScholarOne)
 
Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...
Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...
Panorama Geral do ScholarOne na Coleção SciELO Brasil e Retrospectiva 2017-20...
 
Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...
Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...
Patricia Méndez - Asociación de Revistas Latinoamericanas de Arquitectura (AR...
 
Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...
Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...
Adriano Codato - Utilizando citações para além do fator de impacto: uma alter...
 
Patricia Muñoz Palma - SciELO-Chile: Una mirada de 20 años
Patricia Muñoz Palma - SciELO-Chile: Una mirada de 20 añosPatricia Muñoz Palma - SciELO-Chile: Una mirada de 20 años
Patricia Muñoz Palma - SciELO-Chile: Una mirada de 20 años
 
James Testa - International Collaboration Top Cited journals
James Testa - International Collaboration Top Cited journals James Testa - International Collaboration Top Cited journals
James Testa - International Collaboration Top Cited journals
 
Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...
Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...
Isidro F. Aguillo - Web Identity and positioning: Non-citation metrics for th...
 
Cathy Holland - The Performance of SciELO journals
Cathy Holland - The Performance of SciELO journalsCathy Holland - The Performance of SciELO journals
Cathy Holland - The Performance of SciELO journals
 
Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...
Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...
Stephanie Faulkner - Using Metrics Beyond Citations to Demonstrate the Impact...
 
Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...
Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...
Rogério Mugnaini - Livros e editoras: evolução do impacto nas diversas áreas ...
 
Kátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da Capes
Kátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da CapesKátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da Capes
Kátia de Oliveira Rodrigues, et al. - O livro no Sistema de Avaliação da Capes
 
Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...
Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...
Flávia Rosa, Susane Barros - Reflexões sobre o livro acadêmico no contexto da...
 
Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...
Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...
Leila Posenato Garcia - A participação feminina no meio editorial de saúde co...
 
Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...
Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...
Angela Maria Belloni Cuenca, Milena Maria de Araújo Lima Barbosa, Ivan França...
 

Último

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 

Último (20)

9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 

Digitalização: Captura de Imagem e Fluxo de Trabalho - Constance Rinaldo

  • 1. Digitalização: Captura de Imagem e Fluxo de Trabalho Martin Kalfatovic, Keri Thompson & Connie Rinaldo
  • 3.
  • 6. • Workflow has become more complicated • Difficulty finding books that are easy to scan • Reviewing titles in copyright takes time • Fragile books need repair • The same amount of work, but a different kind
  • 7. Upload spreadsheet titles scanned plans. Include OCLC number, title, volume number, Author, Publisher, Date Tool tries to find matches in other spreadsheets submitted Lesson: metadata is always worse than you think
  • 8. Title, volumes needed Which library has which volumes, additional information conversation about which volumes need to be scanned GEMINI: A Critical Tool
  • 10. • Purpose - to provide an accurate digital representation of the original object • one page per image • (except Field note-books - 2 pages per image) • no image editing • Reuse existing metadata • in the library catalog • other sources (BioStor etc.) Capture: Scanning
  • 11. Capture-Scanning • Most libraries BHL US / UK use the Internet Archive (IA) for scanning books • Some shared funds/one contract for all BHL • Open Access, nonprofit • Services inexpensive • Each member library has its own workflow • Members provide basic metadata from library catalog • In-house digitization or hire another seller • MACAW
  • 12. • * Scan books, from cover to cover one image per page? • * Also called "volume" or "item" is a physical unit, not intellectual unity, ie, a book = multiple articles or book = a monograph Cover Cover good stuff
  • 13. Partial replication in Alexandria, Egypt Secondary backup is in the Smithsonian, including TIFF scanned volumes for home (SIL) ~ 90TB Primary Storage files and "staging area" is on the Internet Archive in San Francisco, USA
  • 14. Images scanned by the library or other vendor Metadata collected through Z39.50 Additional metadata for the item and pages entered by library staff using the software Macaw (biblio software mimics IA) In-house scanning
  • 15. Smithsonian Libraries: uses 2 sets of Phase One: P65 60 MP camera on a copy stand and BC100 - dual-chamber 40mP CaptureOne software By folios (> 36cm), fragile books EXCEPT Notebooks Field Project (Smithsonian Archives) - 2 pages per image to notebooks, letters flatbed scanner
  • 16. Capture: Harvest • Scheduled tasks automated • Books already in the Internet Archive • subject terms • Library "call numbers” • BioStor/articles
  • 18. Interface for staff to edit records and serial volumes put in order Curated add and edit metadata includes books, merging records and authors, removing volumes that are outside the scope of the collection, re-scan books with errors. CURATION
  • 19. allows people to enter the page-level metadata such as page number, page type (picture, text, etc.) creates XML files to upload to IA Replicates software functionality from Internet Archive Installed in a shared SI server for partners to use MACAW: MetadatA Collection And Workflow A Critical Tool
  • 20. •"Title" Record MARC library catalog •Transformed into MARCXML and MODS •Information "Volume" catalog or introduced by humans, stored in xml •"Segment" (article) the information entered by humans or bioStor etc. (after scanning) •"Page" metadata entered by humans, stored in the XML file that provides structure to the digital object Metadata
  • 21. add metadata page level, such as page numbers or titles of articles
  • 22. • Other files derived from Internet Archive processes – PDF – Djvu (OCR text - .txt and .xml) – ePub/Daisy/Kindle • Other files created by BHL processes –Taxonomic names –OCR text – BHL METS
  • 23. Discovering and storing species names associated with pages allows the creation of "species bibliographies," EOL.org connections, GBIF connections
  • 25. Users can (and do!) Report technical problems Request new functionality Report data errors Request scanning of specific titles Gemini
  • 26. Which library has which volumes, additional information Gemini Title, volumes needed Assigned to Cornell University Requestor For all we know, in response to user requests is rare in the world of Digital Library.
  • 27. Smithsonian Libraries Workflow s database library catalog Macaw Internet Archive Move & de- duplicate tracking & shipping Scanning & metadata harvesting BHL transform & package scanning & metadata harvesting create metadata page create derivative create metadata page MARC  MARCxml URL to BHL into MARC record species names quality control (% sample)

Notas do Editor

  1. [1 min] Collection mgmt to me is a continuous cycle of pre-digitization and post-digitization workflows Getting the content scanned is 1 thing and managing the content after it’s been scanned is just as important You’ll notice that our users play a key role in the cycle
  2. At the start of the project, trying to scan as much as possible as fast as possible “feed the beast” = low hanging fruit As the project matures, it becomes more difficult to find material to scan that is in good condition, that is in the Public Domain, or that is on the shelf! Hired a full-time in house scanner to do folios, rare fragile material Most staff, like scanning, is funded by grants. Not permanent, which means not truly programmatic/infrastructure.
  3. 18 plus institutions, 30 plus people, 4 plus time zones
  4. [1 min] Collection mgmt to me is a continuous cycle of pre-digitization and post-digitization workflows Getting the content scanned is 1 thing and managing the content after it’s been scanned is just as important You’ll notice that our users play a key role in the cycle
  5. Workflow has become more complicated Difficulty finding books that are easy to scan Copyright review takes time Fragile books need repair Same quantity of work but different type, slower collection growth
  6. Upload spreadsheet of titles you plan to scan. Include OCLC number, Title, Volume Number, Author, Publisher, Date Tool tries to find matches in other submitted spreadsheets Lesson: your metadata is always worse than you think it is Problems: does not match against BHL in Real Time. Still must check BHL to be sure. Doesn’t always happen Problem: fuzzy matching algorithm is not that great. Works best against numbers (OCLC number) (OCLC? WorldCat? Union catalog for Libraries) Your metadata is always worse than you think. 
  7. Repurpose a generic “issue tracking” system to do many things -track requests for scanning -track titles libraries plan to scan (serial volumes) -track metadata error reports -track website bugs Comment trail can be very long. Conversation vs. database. Confusing to database people (me) but shows history of selection. The selection refinement process can take a long time!
  8. Some background: Most BHL US/UK libraries use Internet Archive as our scanning “vendor” (partner) this was part of the original BHL formation and grant agreement with MacArthur. IA chosen because committed to Open Access, Non-profit, and low cost services – more than just digitization Members can also do their own scanning, or contract to other vendor, but all scans must be “staged” at internet archive Members provide basic metadata from their library catalogs
  9. This decision to scan physical units of books is based in the limitations of available library data. Libraries typically assign data at the “title” level, with maybe some data about individual volumes of a serial. Workflow is designed around scanning physical books. We are working on incorporating born-digital publications. Focus is on the information content of the book rather than the book-as-historical-object
  10. TO REITERATE: For BHL, IA – petaboxes SI – Isilon Total BHL storage currently ~ 90TB. It is so low because IA supplies compressed JP2s, and we store them in a .zip file.
  11. Images scanned by library or other vendor Metadata harvested via Z39.50 Additional metadata for item and pages entered by library staff using Macaw software (mimics IA biblio software)
  12. Scanned by library or other vendor Smithsonian Libraries uses 2 systems: P65+ 60MP camera on a copy stand BC100 – dual camera 40MP scanning backs CaptureOne image editing software Macaw for extra metadata
  13. Analysis of MARCxml records in IA (not all books have MARC records) for 050 and 090 (call number) and 650 (subject headings) Capture - Harvest Automated, scheduled tasks Books from Internet Archive subject terms library “call numbers” Manually entering in identifier Article citations BioStor
  14. Curation includes adding and correcting metadata for books, merging records and authors, removing volumes that are outside of the scope of the collection, rescanning books with errors Title id 3971 Edit record for item (from MARC) Edit volumes attached to the title record – correct volume information, re-order volumes
  15. 4 levels of descriptive metadata (administrative, structural data produced while scanning) “Title” MARC record from library catalog Transformed into MARCXML and MODS “Volume” information from catalog or entered by human, stored in xml file “Segment” (article) information entered by human OR from bioStor etc. “Page” metadata entered by human, stored in xml file that provides structure to digital object
  16. Item 22379 Add article “segment” title and other information “paginate” = add page data que están fuera del alcance de la colección
  17. Run taxonomic intellegence to find names (this shows manual editing, but it is an automated process) Discovering and storing species names associated with pages enables creation of ”species bibliographies”, connections to EOL.org, spLink and other useful tools Descubriendo y almacenamiento nombres de las especies asociadas con las páginas permite la creación de "especies bibliografías," conexiones a EOL.org, Splink y otras herramientas útiles
  18. Won’t show portal functionality – save for William tomorrow. Users are a big part of the data management process (administracion de la collecion)
  19. Here is a request from Dr. Karl Siegert that BHL scan Annales de l’Institut Pasteur. As far as we know, responding to user requests is rare in the Digital Library world. MBLWHOI has this title, as does Cornell University