SlideShare uma empresa Scribd logo
1 de 13
Haley Childers
LIS 688-04
April 26, 2012
Professor Oguz



                 1
Automatic Metadata Generation

  Is a machine process of metadata extraction and
   metadata harvesting.


     Metadata extraction uses automatic indexing
      techniques to search and obtain resource content and
      produce structured metadata according to metadata
      standards


     Metadata harvesting is completed by machine to collect
      tagged metadata created by machine or humans.




                                                               2
Why choose automatic metadata generation
over manually created metadata?


  Advantages:
    Efficiency
    Cost
    Consistency




                                    3
Automatic Metadata Generation Concept                    Example(s)

Metadata extraction. The process of automatically        Metadata extraction for a Web page involves extracting
pulling (extracting) metadata from a resource’s          metadata from the resource's content that is displayed
content. Resource content is mined to produce            via a Web browser.
structured (“labeled”) metadata for object
representation.
Metadata harvesting. The process of automatically        Metadata harvested from a Web page is found in the
collecting resource metadata already embedded in or      "header” source code of an HTML (or XHTML) resource
associated with a resource. The harvested metadata is    (e.g., "Keywords" META tags). Metadata for a Microsoft
originally produced by humans or by fully or             WORD document is found under file properties
semiautomatic processes supported by software.           (e.g., "Type of file," which is automatically
                                                         generated, and "Keywords," which can be added by a
                                                         resource author).
Fully-automatic metadata generation.                     Web editing software (e.g., Macromedia’s Dreamweaver
Complete (or total) reliance on automatic processes to   and Microsoft’s FrontPage) and selected document
create metadata.                                         software (e.g., Microsoft WORD and Acrobat)
                                                         automatically produce metadata at the time a resource
                                                         is created or updated (e.g., “Date of creation" or "Date
                                                         modified") without human intervention.
Semi-automatic metadata generation.                      (1) Fully-automatic techniques are used to generate
Partial reliance on software to create metadata; a       metadata (e.g.,"Keywords") as a first pass, and
combination of fully-automatic and human processes       software then presents the metadata to a person, who
to create metadata.                                      may manually edit the metadata. (2)Software may
                                                         present a person (e.g., resource author or Web
                                                         architect) with a “template” that guides the manual
                                                         input of metadata, and then automatically converts the
                                                         metadata to appropriate encoding (e.g., XML tags). The
                                                         software may even automatically embed metadata in a
                                                         resource.                                       4
    Greenberg (2005), p. 25
 Created to “identify and recommend functionalities for
  automatic metadata generation applications”

 Discusses current state of automatic metadata generation
  applications

    Problem areas

 Conducted survey of metadata experts

 Suggests functionalities that future applications should
  incorporate

Found at:
  http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf

                                                               5
 Problems with current automatic metadata applications:


    Do not support standard bibliographic functions and element
     qualifications

    Sophisticated automatic indexing algorithms have not been
     incorporated to metadata applications

    Automatic metadata applications are developed separate from each
     other

    There is no standards for creating automatic metadata generation
     applications




                                                                   6
The purpose of the survey conducted by AMeGA was to:
       Get an idea of what current libraries are currently doing for
        metadata creation
       See if they are aware of current automatic metadata generation
        applications
       See what developments they would most like to see happen for
        metadata creation
                 Survey participants: 217 completed the survey
75.2% of participants had three or more years of cataloging and/or indexing experience

 •29.5% were administrators/executives       •40.7% of participants were from Academic
 •28.3% catalogers/metadata librarians       libraries
 •Remaining percentages divided by 8         •13.4% from Government
 categories                                  agency/department
                                             •12.8% Academic community (not the
                                             library)
                                             •11.6% Government library
                                             •9.3% Non-profit organization
                                             •8.1% Cooperation/company
                                             •1.2% Public library
                                             •0.1% Corporate library
                                                                             7
                                             •2.3% Other
 Top 4 metadata standards used in the libraries that participants worked: MARC, DC
  simple, DC qualified, and EAD.

 Top 4 metadata standards used in nonlibraries that participants worked: DC
  simple, DC qualified, MARC, DC application profile.

 94 Organizations were using 1 metadata system
  55 Organizations were using 2 metadata systems
  22 Organizations were using 3 metadata systems
  6 Organizations were using 4 metadata systems
  4 Organizations were using 5 metadata systems
  2 Organizations were using 6 metadata systems
  1 Organization was using 7 metadata systems

 The most common Metadata Generation systems being used (in order of most used):
  Custom/in-house
  ContentDM
  Endeavor/Voyager
  OCLC/Innovative Interfaces
  OCLC/Connexion
  Microsoft Access
  Xmetal
  NoteTab (or similar text editor)
  XML Spy
  Dspace
  (etc.)


     Greenberg (2005) p. 24                                                    8
 Survey participants were asked a series of experience or opinion
  questions regarding the automatic metadata generation of digital
  document like objects using the Dublin Core Metadata Element Set.

 Participants either experience or predict the most accuracy of technical
  metadata (ID, language, format).

 Less accuracy was predicted for subject and description since it requires
  intellectual judgment.

 When questioned whether they would devote a “moderate” amount of
  resources for research between either intellectual metadata (subject,
  description) or complete automation of physical metadata (ID, format,
  language) they were divided.

 A majority of participants believed that research for generating nontextual
  and foreign language material is important and valuable.

 70% of participants would like applications to run automatic algorithms,
  allowing human evaluation and editing afterwards.

 Most participants would also want to be able to incorporate subject
  schemes, content creation guidelines, cataloging and metadata examples
  into metadata generation applications.


                                                                       9
 Based on the results of the survey, AMeGA created a list of
  functionalities needed in automatic metadata generation
  applications:

    The system should be able to configure profiles before metadata
     generation

    The system should automatically identify and collect any
     metadata associated with a resource

    The system should enhance and refine manually generated and
     automatically generated metadata

    The system should automatically evaluate the quality and
     metadata and provide a rating score

    The system should be used to create metadata for nontextual
     resources


                                                                10
Conclusion

 Experimental researchers and metadata experts need to work
  together on developing applications.

 Application standards needs to be created.

 Much more funding and research needs to be devoted to
  automatic metadata generation.

 The important thing to now be developed is metadata
  generation applications that automatically identifies and
  collects metadata, aids human metadata generation, enhance
  previously created metadata, and evaluates the quality of
  metadata.




                                                          11
DCMI (2008). Dublin Core Metadata Initiative: Scorpion. Retrieved from
        http://www.dublincore.org/tools/tools/tool-11.shtml

Greenberg, J., (2003). Metadata Generation: Processes, People and Tools. Bulletin of the
       American Society for Information Sciences and Technology, Volume Number 29(2).
       Retrieved from http://www.asis.org/Bulletin/Dec-02/greenberg.html

Greenberg, J., Spurgin, K., Crystal, A. (2005). Final Report for the AMeGA (Automatic    Metadata
       Generation Applications) Project. Retrieved from
       http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf

Greenberg, J., Spurgin, K., Crystal, A. (2006). Functionalities for automatic metadata generation
       applications: a survey of metadata experts’ opinions. Int. J. Metadata, Semantics and Ontologies,
        Volume Number 1 (1), 3-20.

Ojokoh, B., Adewale, O., & Falaki, S. (2009). Automated document metadata extraction. Journal Of
        Information Science, 35(5), 563-570.

Park, J., & Lu, C. (2009). Application of semi-automatic metadata generation in libraries: Types,
         tools, and techniques. Library & Information Science Research (07408188), 31(4), 225-231.

Shafer, K. E. (2001). Automatic Subject Assignment via the Scorpion System. Journal Of
        Library Administration, 34(1/2), 187.

Shafer, K. E. (2001). Evaluating Scorpion Results. Journal Of Library Administration, 34(3/4), 237.

Su, S. T., Long, Y., & Cromwell, D. E. (2002). E2M: Automatic Generation of MARC-Formatted Metadata by
         Crawling E-Publications. Information Technology & Libraries, 21(4), 171-180.



                                                                                                      12
Thank you!



   For any questions or concerns, please contact me at:
                    hachilde@uncg.edu

                         _________



It’s been a wonderful class with everyone! Good luck in all
    of your future endeavors! I hope to see you all around!




                                                          13

Mais conteúdo relacionado

Semelhante a Automatic metadata generation

MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)Nikos Palavitsinis, PhD
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasDataWorks Summit
 
LIS688_Group1
LIS688_Group1 LIS688_Group1
LIS688_Group1 e_chae
 
Apache atlas sydney 2017-v4
Apache atlas   sydney 2017-v4Apache atlas   sydney 2017-v4
Apache atlas sydney 2017-v4Nigel Jones
 
Metadata-powered dissemination of content
Metadata-powered dissemination of contentMetadata-powered dissemination of content
Metadata-powered dissemination of contentNikos Manouselis
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled IntelligenceMetadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligencedannyijwest
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence               Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence dannyijwest
 
The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...DataWorks Summit
 
Automated metadata creation - Possibilities and pitfalls
Automated metadata creation - Possibilities and pitfallsAutomated metadata creation - Possibilities and pitfalls
Automated metadata creation - Possibilities and pitfallsNASIG
 
Opinioz_intern
Opinioz_internOpinioz_intern
Opinioz_internSai Ganesh
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...DataWorks Summit
 
Approaches to automated metadata extraction : FixRep Project
Approaches to automated metadata extraction : FixRep ProjectApproaches to automated metadata extraction : FixRep Project
Approaches to automated metadata extraction : FixRep ProjectUKOLN (dev), University of Bath
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas WorkshopNiall Beard
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
 
Embracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital LibrariesEmbracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital LibrariesAkhmad Riza Faizal
 
3 Understanding Search
3 Understanding Search3 Understanding Search
3 Understanding Searchmasiclat
 
Extensis DAM Forum at MCN
Extensis DAM Forum at MCNExtensis DAM Forum at MCN
Extensis DAM Forum at MCNesmithextensis
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 

Semelhante a Automatic metadata generation (20)

MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)MetadataTheory: Metadata Tools (7th of 10)
MetadataTheory: Metadata Tools (7th of 10)
 
Open Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache AtlasOpen Metadata and Governance with Apache Atlas
Open Metadata and Governance with Apache Atlas
 
LIS688_Group1
LIS688_Group1 LIS688_Group1
LIS688_Group1
 
Apache atlas sydney 2017-v4
Apache atlas   sydney 2017-v4Apache atlas   sydney 2017-v4
Apache atlas sydney 2017-v4
 
Metadata-powered dissemination of content
Metadata-powered dissemination of contentMetadata-powered dissemination of content
Metadata-powered dissemination of content
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled IntelligenceMetadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence
 
Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence               Metadata: Towards Machine-Enabled Intelligence
Metadata: Towards Machine-Enabled Intelligence
 
The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...
 
Automated metadata creation - Possibilities and pitfalls
Automated metadata creation - Possibilities and pitfallsAutomated metadata creation - Possibilities and pitfalls
Automated metadata creation - Possibilities and pitfalls
 
Opinioz_intern
Opinioz_internOpinioz_intern
Opinioz_intern
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
 
Approaches to automated metadata extraction : FixRep Project
Approaches to automated metadata extraction : FixRep ProjectApproaches to automated metadata extraction : FixRep Project
Approaches to automated metadata extraction : FixRep Project
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
Search Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By DesignSearch Solutions 2011: Successful Enterprise Search By Design
Search Solutions 2011: Successful Enterprise Search By Design
 
Embracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital LibrariesEmbracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital Libraries
 
3 Understanding Search
3 Understanding Search3 Understanding Search
3 Understanding Search
 
Extensis DAM Forum at MCN
Extensis DAM Forum at MCNExtensis DAM Forum at MCN
Extensis DAM Forum at MCN
 
Web mining
Web miningWeb mining
Web mining
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 

Último

Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 

Último (20)

Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 

Automatic metadata generation

  • 1. Haley Childers LIS 688-04 April 26, 2012 Professor Oguz 1
  • 2. Automatic Metadata Generation  Is a machine process of metadata extraction and metadata harvesting.  Metadata extraction uses automatic indexing techniques to search and obtain resource content and produce structured metadata according to metadata standards  Metadata harvesting is completed by machine to collect tagged metadata created by machine or humans. 2
  • 3. Why choose automatic metadata generation over manually created metadata? Advantages: Efficiency Cost Consistency 3
  • 4. Automatic Metadata Generation Concept Example(s) Metadata extraction. The process of automatically Metadata extraction for a Web page involves extracting pulling (extracting) metadata from a resource’s metadata from the resource's content that is displayed content. Resource content is mined to produce via a Web browser. structured (“labeled”) metadata for object representation. Metadata harvesting. The process of automatically Metadata harvested from a Web page is found in the collecting resource metadata already embedded in or "header” source code of an HTML (or XHTML) resource associated with a resource. The harvested metadata is (e.g., "Keywords" META tags). Metadata for a Microsoft originally produced by humans or by fully or WORD document is found under file properties semiautomatic processes supported by software. (e.g., "Type of file," which is automatically generated, and "Keywords," which can be added by a resource author). Fully-automatic metadata generation. Web editing software (e.g., Macromedia’s Dreamweaver Complete (or total) reliance on automatic processes to and Microsoft’s FrontPage) and selected document create metadata. software (e.g., Microsoft WORD and Acrobat) automatically produce metadata at the time a resource is created or updated (e.g., “Date of creation" or "Date modified") without human intervention. Semi-automatic metadata generation. (1) Fully-automatic techniques are used to generate Partial reliance on software to create metadata; a metadata (e.g.,"Keywords") as a first pass, and combination of fully-automatic and human processes software then presents the metadata to a person, who to create metadata. may manually edit the metadata. (2)Software may present a person (e.g., resource author or Web architect) with a “template” that guides the manual input of metadata, and then automatically converts the metadata to appropriate encoding (e.g., XML tags). The software may even automatically embed metadata in a resource. 4 Greenberg (2005), p. 25
  • 5.  Created to “identify and recommend functionalities for automatic metadata generation applications”  Discusses current state of automatic metadata generation applications  Problem areas  Conducted survey of metadata experts  Suggests functionalities that future applications should incorporate Found at: http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf 5
  • 6.  Problems with current automatic metadata applications:  Do not support standard bibliographic functions and element qualifications  Sophisticated automatic indexing algorithms have not been incorporated to metadata applications  Automatic metadata applications are developed separate from each other  There is no standards for creating automatic metadata generation applications 6
  • 7. The purpose of the survey conducted by AMeGA was to:  Get an idea of what current libraries are currently doing for metadata creation  See if they are aware of current automatic metadata generation applications  See what developments they would most like to see happen for metadata creation Survey participants: 217 completed the survey 75.2% of participants had three or more years of cataloging and/or indexing experience •29.5% were administrators/executives •40.7% of participants were from Academic •28.3% catalogers/metadata librarians libraries •Remaining percentages divided by 8 •13.4% from Government categories agency/department •12.8% Academic community (not the library) •11.6% Government library •9.3% Non-profit organization •8.1% Cooperation/company •1.2% Public library •0.1% Corporate library 7 •2.3% Other
  • 8.  Top 4 metadata standards used in the libraries that participants worked: MARC, DC simple, DC qualified, and EAD.  Top 4 metadata standards used in nonlibraries that participants worked: DC simple, DC qualified, MARC, DC application profile.  94 Organizations were using 1 metadata system 55 Organizations were using 2 metadata systems 22 Organizations were using 3 metadata systems 6 Organizations were using 4 metadata systems 4 Organizations were using 5 metadata systems 2 Organizations were using 6 metadata systems 1 Organization was using 7 metadata systems  The most common Metadata Generation systems being used (in order of most used): Custom/in-house ContentDM Endeavor/Voyager OCLC/Innovative Interfaces OCLC/Connexion Microsoft Access Xmetal NoteTab (or similar text editor) XML Spy Dspace (etc.) Greenberg (2005) p. 24 8
  • 9.  Survey participants were asked a series of experience or opinion questions regarding the automatic metadata generation of digital document like objects using the Dublin Core Metadata Element Set.  Participants either experience or predict the most accuracy of technical metadata (ID, language, format).  Less accuracy was predicted for subject and description since it requires intellectual judgment.  When questioned whether they would devote a “moderate” amount of resources for research between either intellectual metadata (subject, description) or complete automation of physical metadata (ID, format, language) they were divided.  A majority of participants believed that research for generating nontextual and foreign language material is important and valuable.  70% of participants would like applications to run automatic algorithms, allowing human evaluation and editing afterwards.  Most participants would also want to be able to incorporate subject schemes, content creation guidelines, cataloging and metadata examples into metadata generation applications. 9
  • 10.  Based on the results of the survey, AMeGA created a list of functionalities needed in automatic metadata generation applications:  The system should be able to configure profiles before metadata generation  The system should automatically identify and collect any metadata associated with a resource  The system should enhance and refine manually generated and automatically generated metadata  The system should automatically evaluate the quality and metadata and provide a rating score  The system should be used to create metadata for nontextual resources 10
  • 11. Conclusion  Experimental researchers and metadata experts need to work together on developing applications.  Application standards needs to be created.  Much more funding and research needs to be devoted to automatic metadata generation.  The important thing to now be developed is metadata generation applications that automatically identifies and collects metadata, aids human metadata generation, enhance previously created metadata, and evaluates the quality of metadata. 11
  • 12. DCMI (2008). Dublin Core Metadata Initiative: Scorpion. Retrieved from http://www.dublincore.org/tools/tools/tool-11.shtml Greenberg, J., (2003). Metadata Generation: Processes, People and Tools. Bulletin of the American Society for Information Sciences and Technology, Volume Number 29(2). Retrieved from http://www.asis.org/Bulletin/Dec-02/greenberg.html Greenberg, J., Spurgin, K., Crystal, A. (2005). Final Report for the AMeGA (Automatic Metadata Generation Applications) Project. Retrieved from http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf Greenberg, J., Spurgin, K., Crystal, A. (2006). Functionalities for automatic metadata generation applications: a survey of metadata experts’ opinions. Int. J. Metadata, Semantics and Ontologies, Volume Number 1 (1), 3-20. Ojokoh, B., Adewale, O., & Falaki, S. (2009). Automated document metadata extraction. Journal Of Information Science, 35(5), 563-570. Park, J., & Lu, C. (2009). Application of semi-automatic metadata generation in libraries: Types, tools, and techniques. Library & Information Science Research (07408188), 31(4), 225-231. Shafer, K. E. (2001). Automatic Subject Assignment via the Scorpion System. Journal Of Library Administration, 34(1/2), 187. Shafer, K. E. (2001). Evaluating Scorpion Results. Journal Of Library Administration, 34(3/4), 237. Su, S. T., Long, Y., & Cromwell, D. E. (2002). E2M: Automatic Generation of MARC-Formatted Metadata by Crawling E-Publications. Information Technology & Libraries, 21(4), 171-180. 12
  • 13. Thank you! For any questions or concerns, please contact me at: hachilde@uncg.edu _________ It’s been a wonderful class with everyone! Good luck in all of your future endeavors! I hope to see you all around! 13