SlideShare uma empresa Scribd logo
1 de 24
Taxonomy Assessments -
                                 Part Two
                                 February 9, 2012




                                  Access Innovations, Inc.
             Leveraging Your Content Semantically
                                             Jay Ven Eman, Ph.D., CEO
                                                  j_ven_eman@accessinn.com
                                                      www.accessinn.com
                                                     www.dataharmony.com
                                                        +1.505.998.0800
                                                       Albuquerque, NM




© 2012. Access Innovations, Inc. All rights reserved.
Indexing
     Subject term assignment
     Permanent meta-data to indexed object
     Used for retrieval and evaluation
     Processes
      •     Manual
            •     Publisher
            •     3rd party aggregators
            •     Authors
      •     Automated methods


    © 2011. Access Innovations, Inc. All rights reserved.
Integration / workflow
                                                                      API’s, Client/Server,
              Author Submission                                     Web Services, HTTP-TCP/IP
                   System


Books
                                                                           Content
                                                                       Repository “A”
                                                                       Or Intermediate
Conference                                                               Processes
Proceedings



                                                                                  Content
  ETC.
                                                                                 Repository
                                                                                  “B”, etc.
                                   Thesaurus
                                                           M.A.I.
                                    Master


 Web                                       Data Harmony
 Sites                                     MAIstro Server


                                   Classification System

   © 2011. Access Innovations, Inc. All rights reserved.
Select the document collection
                                                                 CMS



                               Please select the database and the the document directory to load




 © 2011. Access Innovations, Inc. All rights reserved.
CMS




© 2011. Access Innovations, Inc. All rights reserved.
Sample unstructured document




 © 2011. Access Innovations, Inc. All rights reserved.
Run the documents through a metadata extraction
process to create well-formed, rich XML




                                                       • Automatic (per doc template)
                                                       • E.g. Dublin Core Metadata
                                                       • Bibliographic citation




    © 2011. Access Innovations, Inc. All rights reserved.
Automatically add the taxonomy
terms




                                                    Entity extraction: People,
                                                      Places, Things
                                                    Conceptual indexing: using the
                                                      taxonomy




 © 2011. Access Innovations, Inc. All rights reserved.
Classification Process or Assigned Indexing
                                                         <Anchor><Date>09-14-11</Date>
09-14-11
                                                         <TI>“Solving the Challenge”</TI>
“Solving the Challenge”
                                                         <BLH>By</BLH>
By Jay Ven Eman
                                                         <Author>
                                                         <AU_FN>Jay</AU_FN>
The process of indexing
                                                         <AU_MI></AU_MI>
a content object begins
                                                         <AU_LN>Ven Eman</AU_LN>
with…
                                                         </Author>
                                                         <Body>The process of indexing a content
                                                         object begins with…</Body>

                                                         <Subject>Indexing</Subject>
                                                         <Subject>Thesauri</Subject>
                                                         <Subject>Standards</Subject>
                                                         <Subject>Classification</Subject>
   Unstructured
                                                         </Anchor>

                                                                                             Structured


     Thesaurus
                               M.A.I.
      Master
                                                                       Content
              Data Harmony                                             Repository
              MAIstro Server                                           e.g. Database
       Classification System
     © 2011. Access Innovations, Inc. All rights reserved.
Indexing
     Indexing measures
      •     Indexing experts
      •     Subject matter experts (SME)
      •     Hits, misses, & noise
      •     85% hits
     In conjunction with taxonomy measures
      •     Over & under used terms
      •     Over & under indexed content



    © 2011. Access Innovations, Inc. All rights reserved.
Indexing & Search Metrics
     Hit, Miss, Noise
     Subjective
      •     Relevance
      •     Aboutness
     Statistical
      •     Precision
      •     Recall
      •     Level of effort



    © 2011. Access Innovations, Inc. All rights reserved.
Hit, Miss, Noise
     Hit – exactly what a human indexer would use
     Miss – human indexer would use, but system
      did not assign
     Noise – system assigned, but human did not
      •     Relevant noise – could have been assigned
      •     Irrelevant noise – just plain wrong




    © 2011. Access Innovations, Inc. All rights reserved.
Subjective
     Relevance
      •     Reflects how akin it is to the users request
     “Aboutness”
      •     Reflects the topical match between the document
            content and the term
      •     How well the topic describes what the document is
            about
     Varies with level of conceptual terms vs. factual
      terms in the thesaurus




    © 2011. Access Innovations, Inc. All rights reserved.
Indexing
     All content types & sources
      •     Inventory control
      •     Everything in, everything out
     Document types
      •     Articles
      •     Proceedings
      •     Corporate




    © 2011. Access Innovations, Inc. All rights reserved.
Link to Community Resources
(Source: Helen Atkins, AACR)
                                                CME
                                                               Upcoming
                   Other                     Activity on
                                                               Conference
                  Journal                     Topic A
                                                               on Topic A
                 Articles on
                  Topic A
                                                                        Job Posting
                                                  Journal                for Expert
                                                 Article on              on Topic A
                                                  Topic A

                Grant Available                               Podcast Interview
               for Researchers                                 with Researcher
                 Working on                                   Working on Topic A
                    Topic A               Author Networks
                                          Social Networking
                                          SME – Topic A

    © 2011. Access Innovations, Inc. All rights reserved.
Indexing with Data Harmony® M.A.I.™
     Rule base development
      •     80/20 rule
      •     Indexing objectives
     GUI
     Time-to-market
      •     Level of effort to build
      •     Level of effort to maintain
      •     Less than all other alternatives when
            indexing for high precision & recall


    © 2011. Access Innovations, Inc. All rights reserved.
Updating Rule Base
     Automatic for matching rules when using
      Data Harmony MAIstro™
     80/20 rule
     Re-index when 5% to 10% changes to
      taxonomy – arbitrary ranges:
      •     Monthly with small databases – 5k to 20k
      •     Quarterly with medium – 20k to 1 million
      •     Annual with large – greater than 1 million
     Depends on search software, too

    © 2011. Access Innovations, Inc. All rights reserved.
NAMES




© 2012. Access Innovations, Inc. All rights reserved.
What’s in a name?
     Juliet:
"What's in a name? That which
      we call a rose
     By any other name would smell as
      sweet."
     Romeo and Juliet (II, ii, 1-2)




    © 2011. Access Innovations, Inc. All rights reserved.
© 2012. Access Innovations, Inc. All rights reserved.
Magnitude of the Problem:
Facebook - 700 Million Users Projected for 2011(Open-First)




         700 Million Names

        How will your boss, peers,
        anyone ever find you?


    © 2012. Access Innovations, Inc. All rights reserved.
What’s in a name?
     My name         Jay Ven Eman
                      Ven Eman, Jay
      <First_Name>Jay</First_Name>
      <Last_Name>Ven Eman</Last_Name>
     Name variants  Aliases
      Jay Von Eman    William Henry McCarty
      Jay Van Eman    Henry Antrim
      Jay van Eman    William H. Bonney
      Jay ven Eman    Billy the Kid
      Jay Veneman  National & Cultural
      Jay Venema      Conventions
    © 2011. Access Innovations, Inc. All rights reserved.
Names
     Computationally & editorially intense
     Author submissions
     Membership records & the like
     Industry initiatives – ORCID, VIVO
     Subject term disambiguation
     Inventory control basics apply here, too
     Difficulty level is high
     Constance maintenance needed


    © 2011. Access Innovations, Inc. All rights reserved.
Taxonomy Assessments -
                                 Part Two
                                 February 9, 2012


                                 Thank you! Questions?
                                  Access Innovations, Inc.
             Leveraging Your Content Semantically
                                             Jay Ven Eman, Ph.D., CEO
                                                  j_ven_eman@accessinn.com
                                                      www.accessinn.com
                                                     www.dataharmony.com
                                                        +1.505.998.0800
                                                       Albuquerque, NM




© 2012. Access Innovations, Inc. All rights reserved.

Mais conteúdo relacionado

Semelhante a Taxonomy Assessments - Part Two

Taxonomies for Publishing
Taxonomies for PublishingTaxonomies for Publishing
Taxonomies for PublishingTSoholt
 
SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09Stephanie Lemieux
 
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...
“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...Sarah Silveri, RSI Content Solutions
 
10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoring10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoringSharon Burton
 
Business Objects....is it LOV?
Business Objects....is it LOV?Business Objects....is it LOV?
Business Objects....is it LOV?Terry Smith
 
Don't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better AnalyticsDon't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better AnalyticsSplunk
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...Dr. Haxel Consult
 
Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012Alan Yagoda
 
Why I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information ArchitectureWhy I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information ArchitectureMisty Weaver
 
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...Dr. Haxel Consult
 
Better front-end development in Atlassian plugins
Better front-end development in Atlassian pluginsBetter front-end development in Atlassian plugins
Better front-end development in Atlassian pluginsAtlassian
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureAccess Innovations, Inc.
 
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?TCUK Conference
 
Enforcing SharePoint Governance
Enforcing SharePoint GovernanceEnforcing SharePoint Governance
Enforcing SharePoint GovernanceRandy Williams
 
FatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio DevelopersFatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio DevelopersBrian Huff
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 

Semelhante a Taxonomy Assessments - Part Two (20)

Taxonomies for Publishing
Taxonomies for PublishingTaxonomies for Publishing
Taxonomies for Publishing
 
SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09
 
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...
“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...“It’s not rocket science!”  Applying CMS and semantic enrichment to transform...
“It’s not rocket science!” Applying CMS and semantic enrichment to transform...
 
10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoring10 mistakes when moving to topic-based authoring
10 mistakes when moving to topic-based authoring
 
Business Objects....is it LOV?
Business Objects....is it LOV?Business Objects....is it LOV?
Business Objects....is it LOV?
 
Don't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better AnalyticsDon't Re-write Code to Get Better Analytics
Don't Re-write Code to Get Better Analytics
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
 
Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012Elsevier Smart Content LDR SemTech NYC Oct-17-2012
Elsevier Smart Content LDR SemTech NYC Oct-17-2012
 
Why I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information ArchitectureWhy I teach Content Strategy in Information Architecture
Why I teach Content Strategy in Information Architecture
 
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
AI-SDV 2020: Can There Be Profitable Revenue from an AI Deployment? The Upsid...
 
Better front-end development in Atlassian plugins
Better front-end development in Atlassian pluginsBetter front-end development in Atlassian plugins
Better front-end development in Atlassian plugins
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
TCUK 2012, Nolwenn Kerzreho, Metadata: Why Should Technical Communicators Care?
 
(27.05) MOSSCA Invita - Búsqueda empresarial 2
(27.05) MOSSCA Invita - Búsqueda empresarial 2(27.05) MOSSCA Invita - Búsqueda empresarial 2
(27.05) MOSSCA Invita - Búsqueda empresarial 2
 
(28/05) MOSSCA Invita - Administración de Contenido Empresarial
(28/05) MOSSCA Invita - Administración de Contenido Empresarial(28/05) MOSSCA Invita - Administración de Contenido Empresarial
(28/05) MOSSCA Invita - Administración de Contenido Empresarial
 
Enforcing SharePoint Governance
Enforcing SharePoint GovernanceEnforcing SharePoint Governance
Enforcing SharePoint Governance
 
Alfresco content model
Alfresco content modelAlfresco content model
Alfresco content model
 
FatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio DevelopersFatWire Tutorial For Site Studio Developers
FatWire Tutorial For Site Studio Developers
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 

Mais de Access Innovations, Inc.

Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsAccess Innovations, Inc.
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8Access Innovations, Inc.
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Access Innovations, Inc.
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut ItAccess Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityAccess Innovations, Inc.
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedAccess Innovations, Inc.
 

Mais de Access Innovations, Inc. (20)

Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Smart submit
Smart submitSmart submit
Smart submit
 
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 
DHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCRDHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCR
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Último (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Taxonomy Assessments - Part Two

  • 1. Taxonomy Assessments - Part Two February 9, 2012 Access Innovations, Inc. Leveraging Your Content Semantically Jay Ven Eman, Ph.D., CEO j_ven_eman@accessinn.com www.accessinn.com www.dataharmony.com +1.505.998.0800 Albuquerque, NM © 2012. Access Innovations, Inc. All rights reserved.
  • 2. Indexing  Subject term assignment  Permanent meta-data to indexed object  Used for retrieval and evaluation  Processes • Manual • Publisher • 3rd party aggregators • Authors • Automated methods © 2011. Access Innovations, Inc. All rights reserved.
  • 3. Integration / workflow API’s, Client/Server, Author Submission Web Services, HTTP-TCP/IP System Books Content Repository “A” Or Intermediate Conference Processes Proceedings Content ETC. Repository “B”, etc. Thesaurus M.A.I. Master Web Data Harmony Sites MAIstro Server Classification System © 2011. Access Innovations, Inc. All rights reserved.
  • 4. Select the document collection CMS Please select the database and the the document directory to load © 2011. Access Innovations, Inc. All rights reserved.
  • 5. CMS © 2011. Access Innovations, Inc. All rights reserved.
  • 6. Sample unstructured document © 2011. Access Innovations, Inc. All rights reserved.
  • 7. Run the documents through a metadata extraction process to create well-formed, rich XML • Automatic (per doc template) • E.g. Dublin Core Metadata • Bibliographic citation © 2011. Access Innovations, Inc. All rights reserved.
  • 8. Automatically add the taxonomy terms Entity extraction: People, Places, Things Conceptual indexing: using the taxonomy © 2011. Access Innovations, Inc. All rights reserved.
  • 9. Classification Process or Assigned Indexing <Anchor><Date>09-14-11</Date> 09-14-11 <TI>“Solving the Challenge”</TI> “Solving the Challenge” <BLH>By</BLH> By Jay Ven Eman <Author> <AU_FN>Jay</AU_FN> The process of indexing <AU_MI></AU_MI> a content object begins <AU_LN>Ven Eman</AU_LN> with… </Author> <Body>The process of indexing a content object begins with…</Body> <Subject>Indexing</Subject> <Subject>Thesauri</Subject> <Subject>Standards</Subject> <Subject>Classification</Subject> Unstructured </Anchor> Structured Thesaurus M.A.I. Master Content Data Harmony Repository MAIstro Server e.g. Database Classification System © 2011. Access Innovations, Inc. All rights reserved.
  • 10. Indexing  Indexing measures • Indexing experts • Subject matter experts (SME) • Hits, misses, & noise • 85% hits  In conjunction with taxonomy measures • Over & under used terms • Over & under indexed content © 2011. Access Innovations, Inc. All rights reserved.
  • 11. Indexing & Search Metrics  Hit, Miss, Noise  Subjective • Relevance • Aboutness  Statistical • Precision • Recall • Level of effort © 2011. Access Innovations, Inc. All rights reserved.
  • 12. Hit, Miss, Noise  Hit – exactly what a human indexer would use  Miss – human indexer would use, but system did not assign  Noise – system assigned, but human did not • Relevant noise – could have been assigned • Irrelevant noise – just plain wrong © 2011. Access Innovations, Inc. All rights reserved.
  • 13. Subjective  Relevance • Reflects how akin it is to the users request  “Aboutness” • Reflects the topical match between the document content and the term • How well the topic describes what the document is about  Varies with level of conceptual terms vs. factual terms in the thesaurus © 2011. Access Innovations, Inc. All rights reserved.
  • 14. Indexing  All content types & sources • Inventory control • Everything in, everything out  Document types • Articles • Proceedings • Corporate © 2011. Access Innovations, Inc. All rights reserved.
  • 15. Link to Community Resources (Source: Helen Atkins, AACR) CME Upcoming Other Activity on Conference Journal Topic A on Topic A Articles on Topic A Job Posting Journal for Expert Article on on Topic A Topic A Grant Available Podcast Interview for Researchers with Researcher Working on Working on Topic A Topic A Author Networks Social Networking SME – Topic A © 2011. Access Innovations, Inc. All rights reserved.
  • 16. Indexing with Data Harmony® M.A.I.™  Rule base development • 80/20 rule • Indexing objectives  GUI  Time-to-market • Level of effort to build • Level of effort to maintain • Less than all other alternatives when indexing for high precision & recall © 2011. Access Innovations, Inc. All rights reserved.
  • 17. Updating Rule Base  Automatic for matching rules when using Data Harmony MAIstro™  80/20 rule  Re-index when 5% to 10% changes to taxonomy – arbitrary ranges: • Monthly with small databases – 5k to 20k • Quarterly with medium – 20k to 1 million • Annual with large – greater than 1 million  Depends on search software, too © 2011. Access Innovations, Inc. All rights reserved.
  • 18. NAMES © 2012. Access Innovations, Inc. All rights reserved.
  • 19. What’s in a name?  Juliet:
"What's in a name? That which we call a rose  By any other name would smell as sweet."  Romeo and Juliet (II, ii, 1-2) © 2011. Access Innovations, Inc. All rights reserved.
  • 20. © 2012. Access Innovations, Inc. All rights reserved.
  • 21. Magnitude of the Problem: Facebook - 700 Million Users Projected for 2011(Open-First) 700 Million Names How will your boss, peers, anyone ever find you? © 2012. Access Innovations, Inc. All rights reserved.
  • 22. What’s in a name?  My name Jay Ven Eman Ven Eman, Jay <First_Name>Jay</First_Name> <Last_Name>Ven Eman</Last_Name>  Name variants  Aliases Jay Von Eman William Henry McCarty Jay Van Eman Henry Antrim Jay van Eman William H. Bonney Jay ven Eman Billy the Kid Jay Veneman  National & Cultural Jay Venema Conventions © 2011. Access Innovations, Inc. All rights reserved.
  • 23. Names  Computationally & editorially intense  Author submissions  Membership records & the like  Industry initiatives – ORCID, VIVO  Subject term disambiguation  Inventory control basics apply here, too  Difficulty level is high  Constance maintenance needed © 2011. Access Innovations, Inc. All rights reserved.
  • 24. Taxonomy Assessments - Part Two February 9, 2012 Thank you! Questions? Access Innovations, Inc. Leveraging Your Content Semantically Jay Ven Eman, Ph.D., CEO j_ven_eman@accessinn.com www.accessinn.com www.dataharmony.com +1.505.998.0800 Albuquerque, NM © 2012. Access Innovations, Inc. All rights reserved.

Notas do Editor

  1. PDF
  2. Post processing“Labels” content itemBut also classifies author
  3. Thanks to Helen Atkins of AACR for this illustration.The real power of this is that the links can all go in all directions, so we take advantage of having the user’s attention regardless of how they step into our “web”Continuing Medical Education (CME)
  4. Johnny Carson