SlideShare uma empresa Scribd logo
1 de 51
Subject Access Enhancement Via FocusOn Search and CategoryMapAn Integrated Approach for Discovery of University Resources and Library on the Web 2009 Serials Solutions Workshop March 11, 2009 Seattle, WA Updated March 10, 2011 By  Amanda Xu St. John’s University Library Jamaica, New York 1
Overview Subject Access Enhancement Introduction – FocusOnSearch & CategoryMap Business Scenario for FocusOn Search and CategoryMap System Front End for FocusOn Search and CategoryMap System Backend for FocusOn Search and CategoryMap (Taxonomy Management Module) FocusOn Search and CategoryMap in Distributed Network/Web (Logical Network Diagram)  DFD (Data Flow Diagram) Context Level for FocusOnSearch ER Diagram Adapted from RDA (Resource Description and Access)  System Flow Chart for FocusOn Search and CategoryMap System Flow Chart for the Data Movement of all Vocabularies Suggestion for Future References 2
Subject Access Enhancement – FocusOnSearch and CategoryMap (1 of 20) DATA - Structured (20%), Semi Structured & Unstructured (80%) IDC - Percentage Searches on Web – “Aboutness” for a topic  search (45%), and scientific and technical info search (35%)  Query limited to Boolean, Relevance ranking, Phrase, Link Analysis on Refined Indexes by Keywords, Media, and File Types on Web Unknown Named Entities and Topical Search often Discovered by Accident on Web Result  List Rendered often Makes no Sense for “Aboutness” Search on Web, let alone supporting business intelligence Cumbersome Info Sharing Processes for Enterprise Wide Information Discovery  3
Subject Access Enhancement – FocusOnSearch and CategoryMap (2 of 20) Google Query: Algebra – Data Processing – Periodical –Computer Algebra - ACM  4
Subject Access Enhancement – FocusOnSearch and CategoryMap (3 of 20)  OPAC: Subject Keyword “AND” w/ Relevance Ranking – SKEY(^*) in Simple Query Mode 5
Subject Access Enhancement – FocusOnSearch and CategoryMap (4 of 20)  OPAC: Advanced Query Mode: Subject Keyword Boolean “AND”  6
Subject Access Enhancement – FocusOnSearch and CategoryMap (5 of 20)  OPAC Rendering: Brief Display Record Display 7
Subject Access Enhancement – FocusOnSearch and CategoryMap (6 of 20)  OPAC: Subject Browse (SUBJ): Algebra Data processing Periodicals 8
Subject Access Enhancement – FocusOnSearch and CategoryMap (7 of 20)  QPAC: LC Classification – QA 150-272 - Algebra QA 155.7.E4 - Algebra – Electronic Data Processing LC Classification  9
Subject Access Enhancement – FocusOnSearch and CategoryMap (8 of 20)  OPAC: Call No. Browse – CALL Browse: QA155.7 collocating print collections on the topic 10
Subject Access Enhancement – FocusOnSearch and CategoryMap (9 of 20)  Full text E-J Portal on Library Web: Known Item Search by Title, ISSN only  11
Subject Access Enhancement – FocusOnSearch and CategoryMap (10 of 20)  Full text E-J Portal on Library Web: Unknown  Item Browse by Subject  – Mathematics: Algebra 12
Subject Access Enhancement – FocusOnSearch and CategoryMap (11 of 20)  Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.:   Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. 13
Subject Access Enhancement – FocusOnSearch and CategoryMap (12 of 20)  Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.:  Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. 14
Subject Access Enhancement – FocusOnSearch and CategoryMap (13 of 20)  Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. 15
Subject Access Enhancement – FocusOnSearch and CategoryMap (14 of 20)  16 Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.:  Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc.
Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. (15 of 20) 17 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 1 of 4: Overview, and Work Activity Period on OCLC WorldCat Identities
Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (16 of 20) 18 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 2 of 4: Works Created by Charles Wankle on OCLC WorldCat Identities
Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (17 of 20) 19 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 3 of 4:  Audience Level and Works Related to Dr. Charles Wankle on OCLC WorldCat Identities
Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (18 of 20) 20 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 4 of 4: Concept Terms Related to Dr. Charles Wankle
21 Query Submitted to the Search Box on University Website Retried Info on People,  Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (19 of 20) Subject categorization using LC Classification: Serials Solutions offer subject browse for full-text e-j A-Z title list categorized using LC classification; Should the same subject categorization be applied to MARC title list?  If so, library operations within ACQ, CIRC, CAT, Reserve, OPAC, Collection Development modules, etc. can all be categorized, tracked, and reported under LC classification scheme for subject categorization, in addition to integrated discovery of university resources and library on the Web; If so, creating library guides at title-level on a particular topic can also be achieved via WebVoyage with a single hot-link; If so, conformed dimension for enterprise bus architecture can finally be obtainable via LC classification scheme.    Implications to Voyager are the followings if the page on Dr. Charles Wankel were categorized using LC classification scheme: ACQ – Fund code, e.g. 271-7652-302 Queens Books-Management; Cataloging Client – Use MARC 698 field in the Bib for category terms obtained from Serials Solutions’ full-text e-j list by subject.  Item Statistical Category Code 901 and 902 for management in Tobin’s College of Business, and item 900 for management; OPAC browse and search – MARC 698 in bib and statistical category 900 in item, e.g. management is the conformed dimension in EBA (Enterprise Bus Architecture);  CIRC – Patron Group Type matches to Item 901 and 902;Collection Management Module –  How many Faculty at Tobin College are interested in Business Management?  Faculty Teaching and Student Learning Assessment by patron group type, item 901 and 902, and associated activities?  Business Office – Accounting and budgeting by subject category? Other processes?
Subject Access Enhancement – FocusOnSearch and CategoryMap (20 of 20) Enable trend analysis for collection development needs on “Combinatorics” or “Henry George?” Enable repackaging and unbundling of resources by fine-grained topics Answer questions like “To whom will the collection serve, e.g. for which school program, instructor, courses, etc.”  “How well does the collection meet the need of faculty and at what cost?”   Browse both print and electronic collections on “Algebra -Electronic Data Processing” and Mathematics by LC classification scheme with a single click  Enable a single measurement point to benchmark processes on university resources and library Integrate one or more category maps by classifying university resources and library consistently  22
Business Scenario for FocusOn Search and CategoryMap (1 of 3) 23
Business Scenario for FocusOn Search and CategoryMap (2 of 3)  24
Business Scenario for FocusOn Search and CategoryMap (3 of 3)  25
System Front End for FocusOn Search and CategoryMap (1 of 6) 26
27 Systems Front End for FocusOn Search and CategoryMap  - Record Validation Configuration for Atom Data Feed Consumption in Voyager Cataloging Client (2 of  6)
Systems Front End for FocusOn Search and CategoryMap  - Network Connection Configuration to Various DBs for Data Feed Consumption from Bibliographic Utilities, NAFs, to Content Management Systems, etc.  in Voyager Cataloging Client (3 of  6) 28
Systems Front End for FocusOn Search and CategoryMap – General Import Profile Configuration for Atom Data Feed Consumption in Voyager Cataloging Client (4 of  6) 29
Systems Front End for FocusOn Search and CategoryMap – Template Configuration for Constant Holdings Data in Voyager Cataloging Client (5 of  6) 30
Systems Front End for FocusOn Search and CategoryMap – Template Configuration for Constant Item-level Data and Category Term (6 of  6) 31
System Backend for FocusOn Search and CategoryMap (Taxonomy Management Module) (1 of 2) FocusOn Search application packages entail a stack of services:  Centralized catalog Handle media types in the catalog Named entities – Person, family, and corporate be linked and mashed up for obtaining the aboutness and of-ness of a person, locally and remotely via public available APIs on top of HTTP and/ ESBs within the private cloud computing network;   Other entities , e.g. concept, object, event, and geographic name Search facility - suggest spelling correction based on patterns, rules, keywords, phonics, synonyms, dictionary, and controlled vocabulary within one dialogue box in a single interface.  It will also suggest categories that would facilitate discovery based on statistical analysis of queries, documents, user profiles and activities, usage, and vocabulary services consumed from other vocabulary service providers  Google Map API for geographic name   32
System Backend for FocusOn Search and CategoryMap (Taxonomy Management Module) (2 of 2) Link user services, collection management, circulation, acquisitions, cataloging, and other processes across the units of Library and University Resources Maintain taxonomy in conformance to institution and industry standards The CategoryMap will manage category terms which can be in a form of concept, object, event and place, harmonized from subject terms:    Clustered by an application;  Looked up through controlled vocabulary such as LCSH, MESH, and  AAT;  Tagged by user-defined terms;  Structured by LC and Dewey classification;  Referenced directly from fund expenditure structure in acquisitions; Analyzed based on usage statistics reports aggregated from circulation, content suppliers, etc., and no. of documents/objects likely carrying the category term; Managed in a knowledge base for vocabulary filtering, mapping, ETL, etc., and in a data warehouse for data mining;   The search facility will also handle query processing in relational database management systems and ontological database management systems; Relationships between concepts, objects, events, and geographic names are constructed according to controlled vocabularies developed by LC, NLM, and Getty. 33
34 FocusOn Search and CategoryMap in Distributed Network/Web (Logical Network Diagram)
35 Data Flow Diagram Context Level for FocusOn Search
ER Diagram Adapted from RDA (Resource Description and Access) -  ER Diagram View of Title and FRAD Named Entity in Authority Control by IMT, 2008 (1 of 11 )  36
ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Mocked-up Named Entity for Personal Name in MARC Format in LC NAF (2 of 11) 37 |e   rda
38 ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of  Personal Name as Subject Access Point in LC Catalog (3 of 11)
39 ER Diagram Adapted from RDA (Resource Description and Access) – ER Diagram View of  WEMI, Named Entities, and Subjects by IMT , 2008 (4 of 11)
40 ER Diagram Adapted from RDA (Resource Description and Access) – Schema View of  RDA Record, FRBR WEMI, RDA Entities by IMT , 2009 (5 of 11)
ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Related Topical Headings in LC Subject Authority File (6 of 11) 41
ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Mocked-up & Related Topical Headings in MARC Format in LC Subject Authority File (7 of 11) 42 |e   rda
43 ER Diagram Adapted from RDA (Resource Description and Access) – ER Diagram View of Person as Named Entity  by IMT, 2009 (8 of 11)
44 ER Diagram Adapted from RDA (Resource Description and Access) – Schema View of Person as Named Entity  by IMT, 2009 (9 of 11)
ER Diagram Adapted from RDA (Resource Description and Access) - Instance View of  Personal Name as Refined Subject Access Point in LC Catalog (10 of 11) 45 Refine search by subject
ER Diagram Adapted from RDA (Resource Description and Access) - Instance View of  Personal Name as Refined Subject Access Point in LC Catalog (11 of 11) 46
47 System Flow Chart for the Discovery Layer of FocusOn Search and CategoryMap (1 of 2)
48 System Flow Chart for FocusOn Search and CategoryMap (2 of 2)
49 System Flow Chart for the Data Movement of all Vocabularies
Suggestions for Future  Expand Content Selection to Unstructured Data on the Web   Leverage Named Entities Resolution Services Provided by OCLC WorldCat Build Data Filters for Media and File Types Build a Plug-in Reformat Utility Build a Plug-in Meta-data Conversion Utility Evaluate Change Management strategies, packages and techniques  50
References 51 Duggan, J., & Stang, D. B. (2008). Magic quadrant for software change and configuration management for distributed platforms, 2008. Gartner RAS Core Research Note, G00153962, 1-10. Hoffer, J. A., George, J. F., & Valacich, J. S. (2008). Modern systems analysis and design (5th ed. ed., pp. 130-159). New Jersey: Pearson Prentice Hall. IMT (Information Management Team).  “ER Diagram for RDA Taxonomy: High-Level Relationship Among Entities.”  Available: http://www.rdaonline.org/ERDiagramRDA_24June2008.pdf Inmon, W.H.  “Architecting for Business Intelligence and Data Warehousing: Integrating the Structured and Unstructured Data World.”  Data Warehouse Seminar ‘08, sponsored by Data Management Forum, Dec. 8, 2008 Xu, Amanda (2000). “Beyond Seamless Access: Meta-data in the Age of Content Integration” – presented and led the discussion forum at the Spring Program, Information Technology Interest Group of ACRL, New England Chapter, Univ. of Connecticut, May 26, 2000. Xu, Amanda (2007). “Mending the Gap Between the Library’s Electronic and Print Collections on Library’s Web Site Using Semantic Web” – Presented for ExLibris Voyager End User Group Meeting, Chicago, Ill, April 19-20, 2007.   Joint Steering Committee for Development of RDA.  RDA Element Analysis.  26 Oct. 2008: http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf

Mais conteúdo relacionado

Último

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Destaque

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Subject Access Enhancement: FocusOn Search and CategoryMap: An Integrated Approach for Discovery of University Resources and Library on the Web

  • 1. Subject Access Enhancement Via FocusOn Search and CategoryMapAn Integrated Approach for Discovery of University Resources and Library on the Web 2009 Serials Solutions Workshop March 11, 2009 Seattle, WA Updated March 10, 2011 By Amanda Xu St. John’s University Library Jamaica, New York 1
  • 2. Overview Subject Access Enhancement Introduction – FocusOnSearch & CategoryMap Business Scenario for FocusOn Search and CategoryMap System Front End for FocusOn Search and CategoryMap System Backend for FocusOn Search and CategoryMap (Taxonomy Management Module) FocusOn Search and CategoryMap in Distributed Network/Web (Logical Network Diagram) DFD (Data Flow Diagram) Context Level for FocusOnSearch ER Diagram Adapted from RDA (Resource Description and Access) System Flow Chart for FocusOn Search and CategoryMap System Flow Chart for the Data Movement of all Vocabularies Suggestion for Future References 2
  • 3. Subject Access Enhancement – FocusOnSearch and CategoryMap (1 of 20) DATA - Structured (20%), Semi Structured & Unstructured (80%) IDC - Percentage Searches on Web – “Aboutness” for a topic search (45%), and scientific and technical info search (35%) Query limited to Boolean, Relevance ranking, Phrase, Link Analysis on Refined Indexes by Keywords, Media, and File Types on Web Unknown Named Entities and Topical Search often Discovered by Accident on Web Result List Rendered often Makes no Sense for “Aboutness” Search on Web, let alone supporting business intelligence Cumbersome Info Sharing Processes for Enterprise Wide Information Discovery 3
  • 4. Subject Access Enhancement – FocusOnSearch and CategoryMap (2 of 20) Google Query: Algebra – Data Processing – Periodical –Computer Algebra - ACM 4
  • 5. Subject Access Enhancement – FocusOnSearch and CategoryMap (3 of 20) OPAC: Subject Keyword “AND” w/ Relevance Ranking – SKEY(^*) in Simple Query Mode 5
  • 6. Subject Access Enhancement – FocusOnSearch and CategoryMap (4 of 20) OPAC: Advanced Query Mode: Subject Keyword Boolean “AND” 6
  • 7. Subject Access Enhancement – FocusOnSearch and CategoryMap (5 of 20) OPAC Rendering: Brief Display Record Display 7
  • 8. Subject Access Enhancement – FocusOnSearch and CategoryMap (6 of 20) OPAC: Subject Browse (SUBJ): Algebra Data processing Periodicals 8
  • 9. Subject Access Enhancement – FocusOnSearch and CategoryMap (7 of 20) QPAC: LC Classification – QA 150-272 - Algebra QA 155.7.E4 - Algebra – Electronic Data Processing LC Classification 9
  • 10. Subject Access Enhancement – FocusOnSearch and CategoryMap (8 of 20) OPAC: Call No. Browse – CALL Browse: QA155.7 collocating print collections on the topic 10
  • 11. Subject Access Enhancement – FocusOnSearch and CategoryMap (9 of 20) Full text E-J Portal on Library Web: Known Item Search by Title, ISSN only 11
  • 12. Subject Access Enhancement – FocusOnSearch and CategoryMap (10 of 20) Full text E-J Portal on Library Web: Unknown Item Browse by Subject – Mathematics: Algebra 12
  • 13. Subject Access Enhancement – FocusOnSearch and CategoryMap (11 of 20) Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. 13
  • 14. Subject Access Enhancement – FocusOnSearch and CategoryMap (12 of 20) Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. 14
  • 15. Subject Access Enhancement – FocusOnSearch and CategoryMap (13 of 20) Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. 15
  • 16. Subject Access Enhancement – FocusOnSearch and CategoryMap (14 of 20) 16 Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc.
  • 17. Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc. (15 of 20) 17 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 1 of 4: Overview, and Work Activity Period on OCLC WorldCat Identities
  • 18. Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (16 of 20) 18 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 2 of 4: Works Created by Charles Wankle on OCLC WorldCat Identities
  • 19. Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (17 of 20) 19 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 3 of 4: Audience Level and Works Related to Dr. Charles Wankle on OCLC WorldCat Identities
  • 20. Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (18 of 20) 20 Linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution Screen 4 of 4: Concept Terms Related to Dr. Charles Wankle
  • 21. 21 Query Submitted to the Search Box on University Website Retried Info on People, Events, Curriculum, etc.: Algebra Electronic Data Processing, Combinatorics, Henry George, Wankel, Charles, etc . (19 of 20) Subject categorization using LC Classification: Serials Solutions offer subject browse for full-text e-j A-Z title list categorized using LC classification; Should the same subject categorization be applied to MARC title list? If so, library operations within ACQ, CIRC, CAT, Reserve, OPAC, Collection Development modules, etc. can all be categorized, tracked, and reported under LC classification scheme for subject categorization, in addition to integrated discovery of university resources and library on the Web; If so, creating library guides at title-level on a particular topic can also be achieved via WebVoyage with a single hot-link; If so, conformed dimension for enterprise bus architecture can finally be obtainable via LC classification scheme. Implications to Voyager are the followings if the page on Dr. Charles Wankel were categorized using LC classification scheme: ACQ – Fund code, e.g. 271-7652-302 Queens Books-Management; Cataloging Client – Use MARC 698 field in the Bib for category terms obtained from Serials Solutions’ full-text e-j list by subject. Item Statistical Category Code 901 and 902 for management in Tobin’s College of Business, and item 900 for management; OPAC browse and search – MARC 698 in bib and statistical category 900 in item, e.g. management is the conformed dimension in EBA (Enterprise Bus Architecture); CIRC – Patron Group Type matches to Item 901 and 902;Collection Management Module – How many Faculty at Tobin College are interested in Business Management? Faculty Teaching and Student Learning Assessment by patron group type, item 901 and 902, and associated activities? Business Office – Accounting and budgeting by subject category? Other processes?
  • 22. Subject Access Enhancement – FocusOnSearch and CategoryMap (20 of 20) Enable trend analysis for collection development needs on “Combinatorics” or “Henry George?” Enable repackaging and unbundling of resources by fine-grained topics Answer questions like “To whom will the collection serve, e.g. for which school program, instructor, courses, etc.” “How well does the collection meet the need of faculty and at what cost?” Browse both print and electronic collections on “Algebra -Electronic Data Processing” and Mathematics by LC classification scheme with a single click Enable a single measurement point to benchmark processes on university resources and library Integrate one or more category maps by classifying university resources and library consistently 22
  • 23. Business Scenario for FocusOn Search and CategoryMap (1 of 3) 23
  • 24. Business Scenario for FocusOn Search and CategoryMap (2 of 3) 24
  • 25. Business Scenario for FocusOn Search and CategoryMap (3 of 3) 25
  • 26. System Front End for FocusOn Search and CategoryMap (1 of 6) 26
  • 27. 27 Systems Front End for FocusOn Search and CategoryMap - Record Validation Configuration for Atom Data Feed Consumption in Voyager Cataloging Client (2 of 6)
  • 28. Systems Front End for FocusOn Search and CategoryMap - Network Connection Configuration to Various DBs for Data Feed Consumption from Bibliographic Utilities, NAFs, to Content Management Systems, etc. in Voyager Cataloging Client (3 of 6) 28
  • 29. Systems Front End for FocusOn Search and CategoryMap – General Import Profile Configuration for Atom Data Feed Consumption in Voyager Cataloging Client (4 of 6) 29
  • 30. Systems Front End for FocusOn Search and CategoryMap – Template Configuration for Constant Holdings Data in Voyager Cataloging Client (5 of 6) 30
  • 31. Systems Front End for FocusOn Search and CategoryMap – Template Configuration for Constant Item-level Data and Category Term (6 of 6) 31
  • 32. System Backend for FocusOn Search and CategoryMap (Taxonomy Management Module) (1 of 2) FocusOn Search application packages entail a stack of services: Centralized catalog Handle media types in the catalog Named entities – Person, family, and corporate be linked and mashed up for obtaining the aboutness and of-ness of a person, locally and remotely via public available APIs on top of HTTP and/ ESBs within the private cloud computing network; Other entities , e.g. concept, object, event, and geographic name Search facility - suggest spelling correction based on patterns, rules, keywords, phonics, synonyms, dictionary, and controlled vocabulary within one dialogue box in a single interface. It will also suggest categories that would facilitate discovery based on statistical analysis of queries, documents, user profiles and activities, usage, and vocabulary services consumed from other vocabulary service providers Google Map API for geographic name   32
  • 33. System Backend for FocusOn Search and CategoryMap (Taxonomy Management Module) (2 of 2) Link user services, collection management, circulation, acquisitions, cataloging, and other processes across the units of Library and University Resources Maintain taxonomy in conformance to institution and industry standards The CategoryMap will manage category terms which can be in a form of concept, object, event and place, harmonized from subject terms:   Clustered by an application; Looked up through controlled vocabulary such as LCSH, MESH, and AAT; Tagged by user-defined terms; Structured by LC and Dewey classification; Referenced directly from fund expenditure structure in acquisitions; Analyzed based on usage statistics reports aggregated from circulation, content suppliers, etc., and no. of documents/objects likely carrying the category term; Managed in a knowledge base for vocabulary filtering, mapping, ETL, etc., and in a data warehouse for data mining; The search facility will also handle query processing in relational database management systems and ontological database management systems; Relationships between concepts, objects, events, and geographic names are constructed according to controlled vocabularies developed by LC, NLM, and Getty. 33
  • 34. 34 FocusOn Search and CategoryMap in Distributed Network/Web (Logical Network Diagram)
  • 35. 35 Data Flow Diagram Context Level for FocusOn Search
  • 36. ER Diagram Adapted from RDA (Resource Description and Access) - ER Diagram View of Title and FRAD Named Entity in Authority Control by IMT, 2008 (1 of 11 ) 36
  • 37. ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Mocked-up Named Entity for Personal Name in MARC Format in LC NAF (2 of 11) 37 |e rda
  • 38. 38 ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Personal Name as Subject Access Point in LC Catalog (3 of 11)
  • 39. 39 ER Diagram Adapted from RDA (Resource Description and Access) – ER Diagram View of WEMI, Named Entities, and Subjects by IMT , 2008 (4 of 11)
  • 40. 40 ER Diagram Adapted from RDA (Resource Description and Access) – Schema View of RDA Record, FRBR WEMI, RDA Entities by IMT , 2009 (5 of 11)
  • 41. ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Related Topical Headings in LC Subject Authority File (6 of 11) 41
  • 42. ER Diagram Adapted from RDA (Resource Description and Access) – Instance View of Mocked-up & Related Topical Headings in MARC Format in LC Subject Authority File (7 of 11) 42 |e rda
  • 43. 43 ER Diagram Adapted from RDA (Resource Description and Access) – ER Diagram View of Person as Named Entity by IMT, 2009 (8 of 11)
  • 44. 44 ER Diagram Adapted from RDA (Resource Description and Access) – Schema View of Person as Named Entity by IMT, 2009 (9 of 11)
  • 45. ER Diagram Adapted from RDA (Resource Description and Access) - Instance View of Personal Name as Refined Subject Access Point in LC Catalog (10 of 11) 45 Refine search by subject
  • 46. ER Diagram Adapted from RDA (Resource Description and Access) - Instance View of Personal Name as Refined Subject Access Point in LC Catalog (11 of 11) 46
  • 47. 47 System Flow Chart for the Discovery Layer of FocusOn Search and CategoryMap (1 of 2)
  • 48. 48 System Flow Chart for FocusOn Search and CategoryMap (2 of 2)
  • 49. 49 System Flow Chart for the Data Movement of all Vocabularies
  • 50. Suggestions for Future Expand Content Selection to Unstructured Data on the Web Leverage Named Entities Resolution Services Provided by OCLC WorldCat Build Data Filters for Media and File Types Build a Plug-in Reformat Utility Build a Plug-in Meta-data Conversion Utility Evaluate Change Management strategies, packages and techniques 50
  • 51. References 51 Duggan, J., & Stang, D. B. (2008). Magic quadrant for software change and configuration management for distributed platforms, 2008. Gartner RAS Core Research Note, G00153962, 1-10. Hoffer, J. A., George, J. F., & Valacich, J. S. (2008). Modern systems analysis and design (5th ed. ed., pp. 130-159). New Jersey: Pearson Prentice Hall. IMT (Information Management Team). “ER Diagram for RDA Taxonomy: High-Level Relationship Among Entities.” Available: http://www.rdaonline.org/ERDiagramRDA_24June2008.pdf Inmon, W.H. “Architecting for Business Intelligence and Data Warehousing: Integrating the Structured and Unstructured Data World.” Data Warehouse Seminar ‘08, sponsored by Data Management Forum, Dec. 8, 2008 Xu, Amanda (2000). “Beyond Seamless Access: Meta-data in the Age of Content Integration” – presented and led the discussion forum at the Spring Program, Information Technology Interest Group of ACRL, New England Chapter, Univ. of Connecticut, May 26, 2000. Xu, Amanda (2007). “Mending the Gap Between the Library’s Electronic and Print Collections on Library’s Web Site Using Semantic Web” – Presented for ExLibris Voyager End User Group Meeting, Chicago, Ill, April 19-20, 2007. Joint Steering Committee for Development of RDA. RDA Element Analysis. 26 Oct. 2008: http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf

Notas do Editor

  1. Acknowledgement: Prof. IsaelMoskowitz from NYU; Andrew Sankowsi, Cynthia Chambers and Theresa Maylone from St. John’s Univ. Libraries
  2. Definition: Subject access refers to find and locate the ‘Aboutness’ of a named entity (person, family, corporate body) or a concept, object, event, and place To make this happen, at document processing side, we do subject analysis, and other processing as the followings:Provide classification to the document;Sometimes, provide categorization to the document;Describe the Aboutness of the document, e.g. identify the named entity, concept, object, event, and place;Tag the named entity, concept, object, event and place in a controlled manner (e.g. provide authority control for the named entity and subjects, including thesaurus such as AAT, LCSH, MESH); Index them, hoping that the query that the searcher enters into the system matches with the term that we have in the index;Usually store the tags/metadata in relational database systems, and the associated documents in flat file systems; By the way, this is what people call ‘Semi-structured’ data;Unstructured data refers to documents that are in .doc files, .txt files, .xls files, .email, and telephone transcripts; Structured data refers to data in relational database systems, object-oriented database systems, and other structured systems, etc.; The world has 20% of the data in structured systems, and 80% of the data in unstructured and semi-structured systems.Subject access enhancement refers to providing integrated subject access to structured, semi-structured, and unstructured data.FocusOnSearch and CategoryMap are considered as essential components to enhance subject access for such data.
  3. Google Search example –Result list does not differentiate ‘books written by Henry George himself’ from ‘books or topics about Henry George.’OPAC search comes handy as we markup both books written by the Henry George and books or topics about Henry George in the bib records.
  4. Google works for known item search, e.g. “ACM Communications in Computer Algebra” by title. Google does not work for unknown item search, e.g. “Algebra-Data Processing-Periodical”. The above title can not be found in the first page.
  5. OPAC supports both known and unknown item search;Example of unknown item search in OPAC – WebVoyage by subject keyword “AND” with relevance ranking in simple search mode.
  6. Unknown item search in OPAC – WebVoyage by subject Boolean keyword “AND” in advanced search mode.
  7. OPAC rendering result for unknown item search by subject keyword in OPAC - WebVoyage
  8. Unknown item search in OPAC – WebVoyage by subject browse.
  9. Use LC classification - QA 150-272 to group items whose “Aboutness” is Algebra; and QA 155.7.E4 is Algebra – Electronic Data Processing.Unknown item search in OPAC – WebVoyage by LC classification browse in a bib. This is still a challenge. Why?
  10. Only print collection can be collocated using call no. browse.E-J collections can not be browsed even though LC classification exists in the 050 field of a bib record. In Voyager, call number index & browse comes out of MARC holdings field 852$h, rather than the classification number in a bib. Collocate print, electronic, and other types of collections under LC classification is still a challenge.Serials Solutions supplies subject categories to full-text e-j A-Z list using LC classification. A work around can be made to Serials Solutions MARC title list using the same subject category scheme as the one used for full-text e-j A-Z list. The note fields in slide number 21 indicates the implications to ILS operations, and others.
  11. Full-text e-j portal on the library Web supports only known-item search by title and ISSN.
  12. Full-text e-j portal on the library Web supports unknown item browse by subject. Two titles have been highlighted: 1) ACM communications in computer algebra; and 2) Annals of combinatorics.
  13. The next few slides are examples of a page rendering from the University website in a single search box: Algebra Electronic Data Processing, Combinatorics, Henry George, Charles Wankel, etc. A search of the term “Algebra Electronic Data Processing” retrieves schools and bulletin info on the university website.
  14. A search of the term “Combinatorics” retrieves related academic events, faculty info, and school bulletins in 2004.Page rendering at the University Website – should we group the result list as bread crumbs/or folder structure for named entity, concept, object, event, place, timeline, etc.?What about lifecycle maintenance of the web page content?
  15. A search of the term “Henry George” retrieves related academic events and library resource info page on the university Website.Group the “Aboutness” of “Henry George” into a sense-making page regardless the location of the page?
  16. Identify works “BY” or “ABOUT” CharlesWankle on the University Website by linking between University Website and OCLC WorldCat Identities Services for Named Entities Resolution? More examples of Charles Wankel page lookup from OCLC WorldCat Identities Services in the next few slides.
  17. More Examples on Result List – It supports horizontal and vertical scans, and featured recognition about the named entity “Wankel, Charles” on OCLC WorldCat IdentitiesLink between University Website and OCLC WorldCat Identities Services for Named Entities Resolution?
  18. Link between University Website and OCLC WorldCat Identities Services for Named Entities Resolution?
  19. Link between University Website and OCLC WorldCat Identities Services for Named Entities Resolution?
  20. Link between University Website and OCLC WorldCat Identities Services for Named Entities Resolution?
  21. The diagram depicts a snapshot of the information infrastructure for the University resources, especially in regard to faculty, and Libraries;FocusOn Search and CategoryMap sit on top of the information discovery layer, building the bridges, e.g. among faculty, university resources and libraries; Enable us to understand who the users are, and what processes involved in info creation and consumption especially in regard to faculty;More category types tomarkup faculty activities, university resources and the library?What are considered as input, what are considered as output? What are the processes to generate the output? How information flows between each process? This diagram details facts to collect and markup at contextual level.
  22. This diagram indicates the flows of the systems. We aggregate contents through the aggregation of technologies, and distribute the contents to users.Librarians deploy systems, such as Collection Development, Cataloging, LibGuide, capable to select, organize, access, guide, enhance, and distribute contents to the user through technologies. Yet, there are still complaints ….Where is the user’s behavior context? – we index tons of info, present them to the user without any filtering, e.g. who are the users, and what are they looking for?
  23. At document side, if we have a CategoryMap, it will:Lookup and consume vocabulary services provided by LC, NLM, OCLC, and Getty in manual and batch modes;Process vocabulary and enable the choice of the appropriate form of named entity in reference to terms clustered by applications, tagged by end-users, structured in classification scheme;Distribute the contents to the end-users through the analysis of existing collections, activities and users;Classify the users’ behavior context better?
  24. There are objects to be embedded within the front end of FocusOn Search and CategoryMap. The objects being selected for insertion in a word document are:St. John’s Logo: Login/Create My Account; User preferences; Simple search and advanced search modes; Suggest; Reset; Email; Print; AskUs; Exit. The “Preview” button is expected to view full-text of ‘Search results selected’ when limiting to online only, etc.Refinement search results by subject, and then limit the subject to concept only. Click browse CategoryMap, relationship among highlighted subject terms about the person can be explored from OCLC Named Entities for the person. St. John’s FocusOn Search As Google Gadget.‘TextThis’ is the button to send a few and final result sets to mobile phone, mocked up from North Caroline State University’s Quick Search: http://www.lib.ncsu.edu/catalog/The button ‘Save’ means ‘Save To Bag’ for further processing. After ‘Save to Bag’, users have the choice of saving the items into ‘my library.’ The list of ‘Add Note’, ‘Edit labels’, ‘Write review’, and ‘Remove’ will appear in brief item listing display. Two selected books in users’ library are selected for such display, extracted and mocked up from Google Books. The label ‘Add note’ applies to the entire banner of the 1st book in brief display. The label ‘Write review’ applies to the entire banner of 2nd book in brief display. User created labels will be indexed byCategoryMap. Two trails of bread crumbs for folder navigation are designed to integrate FocusOn Search with existing Websites of the University and Libraries. The top one sits right above the user actions for Print, Attach/RSS, Libraries, Text This, Reformat and Gadget. It indicates users’ paths, e.g. Home > Academics & School > Libraries > Resources > Focus On > About Henry George. Click on the trail will lead users go back to the next higher level of the folder structured trail. The second trail in the bottom of the page indicates available services provided by the University, including feedback, privacy, safety, sitemap, and copyright information, etc. Click on the trail will lead users into the services provided by the University and the Libraries.
  25. Build CategoryMap into the session configuration for existing cataloging client whether it is browser-based or window-based for a single user. Validation of content and record structure within CategoryMap. The example shows, how record structure such as Atom and Dublin Core can be accommodated and validated in such environment, including heading types, e.g. category, etc.
  26. Client configurable CategoryMap Connection Options to consume data services from a list of databases, e.g. WorldCat, LC authority files, NLM Mesh, Getty AAT, NLC Authority file, dictionary, and common used reference tool, etc.
  27. Build CategoryMap into the session configuration for general holdings library, including choice of call no. hierarchies, import and duplicated profiles, etc.
  28. Build CategoryMap into the session configuration for format specific holdings library if MARC format is chosen
  29. Build CategoryMap into the session configuration for format specific item in cataloging client, where item level category is displayed as category code, e.g. 900.Build CategoryMap into the session configuration for format specific item in circulation client where item level category is displayed as category name description, e.g. Management - Tobin.
  30. Centralized catalog: 1. Is part of the common service of the discovery layer, sitting on top of existing university information resources and Libraries on the Web, ILS (Integrated Library Systems), university resource planning systems (enterprise legacy systems), teaching and learning systems, and discipline-specific research repositories at institutional and regional level once the systems implemented in full-scale; 2. Provides interfaces for human-machine and machine-machine communication, interaction, collaboration, problem solving, and decision support; 3. Provides an inventory of structured data (xml, RSS, atom) and unstructured data (email, web page, .doc, .pdf, .excel) via a set of meta-data records. A meta-data record conformed to the institutional and industry standards describe the of-ness and about-ness of an information object and provide links to the object. Media Type:All media types in the catalog will be given descriptive meta-data for media type identification, discovery, search and retrieval, and linkage. 1. Like the rest of the collections in the catalog, they are classified for role-based access, arranged alphabetically for browsing, categorized for discovery, filtered, ETL and indexed for search and retrieval, recommended for reputation, top-ranked for analysis and other processes in the pipeline, and linked for obtaining the media object locally or mashing up with external applications remotely via public available APIs on top of HTTP and enterprise service bus within the private cloud computing environment.  2. The administrative and structural metadata for the maintenance and manipulation of each media type (e.g. reformatting images, videos, and audios) as a media object is beyond the scope of this project at the moment.NAMED ENTITIESThe named entity for a person, family, and corporate is considered as an information object that comes with the following attributes when appropriate: Zip-code, address, country;Area code, phone number, device profiles, etc.; Web page and email in the form of URI;Language;Timeline that is specific to a named entity. For a person, timeline refers to dates associated with the person’s birth date, death date, and period of activity in Gregorian calendar; Category appropriate to the level of granularity of the information object, e.g. skills and specialty for a person, and correlated with: subject terms clustered by an application; controlled vocabulary such as LCSH and MESH provided by a lookup; user-tagged terms; classification scheme such as LC classification and Dewey; Association related to the about-ness of a named entity. For a person, the associated attributes are not limited to the followings, e.g. title, gender, affiliation, field of activity, occupation and biographical information. At runtime, a search of the named entity of a person, all resources, works, expressions, manifestations and items about the named entity will be retrieved and displayed along with the bio info of the person; Association related to the of-ness of a named entity. At runtime, a search of the named entity of a person, all works, expressions, manifestations, and items created by the named entity will be retrieved and displayed based on content model for rendering;Relationships between named entities for persons, families, and corporate bodies are tagged, mapped, grouped, and visualized according to user-tagged terms, association rules, classification, and user profiles specified in web form during initial registration. A user can also modify such relationship manually. The backend systems will recommend additional relationships by running a recommendation engine on behalf of the user; Top-ranked for other processes in the pipeline, e.g. supporting collection development decision, users and collection performance analysis, e.g. query expansion; Like media type, the specific named entity, e.g. person, will be linked and mashed up for obtaining the aboutness and of-ness of a person, locally and remotely via public available APIs on top of HTTP and ESBs within the private cloud computing network; Privacy, copyright, and information security, including opt-in and opt-out option for the named entities to be exposed and shared across the enterprise; The output of the focused page can also be rendered for import and export, RSS, preview, citation list generation, sharing, printing, email and texting in user-defined formats and devices. Other entities such as concept term, object name, event name, and geographic name will carry similar system functionality and capability as the named entities for persons, families, and corporate bodies. At run-time, given a concept term, for instance, works, expression, manifestations, and items related to the concept term will be retrieved and displayed regardless of its structure, media type, format, repository, etc. according to the classification of the documents, controlled vocabulary, role-based access, and content models for rendering. At run-time, the relationship between the concept term, for instance, and its broader terms, narrower terms, used terms, etc. can be exposed and consumed by other applications, which might take it as an input for making choices and validation of the form of a name or subject, assigning classification and subject terms to the resources, in addition to the development and maintenance of the vocabulary for categories. The search facility in FocusOn Search will suggest spelling correction based on patterns, rules, keywords, phonics, synonyms, dictionary, and controlled vocabulary within one dialogue box in a single interface. It will also suggest categories that would facilitate discovery based on statistical analysis of queries, documents, user profiles and activities, usage, and vocabulary services consumed from other vocabulary service providers. For geographic name, if applicable, zip code and area code processing will be a part of the application. Ideally, Google Map API look up should be supported as well if applicable.
  31. Fine-grained taxonomy management is important for not only for subject searches, but also for mission critical operations at the University and Libraries. For Libraries, e.g. it is important to make informed decisions as what we are doing and how well we are doing through baselining and reporting on user services, collection management, circulation, acquisitions, cataloging, etc. The CategoryMap application and along with its program will link these processes across the units of the Libraries, and the University.  Therefore, it is our job to maintain such taxonomy for the reuse and sharing of enterprise-wide information resources among ERP systems, ILS, institutional repositories, etc. in conformance to institution and industry standards. The CategoryMap will serve as the backbone of an enterprise’s common data services, in addition to the time of the day and locations. The CategoryMap will manage category terms which can be in a form of concept, object, event and place, harmonized from subject terms:  Clustered by an application; Looked up through controlled vocabulary such as LCSH, MESH, and AAT; Tagged by user-defined terms; Structured by LC and Dewey classification; Referenced directly from fund expenditure structure in acquisitions;Analyzed based on usage statistics reports aggregated from circulation, content suppliers, etc., and no. of documents/objects likely carrying the category term;Managed in a knowledge base for vocabulary filtering, mapping, ETL, etc., and in a data warehouse for data mining; The search facility will also handle query processing in relational database management systems and ontological database management systems;Relationships between concepts, objects, events, and geographic names are constructed according to controlled vocabularies developed by LC, NLM, and Getty.All named entities such as personal name (PN), family name (FN), corporate name (CN), concept term (CT), object name (ON), event name (EN), geographic name (GN), and timeline (TN) in a meta-data record will have their own authority records stored and maintained centrally in a logical/physical name resolver facility distributed globally by authorized vocabulary service providers such as LC, OCLC, British Library, and National Library of Canada on the Web. Named headings in the authority records at the name resolver facility such as OCLC WorldCat are:  Constructed in conformance to tagging standards and rules; Contributed by a community of users who have defined their roles and responsibilities in service contribution and consumption, registered and exposed their services with major vocabulary service providers; Validated by templates, encoding levels, schemas, name authority files, controlled vocabularies, reference tools, and business rules; Governed for the enforcement of policies, service level agreements (SLAs), operational level agreements (OLAs), service reconciliation, service lifecycle management, compliance, SSO (Single Sign On), etc.; Monitored, measured and reported for information quality, fiduciary, and security.  The CategoryMap application will perform dynamic lookup or batch processing for named entities and subjects in a name resolver facility via Web-services for service consumption. User-tagged terms in such a manner will be reviewed, card-sorted, and integrated into a master list of commonly used vocabulary before they are contributed to the vocabulary service providers when appropriate. The application will map a user-tagged term for the object into its variant name, preferred form of name, and default form of name as appropriate to the user’s choice according to statistical processing and tag-based ranking algorithms, and others. See references for information criteria defined by COBIT Conceptual Framework, and ISACA Model Curriculum.
  32. There are two tiers: 1) Cloud tier – user processes on the internet (OS for Browser); 2) Vocabulary tier – document processes on the intranet (OS for Windows); Sync desktop application from both tiers;
  33. DFD (Data Flow Diagram) Context Level for FocusOnSearch
  34. Reference:ER Diagram for RDA Taxonomy: High-Level Relationship Among Entities by IMT (Information Management Team)1. Uncontrolled access point, explanatory heading, community generated tags, etc. excluded from the diagram
  35. Example of the named entity - Person:George, Henry, 1839-1897 using LC Authority File
  36. Example of books about Henry George marked up in MARC 600 field. The personal heading has been established in LC Authority File.
  37. This ER Diagram indicates entity relationship among named entities and subjects (e.g. concept, object, event, place).Reference:ER Diagram for RDA Taxonomy: High-Level Relationship Among Entities by IMT (Information Management Team) (4 of 8)
  38. LC subject authority indicates relationship between topical headings – ‘Single tax, Land, Nationalization of, etc.’
  39. Here is how it is marked up in the authority file.
  40. Reference:ER Diagram for RDA Taxonomy: High-Level Relationship Among Entities by IMT (Information Management Team) (7 of 8)
  41. Here is refined search by subject.
  42. System Flow Chart for FocusOn Search and CategoryMapInfo Sharing Processes for Enterprise Wide Information Discovery
  43. The CategoryMap has to leverage the vocabulary framework such as Topic Map as formal taxonomy building block, which sits on top of commonly thesauri such as LC LCSH, NLM MESH, and Getty AAT, and in addition, it presents the topic map and other vocabulary processing features for FocusOn Search in the discovery layer. On the one hand, we will leverage existing vocabulary framework such as OCLC WorldCat Identities by developing service consumption applications, and on the other hand, we have to actively collaborate with others in developing the common vocabulary infrastructure for the Web.
  44. 1. Info Sharing Processes for Enterprise Wide Information Discovery2. Maintain, Trace, Track, Analyze, Report
  45. Continue to collect sample unstructured source data at St. John’s Univ. Web Site from the faculty page of Tobin College of Business like Dr. Charles Wankel, and integrate the page using CategoryMap application that is going to integrate into the Discovery Layer for FocusOn Search. Continue to collect sample unstructured source data composed by a group of librarians as the libraries’ guides to the events of current and future interest, and published at St. John’s University Web site like one of the Topic Guides Titled “Focus on Henry George”.Continue to collect sample unstructured source data from OCLC WorldCat Identities Services for Named Entities Resolution using LCCN number as identifier to locate the personal name page for Dr. Charles Wankle.Continue to collect sample structured data to be syndicated from Google Books by FocusOn Search using Henry George’s “Dreamer or Realist” as use cases for developing detailed display of an selected item in the Front End of FocusOn SearchTosyndicate data feed from the university resources and Libraries on the Web, ‘Attach’ button would allow the system to obtain HTML pages and their associated files (e.g. PDF, Excel, Word, etc.) from sites recommended by the discovery layer of the FocusOn Search. The file filtering layer prebuilt within the FocusOn Search will automatically convert the native pages into format-independent files, ready to be reviewed, ETL (Extracted, Transformed, and Loaded), and integrated with the repositories of FocusOn Search.A plug-in meta-data conversion utility will capture the attached metadata and convert them into a centralized meta-data repository for the entire discovery layer, ready to be reused by other applications.The ‘RSS’ button is going to store dynamic contents on the web. Special change management strategies, packages, and techniques have been evaluated, e.g. Rational Asset Manger, for SOA services, etc.Reformat’ is an export facility that presents users with choices for output options of further processing, e.g. RefWork. All the cataloged resources are expected to have zip code lookup function, and would be interfaced with Google Map, and localized as how the systems behaved in OCLC Open WorldCathttp://www.worldcat.org/. Such visualized features are expected to be performed after final refinement.Two sample result sets indicate that the discovery layer of the FocusOn Search will send open API requests to a list of service providers, dynamically determining the appropriate copy to present if there are multiple choices, the appropriate format template to use for rendering based on criteria of the followings: a) predefined by the users, b) pre-processed open URL links according to known contracts, service level agreement and trust management, c) patterns, heuristic rules, statistical analysis, and data mining of resources, users, activities, etc. in the data warehouse and the knowledge-based of the discovery layer.