SlideShare a Scribd company logo
1 of 19
Shenghui Wang
Rob Koopman
Exploring a world of
networked information
built from free-text
metadata
OCLC Research EMEA
ELAG2015
What would you do if you are
interested in a topic?
Difficult to answer these questions:
• What are the different aspects of this topic?
• Are there related aspects missing in my search terms?
• Who are the most prominent authors about this topic?
• Which journals publish most about this topic?
• How have others — e.g. librarians — described and classified
this topic?
Demo
• http://thoth.pica.nl/relate?input=opac
How do we do this?
• OFFLINE: generates a semantic representation
for each entity
• ONLINE: finds the most related entities and
using multidimensional scaling to display
Build semantic representation
• Basic assumptions
– Entities can be represented by its context
– Entities which share more context are more likely
to be related
• Context is the textual environment where an
entity occurs
• The effects of state prekindergarten programs on young
children’s school readiness in five states
• [author:jung kwanghee]
• [subject:readiness for school]
Dataset
● ArticleFirst, 65 million articles
● Selected 4 million entities (topical terms,
authors, ISSNs, Dewey decimal codes)
● Represented by 1 million topical terms
But a matrix of 4M x 1M is too big to process
Dimension reduction based on Random Projection
C: a co-occurrence matrix
R: a random matrix of +/-1
C’: approximation of C
after random projection
-- Semantic matrix
Online interface
• Find mutual nearest neighbors
• Use multidimensional scaling to display
Nearest neighbors
Mutual nearest neighbors
Possible applications
• Explorative interface
• Context based search:
– brain
• Journal finder
– Arctic ice journals
– http://brain.oxfordjournals.org/
• Author name disambiguation
– pre kindergarten
Context matters!
• What does “young” mean in
- AritcleFirst
- WorldCat
- Astrophysics
- Art
Ariadne
(demo) http://thoth.pica.nl/relate
• An extremely fast way of navigating large scale
hetereogeneous entities
• Generalisable to different datasets
– Full WorldCat
– Small but highly curated astrophysics dataset
• Supports explorative information retrieval and
entity disambiguation
References
• Koopman, Rob, and Shenghui Wang. 2014. “Where Should I Publish? Detecting
Journal Similarity Based on What Has Been Published There.” In Proceedings of
Digital Libraries 2014, 483–484. London, United Kingdom. Association for
Computing Machinery. Paper, Poster
• Koopman, Rob, Shenghui Wang, Andrea Scharnhorst, and Gwenn Englebienne.
2015. “Ariadne’s Thread — Interactive Navigation in a World of Networked
Information”. In CHI '15 Extended Abstracts on Human Factors in Computing
Systems. ACM, Seoul, South Korea. Paper, Poster
• Koopman, Rob, Shenghui Wang and Andrea Scharnhorst. 2015. “Contextualization
of topics - browsing through terms, authors, journals and cluster allocations”. In
Proceedings of 15th International Conference on Scientometrics & Informetrics.
Istanbul, Turkey. Paper
Explore. Share. Magnify.
Thank you
Shenghui Wang
Rob Koopman
OCLC Research EMEA
shenghui.wang@oclc.org
rob.koopman@oclc.org

More Related Content

What's hot

OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...NASIG
 
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
OCLC and the Social Web:Building tools, providing platforms, engaging the co...OCLC and the Social Web:Building tools, providing platforms, engaging the co...
OCLC and the Social Web: Building tools, providing platforms, engaging the co...Andy Havens
 
Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...lisld
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentConstance Malpas
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the userlisld
 
Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Guus van den Brekel
 
Thinking about technology .... differently
Thinking about technology .... differentlyThinking about technology .... differently
Thinking about technology .... differentlylisld
 
Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...lisld
 
Collections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collectionCollections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collectionlisld
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
Collection Directions: Some Reflections on Libraries and Stewardship of the ...
 Collection Directions: Some Reflections on Libraries and Stewardship of the ... Collection Directions: Some Reflections on Libraries and Stewardship of the ...
Collection Directions: Some Reflections on Libraries and Stewardship of the ...OCLC
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?OCLC
 
The OCLC Research Library Partnership
The OCLC Research Library PartnershipThe OCLC Research Library Partnership
The OCLC Research Library PartnershipOCLC
 
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction DataThe SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction DataOCLC
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentlisld
 
From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...lisld
 
Libraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practiceLibraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practicelisld
 

What's hot (20)

OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...OA in the Library Collection: The Challenge of Identifying and Managing Open ...
OA in the Library Collection: The Challenge of Identifying and Managing Open ...
 
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
OCLC and the Social Web:Building tools, providing platforms, engaging the co...OCLC and the Social Web:Building tools, providing platforms, engaging the co...
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
 
Redefining the Academic Library
Redefining the Academic LibraryRedefining the Academic Library
Redefining the Academic Library
 
Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...Working collaboratively: scaling infrastructure, services, learning and innov...
Working collaboratively: scaling infrastructure, services, learning and innov...
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.
 
Collection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environmentCollection Directions - Research collections in the network environment
Collection Directions - Research collections in the network environment
 
The library in the life of the user
The library in the life of the userThe library in the life of the user
The library in the life of the user
 
Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0
 
Thinking about technology .... differently
Thinking about technology .... differentlyThinking about technology .... differently
Thinking about technology .... differently
 
Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...
 
Collections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collectionCollections unbound: collection directions and the RLUK collective collection
Collections unbound: collection directions and the RLUK collective collection
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
 
Collection Directions: Some Reflections on Libraries and Stewardship of the ...
 Collection Directions: Some Reflections on Libraries and Stewardship of the ... Collection Directions: Some Reflections on Libraries and Stewardship of the ...
Collection Directions: Some Reflections on Libraries and Stewardship of the ...
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
The OCLC Research Library Partnership
The OCLC Research Library PartnershipThe OCLC Research Library Partnership
The OCLC Research Library Partnership
 
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction DataThe SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environment
 
From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...From local infrastructure to engagement - thinking about the library in the l...
From local infrastructure to engagement - thinking about the library in the l...
 
Libraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practiceLibraries: technology as artifact and technology in practice
Libraries: technology as artifact and technology in practice
 

Similar to Exploring a world of networked information built from free-text metadata

Forty Years of the OTA
Forty Years of the OTAForty Years of the OTA
Forty Years of the OTAMartin Wynne
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Searching of Web and Electronic Resources
Searching of Web and Electronic Resources Searching of Web and Electronic Resources
Searching of Web and Electronic Resources Bramesha B
 
LSC Glasgow 061609
LSC Glasgow 061609LSC Glasgow 061609
LSC Glasgow 061609John MacColl
 
Gujranwala medical collge digital library access
Gujranwala medical collge digital library accessGujranwala medical collge digital library access
Gujranwala medical collge digital library accessAsif Iqbal
 
Ontologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebOntologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebGuus Schreiber
 
Webscale Discovery and Information Literacy
Webscale Discovery and Information LiteracyWebscale Discovery and Information Literacy
Webscale Discovery and Information LiteracyCharleston Conference
 
Webscale discovery and information literacy
Webscale discovery and information literacyWebscale discovery and information literacy
Webscale discovery and information literacyli1smc
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-ScholarshipMaryann Martone
 
Bridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and TechnologiesBridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and TechnologiesTed Lin (林泰宏)
 
Printing chocolate bars
Printing chocolate barsPrinting chocolate bars
Printing chocolate barshebertm3308
 
Who, What, Where,Why and How
Who, What, Where,Why and HowWho, What, Where,Why and How
Who, What, Where,Why and HowRachel Frick
 
Innovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trendsInnovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trendsAnil67
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
 

Similar to Exploring a world of networked information built from free-text metadata (20)

Forty Years of the OTA
Forty Years of the OTAForty Years of the OTA
Forty Years of the OTA
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Searching of Web and Electronic Resources
Searching of Web and Electronic Resources Searching of Web and Electronic Resources
Searching of Web and Electronic Resources
 
LSC Glasgow 061609
LSC Glasgow 061609LSC Glasgow 061609
LSC Glasgow 061609
 
Ecdl2004
Ecdl2004Ecdl2004
Ecdl2004
 
Gujranwala medical collge digital library access
Gujranwala medical collge digital library accessGujranwala medical collge digital library access
Gujranwala medical collge digital library access
 
Ontologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture WebOntologies for multimedia: the Semantic Culture Web
Ontologies for multimedia: the Semantic Culture Web
 
Webscale Discovery and Information Literacy
Webscale Discovery and Information LiteracyWebscale Discovery and Information Literacy
Webscale Discovery and Information Literacy
 
Webscale discovery and information literacy
Webscale discovery and information literacyWebscale discovery and information literacy
Webscale discovery and information literacy
 
Alpsp final martone
Alpsp final martoneAlpsp final martone
Alpsp final martone
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-Scholarship
 
Bridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and TechnologiesBridging the Gap: Encouraging Engagement with Library Services and Technologies
Bridging the Gap: Encouraging Engagement with Library Services and Technologies
 
2014_WWW_BTOR
2014_WWW_BTOR2014_WWW_BTOR
2014_WWW_BTOR
 
Printing chocolate bars
Printing chocolate barsPrinting chocolate bars
Printing chocolate bars
 
Ngsp
NgspNgsp
Ngsp
 
Ir1
Ir1Ir1
Ir1
 
Who, What, Where,Why and How
Who, What, Where,Why and HowWho, What, Where,Why and How
Who, What, Where,Why and How
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Innovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trendsInnovative Librarianship - Lib 3.0: The need, opportunity and trends
Innovative Librarianship - Lib 3.0: The need, opportunity and trends
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18
 

More from Shenghui Wang

Non-parametric Subject Prediction
Non-parametric Subject PredictionNon-parametric Subject Prediction
Non-parametric Subject PredictionShenghui Wang
 
Our journey with semantic embedding
Our journey with semantic embeddingOur journey with semantic embedding
Our journey with semantic embeddingShenghui Wang
 
Linking entities via semantic indexing
Linking entities via semantic indexingLinking entities via semantic indexing
Linking entities via semantic indexingShenghui Wang
 
Semantic indexing for KOS
Semantic indexing for KOSSemantic indexing for KOS
Semantic indexing for KOSShenghui Wang
 
Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...Shenghui Wang
 
Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...Shenghui Wang
 
Learning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance SimilarityLearning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance SimilarityShenghui Wang
 
Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...Shenghui Wang
 
Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Shenghui Wang
 
What is concept dirft and how to measure it?
What is concept dirft and how to measure it?What is concept dirft and how to measure it?
What is concept dirft and how to measure it?Shenghui Wang
 
Study concept drift in political ontologies
Study concept drift in political ontologiesStudy concept drift in political ontologies
Study concept drift in political ontologiesShenghui Wang
 

More from Shenghui Wang (13)

Non-parametric Subject Prediction
Non-parametric Subject PredictionNon-parametric Subject Prediction
Non-parametric Subject Prediction
 
Our journey with semantic embedding
Our journey with semantic embeddingOur journey with semantic embedding
Our journey with semantic embedding
 
Linking entities via semantic indexing
Linking entities via semantic indexingLinking entities via semantic indexing
Linking entities via semantic indexing
 
Semantic indexing for KOS
Semantic indexing for KOSSemantic indexing for KOS
Semantic indexing for KOS
 
Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...Contextualization of topics - browsing through terms, authors, journals and c...
Contextualization of topics - browsing through terms, authors, journals and c...
 
Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...Ariadne's Thread -- Exploring a world of networked information built from fre...
Ariadne's Thread -- Exploring a world of networked information built from fre...
 
Learning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance SimilarityLearning Concept Mappings from Instance Similarity
Learning Concept Mappings from Instance Similarity
 
Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...Measuring the dynamic bi-directional influence between content and social ne...
Measuring the dynamic bi-directional influence between content and social ne...
 
Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning Similarity Features, and their Role in Concept Alignment Learning
Similarity Features, and their Role in Concept Alignment Learning
 
What is concept dirft and how to measure it?
What is concept dirft and how to measure it?What is concept dirft and how to measure it?
What is concept dirft and how to measure it?
 
ICA Slides
ICA SlidesICA Slides
ICA Slides
 
ECCS 2010
ECCS 2010ECCS 2010
ECCS 2010
 
Study concept drift in political ontologies
Study concept drift in political ontologiesStudy concept drift in political ontologies
Study concept drift in political ontologies
 

Recently uploaded

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Exploring a world of networked information built from free-text metadata

  • 1. Shenghui Wang Rob Koopman Exploring a world of networked information built from free-text metadata OCLC Research EMEA ELAG2015
  • 2. What would you do if you are interested in a topic?
  • 3.
  • 4.
  • 5. Difficult to answer these questions: • What are the different aspects of this topic? • Are there related aspects missing in my search terms? • Who are the most prominent authors about this topic? • Which journals publish most about this topic? • How have others — e.g. librarians — described and classified this topic?
  • 7. How do we do this? • OFFLINE: generates a semantic representation for each entity • ONLINE: finds the most related entities and using multidimensional scaling to display
  • 8. Build semantic representation • Basic assumptions – Entities can be represented by its context – Entities which share more context are more likely to be related • Context is the textual environment where an entity occurs • The effects of state prekindergarten programs on young children’s school readiness in five states • [author:jung kwanghee] • [subject:readiness for school]
  • 9. Dataset ● ArticleFirst, 65 million articles ● Selected 4 million entities (topical terms, authors, ISSNs, Dewey decimal codes) ● Represented by 1 million topical terms But a matrix of 4M x 1M is too big to process
  • 10. Dimension reduction based on Random Projection C: a co-occurrence matrix R: a random matrix of +/-1 C’: approximation of C after random projection -- Semantic matrix
  • 11. Online interface • Find mutual nearest neighbors • Use multidimensional scaling to display
  • 14.
  • 15. Possible applications • Explorative interface • Context based search: – brain • Journal finder – Arctic ice journals – http://brain.oxfordjournals.org/ • Author name disambiguation – pre kindergarten
  • 16. Context matters! • What does “young” mean in - AritcleFirst - WorldCat - Astrophysics - Art
  • 17. Ariadne (demo) http://thoth.pica.nl/relate • An extremely fast way of navigating large scale hetereogeneous entities • Generalisable to different datasets – Full WorldCat – Small but highly curated astrophysics dataset • Supports explorative information retrieval and entity disambiguation
  • 18. References • Koopman, Rob, and Shenghui Wang. 2014. “Where Should I Publish? Detecting Journal Similarity Based on What Has Been Published There.” In Proceedings of Digital Libraries 2014, 483–484. London, United Kingdom. Association for Computing Machinery. Paper, Poster • Koopman, Rob, Shenghui Wang, Andrea Scharnhorst, and Gwenn Englebienne. 2015. “Ariadne’s Thread — Interactive Navigation in a World of Networked Information”. In CHI '15 Extended Abstracts on Human Factors in Computing Systems. ACM, Seoul, South Korea. Paper, Poster • Koopman, Rob, Shenghui Wang and Andrea Scharnhorst. 2015. “Contextualization of topics - browsing through terms, authors, journals and cluster allocations”. In Proceedings of 15th International Conference on Scientometrics & Informetrics. Istanbul, Turkey. Paper
  • 19. Explore. Share. Magnify. Thank you Shenghui Wang Rob Koopman OCLC Research EMEA shenghui.wang@oclc.org rob.koopman@oclc.org

Editor's Notes

  1. Opac -> journal -> author -> [author:medeiros norm] -> worldcat Ambiguous names: [author:balas janet l] [author:balas j l]
  2. Journal finder Name disam