SlideShare uma empresa Scribd logo
1 de 6
Baixar para ler offline
Semantically Indexed Hypermedia:
Linking Information Disciplines
Douglas Tudhope and Daniel Cunliffe
University of Glamorgan Web: http://www.glam.ac.uk/
Hypermedia Research Unit Web: http://www.comp.glam.ac.uk/pages/research/hypermedia/
School of Computing Web: http://www.comp.glam.ac.uk/
Pontypridd, CF37 1DL, Wales, UK
Email: dstudhope@glam.ac.uk, djcunlif@glam.ac.uk
Web: www.comp.glam.ac.uk/people/staff/dstudhope, www.comp.glam.ac.uk/people/staff/djcunlif

Categories and Subject Descriptors: H.5.4 [Information Interfaces and Presentation]:
Hypertext/Hypermedia; H.3.1 [Information Storage and Retrieval]: Thesauruses; H.3.3 [Information
Storage and Retrieval]: Information Search and Retrieval; H.3.5 [Information Storage and Retrieval]:
web-based services; H.3.7 [Information Storage and Retrieval]: Digital Libraries
Additional Key Words and Phrases: Semantic index, Semantic distance measures, Metadata, Dublin Core

Semantic linking has always been a strand of hypermedia research and is becoming central to current attempts to
facilitate access to information in large hypertexts and the emerging 'semantic web' [Berners-Lee 1998a]. Due to the
scaling problems with explicitly authored links between information items, it is likely that future large scale
hypertexts will employ a mixture of authored links and indirect, computed links via some form of indexing system.
Problems of information access are heightened by the lack of precision of current WWW retrieval technology and
users unfamiliar with indexing conventions. There is a critical need for tools that will assist users to formulate and
refine queries, and navigate through information spaces. Recent years have seen the growth of metadata, Digital
Libraries, and interest in the application of traditional information science and library cataloguing techniques to the
new environment of hypertext and the WWW. Semantic indexing provides a bridge between the various information
disciplines. With the growing influence of the Resource Description Framework [Lassila 1999], semantic tagging
and cataloguing of information is likely to become a key component of the information architecture of intranet
hypertexts and the WWW.

1 Semantic indexing
Representing semantic knowledge about a domain or application area in order to facilitate access to information has
been a major focus in hypermedia, since the early days (e.g. [Collier 1987], [Trigg 1986]). One approach has been to
assign semantic labels or more formal typing to authored hypertext links [Nanard 1991], [Schnase 1993]. Another
approach, the one followed here, includes a semantic index layer in its model of hypermedia architecture. In addition
to explicitly authored links, each information item is indexed with descriptor terms - frequently more than one term
will be required. Frisse and Cousins [Frisse 1989] first introduced the notion of separate index and document spaces
to hypertext, observing that different conformations of those spaces allow for different possibilities in automated
__________________________
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on
the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is
permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or
a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org
© Copyright 2000 ACM 0360-0300/99/12es
Tudhope

Semantically Indexed Hypermedia: Linking Information Disciplines

reasoning. Different types of indexing system are possible. It is useful to categorise indexing systems according to
three dimensions [van Rijsbergen 1979]:
1.

whether index terms are automatically derived or manually assigned.

2.

whether index terms belong to a controlled vocabulary or are uncontrolled ('free').

3.

whether terms can be combined as ordered strings representing a single concept when indexing (precoordinated terms), e.g. "Association of Computing Machinery", or must be post-coordinated on retrieval.
The latter allows the possibility of 'false positives' where items are returned that have no connection
between different terms in the source string.

Information Retrieval (IR) has tended towards automatically generated free text index terms (post-coordinated),
weighted by statistical frequency of terms in documents and collections. On the other hand, distinguishing features
of a semantic index are that semantic relationships exist between controlled index terms, usually (but not
necessarily) the result of manual cataloguing. Semantically indexed hypermedia links are, by definition, computed,
corresponding to Intensional-Retrieval links [DeRose 1989]. This allows the possibility of flexible query-based
navigation tools.

2 Thesauri and Classification Systems
The semantic index approach employs a set of semantic relationships between index terms, following the well
established thesaurus tradition in information science (ISO 2788, ISO 5964). A large number of thesauri exist,
covering a variety of subject domains, for example the Medical Subject Headings [MeSH 1999] and the Art and
Architecture Thesaurus [AAT 1999]. Classification systems, such as Dewey Decimal or Library of Congress, focus
on hierarchical relationships. These controlled vocabularies are part of standard cataloguing practice in libraries and
museums and are now being applied to digital hypertexts via thematic keywords in metadata resource descriptors.
For example, the Dublin Core [DC 1999] standard metadata set includes elements for Title, Creator, Date, Format,
etc. in addition to the more complex notion of the Subject (or theme) of a resource. Guidelines recommend that,
where possible, the Subject element be taken from a relevant controlled vocabulary. Links between concepts in the
subject domain can be expressed by the semantic relationships in a thesaurus. The three main thesaurus relationships
are Equivalence (equivalent terms), Hierarchical (broader/narrower terms), and Associative (more loosely Related
Terms). Sometimes specialisations of the three main relationships are included (for example distinguishing
taxonomic and instance hierarchical relationships). Following a minimalist approach to semantic modelling by
restricting the set of relationships permits interoperability of cataloguing/retrieval tools and techniques. It also
facilitates automated reasoning over this core set of relationships.

3 Using semantic index links
Navigation is provided indirectly by queries to the semantic index space, as opposed to directly following explicit
links between information items. The queries can be simple or complex. The conventional hypermedia navigation
techniques may be implemented by relatively simple queries [Tudhope 1994], although there would be no particular
reason to use a semantic index to achieve that functionality. One additional possibility provided by a semantic index
space is an organised set of browsable concept descriptors, as a means of comprehending the associated layer of
media items [Bruza 1990], [Pollard 1993]. The user can browse the index space, 'beam down' to view media items of
interest, and conversely 'beam up' to the index space from media items. Additionally, when index terms are
combined, the user may browse around each term, broadening and narrowing the specificity of description and
seeing the effect on likely 'hits' [Pollitt 1997]. Alternatively, the combined terms can be considered as locating a
position in a 'hyperindex', permitting a string of terms to be broadened or narrowed in one navigation action [Bruza
1990]. If a user enters a set of query terms as opposed to browsing the index space, equivalence relationships permit
a broad entry vocabulary of synonyms to be tied together for retrieval purposes, without the user having to specify
ACM Computing Surveys, Vol. 31, Number 4es, December 1999

2
Tudhope

Semantically Indexed Hypermedia: Linking Information Disciplines

the exact term employed for indexing. As a simple example, this document is indexed by a set of controlled
vocabulary terms from the ACM Computing Classification [ACM 1998] (see Categories and Subject Descriptors
above). In the ACM Digital Library pages, explicit hypertext links can be navigated. In addition, controlled
vocabulary index terms can be combined with free text terms when searching the library and the hypertext version
of the classification can be browsed as a subject index in order to select terms for searching.
Beyond this, the inclusion of semantic information in the index space provides the opportunity for knowledge-based
hypermedia systems that provide intelligent navigation support and retrieval, with the system taking a more active
role in the navigation process than relying on manual browsing alone. For example, rules governing permitted
combinations of terms can filter a user's possible navigation options [Arents 1993], [Rada 1993]. Work at the
University of Glamorgan explores the potential of reasoning over the semantic relationships in the index space.
Traversal of transitive relationships makes possible imprecise matching between query and media item, or between
two media items, rather than relying on an exact match of controlled vocabulary terms [Tudhope 1997]. Expanding
terms offers an augmented browsing capacity based on measures of distance in the semantic index space. Results
can be post-processed for expression in a particular retrieval tool. Various possibilities exist for indirect computed
links with such hybrid query/navigation tools [Cunliffe 1997]. For example, information items with semantically
close terms can be ranked in the result or destination set, or the system might automatically suggest terms to be
considered for inclusion in a query. If facets exist for time and place in the index space, then a result set can be
returned as a dynamic guided tour based on temporal or spatial relationships (or indeed other orderings).
Alternatively, the focus of a user's navigation can remain in the document (media) space, typically requiring less
cognitive overhead than constructing a formal query [Marchionini 1995]. In this case, having found an information
item of interest, the navigation action consists of requesting "More items like this one", with the system responsible
for a (best-match) similarity measure of the item's index terms. At the cost of greater cognitive demand on the user,
the source context for the navigation may be modified and particular media items or terms (de)emphasised (cf.
relevance feedback techniques in IR).

4 Key application to RDF and the WWW
Semantically based retrieval underpins diverse efforts to provide access to distributed multimedia resources, such as
the many projects involving SGML (XML) and Z39.50 for networked access to cross-platform information. Major
efforts are underway to create subject-based gateways to Internet resources, sometimes combining manually indexed
and robot harvested metadata. The W3C Recommendation for a 'machine-understandable' Resource Description
Framework supports the thrust of this research [Lassila 1999]. An RDF descriptor might include the Dublin Core
element, Subject, specifying a classification or thesaurus to which keywords belong. Precise semantic index retrieval
tools will be required to provide a manageable set of results to requests that may span several collections [Doerr
1997], and may involve networked terminology servers and more than one thesaurus or classification. One point
worth emphasising is the social dimension to access and the link with existing cataloguing practice. Controlled
vocabularies are often the result of standards efforts in subject domains, continue to evolve, and are part of a
network of practice and education/training in the information science community. They have the potential to act as a
bridge between information provider and seeker, "a semantic road map for searchers and indexers" [Soergel 1995],
if tools can be devised that visualise their structure and how they may be used.

5 Research issues
A number of key issues for research remain if the potential of significant gains in precision of information access is
to be realised.
•

An advantage of building query functionality into hypertext navigation is a smooth transition between
querying and browsing. Can we identify the appropriate extent of cognitive effort demanded by interfaces
to navigation tools? How far should the internal workings of matching functions or the detail of the
underlying semantic network be brought to the surface?
ACM Computing Surveys, Vol. 31, Number 4es, December 1999

3
Tudhope

Semantically Indexed Hypermedia: Linking Information Disciplines

•

Some applications may lend themselves to the specialisation of the standard thesaurus relationships into
richer sets, particularly the associative relationship. For example, in some situations it may be useful to
distinguish various kinds of causal relationships from the generic associative relationship.

•

The problem of expressing similarity between pre-coordinated strings of semantic index terms needs
further investigation. How much should be pre-computed and what can be left to dynamic computation?
How best can we express syntax or structure in such strings? This effort converges with work on
description logic ontologies [Bullock 1998], [Weinstein 1998].

•

Various efforts attempt to combine statistical IR and semantic controlled vocabulary approaches. For
example, Agosti et al [Agosti 1995] propose a three layer architecture for Hypermedia IR systems
combining a statistical index layer and a semantic (thesaurus) layer (see also [Aslandogan 1997],
[Chiaramella 1996]). Studies of online searching behaviour have investigated conditions influencing choice
of free text or controlled vocabulary terms (e.g. [Fidel 1991]). How should the two approaches be best
integrated - should they be seen as different components of a toolkit, or should a matching function
incorporate both statistical weighting and semantic measures? In addition, indirect semantic links and
explicit authored links will soon be combined in link/search engines. What principles should guide this
integration?

•

The semantic interoperability of overlapping but different thesauri is an important issue for remote access
to distributed sets of resources employing controlled vocabularies in metadata. A concept may exist in one
vocabulary but not another, or may map (partially) to various concepts.

References
[AAT 1999] Art and Architecture Thesaurus Browser, [Online: http://shiva.pub.getty.edu/aat_browser/], 1999.
[ACM 1998] ACM Computing Classification. http://www.acm.org/class/1998/
[Agosti 1995] Maristella Agosti, Massimo Melucci, and Fabio Crestani. "Automatic Authoring and Construction of
Hypermedia for Information Retrieval" in ACM Multimedia Systems, 3(1), 15-24, 1995.
[Arents 1993] Hans C. Arents and Walter F. L. Bogaerts. "Navigation without Links and Nodes without Contents:
Intensional Navigation in a Third-Order Hypermedia System" in Hypermedia, 5(3), 187-204, 1993.
[Aslandogan 1997] Y. Alp Aslandogan, Chuck Thier, Clement T. Yu, Jon Zou, and Naphtali Rishe. "Using
Semantic Contents and WordNet in Image Retrieval" in Proceedings of ACM SIGIR '97, 286-295, 1997.
[Berners-Lee 1998a] Tim Berners-Lee. World Wide Web Design Issues: A Roadmap to the Semantic Web,
[Online: http://www.w3.org/DesignIssues/Semantic.html], 1998.
[Bruza 1990] Peter Bruza. "Hyperindices: A Novel Aid for Searching in Hypermedia" in Proceedings of the ACM
European Conference on Hypertext '90 (ECHT '90), Versailles, France,109-122, November 1990.
[Bullock 1998] Joseph Bullock and Carole Goble. "TourisT: The Application of a Description Logic based
Semantic Hypermedia System for Tourism" in Proceedings of ACM Hypertext '98, Pittsburgh PA, 132-141, June
1998.
[Chiaramella 1996] Yves Chiaramella and Ammar Kheirbek. "An Integrated Model for Hypermedia and
Information Retrieval" in Information Retrieval and Hypertext, Maristella Agosti and Alan Smeaton (editors),
Kluwer, 139-178, 1996.

ACM Computing Surveys, Vol. 31, Number 4es, December 1999

4
Tudhope

Semantically Indexed Hypermedia: Linking Information Disciplines

[Collier 1987] George Collier. "Thoth-II: Hypertext with Explicit Semantics" in Proceedings of ACM Hypertext
'87, Chapel Hill, NC, 269-289, November 1987.
[Cunliffe 1997] Daniel Cunliffe, Carl Taylor, and Douglas Tudhope. "Query-based Navigation in Semantically
Indexed Hypermedia" in Proceedings of ACM Hypertext 97, Southampton, UK, 87-95, April 1997.
[DC 1999] Dublin Core. [Online: http://purl.org/metadata/dublin_core], 1999.
[DeRose 1989] Steven J. DeRose. "Expanding the Notion of Links" in Proceedings of ACM Hypertext '89,
Pittsburgh, PA, 249-257, November 1989.
[Doerr 1997] Martin Doerr, Irene Fundulaki and Vassilis Christophidis. "The Specialist Seeks Expert Views:
Managing Digital Folders in the AQUARELLE Project" in Proceedings of Museums and the Web, David Bearman
and Jennifer Trant (editors), 261-270, 1997.
[Fidel 1991] Raya Fidel. "Searchers' Selection of Search Keys (I-III)" in Journal of American Society for
Information Science, 42(7), 490-527, 1991.
[Frisse 1989] Mark E. Frisse and Steven B. Cousins. "Information retrieval from hypertext: Update on the Dynamic
Medical Handbook" in Proceedings of ACM Hypertext '89, Pittsburgh, PA, 199-211, November 1989.
[Lassila 1999] Ora Lassila and Ralph Swick (editors), "Resource Description Framework (RDF) Model and Syntax
Specification" World Wide Web Consortium Recommendation, [Online: http://www.w3.org/TR/REC-rdf-syntax/],
February 22 1999.
[Marchionini 1995] Gary Marchionini. Information Seeking in Electronic Environments. Cambridge University
Press, 1995.
[MeSH 1999] MeSH 1999. Medical Subject Headings homepage. http://www.nlm.nih.gov/mesh/meshhome.html
[Nanard 1991] Jocelyne Nanard and Mark Nanard. "Using structured types to incorporate knowledge in hypertext"
in Proceedings of ACM Hypertext '91, San Antonio, TX, 329-344, December 1991.
[Pollard 1993] Richard Pollard. "A hypertext-based thesaurus as a subject browsing aid for bibliographic databases"
in Information Processing and Management, 29(3), 345-357, 1993.
[Pollitt 1997] Steven Pollitt, Martin P Smith and Patrick A J Braekevelt. "View-based Searching Systems" in
Proceedings of Joint Workshop of BCS IR and HCI Specialist Groups, (Johnson and Dunlop eds.) 73-77.
[Rada 1993] Roy Rada, Weigang Wang, Alex Birchall. "Retrieval hierarchies in hypertext" in Information
Processing and Management 29(3), 359-371, 1993.
[Schnase 1993] John L. Schnase, John J. Leggett, David L. Hicks, and Ron L. Szabo. "Semantic Data Modeling of
Hypermedia Associations. ACM Transactions on Information Systems (TOIS), 11(1), 27-49, January 1993.
[Soergel 1995] Dagobert Soergel. "The Art and Architecture Thesaurus (AAT): a critical appraisal" in Visual
Resources, 10(4), 369-400, 1995.
[Trigg 1986] Randall H. Trigg and Mark Weiser. "Textnet: A Network-based Approach to Text Handling" in ACM
Transactions on Office Information Systems (TOIS), 4(1), 1-23, January 1986.
[Tudhope 1994] Douglas Tudhope, Paul Beynon-Davies, Carl Taylor, and Chris B. Jones. "Virtual Architecture
Based on a Binary Relational Model: A Museum Hypermedia Application" in Hypermedia, 6(3), 174-192, 1994.
ACM Computing Surveys, Vol. 31, Number 4es, December 1999

5
Tudhope

Semantically Indexed Hypermedia: Linking Information Disciplines

[Tudhope 1997] Douglas Tudhope and Carl Taylor. "Navigation via Similarity: Automatic Lnking Based on
Semantic Closeness" in Information Processing and Management, 33(2), 233-242, 1997.
[van Rijsbergen 1979] C. J. "Keith" van Rijsbergen. Information Retrieval. Butterworth, 1979.
[Weinstein 1998] Peter C. Weinstein. "Ontology-based metadata: transforming the MARC legacy" in Proceedings
of ACM Digital Libraries '98, 254-263, 1998.

ACM Computing Surveys, Vol. 31, Number 4es, December 1999

6

Mais conteúdo relacionado

Mais procurados

A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...University of Bari (Italy)
 
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...WGBH Media Library and Archives
 
Text mining and analytics v6 - p1
Text mining and analytics   v6 - p1Text mining and analytics   v6 - p1
Text mining and analytics v6 - p1Dave King
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Text analytics in social media
Text analytics in social mediaText analytics in social media
Text analytics in social mediaJeremiah Fadugba
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationNew York University
 
A Survey on Text Mining-techniques and application
A Survey on Text Mining-techniques and applicationA Survey on Text Mining-techniques and application
A Survey on Text Mining-techniques and applicationRyota Eisaki
 
Towards Webpage Steganography with Attribute Truth Table
Towards Webpage Steganography with Attribute Truth TableTowards Webpage Steganography with Attribute Truth Table
Towards Webpage Steganography with Attribute Truth TableSreekanth Reddy
 
Enhancing access privacy of range retrievals over b+trees
Enhancing access privacy of range retrievals over b+treesEnhancing access privacy of range retrievals over b+trees
Enhancing access privacy of range retrievals over b+treesMigrant Systems
 
Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...
Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...
Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...IJERA Editor
 
A NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERING
A NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERINGA NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERING
A NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERINGIJDKP
 
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...Computer Science Journals
 
Compressed full text indexes
Compressed full text indexesCompressed full text indexes
Compressed full text indexesunyil96
 
IRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET Journal
 
A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
 

Mais procurados (19)

A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
 
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
ANALYSIS OF RESEARCH ISSUES IN WEB DATA MINING
 
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
 
Text mining and analytics v6 - p1
Text mining and analytics   v6 - p1Text mining and analytics   v6 - p1
Text mining and analytics v6 - p1
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Text analytics in social media
Text analytics in social mediaText analytics in social media
Text analytics in social media
 
Beyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content IntegrationBeyond Seamless Access: Meta-data In The Age of Content Integration
Beyond Seamless Access: Meta-data In The Age of Content Integration
 
A Survey on Text Mining-techniques and application
A Survey on Text Mining-techniques and applicationA Survey on Text Mining-techniques and application
A Survey on Text Mining-techniques and application
 
Improving Tag Clouds
Improving Tag CloudsImproving Tag Clouds
Improving Tag Clouds
 
Towards Webpage Steganography with Attribute Truth Table
Towards Webpage Steganography with Attribute Truth TableTowards Webpage Steganography with Attribute Truth Table
Towards Webpage Steganography with Attribute Truth Table
 
Enhancing access privacy of range retrievals over b+trees
Enhancing access privacy of range retrievals over b+treesEnhancing access privacy of range retrievals over b+trees
Enhancing access privacy of range retrievals over b+trees
 
Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...
Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...
Dynamic & Attribute Weighted KNN for Document Classification Using Bootstrap ...
 
A NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERING
A NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERINGA NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERING
A NEAR-DUPLICATE DETECTION ALGORITHM TO FACILITATE DOCUMENT CLUSTERING
 
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...
 
Sub1522
Sub1522Sub1522
Sub1522
 
Compressed full text indexes
Compressed full text indexesCompressed full text indexes
Compressed full text indexes
 
IRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description FrameworkIRJET- Data Retrieval using Master Resource Description Framework
IRJET- Data Retrieval using Master Resource Description Framework
 
A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User Transactions
 
Rules
RulesRules
Rules
 

Destaque (9)

CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
 
The role of Thesauri and Standard Vocabularies in linking data
The role of Thesauri and Standard Vocabularies in linking data The role of Thesauri and Standard Vocabularies in linking data
The role of Thesauri and Standard Vocabularies in linking data
 
Hosameldin mostafa cv2010 (2)
Hosameldin mostafa cv2010 (2)Hosameldin mostafa cv2010 (2)
Hosameldin mostafa cv2010 (2)
 
Cambridge ppt
Cambridge pptCambridge ppt
Cambridge ppt
 
Implementation of semantic network dictionary system
Implementation of semantic network dictionary system Implementation of semantic network dictionary system
Implementation of semantic network dictionary system
 
PoolParty Search Server
PoolParty Search ServerPoolParty Search Server
PoolParty Search Server
 
Semantic Search Trend
Semantic Search TrendSemantic Search Trend
Semantic Search Trend
 
Semantic SharePoint
Semantic SharePointSemantic SharePoint
Semantic SharePoint
 
Instrumentação: Cordas (Disc. Arranjos e Transcrições)
Instrumentação: Cordas (Disc. Arranjos e Transcrições)Instrumentação: Cordas (Disc. Arranjos e Transcrições)
Instrumentação: Cordas (Disc. Arranjos e Transcrições)
 

Semelhante a Semantically indexed hypermedia linking information disciplines

Dynamic hypertext querying and linking
Dynamic hypertext querying and linkingDynamic hypertext querying and linking
Dynamic hypertext querying and linkingunyil96
 
Topic Modeling : Clustering of Deep Webpages
Topic Modeling : Clustering of Deep WebpagesTopic Modeling : Clustering of Deep Webpages
Topic Modeling : Clustering of Deep Webpagescsandit
 
semantic annotation of documents a compar ative study
semantic annotation of documents  a compar ative studysemantic annotation of documents  a compar ative study
semantic annotation of documents a compar ative studyINFOGAIN PUBLICATION
 
A Schema-Based Approach To Modeling And Querying WWW Data
A Schema-Based Approach To Modeling And Querying WWW DataA Schema-Based Approach To Modeling And Querying WWW Data
A Schema-Based Approach To Modeling And Querying WWW DataLisa Garcia
 
An Incremental Method For Meaning Elicitation Of A Domain Ontology
An Incremental Method For Meaning Elicitation Of A Domain OntologyAn Incremental Method For Meaning Elicitation Of A Domain Ontology
An Incremental Method For Meaning Elicitation Of A Domain OntologyAudrey Britton
 
Multidimensional access methods
Multidimensional access methodsMultidimensional access methods
Multidimensional access methodsunyil96
 
Text databases and information retrieval
Text databases and information retrievalText databases and information retrieval
Text databases and information retrievalunyil96
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...ijcsity
 
A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...
A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...
A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...TELKOMNIKA JOURNAL
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search enginesunyil96
 
Knowledge organization
Knowledge organizationKnowledge organization
Knowledge organizationEthel88
 
1 s2.0-s0098791313000154-main
1 s2.0-s0098791313000154-main1 s2.0-s0098791313000154-main
1 s2.0-s0098791313000154-mainGraham Steel
 
Data Integration in Multi-sources Information Systems
Data Integration in Multi-sources Information SystemsData Integration in Multi-sources Information Systems
Data Integration in Multi-sources Information Systemsijceronline
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud ComputingCarmen Sanborn
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperDBOnto
 

Semelhante a Semantically indexed hypermedia linking information disciplines (20)

Dynamic hypertext querying and linking
Dynamic hypertext querying and linkingDynamic hypertext querying and linking
Dynamic hypertext querying and linking
 
Use and integration of controlled vocabularies (AGROVOC) in DSpace Repositories
Use and integration of controlled vocabularies (AGROVOC) in DSpace RepositoriesUse and integration of controlled vocabularies (AGROVOC) in DSpace Repositories
Use and integration of controlled vocabularies (AGROVOC) in DSpace Repositories
 
Topic Modeling : Clustering of Deep Webpages
Topic Modeling : Clustering of Deep WebpagesTopic Modeling : Clustering of Deep Webpages
Topic Modeling : Clustering of Deep Webpages
 
semantic annotation of documents a compar ative study
semantic annotation of documents  a compar ative studysemantic annotation of documents  a compar ative study
semantic annotation of documents a compar ative study
 
A Schema-Based Approach To Modeling And Querying WWW Data
A Schema-Based Approach To Modeling And Querying WWW DataA Schema-Based Approach To Modeling And Querying WWW Data
A Schema-Based Approach To Modeling And Querying WWW Data
 
An Incremental Method For Meaning Elicitation Of A Domain Ontology
An Incremental Method For Meaning Elicitation Of A Domain OntologyAn Incremental Method For Meaning Elicitation Of A Domain Ontology
An Incremental Method For Meaning Elicitation Of A Domain Ontology
 
Multidimensional access methods
Multidimensional access methodsMultidimensional access methods
Multidimensional access methods
 
Text databases and information retrieval
Text databases and information retrievalText databases and information retrieval
Text databases and information retrieval
 
Spotlight
SpotlightSpotlight
Spotlight
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...
 
090925 Data Transformation
090925 Data Transformation090925 Data Transformation
090925 Data Transformation
 
A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...
A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...
A Comprehensive Survey on Comparisons across Contextual Pre-Filtering, Contex...
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search engines
 
Knowledge organization
Knowledge organizationKnowledge organization
Knowledge organization
 
Knowledge organization system
Knowledge organization systemKnowledge organization system
Knowledge organization system
 
1 s2.0-s0098791313000154-main
1 s2.0-s0098791313000154-main1 s2.0-s0098791313000154-main
1 s2.0-s0098791313000154-main
 
Data Integration in Multi-sources Information Systems
Data Integration in Multi-sources Information SystemsData Integration in Multi-sources Information Systems
Data Integration in Multi-sources Information Systems
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
 

Mais de unyil96

Xml linking
Xml linkingXml linking
Xml linkingunyil96
 
Xml data clustering an overview
Xml data clustering an overviewXml data clustering an overview
Xml data clustering an overviewunyil96
 
Word sense disambiguation a survey
Word sense disambiguation a surveyWord sense disambiguation a survey
Word sense disambiguation a surveyunyil96
 
Web page classification features and algorithms
Web page classification features and algorithmsWeb page classification features and algorithms
Web page classification features and algorithmsunyil96
 
Techniques for automatically correcting words in text
Techniques for automatically correcting words in textTechniques for automatically correcting words in text
Techniques for automatically correcting words in textunyil96
 
Strict intersection types for the lambda calculus
Strict intersection types for the lambda calculusStrict intersection types for the lambda calculus
Strict intersection types for the lambda calculusunyil96
 
Smart meeting systems a survey of state of-the-art
Smart meeting systems a survey of state of-the-artSmart meeting systems a survey of state of-the-art
Smart meeting systems a survey of state of-the-artunyil96
 
Searching in metric spaces
Searching in metric spacesSearching in metric spaces
Searching in metric spacesunyil96
 
Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...unyil96
 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces usingunyil96
 
Ontology visualization methods—a survey
Ontology visualization methods—a surveyOntology visualization methods—a survey
Ontology visualization methods—a surveyunyil96
 
On nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsunyil96
 
Nonmetric similarity search
Nonmetric similarity searchNonmetric similarity search
Nonmetric similarity searchunyil96
 
Machine transliteration survey
Machine transliteration surveyMachine transliteration survey
Machine transliteration surveyunyil96
 
Machine learning in automated text categorization
Machine learning in automated text categorizationMachine learning in automated text categorization
Machine learning in automated text categorizationunyil96
 
Is this document relevant probably
Is this document relevant probablyIs this document relevant probably
Is this document relevant probablyunyil96
 
Integrating content search with structure analysis for hypermedia retrieval a...
Integrating content search with structure analysis for hypermedia retrieval a...Integrating content search with structure analysis for hypermedia retrieval a...
Integrating content search with structure analysis for hypermedia retrieval a...unyil96
 
Information retrieval on the web
Information retrieval on the webInformation retrieval on the web
Information retrieval on the webunyil96
 
Implementing sorting in database systems
Implementing sorting in database systemsImplementing sorting in database systems
Implementing sorting in database systemsunyil96
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide webunyil96
 

Mais de unyil96 (20)

Xml linking
Xml linkingXml linking
Xml linking
 
Xml data clustering an overview
Xml data clustering an overviewXml data clustering an overview
Xml data clustering an overview
 
Word sense disambiguation a survey
Word sense disambiguation a surveyWord sense disambiguation a survey
Word sense disambiguation a survey
 
Web page classification features and algorithms
Web page classification features and algorithmsWeb page classification features and algorithms
Web page classification features and algorithms
 
Techniques for automatically correcting words in text
Techniques for automatically correcting words in textTechniques for automatically correcting words in text
Techniques for automatically correcting words in text
 
Strict intersection types for the lambda calculus
Strict intersection types for the lambda calculusStrict intersection types for the lambda calculus
Strict intersection types for the lambda calculus
 
Smart meeting systems a survey of state of-the-art
Smart meeting systems a survey of state of-the-artSmart meeting systems a survey of state of-the-art
Smart meeting systems a survey of state of-the-art
 
Searching in metric spaces
Searching in metric spacesSearching in metric spaces
Searching in metric spaces
 
Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...Searching in high dimensional spaces index structures for improving the perfo...
Searching in high dimensional spaces index structures for improving the perfo...
 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces using
 
Ontology visualization methods—a survey
Ontology visualization methods—a surveyOntology visualization methods—a survey
Ontology visualization methods—a survey
 
On nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains
On nonmetric similarity search problems in complex domains
 
Nonmetric similarity search
Nonmetric similarity searchNonmetric similarity search
Nonmetric similarity search
 
Machine transliteration survey
Machine transliteration surveyMachine transliteration survey
Machine transliteration survey
 
Machine learning in automated text categorization
Machine learning in automated text categorizationMachine learning in automated text categorization
Machine learning in automated text categorization
 
Is this document relevant probably
Is this document relevant probablyIs this document relevant probably
Is this document relevant probably
 
Integrating content search with structure analysis for hypermedia retrieval a...
Integrating content search with structure analysis for hypermedia retrieval a...Integrating content search with structure analysis for hypermedia retrieval a...
Integrating content search with structure analysis for hypermedia retrieval a...
 
Information retrieval on the web
Information retrieval on the webInformation retrieval on the web
Information retrieval on the web
 
Implementing sorting in database systems
Implementing sorting in database systemsImplementing sorting in database systems
Implementing sorting in database systems
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
 

Último

All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 

Último (20)

All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 

Semantically indexed hypermedia linking information disciplines

  • 1. Semantically Indexed Hypermedia: Linking Information Disciplines Douglas Tudhope and Daniel Cunliffe University of Glamorgan Web: http://www.glam.ac.uk/ Hypermedia Research Unit Web: http://www.comp.glam.ac.uk/pages/research/hypermedia/ School of Computing Web: http://www.comp.glam.ac.uk/ Pontypridd, CF37 1DL, Wales, UK Email: dstudhope@glam.ac.uk, djcunlif@glam.ac.uk Web: www.comp.glam.ac.uk/people/staff/dstudhope, www.comp.glam.ac.uk/people/staff/djcunlif Categories and Subject Descriptors: H.5.4 [Information Interfaces and Presentation]: Hypertext/Hypermedia; H.3.1 [Information Storage and Retrieval]: Thesauruses; H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval; H.3.5 [Information Storage and Retrieval]: web-based services; H.3.7 [Information Storage and Retrieval]: Digital Libraries Additional Key Words and Phrases: Semantic index, Semantic distance measures, Metadata, Dublin Core Semantic linking has always been a strand of hypermedia research and is becoming central to current attempts to facilitate access to information in large hypertexts and the emerging 'semantic web' [Berners-Lee 1998a]. Due to the scaling problems with explicitly authored links between information items, it is likely that future large scale hypertexts will employ a mixture of authored links and indirect, computed links via some form of indexing system. Problems of information access are heightened by the lack of precision of current WWW retrieval technology and users unfamiliar with indexing conventions. There is a critical need for tools that will assist users to formulate and refine queries, and navigate through information spaces. Recent years have seen the growth of metadata, Digital Libraries, and interest in the application of traditional information science and library cataloguing techniques to the new environment of hypertext and the WWW. Semantic indexing provides a bridge between the various information disciplines. With the growing influence of the Resource Description Framework [Lassila 1999], semantic tagging and cataloguing of information is likely to become a key component of the information architecture of intranet hypertexts and the WWW. 1 Semantic indexing Representing semantic knowledge about a domain or application area in order to facilitate access to information has been a major focus in hypermedia, since the early days (e.g. [Collier 1987], [Trigg 1986]). One approach has been to assign semantic labels or more formal typing to authored hypertext links [Nanard 1991], [Schnase 1993]. Another approach, the one followed here, includes a semantic index layer in its model of hypermedia architecture. In addition to explicitly authored links, each information item is indexed with descriptor terms - frequently more than one term will be required. Frisse and Cousins [Frisse 1989] first introduced the notion of separate index and document spaces to hypertext, observing that different conformations of those spaces allow for different possibilities in automated __________________________ Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org © Copyright 2000 ACM 0360-0300/99/12es
  • 2. Tudhope Semantically Indexed Hypermedia: Linking Information Disciplines reasoning. Different types of indexing system are possible. It is useful to categorise indexing systems according to three dimensions [van Rijsbergen 1979]: 1. whether index terms are automatically derived or manually assigned. 2. whether index terms belong to a controlled vocabulary or are uncontrolled ('free'). 3. whether terms can be combined as ordered strings representing a single concept when indexing (precoordinated terms), e.g. "Association of Computing Machinery", or must be post-coordinated on retrieval. The latter allows the possibility of 'false positives' where items are returned that have no connection between different terms in the source string. Information Retrieval (IR) has tended towards automatically generated free text index terms (post-coordinated), weighted by statistical frequency of terms in documents and collections. On the other hand, distinguishing features of a semantic index are that semantic relationships exist between controlled index terms, usually (but not necessarily) the result of manual cataloguing. Semantically indexed hypermedia links are, by definition, computed, corresponding to Intensional-Retrieval links [DeRose 1989]. This allows the possibility of flexible query-based navigation tools. 2 Thesauri and Classification Systems The semantic index approach employs a set of semantic relationships between index terms, following the well established thesaurus tradition in information science (ISO 2788, ISO 5964). A large number of thesauri exist, covering a variety of subject domains, for example the Medical Subject Headings [MeSH 1999] and the Art and Architecture Thesaurus [AAT 1999]. Classification systems, such as Dewey Decimal or Library of Congress, focus on hierarchical relationships. These controlled vocabularies are part of standard cataloguing practice in libraries and museums and are now being applied to digital hypertexts via thematic keywords in metadata resource descriptors. For example, the Dublin Core [DC 1999] standard metadata set includes elements for Title, Creator, Date, Format, etc. in addition to the more complex notion of the Subject (or theme) of a resource. Guidelines recommend that, where possible, the Subject element be taken from a relevant controlled vocabulary. Links between concepts in the subject domain can be expressed by the semantic relationships in a thesaurus. The three main thesaurus relationships are Equivalence (equivalent terms), Hierarchical (broader/narrower terms), and Associative (more loosely Related Terms). Sometimes specialisations of the three main relationships are included (for example distinguishing taxonomic and instance hierarchical relationships). Following a minimalist approach to semantic modelling by restricting the set of relationships permits interoperability of cataloguing/retrieval tools and techniques. It also facilitates automated reasoning over this core set of relationships. 3 Using semantic index links Navigation is provided indirectly by queries to the semantic index space, as opposed to directly following explicit links between information items. The queries can be simple or complex. The conventional hypermedia navigation techniques may be implemented by relatively simple queries [Tudhope 1994], although there would be no particular reason to use a semantic index to achieve that functionality. One additional possibility provided by a semantic index space is an organised set of browsable concept descriptors, as a means of comprehending the associated layer of media items [Bruza 1990], [Pollard 1993]. The user can browse the index space, 'beam down' to view media items of interest, and conversely 'beam up' to the index space from media items. Additionally, when index terms are combined, the user may browse around each term, broadening and narrowing the specificity of description and seeing the effect on likely 'hits' [Pollitt 1997]. Alternatively, the combined terms can be considered as locating a position in a 'hyperindex', permitting a string of terms to be broadened or narrowed in one navigation action [Bruza 1990]. If a user enters a set of query terms as opposed to browsing the index space, equivalence relationships permit a broad entry vocabulary of synonyms to be tied together for retrieval purposes, without the user having to specify ACM Computing Surveys, Vol. 31, Number 4es, December 1999 2
  • 3. Tudhope Semantically Indexed Hypermedia: Linking Information Disciplines the exact term employed for indexing. As a simple example, this document is indexed by a set of controlled vocabulary terms from the ACM Computing Classification [ACM 1998] (see Categories and Subject Descriptors above). In the ACM Digital Library pages, explicit hypertext links can be navigated. In addition, controlled vocabulary index terms can be combined with free text terms when searching the library and the hypertext version of the classification can be browsed as a subject index in order to select terms for searching. Beyond this, the inclusion of semantic information in the index space provides the opportunity for knowledge-based hypermedia systems that provide intelligent navigation support and retrieval, with the system taking a more active role in the navigation process than relying on manual browsing alone. For example, rules governing permitted combinations of terms can filter a user's possible navigation options [Arents 1993], [Rada 1993]. Work at the University of Glamorgan explores the potential of reasoning over the semantic relationships in the index space. Traversal of transitive relationships makes possible imprecise matching between query and media item, or between two media items, rather than relying on an exact match of controlled vocabulary terms [Tudhope 1997]. Expanding terms offers an augmented browsing capacity based on measures of distance in the semantic index space. Results can be post-processed for expression in a particular retrieval tool. Various possibilities exist for indirect computed links with such hybrid query/navigation tools [Cunliffe 1997]. For example, information items with semantically close terms can be ranked in the result or destination set, or the system might automatically suggest terms to be considered for inclusion in a query. If facets exist for time and place in the index space, then a result set can be returned as a dynamic guided tour based on temporal or spatial relationships (or indeed other orderings). Alternatively, the focus of a user's navigation can remain in the document (media) space, typically requiring less cognitive overhead than constructing a formal query [Marchionini 1995]. In this case, having found an information item of interest, the navigation action consists of requesting "More items like this one", with the system responsible for a (best-match) similarity measure of the item's index terms. At the cost of greater cognitive demand on the user, the source context for the navigation may be modified and particular media items or terms (de)emphasised (cf. relevance feedback techniques in IR). 4 Key application to RDF and the WWW Semantically based retrieval underpins diverse efforts to provide access to distributed multimedia resources, such as the many projects involving SGML (XML) and Z39.50 for networked access to cross-platform information. Major efforts are underway to create subject-based gateways to Internet resources, sometimes combining manually indexed and robot harvested metadata. The W3C Recommendation for a 'machine-understandable' Resource Description Framework supports the thrust of this research [Lassila 1999]. An RDF descriptor might include the Dublin Core element, Subject, specifying a classification or thesaurus to which keywords belong. Precise semantic index retrieval tools will be required to provide a manageable set of results to requests that may span several collections [Doerr 1997], and may involve networked terminology servers and more than one thesaurus or classification. One point worth emphasising is the social dimension to access and the link with existing cataloguing practice. Controlled vocabularies are often the result of standards efforts in subject domains, continue to evolve, and are part of a network of practice and education/training in the information science community. They have the potential to act as a bridge between information provider and seeker, "a semantic road map for searchers and indexers" [Soergel 1995], if tools can be devised that visualise their structure and how they may be used. 5 Research issues A number of key issues for research remain if the potential of significant gains in precision of information access is to be realised. • An advantage of building query functionality into hypertext navigation is a smooth transition between querying and browsing. Can we identify the appropriate extent of cognitive effort demanded by interfaces to navigation tools? How far should the internal workings of matching functions or the detail of the underlying semantic network be brought to the surface? ACM Computing Surveys, Vol. 31, Number 4es, December 1999 3
  • 4. Tudhope Semantically Indexed Hypermedia: Linking Information Disciplines • Some applications may lend themselves to the specialisation of the standard thesaurus relationships into richer sets, particularly the associative relationship. For example, in some situations it may be useful to distinguish various kinds of causal relationships from the generic associative relationship. • The problem of expressing similarity between pre-coordinated strings of semantic index terms needs further investigation. How much should be pre-computed and what can be left to dynamic computation? How best can we express syntax or structure in such strings? This effort converges with work on description logic ontologies [Bullock 1998], [Weinstein 1998]. • Various efforts attempt to combine statistical IR and semantic controlled vocabulary approaches. For example, Agosti et al [Agosti 1995] propose a three layer architecture for Hypermedia IR systems combining a statistical index layer and a semantic (thesaurus) layer (see also [Aslandogan 1997], [Chiaramella 1996]). Studies of online searching behaviour have investigated conditions influencing choice of free text or controlled vocabulary terms (e.g. [Fidel 1991]). How should the two approaches be best integrated - should they be seen as different components of a toolkit, or should a matching function incorporate both statistical weighting and semantic measures? In addition, indirect semantic links and explicit authored links will soon be combined in link/search engines. What principles should guide this integration? • The semantic interoperability of overlapping but different thesauri is an important issue for remote access to distributed sets of resources employing controlled vocabularies in metadata. A concept may exist in one vocabulary but not another, or may map (partially) to various concepts. References [AAT 1999] Art and Architecture Thesaurus Browser, [Online: http://shiva.pub.getty.edu/aat_browser/], 1999. [ACM 1998] ACM Computing Classification. http://www.acm.org/class/1998/ [Agosti 1995] Maristella Agosti, Massimo Melucci, and Fabio Crestani. "Automatic Authoring and Construction of Hypermedia for Information Retrieval" in ACM Multimedia Systems, 3(1), 15-24, 1995. [Arents 1993] Hans C. Arents and Walter F. L. Bogaerts. "Navigation without Links and Nodes without Contents: Intensional Navigation in a Third-Order Hypermedia System" in Hypermedia, 5(3), 187-204, 1993. [Aslandogan 1997] Y. Alp Aslandogan, Chuck Thier, Clement T. Yu, Jon Zou, and Naphtali Rishe. "Using Semantic Contents and WordNet in Image Retrieval" in Proceedings of ACM SIGIR '97, 286-295, 1997. [Berners-Lee 1998a] Tim Berners-Lee. World Wide Web Design Issues: A Roadmap to the Semantic Web, [Online: http://www.w3.org/DesignIssues/Semantic.html], 1998. [Bruza 1990] Peter Bruza. "Hyperindices: A Novel Aid for Searching in Hypermedia" in Proceedings of the ACM European Conference on Hypertext '90 (ECHT '90), Versailles, France,109-122, November 1990. [Bullock 1998] Joseph Bullock and Carole Goble. "TourisT: The Application of a Description Logic based Semantic Hypermedia System for Tourism" in Proceedings of ACM Hypertext '98, Pittsburgh PA, 132-141, June 1998. [Chiaramella 1996] Yves Chiaramella and Ammar Kheirbek. "An Integrated Model for Hypermedia and Information Retrieval" in Information Retrieval and Hypertext, Maristella Agosti and Alan Smeaton (editors), Kluwer, 139-178, 1996. ACM Computing Surveys, Vol. 31, Number 4es, December 1999 4
  • 5. Tudhope Semantically Indexed Hypermedia: Linking Information Disciplines [Collier 1987] George Collier. "Thoth-II: Hypertext with Explicit Semantics" in Proceedings of ACM Hypertext '87, Chapel Hill, NC, 269-289, November 1987. [Cunliffe 1997] Daniel Cunliffe, Carl Taylor, and Douglas Tudhope. "Query-based Navigation in Semantically Indexed Hypermedia" in Proceedings of ACM Hypertext 97, Southampton, UK, 87-95, April 1997. [DC 1999] Dublin Core. [Online: http://purl.org/metadata/dublin_core], 1999. [DeRose 1989] Steven J. DeRose. "Expanding the Notion of Links" in Proceedings of ACM Hypertext '89, Pittsburgh, PA, 249-257, November 1989. [Doerr 1997] Martin Doerr, Irene Fundulaki and Vassilis Christophidis. "The Specialist Seeks Expert Views: Managing Digital Folders in the AQUARELLE Project" in Proceedings of Museums and the Web, David Bearman and Jennifer Trant (editors), 261-270, 1997. [Fidel 1991] Raya Fidel. "Searchers' Selection of Search Keys (I-III)" in Journal of American Society for Information Science, 42(7), 490-527, 1991. [Frisse 1989] Mark E. Frisse and Steven B. Cousins. "Information retrieval from hypertext: Update on the Dynamic Medical Handbook" in Proceedings of ACM Hypertext '89, Pittsburgh, PA, 199-211, November 1989. [Lassila 1999] Ora Lassila and Ralph Swick (editors), "Resource Description Framework (RDF) Model and Syntax Specification" World Wide Web Consortium Recommendation, [Online: http://www.w3.org/TR/REC-rdf-syntax/], February 22 1999. [Marchionini 1995] Gary Marchionini. Information Seeking in Electronic Environments. Cambridge University Press, 1995. [MeSH 1999] MeSH 1999. Medical Subject Headings homepage. http://www.nlm.nih.gov/mesh/meshhome.html [Nanard 1991] Jocelyne Nanard and Mark Nanard. "Using structured types to incorporate knowledge in hypertext" in Proceedings of ACM Hypertext '91, San Antonio, TX, 329-344, December 1991. [Pollard 1993] Richard Pollard. "A hypertext-based thesaurus as a subject browsing aid for bibliographic databases" in Information Processing and Management, 29(3), 345-357, 1993. [Pollitt 1997] Steven Pollitt, Martin P Smith and Patrick A J Braekevelt. "View-based Searching Systems" in Proceedings of Joint Workshop of BCS IR and HCI Specialist Groups, (Johnson and Dunlop eds.) 73-77. [Rada 1993] Roy Rada, Weigang Wang, Alex Birchall. "Retrieval hierarchies in hypertext" in Information Processing and Management 29(3), 359-371, 1993. [Schnase 1993] John L. Schnase, John J. Leggett, David L. Hicks, and Ron L. Szabo. "Semantic Data Modeling of Hypermedia Associations. ACM Transactions on Information Systems (TOIS), 11(1), 27-49, January 1993. [Soergel 1995] Dagobert Soergel. "The Art and Architecture Thesaurus (AAT): a critical appraisal" in Visual Resources, 10(4), 369-400, 1995. [Trigg 1986] Randall H. Trigg and Mark Weiser. "Textnet: A Network-based Approach to Text Handling" in ACM Transactions on Office Information Systems (TOIS), 4(1), 1-23, January 1986. [Tudhope 1994] Douglas Tudhope, Paul Beynon-Davies, Carl Taylor, and Chris B. Jones. "Virtual Architecture Based on a Binary Relational Model: A Museum Hypermedia Application" in Hypermedia, 6(3), 174-192, 1994. ACM Computing Surveys, Vol. 31, Number 4es, December 1999 5
  • 6. Tudhope Semantically Indexed Hypermedia: Linking Information Disciplines [Tudhope 1997] Douglas Tudhope and Carl Taylor. "Navigation via Similarity: Automatic Lnking Based on Semantic Closeness" in Information Processing and Management, 33(2), 233-242, 1997. [van Rijsbergen 1979] C. J. "Keith" van Rijsbergen. Information Retrieval. Butterworth, 1979. [Weinstein 1998] Peter C. Weinstein. "Ontology-based metadata: transforming the MARC legacy" in Proceedings of ACM Digital Libraries '98, 254-263, 1998. ACM Computing Surveys, Vol. 31, Number 4es, December 1999 6