AI You Can Trust - Ensuring Success with Data Integrity Webinar
Analysis on semantic web layer cake entities
1. Keyword and Uri: An Analysis
in Semantic
Content Retrieval
By
D.Teja Santosh, Assistant Professor
Computer Science and Engineering
GITAM University,
Rudraram, Hyderabad
2. Aim :Change the keyword based search to meaning based search so
that a “COMPUTER” will be able (Machine processable) to find out
the actual information what the user is expected to have.
Technologies Used:
RDF, SPARQL, RDF-S AND OWL.
Novelty of the concept:
• Taking redundant pages (different URLs but with same resource URI) and
filtering the resource using SPARQL query.
• This filtering is dependent on RDF, RDF-S and OWL vocabularies used in
the linked graph generation and classification development.
3. I want to know about the artist of the audio page- The Problem:
7. PROBLEMS:
• The searching time is increased due to this unwanted feature and
so the interpretation of the content. (Obvious from search results).
• The accuracy measures: Precision and Recall are very less due to
this.
8. How RDF starts to reduce this?
• The RDF model is made up of triples: subject-predicate-object.
• These triples are uniquely identified on the web through URI. [Like “PASSPORT NUMBER”
to uniquely identify a person across the real world].
• This lets machines understand human knowledge statements. [Computer saying: Oh!]
• The RDF model is essentially the canonicalization of a (directed) graph, and so as such has all
the advantages (and generality) of structuring information using graphs
• The triples are understood as a basic “lexis” (from Microsoft Word – Synonym of vocabulary) of
the web resources. These will not give any additional information about the resource properties
and relationships between them and with other properties.
10. Thanks to RDF data supported query language - SPARQL
• I call SPARQL as a test bed which makes us to have clear idea about the
result accuracy (as a Web 3.0 learner).
• Queries RDF data. If your data is in RDF, then SPARQL can query it
natively.
• Implicit join syntax. SPARQL queries RDF graphs, which consist of various
triples expressing binary relations between resources. As all relationships
are of a fixed size and data lives in a single graph, SPARQL does not
require explicit joins that specify the relationship between differently
structured data.
• The SPARQL query above has a similar structure:
SELECT <variable list>
WHERE {<graph pattern> }
• FROM is used as a Base URL of the RDF Triple Store.
11. Computer now only knows URI, but don’t no about the resource
relationship with the keyword: What is the solution?
• The solution is to use vocabulary description language: RDF-S.
• A schema defines not only the properties of the resource (e.g., title, author,
subject, size, color, etc.) but may also define the kinds of resources being
described (books, Web pages, people, companies, etc.).
• Eg:
<owl:Class rdf:about="http://media.srichaganti.net/audio/Bhagavatam/001-
bhagavatam-02_02_06.mp3">
<rdfs:comment>An audio file, which may be available on a local file system or
through http, ftp, etc.</rdfs:comment>
<rdfs:label>audio file</rdfs:label>
<rdfs:subClassOf
rdf:resource="http://english.srichaganti.net/SrimadBhagavatam.aspx#
"/>
</owl:Class>
12. RDF-S restricting the Subjects and Objects with its Vocabulary: Which
is a good sign
13. Now, can I get the suitable (needed) response? – Answer is Yes
• Answer is YES through the Ontology.
• As RDFS restricted the “domain” to the “range”, it is now simple to infer the
response through a query.
• Simply to easy: Ontology is using complex vocabularies to infer the
response
• Eg:
<owl:Class
rdf:about="http://media.srichaganti.net/audio/Bhagavatam/001-
bhagavatam-02_02_06.mp3">
…………………………
</owl:class>
• <owl:Class rdf:ID="ConferenceVenuePlace">
…………………………
</owl:class>