Session 1.4 sustainable urban delta knowledge and semantic search
1. SUD MODEL
RVO Sustainable Urban Delta
Dr. Evgeny Knutov
John Walker
KnowSyms B.V. / Semaku B.V.
18 Aug. 2017
2. PROBLEM STATEMENT
SUD MODEL
A system for the flexible
management of a dynamic co-
evolving document collection and
knowledge structures in a focused
domain. The work reported here is in
the context of document and
knowledge management activities in
the context of the SUD data
modelling project at RVO.
Within a framework a vast amount of
unstructured information becomes
available in the form of different
reports (primarily PDF) submitted by
different companies, and experts.
There is a need to automate the
processing of these reports and to
help domain experts to find and
analyze the most important
information, and turn this
information into a knowledge base.
4. SUD ONTOLOGY BASICS
• Ontology explains semantics of the
information “how do we convey the meaning”
• Formal naming and definition of types and
interrelationships that formally exist in the
described domain
• Consists of triples or semantic triples
• Triple represents subject-predicate-object
• e.g. (RVO) - (located in) - (Utrecht)
7. SUD ONTOLOGY
• information is represented in
triples
• started with ~300 triples and
counting
• additional energy-related
ontologies with >1000 triples
• RDF format (industry standard)
• “easily” add new instances and
concepts
• interchangeable ontologies
(switch your SUD-related
knowledge base on the fly)
9. SEARCH AND EXPLORE (CONT.)
• provides search and exploration functionality across
the knowledge base(s) and all the documents
• offers integration of the knowledge terms and
triggers document (re-)search results refinement
• adjustable search and viewing options
• change your knowledge base on the fly
• adjust the viewing option and the knowledge depth
10. BASIC DOCUMENT VIEW
• basic view of the
text snippets
containing the
found information
• immediately get
access to the
original PDF
document
• highlighted term(s)
and predefined wiki
links
11. EXTENDED DOCUMENT VIEW
• “enable detailed snippets”
provides extra insight on the
found document and
keywords
• virtually every aspect of the
document presentation is
internally adjustable
• length of textual
information, highlights,
clickable links, etc.
15. EXAMPLE 1: MAIN EXPLORATION SCENARIO (CONT.)
Challenges - Agriculture - Greenport - Venlo - Location - Eindhoven - Brainport
16. EXAMPLE 2: ADDING NEW KNOWLEDGE ELEMENT
• search for
“challenge” (currently
results in 12 challenge
types)
• adding new “security”
challenge (aka new
“Challenge”class individual)
17. EXAMPLE 2: ADDING NEW ONTOLOGY ELEMENT (CONT.2)
• in the main search you will
have a possibility to explore
more challenges thus
narrow down document
search
• thus the whole new types of
challenges become instantly
discoverable in the whole
document set
18. EXAMPLE 5: KNOWLEDGE INTEROPERABILITY
• switch the knowledge
on the fly
• use different
knowledge with the
same documents
19. EXAMPLE 5: KNOWLEDGE INTEROPERABILITY (CONT.)
• or the same knowledge
with a different
document set (not in
the system)
• system is agnostic to
the documents and/or
the knowledge
• can be used throughout
multiple domains
20. EXAMPLE 6: EXTENDING KNOWLEDGE
• Extending the knowledge beyond
the concerned domain (e.g.
Wikipedia or DBpedia)
• incorporating in the ontology
• using external features
21. TAKING IT ONE STEP FURTHER
• lots of possibilities to
adjust and enrich the
system functionality
• interchanging
ontologies and
document sets
• user feedback: system
becomes better when
users decide on the
documents relevancy
• automatic
summarization on a
certain topic
• automatic report
generation
• custom features, etc.
etc.
22. COMBINED AND INTERCHANGEABLE KNOWLEDGE
• Combine knowledge from
multiple sources
• e.g. via federated
queries among multiple
knowledge bases
including “SUDmodel”
• general accepted
knowledge such as
DBpedia (Wikipedia of
concepts)
• easily re-use ontologies
from a different domain
• use current knowledge
with the different
document set
• can be used with the
external document
supplier (with a generic
formatting/schema)
23. OTHER VERSIONS
❖ Heat and Energy -related
❖ Separation technology -related
❖ Map integration
❖ Plain version (sandbox)
24. TECHNICAL DETAILS OF THE ENVIRONMENT
• runs on the Ubuntu 16.04
LTS server OS
• uses open-source third
party solutions
• Apache Solr 4.10 - 6.10
• Apache Fuseki 2.4.1
• Apache HTTP2 server
• custom build JavaScript
framework