SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Andreas Blumauer
CEO & Managing Partner
Semantic Web Company /
PoolParty Semantic Suite
Taxonomy Boot Camp 2017
Washington, DC
Leveraging
Taxonomy Management
With Machine Learning
INTRODUCTION
2
Semantic Web
Company
founder &
CEO of
Andreas
Blumauer
developer and
vendor of
2004
founded
6.0
current
Version
active at
based on
Vienna
located
part of Enterprise
Knowledge Graphs
manages
standard for
part of
enriches
>200serves customers
editor of
Taxonomies
is about
Ontologies
standard for
graduates
Text
Mining
used for
Agenda
▸ Cognitive Computing:
Semantic Technologies & Machine Learning
▸ Terms, Concepts, Shadow Concepts
▸ Corpus Analysis & (Shadow) Concept Extraction
with PoolParty
▸ A comparison with LSA and Word2Vec
▸ Use Cases
▹ Document Annotation & Indexing
▹ Text Classification (incl. Benchmarks)
▹ Recommender Systems (incl. Use Case)
3
Cognitive
Computing
Combining Semantic Technologies
With Machine Learning
4
A key
assumption
of this talk
People do not search
for documents only,
they seek facts about
things and smaller
chunks of information.
Machines shall help to
create links across
data silos to give
answers to questions.
5
Converging A.I.
Technologies
A quick
question at the
beginning
Will Artificial
Intelligence
make
Subject Matter
Experts
obsolete?
6 Imagine you want to
build an application
that helps to identify
patients and
treatments pairings.
Which will you prefer?
Applications solely based on machine learning, those ones which
are based on doctors' knowledge only, or a combination of both?
How Semantic
Computing
and Machine
Learning
complement
each other
7
Structured Data
Machine
Learning
Cognitive
Applications
How Semantic
Computing
and Machine
Learning
complement
each other
8 Unstructured Data
Structured Data
Machine
Learning
Cognitive
Applications
How Semantic
Computing
and Machine
Learning
complement
each other
9 Unstructured Data
Structured Data
Knowledge Graphs
Machine
Learning
Cognitive
Applications
Towards a
Digital Twin
Proposal for a
Cognitive
Computing
Platform
Architecture
10 Unstructured Data
Structured Data
Knowledge Graphs
Machine
Learning
Semantic
Layer
IoT & Cognitive
Applications
Terms, Concepts,
Shadow Concepts
How to make sense of text and data
11
Terms and
co-occurence
models
12
Document
Corpus
- Websites
- PDF, Word, …
- Abstracts from
DBpedia
- RSS Feeds
Term 8
Term 3
Term 7
Term 8
Term 6
Term 9
Term 5
Term 10
- Relevant terms and phrases
- Relevancy of terms
- co-occurence between terms and terms
Term 1
Term 4
Term 2
‘Things’ but not Strings:
Using a ‘Semantic Knowledge Graph’
http://www.my.com/
taxonomy/62346723
prefLabel
Retina
image
http://www.my.com/
images/90546089
http://www.my.com/
taxonomy/
97345854
prefLabel
Funduscope
altLabel
Ophthalmoscope
http://www.mycom.com
/taxonomy/4543567
prefLabel
Diagnostic Equipment
has broader
Shadow Concepts
Use co-occurences
between concepts
and terms to
extract ‘shadow
concepts’
14 This site is a
15th-century Inca
site located 2,430
metres above sea
level. It is located
in Cusco, Peru.
It is situated on a mountain ridge above
the Sacred Valley through which the
Urubamba River flows. Most
archaeologists believe that it was built as
an estate for the Inca emperor Pachacuti.
Often mistakenly referred to as the "Lost
City of the Incas", it is the most familiar
icon of Inca civilization. The Incas built
the estate around 1450, but abandoned it
a century later at the time of the
Spanish Conquest.
Inca
site
Machu
Picchu
Cusco
Inca
empire
Inca
emperor
Peru
Spanish
Conquest
Sacred
Valley
Chankas
Lost
City
Pachacuti
In addition to explicitly used concepts and terms, Machu Picchu is
extracted from the article as a Shadow Concept. As a prerequisite,
one has to provide and analyze a representative text corpus first.
Example:
Corpus Analysis
Use PoolParty for Deep Text Analysis
15
Bionics
How do we learn
from a lot of text?
16 Bla bla
bla bla.
Bla bla
bla bla
The stove is on.
The stove is hot!
Ontological model → reasoningTaxonomical model → is-a abstractions
Bla stove
bla bla.
Bla bla
bla hot
Switched on
devices are
dangerous
devices.
The stove is on.
The stove is hot!
Statistical model/cooccurences → is related
The stove is on.
The stove is hot!
Switched on
devices are
dangerous, only if
the operating
temperature is
above 100 degrees
and the automatic
shutdown
mechanism is
broken.
Bla bla
bla bla.
Bla bla
bla bla
Graphs +
Machine Learning
PoolParty as a
supervised
learning system
17 Content Manager
Integrator
Taxonomist/
Ontologist
Thesaurus
Server
Extractor
PowerTagging
uses API
is user of
is user of
is basis of
is basis of
Index
annotates
enriches
Corpus Learning/
Semantic Analysis
CMS
extends
is basis of
analyzes
uses API
proposes
extensions
Knowledge
graphs as a
result of
human-machine
cooperation
18 Manually created parts of graph
Supervised learning
Automatically created parts of graph
(corpus analysis, RDF transformation,
machine learning, ….)
PoolParty
Corpus Analysis
How taxonomists
can extend
taxonomies with
some help from
machine learning
algorithms
19
Candidate Concepts derived from
sample documents can be easily
integrated into taxonomy. A list of possible Candidate Concepts is
shown per document or as a list of most
relevant candidates per corpus.
Context of a given taxonomy
concept can be visualised with a
few mouse-clicks. Terms, concepts and shadow concepts
can be high-lighted per document.
Network-based
Knowledge
Graph
Assessment
Thesaurus
Harmonizer
20 ▸ Find missing relationships between
concepts, which are of high
semantic relevance
▸ Point out structural flaws in
existing thesauri
▸ Identify corpora that only reflect a
fraction of a thesaurus
▹ Or, vice versa: identify
thesauri that are far too big
for their domain applications,
and possibly missing details
Use Cases
Benefit from Semantic Knowledge Graphs
and Machine Learning
21
PoolParty
Extractor
Extract concepts
from text even if
not used explicitly
22
Some domains use text that doesn’t always call a spade a spade. With
‘shadow concept extraction’ those ‘masked’ concepts still can be surfaced.
Since these technologies would have become conventional
technologies that are made into products and introduced into market
at the time of their introduction, it would be difficult to differentiate
them as innovative environmental and energy technologies from other
global warming prevention technologies that have already been put to
practical use in the industrial, commercial, residential, and energy
conversion sectors.
- The Innovative Global Warming Prevention Technology Working
Group under the Research and Development Subcommittee
- Council assessed that innovative global warming prevention
technologies would bring about a reduction effect of 7.49 million t-CO2
case of average emissions factor for all power sources of carbon
dioxide in 2010. In view of the difficulty in putting innovative carbon
dioxide sequestration technology into practical use by 2010, the
Working Group reassigned it as an issue of global warming prevention
technology to be tackled by 2030.
The Central Environment Council, however, has not had the
opportunity to examine the contents of these technologies in detail.
(Promotion of climate change prevention activities by every social
actor)
- The Programme encourages every social actor to take actions to
prevent global warming. The actions include measures undertaken by
the public sector.
Climate Change
Since these technologies would have become conventional
technologies that are made into products and introduced into market
at the time of their introduction, it would be difficult to differentiate
them as innovative environmental and energy technologies from other
global warming prevention technologies that have already been put to
practical use in the industrial, commercial, residential, and energy
conversion sectors.
- The Innovative Global Warming Prevention Technology Working
Group under the Research and Development Subcommittee
- Council assessed that innovative global warming prevention
technologies would bring about a reduction effect of 7.49 million t-CO2
case of average emissions factor for all power sources of carbon
dioxide in 2010. In view of the difficulty in putting innovative carbon
dioxide sequestration technology into practical use by 2010, the
Working Group reassigned it as an issue of global warming prevention
technology to be tackled by 2030.
The Central Environment Council, however, has not had the
opportunity to examine the contents of these technologies in detail.
(Promotion of climate change prevention activities by every social
actor)
- The Programme encourages every social actor to take actions to
prevent global warming. The actions include measures undertaken by
the public sector.
Climate Change
PoolParty
Semantic
Classifier
Text Classification
based on Machine
Learning and
Semantic
Knowledge Models
23
PoolParty Semantic Classifier combines machine learning algorithms
(SVM, Deep Learning, Naive Bayes, etc.) with Semantic Knowledge Graphs.
Benchmarking
the PoolParty
Semantic
Classifier
Improvement of
5.2% compared
to traditional
(term-based)
SVM
24
Features used Classifier F1 (5 folds) Variance
Terms LinearSVC 0.83175 0.0008
Concepts from REEGLE + Shadow Concepts LinearSVC 0.84451 0.0011
Concepts from REEGLE LinearSVC 0.84647 0.0009
Terms + Concepts from REEGLE + Shadow Concepts LinearSVC 0.87474 0.0009
Reegle thesaurus
A comprehensive SKOS taxonomy
for the clean energy sector
(http://data.reeep.org/thesaurus/guide)
● 3,420 concepts
● 7,280 labels (English version)
● 9,183 relations (broader/narrower + related)
Document Training Set
1.800 documents in 7 classes
Renewable Energy, District Heating Systems,
Cogeneration, Energy Efficiency, Energy (general),
Climate Protection, Rural Electrification
Sample
Calculation
Based on an
improvement of
5.2%
25
Inbound
Documents
PoolParty
Semantic
Classifier
Experienced
Agent
● 100,000 documents (emails, tickets, etc.) per month
● 5 Euros extra costs per document when misrouted
● Cost savings per year:
○ 1,200.000 x €5.0 x 0.052 = € 312,000 per annum
Use Shadow
Concepts to
improve
Recommender
Systems
26
Mini Countryman
And it’s probably more of a
crossover than ever, with the design
to match, Being a Mini, the
Countryman is clearly meant to be
the driver’s car among small
crossovers. The suspension is
sophisticated, and there are lots of
chassis options (a stiffer sports
setup, variable damping, the
electronically controlled ALL4
all-wheel-drive).
But it’s also the crossover for people who’ve bags of cash to blow on
personalisation and luxury.
There’s been a lot of effort on ramping up the cabin quality, but then the
outgoing Countryman was a sad let-down in that department.
On the outside, plastic wheel-arch extensions, with eyebrow creases in the
metalwork above, as well as roof bars and sill protectors all add to the visual
crossover-ness. This remains the only Mini with angular rather than oval
headlamps, and there’s a load of visual posturing going on in the lower face.
There are eight versions at launch, and they’re exactly what you’d expect. It’s
Cooper or Cooper S, each fuelled by petrol or diesel, each of them with front
drive or ALL4. Oh and an eight-speed auto, too, if you count that as a
separate choice. The Cooper petrol is a three-cylinder, the rest fours.
You get extra kit as standard versus the old car, including navigation,
Bluetooth, emergency call and park sensors. Upgrades include a bigger
touch-screen nav with high-definition traffic, various posher seats, a HUD,
and driver aids. Oh and a cushion thingy that folds out from the boot so you
can sit on the rear bumper without getting your clothes mucky.
In June 2017 a Cooper E will launch, which has the Cooper three-cylinder
petrol driving the front wheels, and an electric motor for the rears, with a
capacity to do a claimed 25 miles of gentle all-electric running. So it has the
performance of a Cooper S ALL4 with the tax-busting advantages of a plug-in
hybrid. And you wouldn’t use any fuel if you commuted a short distance.
The platform is BMW’s contemporary transverse-engined hardware, in the
bigger of its two sizes. That means it shares a lot with the BMW X1. The
4WD system is more sophisticated than the previous Countryman’s. The
proportion of drive to the rear is computed by a controller that takes into
account parameters including grip, steering angle and throttle position, as
well as whether you’ve got the sports mode and sports traction systems
selected.
Use a Knowledge
Graph +
Co-occurences for
precise Content
Recommendation
27 RavingDe-Void
Scott
attack
Stilinski
friend
shame
O’Brien
woman
married
girl
attractive
Similarepisodes!
love
Example: Find similar episodes
Rules-based
Recommender
Systems
Example:
Wine-to-Cheese
Harmonizer
Live Demo
28 Dry
Medium-bodied
High acidity
Weingut
Weinrieder
Grüner
Veltliner
Alte Reben
is characterized by
Nutmeg
Full-bodied
Warm finish
Tobacco
is characterized by
Nagelkaas
Cumin
Clove
Hard cheese
Higher fat
?
is characterized by
matches
matches
does not match
Why ‘The Knot’
uses Machine
Learning and
Semantic
Models
29 ▹ XO Group runs ‘The Knot’
since 1996
▹ NYSE: XOXO (S&P 600
Component)
▹ 1.5 million active members
▹ The Knot has helped marry
25 million couples
▹ Partnering with 300,000
wedding vendors
▹ Millions of vendor reviews
Thank you for
your interest!
Andreas Blumauer
CEO, Semantic Web Company
▸ Mail andreas.blumauer@semantic-web.com
▸ Company https://www.semantic-web.com
▸ LinkedIn https://www.linkedin.com/in/andreasblumauer
▸ Twitter https://twitter.com/semwebcompany
▸ Blog https://www.linkedin.com/today/
author/andreasblumauer
30
© Semantic Web Company - http://www.semantic-web.com and http://www.poolparty.biz/

Mais conteúdo relacionado

Mais procurados

PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataSemantic Web Company
 
PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)Semantic Web Company
 
Using Knowledge Graphs to Predict Customer Needs and Improve Quality
Using Knowledge Graphs to Predict Customer Needs and Improve QualityUsing Knowledge Graphs to Predict Customer Needs and Improve Quality
Using Knowledge Graphs to Predict Customer Needs and Improve QualityNeo4j
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Sören Auer
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark newAnam Mahmood
 
The Study of the Large Scale Twitter on Machine Learning
The Study of the Large Scale Twitter on Machine LearningThe Study of the Large Scale Twitter on Machine Learning
The Study of the Large Scale Twitter on Machine LearningIRJET Journal
 
MarkLogic Overview and Use Cases
MarkLogic Overview and Use CasesMarkLogic Overview and Use Cases
MarkLogic Overview and Use CasesIoan Toma
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...semanticsconference
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is hereConnected Data World
 
Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentOntotext
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataIMC Institute
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Cambridge Semantics
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics DemoOntotext
 
Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014
Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014
Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014KDZ - Zentrum für Verwaltungsforschung
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsOntotext
 

Mais procurados (20)

PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked Data
 
PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)PoolParty Semantic Suite - Release 6.0 (Technical Overview)
PoolParty Semantic Suite - Release 6.0 (Technical Overview)
 
Structured Content Meets Taxonomy
Structured Content Meets TaxonomyStructured Content Meets Taxonomy
Structured Content Meets Taxonomy
 
Using Knowledge Graphs to Predict Customer Needs and Improve Quality
Using Knowledge Graphs to Predict Customer Needs and Improve QualityUsing Knowledge Graphs to Predict Customer Needs and Improve Quality
Using Knowledge Graphs to Predict Customer Needs and Improve Quality
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
 
The Study of the Large Scale Twitter on Machine Learning
The Study of the Large Scale Twitter on Machine LearningThe Study of the Large Scale Twitter on Machine Learning
The Study of the Large Scale Twitter on Machine Learning
 
Cognitive data
Cognitive dataCognitive data
Cognitive data
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
MarkLogic Overview and Use Cases
MarkLogic Overview and Use CasesMarkLogic Overview and Use Cases
MarkLogic Overview and Use Cases
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is here
 
Boost your data analytics with open data and public news content
Boost your data analytics with open data and public news contentBoost your data analytics with open data and public news content
Boost your data analytics with open data and public news content
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
 
Open Data and News Analytics Demo
Open Data and News Analytics DemoOpen Data and News Analytics Demo
Open Data and News Analytics Demo
 
Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014
Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014
Enterprise linked data - open or closed, Andreas Blumauer, Keynote SMWCon 2014
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk Analytics
 

Semelhante a Leveraging Taxonomy Management with Machine Learning

Sc10 slide share
Sc10 slide shareSc10 slide share
Sc10 slide shareGuy Tel-Zur
 
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...Ignasi Sayol
 
What's next? Emerging trends in cloud computing
What's next? Emerging trends in cloud computingWhat's next? Emerging trends in cloud computing
What's next? Emerging trends in cloud computingMartin Hamilton
 
Microsoft Innovation Center Rapperswil
Microsoft Innovation Center Rapperswil Microsoft Innovation Center Rapperswil
Microsoft Innovation Center Rapperswil mictc
 
15527769_Pedro Martins_Final Report_edit
15527769_Pedro Martins_Final Report_edit15527769_Pedro Martins_Final Report_edit
15527769_Pedro Martins_Final Report_editPedro Martins
 
ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...
ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...
ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...Symphony Software Foundation
 
Communications Technology Essay
Communications Technology EssayCommunications Technology Essay
Communications Technology EssayAshley Hargrove
 
e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...
e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...
e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...Mario Jose Villamizar Cano
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Sandro D'Elia
 
Masters thesis -_cloud_computing_-_rehan_saleem
Masters thesis -_cloud_computing_-_rehan_saleemMasters thesis -_cloud_computing_-_rehan_saleem
Masters thesis -_cloud_computing_-_rehan_saleemMohammed Hesham
 
ICT4D_Lecture5_Infrastruct
ICT4D_Lecture5_InfrastructICT4D_Lecture5_Infrastruct
ICT4D_Lecture5_Infrastructssuserd82a5f1
 
SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework Business Aspect...
SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework  Business Aspect...SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework  Business Aspect...
SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework Business Aspect...South Tyrol Free Software Conference
 
ERP for manufacturing companies
ERP for manufacturing companiesERP for manufacturing companies
ERP for manufacturing companiesAzdan
 
BSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming ModelsBSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming Modelsinside-BigData.com
 
Real-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with StormReal-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with StormDataWorks Summit
 

Semelhante a Leveraging Taxonomy Management with Machine Learning (20)

Sc10 slide share
Sc10 slide shareSc10 slide share
Sc10 slide share
 
Connected Products Studio Report
Connected Products Studio ReportConnected Products Studio Report
Connected Products Studio Report
 
Business bavaria 10-2012_en
Business bavaria 10-2012_enBusiness bavaria 10-2012_en
Business bavaria 10-2012_en
 
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
 
What's next? Emerging trends in cloud computing
What's next? Emerging trends in cloud computingWhat's next? Emerging trends in cloud computing
What's next? Emerging trends in cloud computing
 
Microsoft Innovation Center Rapperswil
Microsoft Innovation Center Rapperswil Microsoft Innovation Center Rapperswil
Microsoft Innovation Center Rapperswil
 
A Strategist's Guide to Digital Fabrication
A Strategist's Guide to Digital FabricationA Strategist's Guide to Digital Fabrication
A Strategist's Guide to Digital Fabrication
 
15527769_Pedro Martins_Final Report_edit
15527769_Pedro Martins_Final Report_edit15527769_Pedro Martins_Final Report_edit
15527769_Pedro Martins_Final Report_edit
 
ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...
ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...
ApacheCon NA 2015 - Gabriele Columbro - Is Open Source the right model in the...
 
Communications Technology Essay
Communications Technology EssayCommunications Technology Essay
Communications Technology Essay
 
Collaboration with industry: success stories
Collaboration with industry: success storiesCollaboration with industry: success stories
Collaboration with industry: success stories
 
The most disruptive quantum technology companies to watch 2019
The most disruptive quantum technology companies to watch 2019The most disruptive quantum technology companies to watch 2019
The most disruptive quantum technology companies to watch 2019
 
e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...
e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...
e-Clouds A Platform and Marketplace to Access and Publish Scientific Applicat...
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
 
Masters thesis -_cloud_computing_-_rehan_saleem
Masters thesis -_cloud_computing_-_rehan_saleemMasters thesis -_cloud_computing_-_rehan_saleem
Masters thesis -_cloud_computing_-_rehan_saleem
 
ICT4D_Lecture5_Infrastruct
ICT4D_Lecture5_InfrastructICT4D_Lecture5_Infrastruct
ICT4D_Lecture5_Infrastruct
 
SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework Business Aspect...
SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework  Business Aspect...SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework  Business Aspect...
SFSCON23 - Seckin Celik Davide Serpico - The ZOOOM Framework Business Aspect...
 
ERP for manufacturing companies
ERP for manufacturing companiesERP for manufacturing companies
ERP for manufacturing companies
 
BSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming ModelsBSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming Models
 
Real-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with StormReal-time Energy Data Analytics with Storm
Real-time Energy Data Analytics with Storm
 

Mais de Semantic Web Company

How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...Semantic Web Company
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AISemantic Web Company
 
Deep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from textDeep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from textSemantic Web Company
 
Linking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured DataLinking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured DataSemantic Web Company
 
PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5Semantic Web Company
 
PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365Semantic Web Company
 
From SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom OntologiesFrom SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom OntologiesSemantic Web Company
 
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...Semantic Web Company
 
PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...
PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...
PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...Semantic Web Company
 
PoolParty Semantic Suite - Management Briefing
PoolParty Semantic Suite - Management BriefingPoolParty Semantic Suite - Management Briefing
PoolParty Semantic Suite - Management BriefingSemantic Web Company
 
PoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional OverviewPoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional OverviewSemantic Web Company
 

Mais de Semantic Web Company (19)

How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
How Enterprise Architecture & Knowledge Graph Technologies Can Scale Business...
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
Deep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from textDeep Text Analytics - How to extract hidden information and aboutness from text
Deep Text Analytics - How to extract hidden information and aboutness from text
 
Linking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured DataLinking SharePoint Documents with Structured Data
Linking SharePoint Documents with Structured Data
 
Semantic AI
Semantic AISemantic AI
Semantic AI
 
Taxonomies put in the right place
Taxonomies put in the right placeTaxonomies put in the right place
Taxonomies put in the right place
 
Taxonomy Quality Assessment
Taxonomy Quality AssessmentTaxonomy Quality Assessment
Taxonomy Quality Assessment
 
Taxonomy-Driven UX
Taxonomy-Driven UXTaxonomy-Driven UX
Taxonomy-Driven UX
 
PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5PoolParty Semantic Suite - Release 5.5
PoolParty Semantic Suite - Release 5.5
 
PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365PowerTagging for Sharepoint and Office 365
PowerTagging for Sharepoint and Office 365
 
From SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom OntologiesFrom SKOS over SKOS-XL to Custom Ontologies
From SKOS over SKOS-XL to Custom Ontologies
 
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
PoolParty Semantic Suite: Solutions for Sustainable Development: The Climate ...
 
PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...
PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...
PoolParty Semantic Suite - Solutions for Sustainable Development - weadapt.or...
 
Dynamic Semantic Publishing
Dynamic Semantic PublishingDynamic Semantic Publishing
Dynamic Semantic Publishing
 
PoolParty Semantic Suite - Management Briefing
PoolParty Semantic Suite - Management BriefingPoolParty Semantic Suite - Management Briefing
PoolParty Semantic Suite - Management Briefing
 
PoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional OverviewPoolParty Semantic Suite - Functional Overview
PoolParty Semantic Suite - Functional Overview
 
SKOS in the Public Sector
SKOS in the Public SectorSKOS in the Public Sector
SKOS in the Public Sector
 
SKOS - An Overview
SKOS - An OverviewSKOS - An Overview
SKOS - An Overview
 
SKOS - Some Use Cases
SKOS - Some Use CasesSKOS - Some Use Cases
SKOS - Some Use Cases
 

Último

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 

Último (20)

Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 

Leveraging Taxonomy Management with Machine Learning

  • 1. Andreas Blumauer CEO & Managing Partner Semantic Web Company / PoolParty Semantic Suite Taxonomy Boot Camp 2017 Washington, DC Leveraging Taxonomy Management With Machine Learning
  • 2. INTRODUCTION 2 Semantic Web Company founder & CEO of Andreas Blumauer developer and vendor of 2004 founded 6.0 current Version active at based on Vienna located part of Enterprise Knowledge Graphs manages standard for part of enriches >200serves customers editor of Taxonomies is about Ontologies standard for graduates Text Mining used for
  • 3. Agenda ▸ Cognitive Computing: Semantic Technologies & Machine Learning ▸ Terms, Concepts, Shadow Concepts ▸ Corpus Analysis & (Shadow) Concept Extraction with PoolParty ▸ A comparison with LSA and Word2Vec ▸ Use Cases ▹ Document Annotation & Indexing ▹ Text Classification (incl. Benchmarks) ▹ Recommender Systems (incl. Use Case) 3
  • 5. A key assumption of this talk People do not search for documents only, they seek facts about things and smaller chunks of information. Machines shall help to create links across data silos to give answers to questions. 5 Converging A.I. Technologies
  • 6. A quick question at the beginning Will Artificial Intelligence make Subject Matter Experts obsolete? 6 Imagine you want to build an application that helps to identify patients and treatments pairings. Which will you prefer? Applications solely based on machine learning, those ones which are based on doctors' knowledge only, or a combination of both?
  • 7. How Semantic Computing and Machine Learning complement each other 7 Structured Data Machine Learning Cognitive Applications
  • 8. How Semantic Computing and Machine Learning complement each other 8 Unstructured Data Structured Data Machine Learning Cognitive Applications
  • 9. How Semantic Computing and Machine Learning complement each other 9 Unstructured Data Structured Data Knowledge Graphs Machine Learning Cognitive Applications
  • 10. Towards a Digital Twin Proposal for a Cognitive Computing Platform Architecture 10 Unstructured Data Structured Data Knowledge Graphs Machine Learning Semantic Layer IoT & Cognitive Applications
  • 11. Terms, Concepts, Shadow Concepts How to make sense of text and data 11
  • 12. Terms and co-occurence models 12 Document Corpus - Websites - PDF, Word, … - Abstracts from DBpedia - RSS Feeds Term 8 Term 3 Term 7 Term 8 Term 6 Term 9 Term 5 Term 10 - Relevant terms and phrases - Relevancy of terms - co-occurence between terms and terms Term 1 Term 4 Term 2
  • 13. ‘Things’ but not Strings: Using a ‘Semantic Knowledge Graph’ http://www.my.com/ taxonomy/62346723 prefLabel Retina image http://www.my.com/ images/90546089 http://www.my.com/ taxonomy/ 97345854 prefLabel Funduscope altLabel Ophthalmoscope http://www.mycom.com /taxonomy/4543567 prefLabel Diagnostic Equipment has broader
  • 14. Shadow Concepts Use co-occurences between concepts and terms to extract ‘shadow concepts’ 14 This site is a 15th-century Inca site located 2,430 metres above sea level. It is located in Cusco, Peru. It is situated on a mountain ridge above the Sacred Valley through which the Urubamba River flows. Most archaeologists believe that it was built as an estate for the Inca emperor Pachacuti. Often mistakenly referred to as the "Lost City of the Incas", it is the most familiar icon of Inca civilization. The Incas built the estate around 1450, but abandoned it a century later at the time of the Spanish Conquest. Inca site Machu Picchu Cusco Inca empire Inca emperor Peru Spanish Conquest Sacred Valley Chankas Lost City Pachacuti In addition to explicitly used concepts and terms, Machu Picchu is extracted from the article as a Shadow Concept. As a prerequisite, one has to provide and analyze a representative text corpus first. Example:
  • 15. Corpus Analysis Use PoolParty for Deep Text Analysis 15
  • 16. Bionics How do we learn from a lot of text? 16 Bla bla bla bla. Bla bla bla bla The stove is on. The stove is hot! Ontological model → reasoningTaxonomical model → is-a abstractions Bla stove bla bla. Bla bla bla hot Switched on devices are dangerous devices. The stove is on. The stove is hot! Statistical model/cooccurences → is related The stove is on. The stove is hot! Switched on devices are dangerous, only if the operating temperature is above 100 degrees and the automatic shutdown mechanism is broken. Bla bla bla bla. Bla bla bla bla
  • 17. Graphs + Machine Learning PoolParty as a supervised learning system 17 Content Manager Integrator Taxonomist/ Ontologist Thesaurus Server Extractor PowerTagging uses API is user of is user of is basis of is basis of Index annotates enriches Corpus Learning/ Semantic Analysis CMS extends is basis of analyzes uses API proposes extensions
  • 18. Knowledge graphs as a result of human-machine cooperation 18 Manually created parts of graph Supervised learning Automatically created parts of graph (corpus analysis, RDF transformation, machine learning, ….)
  • 19. PoolParty Corpus Analysis How taxonomists can extend taxonomies with some help from machine learning algorithms 19 Candidate Concepts derived from sample documents can be easily integrated into taxonomy. A list of possible Candidate Concepts is shown per document or as a list of most relevant candidates per corpus. Context of a given taxonomy concept can be visualised with a few mouse-clicks. Terms, concepts and shadow concepts can be high-lighted per document.
  • 20. Network-based Knowledge Graph Assessment Thesaurus Harmonizer 20 ▸ Find missing relationships between concepts, which are of high semantic relevance ▸ Point out structural flaws in existing thesauri ▸ Identify corpora that only reflect a fraction of a thesaurus ▹ Or, vice versa: identify thesauri that are far too big for their domain applications, and possibly missing details
  • 21. Use Cases Benefit from Semantic Knowledge Graphs and Machine Learning 21
  • 22. PoolParty Extractor Extract concepts from text even if not used explicitly 22 Some domains use text that doesn’t always call a spade a spade. With ‘shadow concept extraction’ those ‘masked’ concepts still can be surfaced. Since these technologies would have become conventional technologies that are made into products and introduced into market at the time of their introduction, it would be difficult to differentiate them as innovative environmental and energy technologies from other global warming prevention technologies that have already been put to practical use in the industrial, commercial, residential, and energy conversion sectors. - The Innovative Global Warming Prevention Technology Working Group under the Research and Development Subcommittee - Council assessed that innovative global warming prevention technologies would bring about a reduction effect of 7.49 million t-CO2 case of average emissions factor for all power sources of carbon dioxide in 2010. In view of the difficulty in putting innovative carbon dioxide sequestration technology into practical use by 2010, the Working Group reassigned it as an issue of global warming prevention technology to be tackled by 2030. The Central Environment Council, however, has not had the opportunity to examine the contents of these technologies in detail. (Promotion of climate change prevention activities by every social actor) - The Programme encourages every social actor to take actions to prevent global warming. The actions include measures undertaken by the public sector. Climate Change Since these technologies would have become conventional technologies that are made into products and introduced into market at the time of their introduction, it would be difficult to differentiate them as innovative environmental and energy technologies from other global warming prevention technologies that have already been put to practical use in the industrial, commercial, residential, and energy conversion sectors. - The Innovative Global Warming Prevention Technology Working Group under the Research and Development Subcommittee - Council assessed that innovative global warming prevention technologies would bring about a reduction effect of 7.49 million t-CO2 case of average emissions factor for all power sources of carbon dioxide in 2010. In view of the difficulty in putting innovative carbon dioxide sequestration technology into practical use by 2010, the Working Group reassigned it as an issue of global warming prevention technology to be tackled by 2030. The Central Environment Council, however, has not had the opportunity to examine the contents of these technologies in detail. (Promotion of climate change prevention activities by every social actor) - The Programme encourages every social actor to take actions to prevent global warming. The actions include measures undertaken by the public sector. Climate Change
  • 23. PoolParty Semantic Classifier Text Classification based on Machine Learning and Semantic Knowledge Models 23 PoolParty Semantic Classifier combines machine learning algorithms (SVM, Deep Learning, Naive Bayes, etc.) with Semantic Knowledge Graphs.
  • 24. Benchmarking the PoolParty Semantic Classifier Improvement of 5.2% compared to traditional (term-based) SVM 24 Features used Classifier F1 (5 folds) Variance Terms LinearSVC 0.83175 0.0008 Concepts from REEGLE + Shadow Concepts LinearSVC 0.84451 0.0011 Concepts from REEGLE LinearSVC 0.84647 0.0009 Terms + Concepts from REEGLE + Shadow Concepts LinearSVC 0.87474 0.0009 Reegle thesaurus A comprehensive SKOS taxonomy for the clean energy sector (http://data.reeep.org/thesaurus/guide) ● 3,420 concepts ● 7,280 labels (English version) ● 9,183 relations (broader/narrower + related) Document Training Set 1.800 documents in 7 classes Renewable Energy, District Heating Systems, Cogeneration, Energy Efficiency, Energy (general), Climate Protection, Rural Electrification
  • 25. Sample Calculation Based on an improvement of 5.2% 25 Inbound Documents PoolParty Semantic Classifier Experienced Agent ● 100,000 documents (emails, tickets, etc.) per month ● 5 Euros extra costs per document when misrouted ● Cost savings per year: ○ 1,200.000 x €5.0 x 0.052 = € 312,000 per annum
  • 26. Use Shadow Concepts to improve Recommender Systems 26 Mini Countryman And it’s probably more of a crossover than ever, with the design to match, Being a Mini, the Countryman is clearly meant to be the driver’s car among small crossovers. The suspension is sophisticated, and there are lots of chassis options (a stiffer sports setup, variable damping, the electronically controlled ALL4 all-wheel-drive). But it’s also the crossover for people who’ve bags of cash to blow on personalisation and luxury. There’s been a lot of effort on ramping up the cabin quality, but then the outgoing Countryman was a sad let-down in that department. On the outside, plastic wheel-arch extensions, with eyebrow creases in the metalwork above, as well as roof bars and sill protectors all add to the visual crossover-ness. This remains the only Mini with angular rather than oval headlamps, and there’s a load of visual posturing going on in the lower face. There are eight versions at launch, and they’re exactly what you’d expect. It’s Cooper or Cooper S, each fuelled by petrol or diesel, each of them with front drive or ALL4. Oh and an eight-speed auto, too, if you count that as a separate choice. The Cooper petrol is a three-cylinder, the rest fours. You get extra kit as standard versus the old car, including navigation, Bluetooth, emergency call and park sensors. Upgrades include a bigger touch-screen nav with high-definition traffic, various posher seats, a HUD, and driver aids. Oh and a cushion thingy that folds out from the boot so you can sit on the rear bumper without getting your clothes mucky. In June 2017 a Cooper E will launch, which has the Cooper three-cylinder petrol driving the front wheels, and an electric motor for the rears, with a capacity to do a claimed 25 miles of gentle all-electric running. So it has the performance of a Cooper S ALL4 with the tax-busting advantages of a plug-in hybrid. And you wouldn’t use any fuel if you commuted a short distance. The platform is BMW’s contemporary transverse-engined hardware, in the bigger of its two sizes. That means it shares a lot with the BMW X1. The 4WD system is more sophisticated than the previous Countryman’s. The proportion of drive to the rear is computed by a controller that takes into account parameters including grip, steering angle and throttle position, as well as whether you’ve got the sports mode and sports traction systems selected.
  • 27. Use a Knowledge Graph + Co-occurences for precise Content Recommendation 27 RavingDe-Void Scott attack Stilinski friend shame O’Brien woman married girl attractive Similarepisodes! love Example: Find similar episodes
  • 28. Rules-based Recommender Systems Example: Wine-to-Cheese Harmonizer Live Demo 28 Dry Medium-bodied High acidity Weingut Weinrieder Grüner Veltliner Alte Reben is characterized by Nutmeg Full-bodied Warm finish Tobacco is characterized by Nagelkaas Cumin Clove Hard cheese Higher fat ? is characterized by matches matches does not match
  • 29. Why ‘The Knot’ uses Machine Learning and Semantic Models 29 ▹ XO Group runs ‘The Knot’ since 1996 ▹ NYSE: XOXO (S&P 600 Component) ▹ 1.5 million active members ▹ The Knot has helped marry 25 million couples ▹ Partnering with 300,000 wedding vendors ▹ Millions of vendor reviews
  • 30. Thank you for your interest! Andreas Blumauer CEO, Semantic Web Company ▸ Mail andreas.blumauer@semantic-web.com ▸ Company https://www.semantic-web.com ▸ LinkedIn https://www.linkedin.com/in/andreasblumauer ▸ Twitter https://twitter.com/semwebcompany ▸ Blog https://www.linkedin.com/today/ author/andreasblumauer 30 © Semantic Web Company - http://www.semantic-web.com and http://www.poolparty.biz/