Concepts in Application Contexts: Exploring Conceptual Modeling with Formal Concept Analysis

Steffen Staab 1Institute for Web Science and Technologies · University of Koblenz-Landau, Germany
Web and Internet Science Group · ECS · University of Southampton, UK &
Concepts in Application Context
( How we may think conceptually )
Steffen Staab
Including joint work with Lukas Schmelzeisen, Martin Leinberger,
Ralf Lämmel, Claudia Schon, Philipp Seifer & further team
Several slides adapted from Lukas & Martin

Steffen Staab 2
Pre-existing hypotheses
• Formal concept analysis is a useful tool to
explore natural and artificial languages
explore our thinking using language
Recent paradigm changes (2018 / 2019)
– Understanding natural language text
– Understanding software code
Open question to you:
• How does / how can FCA help?
Main thrust of this talk

Steffen Staab 3
From one forefather of FCA
Charles Saunders Peirce
Four incapacities:
3. No power of thinking without signs. A cognition must be interpreted in a subsequent
cognition in order to be a cognition at all.

Steffen Staab 4
Publicdomain,
https://commons.wikimedia.org/w/index.php?curid=2567
0405
Concepts in natural languages

Steffen Staab 5
Example: WordNet
5
event
miracle
act,
human action,
human activity
happening,
occurrence,
occurent,
natural event
change,
alteration,
modification
miracle
transition
damage,
harm,
impairment
increase
leap,
hump,
saltation
jump,
leap
forfeit,
forfeiture,
sacrifice
action group action
resistance,
opposition
change transgression
motion,
movement,
move
demotion variation
locomotion,
travel
descent
run,
running
jump,
parachuting
dash,
sprint
Example from the WordNet Assignment in
the Princeton Algorithm Course COS 226.
hypernym–
hyponym
relation synset =
sets of
synonyms

Steffen Staab 6
Morning star, most commonly used as a name for the
planet Venus when it appears in the east before
sunrise
The Egyptians knew the morning star as Tioumoutiri
and the evening star as Ouaiti.
Example text
https://en.wikipedia.org/wiki/Morning_Star
https://en.wikipedia.org/wiki/Venus

Steffen Staab 7
Subtasks:
 Term Extraction:
Input: text corpus and domain
Output: terms that should become part of taxonomy
Taxonomy learning from text
7
sunrise.

Steffen Staab 8
Subtasks:
 Hypernym Detection:
Input: pair of terms
Output: relationship (hypernym / hyponym / other)
8
sunrise.

Steffen Staab 9
Subtasks:
 Hypernym Detection:
Input: pair of terms
Output: relationship (hypernym / hyponym / other)
 Taxonomy Construction:
Input: set of hypernym-hyponym-pairs
Output: high quality taxonomy (cycle-free, etc.)
9
sunrise.

Steffen Staab 10
„Venus“
The second
planet in our
star system
The signifier
The concept
The signified
„morning
star“
The signifier
Peirce: Sign, interpretant, object
Frege: Zeichen (sign),
Sinn (sense), Bedeutung (meaning)
Semiotic triangle

Steffen Staab 11
• Concepts are related
in a heterarchy / Hasse diagram
• Concept comprises objects (the signified)
• Concept comprises attributes (the signifiers)
Possible correspondence
between FCA and the Semiotic Triangle

Steffen Staab 12
An early approach (Cimiano et al 2005)
„The museum houses an impressive
collection of medieval and modern art.
The building combines geometric abstraction
with classical references that allude to the
Roman influence on the region.“

Steffen Staab 13

Steffen Staab 14

Steffen Staab 15
 Word Embeddings
 Word2vec, GloVe, fastText, …
 Dominating paradigm until mid 2018
 Map a term 𝑡 ∈ 𝑉 onto a vector in concept space ℝ 𝑛
:
𝑓: 𝑉 → ℝ 𝑛
 Just one sense per term
Context-free Word Representations
15
„morning
star“
𝐶 ∈ ℝ 𝑛
The signifier
The concept
The signified

Steffen Staab 16
Explorative analysis

Steffen Staab 17
planet Venus when it appears in the east before sunrise
The Egyptians knew the morning star as Tioumoutiri and
the evening star as Ouaiti.
A morning star is any of several medieval club-like
weapons consisting of a shaft with an attached ball
adorned with one or more spikes
The Morning star is normally considered to be a one
handed weapon but it was also a polearm weapon
Example text continued
https://en.wikipedia.org/wiki/Morning_star_(weapon)
http://medieval.stormthecastle.com/armorypages/polearms/morning-star.htm

Steffen Staab 18
Issues
 Lack of accounting for ambiguous terms
 Lack of quality of taxonomy (e.g. non-transitivity)
What about ambiguity?
18
„Venus“
The second
planet in our
star system
The signifier
The concept
The signified
A mace, a
weapon
with spiked
ball
The concept
„morning
star“
The signifier
The signified

Steffen Staab 19
Planet
2nd in solar
system
Can be
used to kill
Used by
knights Has spikes
Venus x x
Evening star x x
Morning star x x x x x
Mace x x x
Weapon x x
weaponVenus, Evening star
Morning star mace
⊤
⊥
A too shallow analysis
cannot reveal the wrong
place of „morning star“

Steffen Staab 20
„Steffen went to Universität Koblenz-Landau.“
Interpretation 1: Steffen studied at Universität Koblenz-Landau.
Interpretation 2: Steffen went to the location
where the premises of Universität Koblenz-Landau are.
Interpretation 2a: Steffen went to Koblenz.
Interpretation 2b: Steffen went to Landau.

Steffen Staab 21
From one forefather of FCA
Charles Saunders Peirce
Four incapacities:
3. No power of thinking without signs. A cognition must be interpreted in a subsequent
cognition in order to be a cognition at all.
However: our thinking allows for ambiguity.
Just as our language does.

Steffen Staab 22
 Paradigm-changing results:
 2018: ULMFiT, ELMo, OpenAI GPT, BERT
 2019: OpenAI GPT-2
 Map a sequence of terms from vocabulary 𝑉
onto a sequence of vectors in concept space ℝ 𝑛:
𝑓: 𝑉 𝑙
→ ℝ𝑙+𝑛
 Representation of a term in concept space is relative to all
other representations of terms in concept space
Contextualized Word Representations
22
In recent evaluations of natural language systems
(QA, SWAG, Entity recog.,...),
proper usage of BERT led to best overall systems
Ongoing
competition!!

Steffen Staab 23
 Map a sequence of terms from vocabulary 𝑉
onto a sequence of vectors in concept space ℝ 𝑛:
𝑓: 𝑉 𝑙
→ ℝ𝑙+𝑛
 Is this useful to deal with ambiguity of attributes?
Contextualized Word Representations vs.
Concepts in Application Context
23
Our (ongoing) objectives
 Derive hierarchically ordered synonym sets
 Allow inference for similarity and inclusion
 Create taxonomy

Steffen Staab 24
 Input
 term sequence
 Method
 Compute contextualized representations of terms in
concept space
 Cluster representations in concept space
 Output: Representations for concept clusters
 E.g. multi-variate mixtures of Gauß distributions
Concept Representations
24

Steffen Staab 25
Explorative Analysis
25

Steffen Staab 26
Preliminary observations
26
 Averages of concept vectors are good representations
 Concept vectors of hyponyms show less variance than of hypernyms
 Ambiguous words
 Individual concept representations are close to concept representations
of similar terms

Steffen Staab 27
Intermediate summary
• Recent deep
learning/numerical methods
– hugely successful for
addressing a range of
natural language processing
tasks
– Because they are good at
representing
similarity/analogy
– Because they deal with
ambiguity
• But: it is hard to understand
what they do
• FCA working with
application context might be
useful
– Explicit
– Explaining
Let‘s use FCA for analysing vocabulary

Steffen Staab 28
Concepts in software languages
Public Domain,
https://commons.wikimedia.org/w/index.php?curid=358991

Steffen Staab 29
Data Base 1 Data Base 2 Data Base 3
App A App B App B
Scenario

Steffen Staab 30
Example: How old are these students?
Query for all students, access age
Query fails during evaluation
let students = query { SELECT ?x WHERE {?x a Student. } }
for student in students do
printfn „%A“ (student.age)
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 31
Example: How old are these students?
Should we use this relation on this signifier?
Depends on application contexts:
1. Conceptualization of data source
2. Query of data source
3. Software code
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 32
• alice and bob are Persons
– Implies having a name
• bob is a Student
– Implies being a Person and having a place to studyAt
• No restrictions with respect to age or matrNr
1 Conceptualization of data source
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 33
SHACL: Shapes constraint language
SHACL shapes are integrity constraints
(Namespaces omitted for brevity)
:StudentShape a :NodeShape;
:targetClass :Student;
:class :Person;
:property [
:path :studiesAt;
:minCount 1;
:class :University;
].
:PersonShape a :NodeShape;
:targetClass :Person;
:property [
:path :name;
:minCount 1;
:datatype xsd:string;
].

Steffen Staab 34
Type checking discovers (potential) run-time errors
• Types interpreted as sets of values
• Based on tests for subsets
Set of all students (StudentShape)
One value of
StudentShape
set
Not allowed since
StudentShape ⊈ ≥ 𝟏age.⊤
when considering
all conceptually possible RDF graphs

Steffen Staab 35
• Access: matrNr
• No error during evaluation
• Unsafe: Rejected by type checking,
conceptualization not guaranteed
printfn „%A“ (student.matrNr)
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 36
2 Query context
• Query for: matrNr
• Type safe access:
matrNr inferred to be given for all values of student
let students = query { SELECT ?x WHERE {?x matrNr ?y. } }
printfn „%A“ (student.matrNr)
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 37
3 Type safety in one code context
• Accessing studiesAt relation
• Accepted as type safe
printfn „%A“ (student.studiesAt)
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 38
3 Lacking type safety in another code context
• Accessing studiesAt relation
• Not type safe for type Person
printfn „%A“ ( (Person)student . studiesAt)
bobalice 𝑏1
Student
UniversitysubClass
type
studiesAt
type
211... "Bob"
matrNr name
25 "Alice"
age name
Person

Steffen Staab 39
1. Use available SHACL constraints
2. Infer additional SHACL constraints from queries
3. Type check using inference
Determine type safety in context
printfn „%A“ (student.name)
Query shape(2) including StudentShape (1)
One value of
StudentShape
set
StudentShape ⊆ PersonShape and
PersonShape ⊆ ≥1name. ⊤ in all possible graphs
Inference (3)

Steffen Staab 40
Inference for type checking
Abstraction: If 𝜆 𝑥: 𝑇 . 𝑡 is a function
• 𝑥 is a variable of type 𝑇, 𝑡 is the body
• Type 𝑇 constitutes a set of values
(domain of the function)
Application: 𝜆 𝑥: 𝑇1 . 𝑡1 𝑡2
• Then question is: What type 𝑇2 is 𝑡2
(to what values can 𝑡2 evaluate to)?
• Can the function be applied to all possible evaluation results
(is 𝑇2 ⊆ 𝑇1 true)?

Steffen Staab 41
Intermediate summary
Application contexts
1. Conceptualization
2. Queries
3. Code
Open Questions
• Inference
– Sound
– Complete
– Efficient
• More polymorphism
Data integration leads to ambiguity

Steffen Staab 43
Pre-existing hypotheses
explore natural and artificial languages
explore our thinking using language
Recent paradigm changes (2018 / 2019)
– Understanding natural language text
– Understanding software code
Open question to you:
• How does / how can FCA help?
Main thrust of this talk

Steffen Staab 44
Open questions
• Representation of
ambiguities
– Representation of
underspecification
(as in Computational
Linguistics)
– Representation of
alternatives
• Sound and complete
inference for subsumption
between SHACL shapes
• Ambiguities →
Polymorphisms and more
type inference
Claim
Talking and thinking with ambiguities is efficient and effective!
How to support it?

Steffen Staab 45
Thank you very much!

Steffen Staab 46
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova: BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding. NAACL-HLT (1) 2019: 4171-4186
P. Cimiano, A. Hotho, S. Staab. Learning Concept Hierarchies from Text Corpora using Formal Concept
Analysis. JAIR - Journal of AI Research. 24: 305-339, 2005.
M. Leinberger, P. Seifer, C. Schon, R. Lämmel, S. Staab. Type Checking Program Code using SHACL.
In: Proc. of ISWC-2019. New Zealand, 2019.
Seifer, P., Leinberger, M., Lämmel, R., Staab, S.: Semantic query integration with reason. The Art, Science, and
Engineering of Programming 3(3) (2019). https://doi.org/10.22152/programming-journal.org/2019/3/13
References

Concepts in Application Contexts: Exploring Conceptual Modeling with Formal Concept Analysis

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (10)

Semelhante a Concepts in Application Contexts: Exploring Conceptual Modeling with Formal Concept Analysis

Semelhante a Concepts in Application Contexts: Exploring Conceptual Modeling with Formal Concept Analysis (20)

Mais de Steffen Staab

Mais de Steffen Staab (20)

Último

Último (20)

Concepts in Application Contexts: Exploring Conceptual Modeling with Formal Concept Analysis

Notas do Editor