SlideShare a Scribd company logo
1 of 101
Download to read offline
Deep Misconceptions and the Myth
of Data-Driven Language Understanding
On Putting Logical Semantics Back to Work
IMMANUEL KANT
Every thing in nature, in the inanimate
as well as in the animate world, happens
according to some rules, though we do
not always know them
I reject the contention that an important
theoretical difference exists between formal
and natural languages
RICHARD MONTAGUE
One can assume a theory of the world
that is isomorphic to the way we talk
about it… in this case, semantics
becomes very nearly trivial
JERRY HOBBS
Early efforts to find theoretically elegant formal models for various linguistic phenomena did not
result in any noticeable progress, despite nearly three decades of intensive research (late 1950‟s
through the late 1980‟s ). As the various formal (and in most cases mere symbol manipulation)
systems seemed to reach a deadlock, disillusionment in the brittle logical approach to language
processing grew larger, and a number of researchers and practitioners in natural language
processing (NLP) started to abandon theoretical elegance in favor of attaining some quick results
using empirical (data-driven) approaches.
All seemed natural and expected. In the absence of theoretically elegant models that can explain
a number of NL phenomena, it was quite reasonable to find researchers shifting their efforts to
finding practical solutions for urgent problems using empirical methods. By the mid 1990‟s, a
data-driven statistical revolution that was already brewing over took the field of NLP by a storm,
putting aside all efforts that were rooted in over 200 years of work in logic, metaphysics,
grammars and formal semantics.
We believe, however, that this trend has overstepped the noble cause of using empirical
methods to find reasonably working solutions for practical problems. In fact, the data-driven
approach to NLP is now believed by many to be a plausible approach to building systems that
can truly understand ordinary spoken language. This is not only a misguided trend, but is a very
damaging development that will hinder significant progress in the field. In this regard, we hope
this study will help start a sane, and an overdue, semantic (counter) revolution.
Copyright © 2017 WALID S. SABA
a spectre is haunting NLP
February 7, 2017
Copyright © 2017 WALID S. SABA
Copyright © 2017 WALID S. SABA
about the resurgence
of and the currently dominant
paradigm in ‘AI’ …
Copyright © 2017 WALID S. SABA
the availability of huge amounts data, coupled with
advances in computer hardware and distributed
computing resulted in some advances in certain types
of (data-centric) problems (image, speech, fraud
detection, text categorization, etc.)
Copyright © 2017 WALID S. SABA
But …
many problems in AI require
understanding that is beyond
discovering patterns in data
Copyright © 2017 WALID S. SABA
Identifying an adult female in an image is a data-centric
problem that might be suitable for data-driven image
recognition systems
However, inferring which of the two is a photo of a
teacher and which is a mother requires information that
is not (always!) in the data
Copyright © 2017 WALID S. SABA
which picture would a
data-driven image
recognition pick-out
for a query like
‘musical band’?
Copyright © 2017 WALID S. SABA
which picture would a
data-driven image
recognition pick-out
for a query like
‘musical band’?
which picture would a
data-driven image
recognition pick-out
for a query like
‘musical band’?
Musician?
a person who plays a
musical instrument?
Copyright © 2017 WALID S. SABA
So what
is at issue
here?
The issue here is that, ontologically, there
are no musicians, teachers, lawyers, or
even mothers! What exists, ontologically
(metaphysically), are humans, and a
concept such as ‘musician’ is a logical
concept that might be true of a certain
human
Quantitative/Data-driven approaches can
only reason with (detect, infer, recognize)
objects that are of an ontological type, but
they can not detect logical concepts, that
form the majority of objects of (human)
thought
Copyright © 2017 WALID S. SABA
human
lawyer
dancer
ONTOLOGIACL
CONCEPTS
LOGIACL
CONCEPTS
teacher
mother
...
Copyright © 2017 WALID S. SABA
failure to distinguish between logical and
ontological concepts is not only a flaw in
data-driven approaches
logical/formal semantics also failed to
provide adequate models for natural
language and for exactly the same reason
Copyright © 2017 WALID S. SABA
Notwithstanding achievements in data-centric tasks (e.g., image and speech recognition, or
numerically specifiable and finite-space problems, such as the game Go), statistical and other
data-driven models (e.g., neural networks) cannot model human language comprehension
because these models cannot explain, model or account for very important phenomena in
ordinary spoken language, such as:
• Non-Observable (thus Non-Learnable) Information
• Intensionality and Compositionally
• Inferential Capacity
Criticisms of the statistical data-driven approach to language understanding are very often
automatically associated with the Chomskyan school of linguistics. At best, this is a misinformed
judgement (although in many cases, it is ill informed). There is a long history of work in logical
semantics (a tradition that forms the background to the proposals we will make here) that has
very little to do (if anything at all) with Chomskyan linguistics.
Notwithstanding Chomsky‟s (in our opinion valid) Poverty of the Stimulus (POS) argument – an
argument that clearly supports the claim of some kind of innate linguistic abilities, we believe
that Chomskyans put too much emphasis on syntax and grammar (which ironically made their
theory vulnerable to criticism from the statistical and data-driven school). Instead, we think that
syntax and grammar are just the external artifacts used to express internal, logically coherent,
semantic, and compositionally and productively (i.e., recursively) constructed thoughts,
something that is perhaps analogous to Jerry Fodor‟s Language of Thought (LOT).
Here we should also mention that we agree somewhat with M. C. Corballis („The Recursive
Mind‟) that it is thought that brought about the external tool we call language, and not the other
way around.
Copyright © 2017 WALID S. SABA
what this study is not about
Copyright © 2017 WALID S. SABA
Another association that criticism of the statistical and data-driven approaches to NLU often
conjures up is that of building large knowledge bases with brittle rule-based inference engines.
This is perhaps the biggest misunderstanding, held not only by many in the statistical and data-
driven camp, but also by previously over enthused knowledge engineers that mistakenly
believed at one point that all that is required to crack the NLU problem was to keep adding
more knowledge and more rules. We also do not subscribe to such theories.
In fact, regarding the above, we agree with an observation once made by the late John McCarthy
(at IJCAI 1995) that building ad-hoc systems by simply adding more knowledge and more rules
will result in building systems that we don‟t even understand. Ockham's Razor, as well as
observing linguistic skills of 5-year olds, should both tell us that the conceptual structures that
might be needed in language understanding should not, in principal, require all that
complexity.
As will become apparent later in this study, the conceptual structures that speakers of ordinary
spoken language have access to are not as massive and overwhelming as is commonly believed.
Instead, it will be shown that the key is in the nature of that conceptual structure and the
computational processes involved.
what this study is not about
FINALLY, our concern here is in introducing a plausible model for natural language understanding
(NLU). If your concern is natural language processing (NLP), as it is used, for example, in
applications such as these
words-sense disambiguation (WSD);
entity extraction/named-entity recognition (NER);
spam filtering, categorization, classification;
semantic/topic-based search;
word co-occurrence/concept clustering;
sentiment analysis;
topic identification;
automated tagging;
document clustering;
summarization;
etc.
then it is best if we part ways at this point, since this is not at all our concern here. There are many
NLP and text processing systems that already do a reasonable job on such data-level tasks. In fact, I
am part of a team that developed a semantic technology that does an excellent job on almost all of
the above, but that system (and similar systems) are light years away from doing anything remotely
related to what can be called natural language understanding (NLU), which is our concern here.
Copyright © 2017 WALID S. SABA
what this study is not about
Copyright © 2017 WALID S. SABA
1
2
3
WE WILL ARGUE THAT purely data-driven extensional models that ignore
intensionality, compositionality and inferential capacities in natural language are
inappropriate, even when the relevant data is available, since higher-level reasoning (the
kind that‟s needed in NLU) requires intensional reasoning beyond simple data values.
WE WILL ARGUE THAT many language phenomena are not learnable from data because
(i) in most situations what is to be learned is not even observable in the data (or is not
explicitly stated but is implicitly assumed as „shared knowledge‟ by a language
community); or (ii) in many situations there‟s no statistical significance in the data as the
relevant probabilities are all equal
WE WILL ARGUE THAT the most plausible explanation for a number of
phenomena in natural language is rooted in logical semantics, ontology, and the
computational notions of polymorphism, type unification, and type casting; and we
will do this by proposing solutions to a number of challenging and well-known
problems in language understanding.
what this study is about
Copyright © 2017 WALID S. SABA
We will propose a plausible model rooted in logical semantics, ontology, and the computational notions
of polymorphism, type casting and type unification. Our proposal provides a plausible framework for
modelling various phenomena in natural language; and specifically phenomena that requires reasoning
beyond the surface structure (external data). To give a hint of the kind of reasoning we have in mind,
consider the following sentences:
(1) a. Jon enjoyed the movie
b. Jon enjoyed watching the movie
(2) a. A small leather suitcase was found unattended
b. A leather small suitcase was found unattended
(3) a. The ham sandwich wants another beer
b. The person eating the ham sandwich wants another beer
(4) a. Dr. Spok told Jon he should soon be done with writing the thesis
b. Dr. Spok told Jon he should soon be done with reading the thesis
Our model will explain why (1a) is understood by all speakers
of ordinary language as (1b); why speakers in multiple languages find (2a) more natural to say than (2b);
why we all understand (3a) as (3b); and why we effortlessly resolve „he‟ in (4a) with Jon and „he‟ in (4b)
with Dr. Spok. Before we do so, however, we will discuss some serious flaws in proposing a statistical
and a data-driven approach to NLU.
more specifically ...
Copyright © 2017 WALID S. SABA
is not even in the data?
what if the relevant information
understanding language by
analyzing data?
Copyright © 2017 WALID S. SABA
Challenges in the computational comprehension of ordinary text is often due to quite a bit of missing text –
text which is not explicitly stated but is often assumed as shared knowledge among a community of
language users. Consider for example the sentences in (1):
(1) a. Don’t worry, Simon is a rock.
b. The truck in front of us is annoying me.
c. Carlos likes to play bridge.
d. Mary enjoyed the apple pie.
e. Jon owns a house on every street in the village.
Clearly, speakers of ordinary English understand the above as
(2) a. Don’t worry, Simon is [as solid as] a rock.
b. The [person driving the] truck in front of us is annoying me.
c. Carlos likes to play [the game] bridge.
d. Mary enjoyed [eating] the apple pie.
e. Jon owns a [different] house on every street in the village.
Since such sentences are quite common and are not at all exotic, farfetched, or contrived, any model for
NLU must clearly somehow „uncover‟ this [missing text] for a proper understanding of what is being said.
What is certain here is that data-driven approaches are helpless in this regard, since a crucial part of
understanding NL text is not only interpreting the data, but „discovering‟ what is missing from the data.
analyzing missing text?
it's not even in the data
Copyright © 2017 WALID S. SABA
Again, let us consider the sentences below, where there is some [missing text] that is not explicitly
stated in every day discourse, but is often implicitly assumed:
a. Don’t worry, Simon is [as solid as] a rock.
b. The [person driving the] truck in front of us is annoying me.
c. Carlos likes to play [the game] bridge.
d. Mary enjoyed [eating] the apple pie.
e. Jon owns a [different] house on every street in the village.
Although the above seem to have a common denominator, namely some missing text that is often
implicitly assumed, it is somewhat surprising that in looking at the literature one finds that the
missing text phenomenon have been studied quite independently and under different labels such
as
metaphor (a),
metonymy (b),
lexical ambiguity (c),
ellipses (d),
quantifier scope ambiguity (e)
it's not even in the data
analyzing missing text?
Copyright © 2017 WALID S. SABA
In ordinary spoken
language there’s more
than missing (and
implicitly assumed)
text …
When surface data
probabilities are all equally
likely, we often resort to our
shared (commonsense)
knowledge in resolving
certain types of ambiguities
(e.g., in reference resolution)
Copyright © 2017 WALID S. SABA
One of the most obvious challenges to statistical and data-driven NLU are situations where there does
not seem to be any statistical significance in the observed data that can help in making the right
inferences. As an example, consider the sentences in (1) and (2).
(1) The trophy did not fit in the brown suitcase because it was too
a. big
b. small
(2) Dr. Spok told Jon that he should soon be done
a. writing his thesis
b. reading his thesis
For a speaker of ordinary language, the decision as to what „it‟ in (1) and „he‟ in (2) refer to are
immediately obvious, even for a 5-year old. On the other hand, a statistically data-driven approach
would be helpless in making such decisions since the only difference between the sentence-pairs in (1)
and (2) are words that co-occur with equal probabilities (this is so because antonyms or opposites, such
as big/small, night/day, hot/cold, read/write, open/close, etc. have been shown to co-occur in text with
equal frequency). Clearly, then, references such those in (1) and (2) must be resolved using information
that is not (directly) in the data.
probabilities are all equal
it's not even in the data
Copyright © 2017 WALID S. SABA
In the absence of any statistical significance in the data, we have suggested above that references such
as those in sentences (1) and (2) are resolved by relying on other information that is not (directly) in
the data.
It might still be suggested, however, that a learning algorithm can create statistical significance
between (1a) and (1b), for example, if probabilities of some composites in the sentence (as opposed to
the atomic units) are considered. What this would essentially require is creating a composite feature
for every possible relation. In (1), we would need at least the following:
trophy-fit-in-suitcase-small
trophy-fit-in-suitcase-big
trophy-not-fit-in-suitcase-small
trophy-not-fit-in-suitcase-big
Note here that since data-driven approaches also do not admit the existence of a type-hierarchy (or any
knowledge structure, for that matter) –i.e., there‟s nothing that says that a Trophy and a Radio are both
subtypes of an Artifact, and that Purse and Suitcase are both subtypes of some Container, where
the „fit‟ relation applies similarly to both, other features (e.g., radio-fit-in-purse-small) would also be
needed to learn how to resolve the reference „it‟ in (1).
probabilities are all equal
it's not even in the data
Copyright © 2017 WALID S. SABA
Again, in the absence of a type-hierarchy (or some other source of information) statistical
significance can only be salvaged if composite features are constructed for every possible relation in
a meaningful sentence. Such a story leads us to something like this:
trophy-fit-in-suitcase-small
trophy-fit-in-suitcase-big
trophy-not-fit-in-suitcase-small
trophy-not-fit-in-suitcase-big
radio-fit-in-purse-small
radio-fit-in-purse-big
radio-not-fit-in-purse-small
radio-not-fit-in-purse-big
etc.
Although the point can be made with the above, the story in reality is much worse, as there are
more „nodes‟ that must be combined in these features to capture statistical significance. For
example, if „because‟ was changed to „although‟ in (1b) then „it‟ would suddenly refer to the trophy.
Nevertheless, the question now is how many such features would eventually be needed, if every
meaningful sentence requires a handful of composite features to capture all statistical correlations?
Fodor and Pylyshyn (1988) hint that that number is in the magnitude of the number of seconds in
the history of the universe, citing an experiment conducted by the psycholinguist George Miller.
probabilities are all equal
it's not even in the data
Copyright © 2017 WALID S. SABA
Incidentally, in the absence of any external knowledge structures, the combinatorially implausible
explosion in the number of features needed by a statistical data-driven (i.e., bottom-up) learner
would also be needed by a top-down learner, one that learns by being told (or by instruction).
Specifically, a top-down learner would ask for n number of clarifications in every sentence, requiring
therefore a total of nm clarifications for a paragraph with m sentences. The reader can now easily
work out how many clarifications would be required for a top-down learner to understand just a
small paragraph1.
The point here is that whether the learner tries to discover what is missing bottom-up (from the
data), or top-down (by being told), it would seem therefore that the infinity lurking
in language (due to the recursive productivity of thoughts) makes learning various language
phenomena just from data alone a computationally implausible theory.
a top-down explanation
1The reason a top-down learner would need (n x n), as opposed to (n + n), clarifications for two consecutive
sentences where each requires n is that the preferences of one sentence are subject to revision in the context of
the previous and/or the following sentence. This is so because, linguistically, it is paragraphs, not sentences,
that are the smallest linguistic units that can be fully interpreted on their own and should not (in theory) require
any additional text to be fully understood. See The Semantics of Paragraphs (Zadrozny & Jenssen, 19xx) for an
excellent treatment of the subject
it's not even in the data
Copyright © 2017 WALID S. SABA
Our argument against statistical data-driven approaches in NLU are not meant to dismiss the role of
statistical/probabilistic reasoning in language understanding. That would, of course, be unwise. Our
argument is about which probabilities are relevant in language understanding. Consider, for example,
the following:
(1) The town councilors refused to give the demonstrators a
permit because they advocated violence and anarchy.
(2) A young teenager fired several shots at a policeman.
Eyewitnesses say he immediately fled away.
While the most likely reading for (1) has they referring to the demonstrators, one can imagine a
scenario where a group of anarchist town councilors refused to give the demonstrators a permit
specifically to incite violence and anarchy. Similarly, while the most likely reading for (2) is the one
where „he‟ refers to the young teenager, one can imagine a scenario where a slightly wounded
policeman fled away to escape further injuries.
Obviously such occurrences are rare, and thus, in the absence of other information, the pragmatic
probability of the usual reading wins over with speakers of ordinary language. What is important to
note here is that the likelihoods we are speaking of are a function of pragmatics and have nothing to
do with anything observed in the data.
pragmatic probabilities
it's not even in the data
Copyright © 2017 WALID S. SABA
To summarize this argument, consider the table below. At the data level references can be resolved
during syntactic analysis using simple NUMBER or GENDER data. At the information level,
the resolution would require semantic (type) information, for example that corporations, and not
lawsuits, settle a case out of court. Note also that at this level the possibilities are not all available, once
the type constraints are applied. It is exactly at the pragmatic level where probabilistic/statistical
reasoning factors in, since at this level the referents are all possible, yet some are more probable than
others (e.g., it is more likely that the one who fell down is the one who was shot, etc.)
pragmatic probabilities
REFERNCES RESOLVED BY SYNTAX
John informed Mary that he passed the exam.
John told Steve and Diane that they were invited to the party.
information
data
intentional
REFERNCES RESOLVED BY SEMANTICS
There are a number of lawsuits between Apple and Samsung, and
a. both say they are more about values than patents and money.
b. both say they are ready to settle out of court.
REFERNCES RESOLVED BY PRAGMATICS
A young teenager fired several shots at a policeman.
Eyewitnesses say he immediately fled away.
REFERNCES CANNOT BE RESOLVED (intention not clear)
John told Bill that he has been nominated to head the committee.
information level
data level
knowledge level
intentional level
it's not even in the data
Copyright © 2017 WALID S. SABA
Perhaps chief among the “it‟s not even in the data” phenomena is that of Adjective-Ordering Restrictions
(AORs), a phenomenon that can be explained by the examples below:
(1) a. Carlos is a polite young man
b. #Carlos is a young polite man
(2) a. A small brown suitcase was found unattended
b. #A brown small suitcase was found unattended
The readings in (1a) and (2a) are clearly preferred by speakers of ordinary spoken language over the
readings in (1b) and (2b), although there are no rules that speakers of ordinary language seem to be
following. What makes the AORs phenomenon even more intriguing is the fact that these preferences
are also consistently made across multiple languages.
First of all this phenomena presents a paradigmatic challenge to the statistical and data-driven story
about language learning, as it does not seem that speakers come to have these preferences by observing
and analyzing data. Furthermore, there does not seem to be a pattern in the observed data suggesting
what adjectives should precede or follow other adjectives. For example, while it
is preferred that „small‟ precede „brown‟ in (2), in (3) „small‟ is
not anymore preferred to be the first adjective:
(3) A beautiful small suitcase was found unattended
innate preferences?
it's not even in the data
Copyright © 2017 WALID S. SABA
The most crucial challenge to data-driven NLU as it relates to adjective-ordering restrictions is to
explain how beautiful in (4a) could be describing Olga‟s dancing as well as Olga as a person, while this
reading is not available in (4b):
(4) a. Olga is a tall beautiful dancer
b. Olga is a beautiful tall dancer
We will see later why beautiful in (4b) cannot anymore modify Olga‟s dancing (an abstract entity of
type Activity) after it was polymorphically cast into describing a physical object. For now we want to
note however that while various investigations on large corpora have not yielded any plausible
explanation as to what seems to govern these adjective-ordering restrictions, we argue that even if
some patterns were to be discovered, the more important question is „what is behind this phenomena –
i.e., what is it that makes us have these ordering preferences, and across multiple languages‟?
In our opinion, what is behind this phenomenon must be much deeper than the outside (observable)
data of any language. In fact, we believe that a plausible account for this phenomena must shed some
light on the conceptual structures and the processes that are operating in language. As stated above, a
plausible explanation for this puzzle, one that is rooted in ontology, polymorphism, type unification
and type casting, will be suggested later in this study.
innate preferences?
it's not even in the data
Copyright © 2017 WALID S. SABA
We have thus far argued that in the absence of some process or other source of information, a number of
phenomena in natural language understanding cannot be observed, captured, or learned by simply
analyzing the external linguistic data alone. Whether it‟s adjective-ordering restrictions, which seem to be
not only data-independent, but even language-independent, to the missing (not explicitly stated) text that
must somehow be discovered and interpreted, to situations where probabilities in the data are statistically
insignificant, it is clear that data-driven approaches to NLU are inappropriate.
Before we get into our proposals, however, we will next have a small discussion about intensions and
how data alone, even if available, is not enough in high-level reasoning, the kind that is needed in NLU.
Copyright © 2017 WALID S. SABA
data is (in the end) just data
no matter how big,
extensions and
intensions
Copyright © 2017 WALID S. SABA
Clearly, as objects (e.g., as logical gates) the expressions in (1) are not the same. For example, a logical
circuit corresponding to the expression on the left-hand side has only two gates, while a gate for the
other expression would have three, as shown below.
What do we mean when we write an equality like this?
(A ^ (B _ C)) = (A ^ B) _ (A ^ C)
It would seem then that at some level, equality in data only is not enough and saying two objects are the
same is different from saying they are equal (in their data value). In some contexts, as will be seen
shortly, these differences are crucial. What is crucial to our discussion here is that data-driven approaches
deal with data only, that is, equality in that paradigm is equality of one attribute, namely the final value.
Thus, if it does turn out that equality of data alone is not enough in high-level reasoning (e.g., in NLU),
then data-driven approaches to NLU, would also (or, again) clearly be inappropriate.
Let us therefore take a closer look at the equality most of us know, and the related notions of intensions
and extensions, notions that some of the most penetrating minds in mathematical logic have studied for
nearly two centuries.
(1)
data and intensions
Copyright © 2017 WALID S. SABA
Our grade school teachers once
told us that
256 = 16
Can we always equate and replace the data value
16 by the data value 256? Let‟s see …
data and intensions
Copyright © 2017 WALID S. SABA
Mary taught
her little brother
that 7 + 9 = 16
Now if we blindly follow what our grade school teachers told us, namely that 256 = 16, we
should be able to replace 16 by 256 without any problem. But if we do that we would then be
able to alter reality and come up with
What happened? Were we taught the wrong thing when we were told that 256 = 16? Not
exactly, but we were also not told the whole story. I guess our grade school teachers did not
know we will end up working in AI and NLU. If they did, they would have told us that
extensional (data only) equality is not sufficient in high-level reasoning, and if equated with
sameness at that level it can easily lead to false conclusions.
here‘s a snapshot of
some reality
Mary taught her little brother that 7 + 9 = 𝟐𝟓𝟔
data and intensions
Copyright © 2017 WALID S. SABA
The four objects below are in fact equal, including 256 and 16, but in regard to one attribute only,
namely their data value. As objects, however, they are not the same, as they differ in many other
attributes, for example in the number of operators and the number of operands. Note however that
the attributes value, no-of-operators, and no-of-operands are still not enough to establish true intensional
equality between these objects, as demonstrated by the objects (a) and (b). At a minimum, true
(intensional) equality between these objects would require the equality of at least four attributes:
value, no-of-operators, no-of-operands, syntaxtree.
equality and sameness
(a) (b)
In many domains where the only relevant attribute is the data (value), working with extensional
(data equality) only might be enough. In tasks that require high-level reasoning, such as NLU,
however, this will lead to contradictions and false conclusions, as the example of Mary and her
little brother clearly demonstrate
data and intensions
Copyright © 2017 WALID S. SABA
The four objects below are in fact equal, including 256 and 16, but only in regard to one attribute
only, namely their data value. As objects, however, they are not the same, as they differ in many
other attributes, for example in the number of operators and the number of operands. Note
however that the attributes value, no-of-operators, and no-of-operands are still not enough to establish
true intensional equality between these objects, as demonstrated by the objects (a) and (b). At a
minimum, true (intensional) equality between these objects would require the equality of at least
four attributes: value, no-of-operators, no-of-operands, syntaxtree.
equality and sameness
(a) (b)
In many domains where the only relevant attribute is the data (value), working with extensional
(data equality) only might be enough. In tasks that require high-level reasoning, such as NLU,
however, this will lead to contradictions and false conclusions, as the example of Mary and her
little brother clearly demonstrate
As an aside …
Reducing equality of objects to equality of one extensional attribute, namely the
data value, is what
is behind the so-called adversarial examples in deep neural works, where small
perturbations in the image (the kind of which will not affect the human eye from
making a different classification) will cause the network to classify the image in a
completely different category. The same is true in the converse case, where a
completely meaningless image (a blob of pixels) is classified with high certainty as
a real-life object. That
is, behind both of these phenomena is something similar to the fact that 256 is
not always (and in all contexts) equal to 9 + 7, although certain calculations
involving these data values might produce the same output
value (bottom line: extensional data-only equality is
not enough in high level reasoning)
data and intensions
Copyright © 2017 WALID S. SABA
Beyond grade school, we were told in high school that two functions, f and g, are equal (are the same) if
for every input they produce the same output. In notation, this was expressed as
But this is not entirely true – or, our high school teachers did not also tell us the whole truth: if two
functions are equal whenever they agree on their input-output pairings, then MergeSort and
InsertionSort would be the same objects, since for any sequence
But computer scientists know that although their external values are always the same (that is, they are
extensionally equal), MergeSort and InsertSort are not the same objects as they differ in many other (and
very important) attributes – for example in their space and time complexity.
yet another example
MergeSort(sequence) = InsertionSort(sequence)
data and intensions
Copyright © 2017 WALID S. SABA
data and reasoning
Here we consider an example where working with extensions (data values) only and ignoring
intensions can easily lead to absurd conclusions. Consider the facts shown in the table below.
Now according to the above, the teacher of Alexander the Great = Aristotle. Notice now that if
we simply replace „the teacher of Alexander the Great‟ with a value that is only extensionally equal to
it, we can get an absurdity from a very meaningful sentence, as shown below
data and intensions
Copyright © 2017 WALID S. SABA
Let us now consider examples illustrating why intensionality cannot be ignored in natural language
understanding. Suppose we have a question-answering system that was to return the names of:
(1) all the tall presidents of the United States ?
(2) all the former presidents of the United States ?
A simple method for answering (1) would be to get two sets, the set of names of all tall people, and
the set of names of all presidents of the United States, and simply return the intersection as the
result.
What about the query in (2), however? Clearly we cannot do the same, because we cannot, like in
the case of tall, represent former by a set (an extension) of all former things. If we did, than Ronald
Reagan, for example, would have been a „former president‟ even while serving his term as
president, because he would have been in both sets: the set of presidents, and the set of „former
things‟ as he was also a former actor.
The point here is unlike tall, which is an extensional adjective that can semantically be represented
by a set (the set of all tall things), former is an intensional adjective that logically operates on a
concept returning a subset of that concept as a result.
data and intensions
data, intensions and reasoning
Copyright © 2017 WALID S. SABA
Let us elaborate on this subject some more. The following is a plausible meaning for (1) and (2) above:
(1) tall presidents of the United States ) f x j is-president-of-the-us(x) ^ is-tall(x)g
(2) former presidents of the United States ) f x j is-president-of-the-us (x) ^ F(x, president)g
What the above says is: (1) „tall presidents of the United States‟ refers to any x that is in the set of
presidents and also in the set of tall things; and (2) „former presidents of the United States‟ refers to any
x that is in the set of presidents and also some F is also true of x. Cleary what F does with an x is
something to effect of making sure that x was, at some point in time, but is not now, a president. The
point here is that unlike is-tall(x), F is not a set, and has no extensional value, but is a logical expression
that takes a concept and applies some condition returning a subset of the original concept.
All of this is not available in data-driven NLU, where both „tall‟ and „former‟ are adjectives that equally
modify nouns, which, as we have seen, can result in contradictions when executed on real data.
data and intensions
data, intensions and reasoning
Copyright © 2017 WALID S. SABA
One misguided attempt at salvaging the data-only solution would be to maintain a set of for the
compound former presidents.
This escape attempt is doomed, however, since composite sets for previous president, former
senator, former governor, previous governor, etc. would also then have to be added and maintained.
In fact, insisting on a data-only solution for intensional adjectives would essentially mean
maintaining a set for every construction of the form [Adj1 Adj2 Noun], [Adj1 Adj2 Noun1 Noun2], …
where any adjective Adji was an intensional adjective.
This is exactly the same the situation we encountered previously (pages 12-15), where composite
features for every possible relation were needed to resolve references in a data-driven model. In
both cases, such alternatives are neither computationally, nor psychologically plausible.
data and intensions
data, intensions and reasoning
Copyright © 2017 WALID S. SABA
Another major problem with data-driven/statistical approaches to NLU is their complete denial of
compositionality in computing the meaning of larger linguistic units as a function of the meaning of
their constituents. To illustrate, consider the sentences below.
(1) Jon bought an old copy of Das Kapital.
(2) Jon read an old copy of Das Kapital.
Although (1) and (2) refer to the same object, namely to a book entitled „Das Kpital‟, the reference in (1)
is to a physical object that can be bought (and thus sold, and burned, etc), while in (2) the reference is to
the content and ideas in that book. Thus, „Das Kapital‟ may refer to different features or properties of
the book, depending on the context, where the context could extend over several sentences. For
example, consider (3):
(3) Jon read Das Kapital. He then burned it because he did not
agree with anything it espouses.
In (3), we are (at the same time) using „Das Kapital‟ to refer to an abstract object (namely the content of
Das Kapital) when Jon read it and then disagreed with it‟s content, and a physical object, that can be
burned. We will see later on how a strongly-typed system will discover the existence of all the potential
types of objects that „Das Kapital‟ can refer to (physical object that can be burned, abstract object that
can be read and disagreed with, etc.)
data and intensions
compostionality
Copyright © 2017 WALID S. SABA
In natural language we can speak of anything we can conceive or imagine, existent or non-existent.
We can thus speak of and refer to an event that did not exist, as in
(1) John cancelled the trip. It was planned for next Saturday.
In (1), we are speaking about and referring to an event (a trip), that did not actually happen, thus a trip
that never existed. We can also refer to or speak of objects that do not exist, as in
(2) John painted a yellow bear.
In (2) what is „yellow‟ is not an actual bear, but a depiction of some object, namely a bear. Reference to
abstract and nonexistent objects can be quite involved, especially in mixed contexts where the initial
reference is to an object that does not necessarily exist, but is an object that subsequent context implies
its existence. For example, consider the following:
(3) John’s book proposal was not well received.
But it later became a bestseller when it was published.
In (3), the reference was initially to a book proposal, which does not imply the existence of the book,
although subsequent context implies the concrete existence of a book. Such inferences cannot
be made with a simple analysis of the external data.
data and intensions
yellow bears?
Copyright © 2017 WALID S. SABA
Data-driven approaches typically ignore functional words (prepositions, quantifiers, etc.), and for a
good reason: the probabilities of these words are equal in all contexts! But such words cannot be
ignored as these words are what logically glues the various components of a sentence into a coherent
whole. Consider for example the determiner „a‟, the smallest word in English, in the following
sentences:
(1) A paper on genetics was published by every student of Dr. Miller
(2) A paper on genetics was referenced by every student of Dr. Miller
While „a paper on genetics‟ may refer to a single and specific paper in (2), this not likely in (1), where
„a‟ is most likely under the scope of „every‟. That is, the most likely meaning of (1) the one implied by
(3) Every student of Dr. Miller published a paper on genetics
Resolving such quantifier scope ambiguities are clearly beyond data-driven approaches and are a
function of pragmatic world knowledge (e.g., while it is possible for several students to refer
to a single paper, it is not likely that all of Dr. Miller‟s students published the same paper…)
We shall later on see how a strongly-typed ontology of commonsense concepts can be used to make
such inferences.
data and intensions
functional words
Copyright © 2017 WALID S. SABA
We (hopefully) have demonstrated that purely quantitative (statistical data-driven) approaches are
not plausible models for natural language understanding, and for two main reasons:
1. The relevant information is often not even present in the
data, or in many cases there is no statistical significance in
the data to make the proper inferences. Attempts to remedy this lead to a combinatorial
explosion in the size of features that would have to assumed, which renders these attempts
computationally implausible.
2. It was shown that even when the data is available, reasoning with data only and ignoring
intensions and logical definitions can easily lead to absurdities and contradictions.
While statistical and data-driven models may not be appropriate for high-level reasoning tasks in
language understanding, we believe that these models have a lot to offer in some linguistic and data-
centric tasks. Chief among these are part-of-speech (POS) tagging, statistical parsing, and collecting
and analyzing corpus linguistic data to „enable‟ and automate some of the tasks needed in building
a system that can truly understand ordinary spoken languages.
We are now in a position to start describing our proposal.
data-driven NLU?
so where are we now?
Copyright © 2017 WALID S. SABA
ontological vs. logical concepts
Copyright © 2017 WALID S. SABA
We will start with our proposal by first introducing the general framework, and we will do so
gradually. The material presented form hereon assumes some exposure to logic, although we will try
to simplify our presentation as much as can possibly be done.
One of the major features in our framework is the crucial idea of distinguishing between what can be
called ontological concepts, or first-intension concepts, as Cocchiarella (19xx) calls them, and logical
concepts (or, second intension concepts). The difference between these two types of concepts can be
illustrated by the following examples:
(1) R2 : heavy(x :: physical)
R3 : hungry(x :: animal)
R4 : articulate(x :: human)
R5 : make(x :: human, y :: artifact)
R6 : imminent(x :: event)
R7 : beautiful(x :: entity)
What the above says is : heavy is a property that can be said of any object x that is of type physical;
that we say hungry of objects that are of type animal; that articulate applies to objects that are of
type human; that we can speak of the make relation between an object of type human and an object of
type artifact; that we can say imminent of objects that are of type event; and, finally, that we can say
beautiful of any entity.
the framework
ontological vs. logical concepts
Copyright © 2017 WALID S. SABA
the framework
ontological vs. logical concepts
It is also assumed that the types associated with predicates in
(1), e.g. artifact, event, human, entity, etc. exist in a
subsumption hierarchy as shown in the fragment hierarchy
below, and where the dotted arrows indicate the existence of
intermediate types.
The fact that an object of type human is ultimately an object of
type entity is expressed as human v entity. Furthermore, a
property such as heavy can be said of objects of type human
and objects of type artifact since human v physical,
artifact v physical and heavy(x :: physical).
Copyright © 2017 WALID S. SABA
As mentioned earlier, a strongly-typed ontology is assumed throughout this study. Usually this
conjures up thoughts of massive amount of knowledge that has to be hand coded and engineered by
experts. This is not at all what we are assuming here. In fact, the ontological structure we are
assuming (and we will discuss later on) is not massive at all since most everyday concepts are
actually just instances of the basic ontological types.
For example, there‟s nothing meaningful (i.e., sensible, regardless of whether it is true or false), in
language that we can say about a „racing car‟ that we cannot say about a car). Thus, as far as
language understanding, the ontological type car belongs to the ontology, and „racing car‟ is just an
instance concept. With such an analysis, most of everyday concepts are just instances of basic
ontological types. This issue is related to a comment that J. Fodor once made, something to the effect
that “to be a concept, is to be locked to a word in the language”. This is also inline with Fred
Sommers‟ idea of applicability in his proposal about The Tree of Language. Gottleb Frege‟s idea of how
a word gets its meaning, namely from all the different ways it can be used in language, is also
consistent with the ontological structure we assume, which was discovered by reverse engineering
language itself. That is, what we can say about concepts, tells us what structure lies behind.
We will discuss the details of the ontology later on, for now, we will simply assume that this
ontological structure exists.
about the ontological structure
the framework
Copyright © 2017 WALID S. SABA
According to the above, in our framework we assume a Platonic universe that includes everything
we can talk about in ordinary discourse, including abstract objects such as events, states, properties,
etc. These ontological concepts exist as types in a strongly-typed ontology, and the logical concepts
are all the properties of, or the relations that can hold between, these ontological concepts. In addition
to logical and ontological concepts there are proper nouns, which are the names of objects; objects
that can be of any type. We use the notation
(91Sheba :: thing)
to state that there is a unique object named Sheba, an object that is of type thing. With this basic
machinery, let‟s consider the interpretation of the simple sentence „Sheba is a thief‟, where 〚s〛
stands for 'the meaning of s', is used to mean 'is interpreted as', and thief(x :: human) states that
the property thief applies to objects that must be of type human:
(2) 〚Sheba is a thief〛
) (91Sheba :: thing)(thief(Sheba :: human))
Thus „Sheba is a thief‟ is interpreted as follows: there is some unique object named Sheba, an object
that is initially assumed to be a thing; such that the property thief is true of Sheba.
ontological vs. logical concepts
)
the framework
Copyright © 2017 WALID S. SABA
Note that in our interpretation (repeated below) Sheba is now associated with more than one type in a
single scope.
(2) 〚Sheba is a thief〛 ) (91Sheba :: thing)(thief(Sheba :: human))
Initially unknown, and thus assumed to be an object of type thing, Sheba was later assigned the type
human, when described by the property (or when in the context of being a) thief. In these situations a
type unification must occur, and this is done as follows,
(Sheba :: (thing ² human)) ! (Sheba :: human)
where (s ² t) denotes a type unification between the types s and t, and where ! stands for „unifies to‟.
Note that the unification of thing and human resulted in human since human v thing; that is, since an
object that is of type human is ultimately an object of type thing. The final interpretation of „Sheba is a
thief‟ is now the following:
(2) 〚Sheba is a thief〛 ) (91Sheba :: human)(thief(Sheba))
In the final analysis „Sheba is a thief‟ is simply interpreted as: there is a unique object named Sheba, an
object that (we now know) must be of type human, and that object is a thief.
type unification – the basics
the framework
Copyright © 2017 WALID S. SABA
Although we have interpreted a very simple sentence, we have already seen the power of embedding
ontological types (that exist in some strongly-typed hierarchy) into the powerful machinery
of logical semantics. Specifically, it was the type constraint on the property thief(x :: human), namely
that it applies to objects that must be of type human, that allowed us to discover the fact that Sheba must
be a human. Admittedly, this a very trivial „discovery‟ and in a very simple context. However, the
power of type unification and the hidden information it will uncover will be more appreciated as we
move on to more involved contexts.
Suppose black(x :: physical) and own(x :: human,y :: entity). That is, we are assuming that black
can be said of all objects of type physical, and that objects of type human can own any object of type
entity. With that, let us consider now the following:
(3) 〚 Sara owns a black cat 〛
) (91Sara :: thing)(9c :: cat)(black(c :: physical)
^ own(Sara :: human, c :: entity))
Thus „Sara owns a black cat‟ is interpreted as follows: there is a unique thing named Sara, and some
object c of type cat, such that c is black (and thus here it must be of type physical), and Sara owns c,
where in this context Sara must be an object of type human and c an object of type entity.
type unification – the basics
the framework
Copyright © 2017 WALID S. SABA
Our interpretation for „Sara owns a black cat‟ is repeated below.
(3) 〚 Sara owns a black cat 〛 ) (91Sara :: thing)(9c :: cat)(black(c :: physical)
^ own(Sara :: human, c :: entity))
Note now that, depending on the context they are mentioned in, Sara is assigned two types, and the
object c is assigned three types. The type unifications that must occur in this situation are the following:
(Sara :: (thing ² human)) ! (Sara :: human)
(c :: ((physical ² entity) ² cat))
! (c :: (physical ² cat))
! (c :: cat)
Note that the type unification (physical ² entity) ² cat) is associative, so the order in which the two
type unifications are done does not matter. The final interpretation of „Sara owns a
black cat‟ is therefore given by:
(3) 〚 Sara owns a black cat 〛 ) (91Sara :: human)(9c :: cat)(black(c) ^ own(Sara, c))
That is, there is unique object named Sara, which is of type human, and some cat c, and Sara owns c.
type unification – the basics
the framework
type
unification:
the basics
the framework
Copyright © 2017 WALID S. SABA
Copyright © 2017 WALID S. SABA
As mentioned in our introduction, in our framework ontological concepts include abstract objects such
as states, processes, events, properties, etc. Let us now consider one of these categories, namely
activities. In our framework a concept such as dancer(x) is true of some x according to the following:
(8x :: human)(dancer(x)
´ (9d :: activity)(dancing(d) ^ agent(d, x))
That is, any object x of type human is a dancer iff there is some object d of type activity such that d is a
dancing activity, and x is the agent of d. Note that according to the above, there are at least two objects
that are part of the meaning of „dancer‟, and in particular, some object x of type human, and some
dancing activity, d. Thus, in saying „beautiful dancer‟, for example, one could be using „beautiful‟ to
describe the dancer, or the dancing activity itself. Consider now the interpretation below, assuming that
beautiful(x :: entity); that is, assuming beautiful is a property that can be said of any entity:
(4) 〚 Sara is a beautiful dancer 〛
) (91Sara :: thing)(9a :: activity)
(dancing(a) ^ agent(a :: activity,Sara :: human)
^ (beautiful(a :: entity) _ beautiful(Sara :: entity)))
abstract objects
the framework
Copyright © 2017 WALID S. SABA
〚 Sara is a beautiful dancer 〛
) (91Sara :: thing)(9a :: activity)
(dancing(a) ^ agent(a :: activity,Sara :: human)
^ (beautiful(a :: entity) _ beautiful(Sara :: entity)))
Thus „Sara is a beautiful dancer‟ is interpreted as follows: there‟s a unique object named Sara, some
activity a, such that a is a dancing activity, and Sara is the agent of a (and as such must be an object of
type human), and either the dancing is beautiful, or Sara (or, of course, both). Note now that there are a
number of type unifications that must occur:
(Sara :: ((thing ² human) ² entity)) ! (Sara :: (human ² entity)) ! (Sara :: human)
(a :: (activity ² entity)) ! (a :: activity)
After all is said and done, the interpretation of (4) is the following:
(4) 〚 Sara is a beautiful dancer 〛
) (91Sara :: human)(9a :: activity)
(dancing(a) ^ agent(a, Sara)
^ (beautiful(a)_beautiful(Sara)))
Note that the ambiguity of what beautiful is describing is still represented in our final interpretation.
abstract objects
the framework
Copyright © 2017 WALID S. SABA
Thus far our type unifications have always succeeded. In some cases, however, a type unification
between two types s and t could fail, and we write this as
(s ² t) ! ?
Let us see where this might occur and what would this result in. Consider the interpretation of „Sara is a
blonde dancer‟ where we assume blonde(x :: human), that is, we are assuming that blonde
is a property that applies to objects that must be of type human.
(5) 〚 Sara is a blonde dancer 〛
) (91Sara :: thing)(9a :: activity)
(dancing(a) ^ agent(a :: activity, Sara :: human)
^ (blonde(a :: human)_blonde(Sara :: human)))
The type unifications needed for Sara are quite simple:
(Sara :: ((thing ² human) ² human)) ! (Sara :: (human ² human)) ! (Sara :: human)
The type unification needed for the activity a, however, is not as straightforward. Before we continue, let
us plug in the type unification of Sara to see where we‟re at.
failed type unifications
the framework
Copyright © 2017 WALID S. SABA
a brief detour
Copyright © 2017 WALID S. SABA
Before we continue with our proposal, we would like to illustrate the utility of separating concepts into
logical and ontological concepts. We will do this here by proposing a solution to the so-called Paradox
of the Ravens. Introduced in the 1940‟s by the logician (and once an assistant of Rudolph Carnap) Carl
Gustav Hempel, the Paradox of the Ravens (or Hempel‟s Paradox, or the Paradox of Confirmation) has
continued to occupy logicians, statisticians, and philosophers of science to this day. The paradox arises
when one considers what constitutes as an evidence for a statement (or hypothesis). To illustrate what
the Paradox of the Ravens is consider the following:
(H1) All ravens are black
(H2) All non-black things are not ravens
That is, we have the hypothesis H1 that „All ravens are black‟. This hypothesis, however, is logically
equivalent to the hypothesis H2 that „All non-black things are not ravens‟, as shown below:
(1) and (2) are logically equivalent, thus any evidence/observation that confirms H1 must also confirm
H2 and vice versa. While it sounds reasonable that observing black ravens should confirm H1, observing
a white ball, or a red sofa, that do confirm H2, also confirm the logically equivalent hypothesis that all
ravens are black, which does not sound plausible.
what paradox of the ravens?
a temporary diversion
Copyright © 2017 WALID S. SABA
what paradox of the ravens?
Observing non-black objects that are not ravens as in (b), however, confirms hypothesis
H2 (that all non-black things are not ravens). But H2 is logically equivalent to H1, leaving
us with the unpleasant conclusion that observing red apples, blue suede shoes, or brown
briefcases, confirms the hypothesis that „All ravens are black‟.
Observing black ravens confirms hypothesis H1, namely that „All ravens are black‟ -
the case in in (a).
(a) (b)
Copyright © 2017 WALID S. SABA
a temporary diversion
what paradox of the ravens?
Many solutions have been proposed to the Paradox of the Ravens that range from accepting the
paradox (that observing red apples and other non-black non-ravens does confirm the hypothesis „All
ravens are black‟) to proposals in the Bayesian tradition that try to measure the „degree‟ of confirmation.
The Bayesian proposals essentially amount to proposing that observing a red apple does confirm the
hypothesis „All ravens are black‟ but it does so very minimally, and certainly much less than the
observation of a black raven confirms „All ravens are black‟. Clearly, this is not a satisfactory solution
since observing a red flower should not contribute at all to the confirmation of „All ravens are black‟.
Worse, in the Bayesian analysis, the observation of black but non-raven objects actually negatively
confirms (or disconfirms) the hypothesis that „All ravens are black‟.
One logician that stands out in suggesting an explanation for the Paradox of the Ravens is W. V. Quine,
who suggested (in „Natural Kinds‟) that there is no paradox in the first place, since universal statements
of the form All Fs are Gs can only be confirmed on what he called natural kinds, and that „nonblack
things‟ and „non ravens‟ are not natural kinds. Basically, for Quine, members of a natural kind must
share most of their properties, and there‟s hardly anything similar between all „non-black things‟, or all
non-ravens. While statistical/Bayesian and other logical proposal still have not suggested a reasonable
explanation for the Ravens Paradox, we believe that the line of thought Quine was perusing is the most
appropriate. However, Quine‟s natural kinds were not well-defined. In fact, what Quine was alluding
to, probably, was that there is a difference between what we have called here logical concepts and that
of ontological concepts
Copyright © 2017 WALID S. SABA
a temporary diversion
what paradox of the ravens?
The so-called Paradox of the Ravens exists simply because of mistakenly representing both ontological
and logical concepts by predicates, although, ontologically, these two types of concepts are quite
different. First, let us discuss some predicates and how we usually represent them in first-order logic.
Consider the following:
Suppose now that we would like to add types to our variables. That is, we would like our logical
expressions to be, in computer programming terminology, strongly-typed. Suppose, further, that we
also like our predicates to be polymorphic; that is, they apply to objects of a certain type and all of their
subtypes. That is, if a predicate applies to objects of type vehicle, then it applies to all subtypes of
vehicle (e.g., car, truck, bus, …) Given this, what are the appropriate types that one might associate
with the variables of the predicates above? Here are some possible type assignments:
Copyright © 2017 WALID S. SABA
a temporary diversion
what paradox of the ravens?
What the above suggests is that, ignoring metaphor for the moment, the predicate black applies to
objects that are of type physical. In other words, black is meaningless (or nonsensical) when applied
to (or said of) objects that are not of type physical. Similarly, the above says that imminent is said of
objects that are of type event (and, of course, all its subtypes, so we can say „an imminent trip‟, an
„imminent meeting‟, imminent election‟, etc.). In the same vein the above says that sympathetic is
said of objects that must be of type human, and that hungry applies to objects of type animal. But
how about the predicates in (5) and (6)? What are the most appropriate types that can be associated
with the variables in the predicates dog(x) and guitar(x), or of what types of objects can these
predicates be meaningful? The only plausible answer seems to be the following:
(5) and (6) are obvious tautologies, since, for example, the predicate dog applied to an object of type
dog is always true. Clearly, then, (5) and (6) are quite different from the predicates in (1) through (4)
: while the predicates in (1) through (4) are logical concepts, dog and guitar are not redicates/logical
concepts, but ontological concepts that correspond to types in a strongly-typed ontology. With this
background, let us now go back to the so-called Paradox of the Ravens.
Copyright © 2017 WALID S. SABA
a temporary diversion
what paradox
of the ravens?
Copyright © 2017 WALID S. SABA
a temporary diversion
what paradox
of the ravens?
Copyright © 2017 WALID S. SABA
a temporary diversion
what paradox
of the ravens?
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
salient
properties/relations
the framework
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
the framework
failed
type
unifications
Copyright © 2017 WALID S. SABA
ontological
semantics:
contents
the road ahead
Copyright © 2017 WALID S. SABA
the proposal
words-sense
disambiguation
Copyright © 2017 WALID S. SABA
the proposal
words-sense
disambiguation
Copyright © 2017 WALID S. SABA
the proposal
words-sense
disambiguation
Copyright © 2017 WALID S. SABA
Let us now look at situations where lexical ambiguities translate into ambiguities in both, logical and
ontological concepts. Consider the sentences in (12) and (13):
(10) Melinda ran for twenty minutes.
(11) The program ran for twenty minutes.
First of all, there is a clear ambiguity in the meaning of „program‟, as it could refer to a computer
program (i.e., a process), or to a program of some event, among other meanings. Second, it is clear that
the running of Melinda in (10) is different from the running of the program in (11). Let us consider the
simplest of these two cases, namely the ambiguity in (10), assuming that there are (at least) two kinds of
running activities, one who‟s agent is a (legged) animal, and one who‟s agent is a process:
What the above says is the following: there‟s a unique object named Melinda, some twenty minutes that
Melinda ran, and either a running activity of some human, or the running of some process.
the proposal
words-sense disambiguation
Copyright © 2017 WALID S. SABA
the proposal
Copyright © 2017 WALID S. SABA
the proposal
Copyright © 2017 WALID S. SABA
fragment of
the ontology
Copyright © 2017 WALID S. SABA
the proposal
words-sense
disambiguation
Copyright © 2017 WALID S. SABA
the proposal
Copyright © 2017 WALID S. SABA
words-sense
disambiguation
the proposal
Copyright © 2017 WALID S. SABA
the proposal
words-sense
disambiguation
Copyright © 2017 WALID S. SABA
the proposal
Copyright © 2017 WALID S. SABA
the proposal
Copyright © 2017 WALID S. SABA
the proposal
Copyright © 2017 WALID S. SABA
the proposal
the proposal
The corner table
wants another beer
Tables have ‘wants’, and they drink beer?!
the proposal
the proposal
the proposal
Copyright © 2017 WALID S. SABA
To be continued ...

More Related Content

What's hot

Contemporary Linguistics 6th Edition OGrady Solutions Manual
Contemporary Linguistics 6th Edition OGrady Solutions ManualContemporary Linguistics 6th Edition OGrady Solutions Manual
Contemporary Linguistics 6th Edition OGrady Solutions Manualzicekufu
 
Ontology Building and its Application using Hozo
Ontology Building and its Application using HozoOntology Building and its Application using Hozo
Ontology Building and its Application using HozoKouji Kozaki
 
OWL Full Semantics
OWL Full SemanticsOWL Full Semantics
OWL Full SemanticsJie Bao
 
Distributed morphology.
Distributed morphology.Distributed morphology.
Distributed morphology.1101989
 
Basic Word Order’ in Formal and Functional Linguistics and the Typological St...
Basic Word Order’ in Formal and Functional Linguistics and the Typological St...Basic Word Order’ in Formal and Functional Linguistics and the Typological St...
Basic Word Order’ in Formal and Functional Linguistics and the Typological St...ola khaza'leh
 
Morphology (linguistics)
Morphology (linguistics)Morphology (linguistics)
Morphology (linguistics)Er Animo
 
EGT_5 The Structure of Verb Phrases
EGT_5 The Structure of Verb PhrasesEGT_5 The Structure of Verb Phrases
EGT_5 The Structure of Verb PhrasesJESUS L. VIEITES
 
Chapter 6: Morphology
Chapter 6: MorphologyChapter 6: Morphology
Chapter 6: MorphologyJane Keeler
 
Morphophonemic changes
Morphophonemic changesMorphophonemic changes
Morphophonemic changesJulius Sison
 
Words as types and words as tokens (Morphology)
Words as types and words as tokens (Morphology)Words as types and words as tokens (Morphology)
Words as types and words as tokens (Morphology)Restu Cahyadiarta
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQLOpen Data Support
 
Morphological rules- Sarah Saneei
Morphological rules- Sarah SaneeiMorphological rules- Sarah Saneei
Morphological rules- Sarah SaneeiSRah Sanei
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowRichard Wallis
 

What's hot (18)

Contemporary Linguistics 6th Edition OGrady Solutions Manual
Contemporary Linguistics 6th Edition OGrady Solutions ManualContemporary Linguistics 6th Edition OGrady Solutions Manual
Contemporary Linguistics 6th Edition OGrady Solutions Manual
 
Ontology Building and its Application using Hozo
Ontology Building and its Application using HozoOntology Building and its Application using Hozo
Ontology Building and its Application using Hozo
 
OWL Full Semantics
OWL Full SemanticsOWL Full Semantics
OWL Full Semantics
 
Distributed morphology.
Distributed morphology.Distributed morphology.
Distributed morphology.
 
Basic Word Order’ in Formal and Functional Linguistics and the Typological St...
Basic Word Order’ in Formal and Functional Linguistics and the Typological St...Basic Word Order’ in Formal and Functional Linguistics and the Typological St...
Basic Word Order’ in Formal and Functional Linguistics and the Typological St...
 
Microlinguistics
MicrolinguisticsMicrolinguistics
Microlinguistics
 
Morphology (linguistics)
Morphology (linguistics)Morphology (linguistics)
Morphology (linguistics)
 
EGT_5 The Structure of Verb Phrases
EGT_5 The Structure of Verb PhrasesEGT_5 The Structure of Verb Phrases
EGT_5 The Structure of Verb Phrases
 
Chapter 6: Morphology
Chapter 6: MorphologyChapter 6: Morphology
Chapter 6: Morphology
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
Morphophonemic changes
Morphophonemic changesMorphophonemic changes
Morphophonemic changes
 
Words as types and words as tokens (Morphology)
Words as types and words as tokens (Morphology)Words as types and words as tokens (Morphology)
Words as types and words as tokens (Morphology)
 
Syntax
SyntaxSyntax
Syntax
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
 
14 morphological typology
14 morphological typology14 morphological typology
14 morphological typology
 
Pluralization in English
Pluralization in EnglishPluralization in English
Pluralization in English
 
Morphological rules- Sarah Saneei
Morphological rules- Sarah SaneeiMorphological rules- Sarah Saneei
Morphological rules- Sarah Saneei
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
 

Similar to Deep misconceptions and the myth of data driven NLU

Objectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative ConnotationsObjectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative ConnotationsBeth Johnson
 
Theoretical Issues In Pragmatics And Discourse Analysis
Theoretical Issues In Pragmatics And Discourse AnalysisTheoretical Issues In Pragmatics And Discourse Analysis
Theoretical Issues In Pragmatics And Discourse AnalysisLouis de Saussure
 
Manipulation and cognitive pragmatics. Preliminary hypotheses
Manipulation and cognitive pragmatics. Preliminary hypothesesManipulation and cognitive pragmatics. Preliminary hypotheses
Manipulation and cognitive pragmatics. Preliminary hypothesesLouis de Saussure
 
Towards a lingua universalis
Towards a lingua universalisTowards a lingua universalis
Towards a lingua universalisWalid Saba
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLPantonellarose
 
Over the rim version 2
Over the rim version 2Over the rim version 2
Over the rim version 2eyetech
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Suresh Manian
 
Discourse Analysis for Social Research
Discourse Analysis for Social ResearchDiscourse Analysis for Social Research
Discourse Analysis for Social ResearchDominik Lukes
 
What do the fields of cosmology, financial matters, fund, law, scien.pdf
What do the fields of cosmology, financial matters, fund, law, scien.pdfWhat do the fields of cosmology, financial matters, fund, law, scien.pdf
What do the fields of cosmology, financial matters, fund, law, scien.pdfannaielectronicsvill
 
chapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_Eftekharichapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_EftekhariNasrin Eftekhary
 
1Assignment Annotated Bibliography xxxxxx xxxxxxx
1Assignment Annotated Bibliography xxxxxx xxxxxxx1Assignment Annotated Bibliography xxxxxx xxxxxxx
1Assignment Annotated Bibliography xxxxxx xxxxxxxEttaBenton28
 
6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...
6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...
6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...Mona Novais
 
Essay on the embryonic field of language
Essay on the embryonic field of languageEssay on the embryonic field of language
Essay on the embryonic field of languageKen Ewell
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Cornelius Puschmann
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processinggulshan kumar
 

Similar to Deep misconceptions and the myth of data driven NLU (20)

Objectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative ConnotationsObjectification Is A Word That Has Many Negative Connotations
Objectification Is A Word That Has Many Negative Connotations
 
Theoretical Issues In Pragmatics And Discourse Analysis
Theoretical Issues In Pragmatics And Discourse AnalysisTheoretical Issues In Pragmatics And Discourse Analysis
Theoretical Issues In Pragmatics And Discourse Analysis
 
Manipulation and cognitive pragmatics. Preliminary hypotheses
Manipulation and cognitive pragmatics. Preliminary hypothesesManipulation and cognitive pragmatics. Preliminary hypotheses
Manipulation and cognitive pragmatics. Preliminary hypotheses
 
Towards a lingua universalis
Towards a lingua universalisTowards a lingua universalis
Towards a lingua universalis
 
The Social Impact of NLP
The Social Impact of NLPThe Social Impact of NLP
The Social Impact of NLP
 
Over the rim version 2
Over the rim version 2Over the rim version 2
Over the rim version 2
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...
 
Metaphic or the art of looking another way.
Metaphic or the art of looking another way.Metaphic or the art of looking another way.
Metaphic or the art of looking another way.
 
Discourse Analysis for Social Research
Discourse Analysis for Social ResearchDiscourse Analysis for Social Research
Discourse Analysis for Social Research
 
What do the fields of cosmology, financial matters, fund, law, scien.pdf
What do the fields of cosmology, financial matters, fund, law, scien.pdfWhat do the fields of cosmology, financial matters, fund, law, scien.pdf
What do the fields of cosmology, financial matters, fund, law, scien.pdf
 
chapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_Eftekharichapter 1 - Harley (2001)_Eftekhari
chapter 1 - Harley (2001)_Eftekhari
 
A Bridge Not too Far
A Bridge Not too FarA Bridge Not too Far
A Bridge Not too Far
 
1Assignment Annotated Bibliography xxxxxx xxxxxxx
1Assignment Annotated Bibliography xxxxxx xxxxxxx1Assignment Annotated Bibliography xxxxxx xxxxxxx
1Assignment Annotated Bibliography xxxxxx xxxxxxx
 
6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...
6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...
6Th Grade Persuasive Essay Topics. 003 6th Grade Argumentative Essay Topics E...
 
Essay on the embryonic field of language
Essay on the embryonic field of languageEssay on the embryonic field of language
Essay on the embryonic field of language
 
Swedish
SwedishSwedish
Swedish
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
 
Generative grammar ppt report
Generative grammar ppt reportGenerative grammar ppt report
Generative grammar ppt report
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Deep misconceptions and the myth of data driven NLU

  • 1. Deep Misconceptions and the Myth of Data-Driven Language Understanding On Putting Logical Semantics Back to Work
  • 2. IMMANUEL KANT Every thing in nature, in the inanimate as well as in the animate world, happens according to some rules, though we do not always know them I reject the contention that an important theoretical difference exists between formal and natural languages RICHARD MONTAGUE One can assume a theory of the world that is isomorphic to the way we talk about it… in this case, semantics becomes very nearly trivial JERRY HOBBS
  • 3. Early efforts to find theoretically elegant formal models for various linguistic phenomena did not result in any noticeable progress, despite nearly three decades of intensive research (late 1950‟s through the late 1980‟s ). As the various formal (and in most cases mere symbol manipulation) systems seemed to reach a deadlock, disillusionment in the brittle logical approach to language processing grew larger, and a number of researchers and practitioners in natural language processing (NLP) started to abandon theoretical elegance in favor of attaining some quick results using empirical (data-driven) approaches. All seemed natural and expected. In the absence of theoretically elegant models that can explain a number of NL phenomena, it was quite reasonable to find researchers shifting their efforts to finding practical solutions for urgent problems using empirical methods. By the mid 1990‟s, a data-driven statistical revolution that was already brewing over took the field of NLP by a storm, putting aside all efforts that were rooted in over 200 years of work in logic, metaphysics, grammars and formal semantics. We believe, however, that this trend has overstepped the noble cause of using empirical methods to find reasonably working solutions for practical problems. In fact, the data-driven approach to NLP is now believed by many to be a plausible approach to building systems that can truly understand ordinary spoken language. This is not only a misguided trend, but is a very damaging development that will hinder significant progress in the field. In this regard, we hope this study will help start a sane, and an overdue, semantic (counter) revolution. Copyright © 2017 WALID S. SABA a spectre is haunting NLP February 7, 2017
  • 4. Copyright © 2017 WALID S. SABA
  • 5. Copyright © 2017 WALID S. SABA about the resurgence of and the currently dominant paradigm in ‘AI’ …
  • 6. Copyright © 2017 WALID S. SABA the availability of huge amounts data, coupled with advances in computer hardware and distributed computing resulted in some advances in certain types of (data-centric) problems (image, speech, fraud detection, text categorization, etc.)
  • 7. Copyright © 2017 WALID S. SABA But … many problems in AI require understanding that is beyond discovering patterns in data
  • 8. Copyright © 2017 WALID S. SABA Identifying an adult female in an image is a data-centric problem that might be suitable for data-driven image recognition systems However, inferring which of the two is a photo of a teacher and which is a mother requires information that is not (always!) in the data
  • 9. Copyright © 2017 WALID S. SABA which picture would a data-driven image recognition pick-out for a query like ‘musical band’?
  • 10. Copyright © 2017 WALID S. SABA which picture would a data-driven image recognition pick-out for a query like ‘musical band’? which picture would a data-driven image recognition pick-out for a query like ‘musical band’? Musician? a person who plays a musical instrument?
  • 11. Copyright © 2017 WALID S. SABA So what is at issue here? The issue here is that, ontologically, there are no musicians, teachers, lawyers, or even mothers! What exists, ontologically (metaphysically), are humans, and a concept such as ‘musician’ is a logical concept that might be true of a certain human Quantitative/Data-driven approaches can only reason with (detect, infer, recognize) objects that are of an ontological type, but they can not detect logical concepts, that form the majority of objects of (human) thought
  • 12. Copyright © 2017 WALID S. SABA human lawyer dancer ONTOLOGIACL CONCEPTS LOGIACL CONCEPTS teacher mother ...
  • 13. Copyright © 2017 WALID S. SABA failure to distinguish between logical and ontological concepts is not only a flaw in data-driven approaches logical/formal semantics also failed to provide adequate models for natural language and for exactly the same reason
  • 14. Copyright © 2017 WALID S. SABA Notwithstanding achievements in data-centric tasks (e.g., image and speech recognition, or numerically specifiable and finite-space problems, such as the game Go), statistical and other data-driven models (e.g., neural networks) cannot model human language comprehension because these models cannot explain, model or account for very important phenomena in ordinary spoken language, such as: • Non-Observable (thus Non-Learnable) Information • Intensionality and Compositionally • Inferential Capacity
  • 15. Criticisms of the statistical data-driven approach to language understanding are very often automatically associated with the Chomskyan school of linguistics. At best, this is a misinformed judgement (although in many cases, it is ill informed). There is a long history of work in logical semantics (a tradition that forms the background to the proposals we will make here) that has very little to do (if anything at all) with Chomskyan linguistics. Notwithstanding Chomsky‟s (in our opinion valid) Poverty of the Stimulus (POS) argument – an argument that clearly supports the claim of some kind of innate linguistic abilities, we believe that Chomskyans put too much emphasis on syntax and grammar (which ironically made their theory vulnerable to criticism from the statistical and data-driven school). Instead, we think that syntax and grammar are just the external artifacts used to express internal, logically coherent, semantic, and compositionally and productively (i.e., recursively) constructed thoughts, something that is perhaps analogous to Jerry Fodor‟s Language of Thought (LOT). Here we should also mention that we agree somewhat with M. C. Corballis („The Recursive Mind‟) that it is thought that brought about the external tool we call language, and not the other way around. Copyright © 2017 WALID S. SABA what this study is not about
  • 16. Copyright © 2017 WALID S. SABA Another association that criticism of the statistical and data-driven approaches to NLU often conjures up is that of building large knowledge bases with brittle rule-based inference engines. This is perhaps the biggest misunderstanding, held not only by many in the statistical and data- driven camp, but also by previously over enthused knowledge engineers that mistakenly believed at one point that all that is required to crack the NLU problem was to keep adding more knowledge and more rules. We also do not subscribe to such theories. In fact, regarding the above, we agree with an observation once made by the late John McCarthy (at IJCAI 1995) that building ad-hoc systems by simply adding more knowledge and more rules will result in building systems that we don‟t even understand. Ockham's Razor, as well as observing linguistic skills of 5-year olds, should both tell us that the conceptual structures that might be needed in language understanding should not, in principal, require all that complexity. As will become apparent later in this study, the conceptual structures that speakers of ordinary spoken language have access to are not as massive and overwhelming as is commonly believed. Instead, it will be shown that the key is in the nature of that conceptual structure and the computational processes involved. what this study is not about
  • 17. FINALLY, our concern here is in introducing a plausible model for natural language understanding (NLU). If your concern is natural language processing (NLP), as it is used, for example, in applications such as these words-sense disambiguation (WSD); entity extraction/named-entity recognition (NER); spam filtering, categorization, classification; semantic/topic-based search; word co-occurrence/concept clustering; sentiment analysis; topic identification; automated tagging; document clustering; summarization; etc. then it is best if we part ways at this point, since this is not at all our concern here. There are many NLP and text processing systems that already do a reasonable job on such data-level tasks. In fact, I am part of a team that developed a semantic technology that does an excellent job on almost all of the above, but that system (and similar systems) are light years away from doing anything remotely related to what can be called natural language understanding (NLU), which is our concern here. Copyright © 2017 WALID S. SABA what this study is not about
  • 18. Copyright © 2017 WALID S. SABA 1 2 3 WE WILL ARGUE THAT purely data-driven extensional models that ignore intensionality, compositionality and inferential capacities in natural language are inappropriate, even when the relevant data is available, since higher-level reasoning (the kind that‟s needed in NLU) requires intensional reasoning beyond simple data values. WE WILL ARGUE THAT many language phenomena are not learnable from data because (i) in most situations what is to be learned is not even observable in the data (or is not explicitly stated but is implicitly assumed as „shared knowledge‟ by a language community); or (ii) in many situations there‟s no statistical significance in the data as the relevant probabilities are all equal WE WILL ARGUE THAT the most plausible explanation for a number of phenomena in natural language is rooted in logical semantics, ontology, and the computational notions of polymorphism, type unification, and type casting; and we will do this by proposing solutions to a number of challenging and well-known problems in language understanding. what this study is about
  • 19. Copyright © 2017 WALID S. SABA We will propose a plausible model rooted in logical semantics, ontology, and the computational notions of polymorphism, type casting and type unification. Our proposal provides a plausible framework for modelling various phenomena in natural language; and specifically phenomena that requires reasoning beyond the surface structure (external data). To give a hint of the kind of reasoning we have in mind, consider the following sentences: (1) a. Jon enjoyed the movie b. Jon enjoyed watching the movie (2) a. A small leather suitcase was found unattended b. A leather small suitcase was found unattended (3) a. The ham sandwich wants another beer b. The person eating the ham sandwich wants another beer (4) a. Dr. Spok told Jon he should soon be done with writing the thesis b. Dr. Spok told Jon he should soon be done with reading the thesis Our model will explain why (1a) is understood by all speakers of ordinary language as (1b); why speakers in multiple languages find (2a) more natural to say than (2b); why we all understand (3a) as (3b); and why we effortlessly resolve „he‟ in (4a) with Jon and „he‟ in (4b) with Dr. Spok. Before we do so, however, we will discuss some serious flaws in proposing a statistical and a data-driven approach to NLU. more specifically ...
  • 20. Copyright © 2017 WALID S. SABA is not even in the data? what if the relevant information understanding language by analyzing data?
  • 21. Copyright © 2017 WALID S. SABA Challenges in the computational comprehension of ordinary text is often due to quite a bit of missing text – text which is not explicitly stated but is often assumed as shared knowledge among a community of language users. Consider for example the sentences in (1): (1) a. Don’t worry, Simon is a rock. b. The truck in front of us is annoying me. c. Carlos likes to play bridge. d. Mary enjoyed the apple pie. e. Jon owns a house on every street in the village. Clearly, speakers of ordinary English understand the above as (2) a. Don’t worry, Simon is [as solid as] a rock. b. The [person driving the] truck in front of us is annoying me. c. Carlos likes to play [the game] bridge. d. Mary enjoyed [eating] the apple pie. e. Jon owns a [different] house on every street in the village. Since such sentences are quite common and are not at all exotic, farfetched, or contrived, any model for NLU must clearly somehow „uncover‟ this [missing text] for a proper understanding of what is being said. What is certain here is that data-driven approaches are helpless in this regard, since a crucial part of understanding NL text is not only interpreting the data, but „discovering‟ what is missing from the data. analyzing missing text? it's not even in the data
  • 22. Copyright © 2017 WALID S. SABA Again, let us consider the sentences below, where there is some [missing text] that is not explicitly stated in every day discourse, but is often implicitly assumed: a. Don’t worry, Simon is [as solid as] a rock. b. The [person driving the] truck in front of us is annoying me. c. Carlos likes to play [the game] bridge. d. Mary enjoyed [eating] the apple pie. e. Jon owns a [different] house on every street in the village. Although the above seem to have a common denominator, namely some missing text that is often implicitly assumed, it is somewhat surprising that in looking at the literature one finds that the missing text phenomenon have been studied quite independently and under different labels such as metaphor (a), metonymy (b), lexical ambiguity (c), ellipses (d), quantifier scope ambiguity (e) it's not even in the data analyzing missing text?
  • 23. Copyright © 2017 WALID S. SABA In ordinary spoken language there’s more than missing (and implicitly assumed) text … When surface data probabilities are all equally likely, we often resort to our shared (commonsense) knowledge in resolving certain types of ambiguities (e.g., in reference resolution)
  • 24. Copyright © 2017 WALID S. SABA One of the most obvious challenges to statistical and data-driven NLU are situations where there does not seem to be any statistical significance in the observed data that can help in making the right inferences. As an example, consider the sentences in (1) and (2). (1) The trophy did not fit in the brown suitcase because it was too a. big b. small (2) Dr. Spok told Jon that he should soon be done a. writing his thesis b. reading his thesis For a speaker of ordinary language, the decision as to what „it‟ in (1) and „he‟ in (2) refer to are immediately obvious, even for a 5-year old. On the other hand, a statistically data-driven approach would be helpless in making such decisions since the only difference between the sentence-pairs in (1) and (2) are words that co-occur with equal probabilities (this is so because antonyms or opposites, such as big/small, night/day, hot/cold, read/write, open/close, etc. have been shown to co-occur in text with equal frequency). Clearly, then, references such those in (1) and (2) must be resolved using information that is not (directly) in the data. probabilities are all equal it's not even in the data
  • 25. Copyright © 2017 WALID S. SABA In the absence of any statistical significance in the data, we have suggested above that references such as those in sentences (1) and (2) are resolved by relying on other information that is not (directly) in the data. It might still be suggested, however, that a learning algorithm can create statistical significance between (1a) and (1b), for example, if probabilities of some composites in the sentence (as opposed to the atomic units) are considered. What this would essentially require is creating a composite feature for every possible relation. In (1), we would need at least the following: trophy-fit-in-suitcase-small trophy-fit-in-suitcase-big trophy-not-fit-in-suitcase-small trophy-not-fit-in-suitcase-big Note here that since data-driven approaches also do not admit the existence of a type-hierarchy (or any knowledge structure, for that matter) –i.e., there‟s nothing that says that a Trophy and a Radio are both subtypes of an Artifact, and that Purse and Suitcase are both subtypes of some Container, where the „fit‟ relation applies similarly to both, other features (e.g., radio-fit-in-purse-small) would also be needed to learn how to resolve the reference „it‟ in (1). probabilities are all equal it's not even in the data
  • 26. Copyright © 2017 WALID S. SABA Again, in the absence of a type-hierarchy (or some other source of information) statistical significance can only be salvaged if composite features are constructed for every possible relation in a meaningful sentence. Such a story leads us to something like this: trophy-fit-in-suitcase-small trophy-fit-in-suitcase-big trophy-not-fit-in-suitcase-small trophy-not-fit-in-suitcase-big radio-fit-in-purse-small radio-fit-in-purse-big radio-not-fit-in-purse-small radio-not-fit-in-purse-big etc. Although the point can be made with the above, the story in reality is much worse, as there are more „nodes‟ that must be combined in these features to capture statistical significance. For example, if „because‟ was changed to „although‟ in (1b) then „it‟ would suddenly refer to the trophy. Nevertheless, the question now is how many such features would eventually be needed, if every meaningful sentence requires a handful of composite features to capture all statistical correlations? Fodor and Pylyshyn (1988) hint that that number is in the magnitude of the number of seconds in the history of the universe, citing an experiment conducted by the psycholinguist George Miller. probabilities are all equal it's not even in the data
  • 27. Copyright © 2017 WALID S. SABA Incidentally, in the absence of any external knowledge structures, the combinatorially implausible explosion in the number of features needed by a statistical data-driven (i.e., bottom-up) learner would also be needed by a top-down learner, one that learns by being told (or by instruction). Specifically, a top-down learner would ask for n number of clarifications in every sentence, requiring therefore a total of nm clarifications for a paragraph with m sentences. The reader can now easily work out how many clarifications would be required for a top-down learner to understand just a small paragraph1. The point here is that whether the learner tries to discover what is missing bottom-up (from the data), or top-down (by being told), it would seem therefore that the infinity lurking in language (due to the recursive productivity of thoughts) makes learning various language phenomena just from data alone a computationally implausible theory. a top-down explanation 1The reason a top-down learner would need (n x n), as opposed to (n + n), clarifications for two consecutive sentences where each requires n is that the preferences of one sentence are subject to revision in the context of the previous and/or the following sentence. This is so because, linguistically, it is paragraphs, not sentences, that are the smallest linguistic units that can be fully interpreted on their own and should not (in theory) require any additional text to be fully understood. See The Semantics of Paragraphs (Zadrozny & Jenssen, 19xx) for an excellent treatment of the subject it's not even in the data
  • 28. Copyright © 2017 WALID S. SABA Our argument against statistical data-driven approaches in NLU are not meant to dismiss the role of statistical/probabilistic reasoning in language understanding. That would, of course, be unwise. Our argument is about which probabilities are relevant in language understanding. Consider, for example, the following: (1) The town councilors refused to give the demonstrators a permit because they advocated violence and anarchy. (2) A young teenager fired several shots at a policeman. Eyewitnesses say he immediately fled away. While the most likely reading for (1) has they referring to the demonstrators, one can imagine a scenario where a group of anarchist town councilors refused to give the demonstrators a permit specifically to incite violence and anarchy. Similarly, while the most likely reading for (2) is the one where „he‟ refers to the young teenager, one can imagine a scenario where a slightly wounded policeman fled away to escape further injuries. Obviously such occurrences are rare, and thus, in the absence of other information, the pragmatic probability of the usual reading wins over with speakers of ordinary language. What is important to note here is that the likelihoods we are speaking of are a function of pragmatics and have nothing to do with anything observed in the data. pragmatic probabilities it's not even in the data
  • 29. Copyright © 2017 WALID S. SABA To summarize this argument, consider the table below. At the data level references can be resolved during syntactic analysis using simple NUMBER or GENDER data. At the information level, the resolution would require semantic (type) information, for example that corporations, and not lawsuits, settle a case out of court. Note also that at this level the possibilities are not all available, once the type constraints are applied. It is exactly at the pragmatic level where probabilistic/statistical reasoning factors in, since at this level the referents are all possible, yet some are more probable than others (e.g., it is more likely that the one who fell down is the one who was shot, etc.) pragmatic probabilities REFERNCES RESOLVED BY SYNTAX John informed Mary that he passed the exam. John told Steve and Diane that they were invited to the party. information data intentional REFERNCES RESOLVED BY SEMANTICS There are a number of lawsuits between Apple and Samsung, and a. both say they are more about values than patents and money. b. both say they are ready to settle out of court. REFERNCES RESOLVED BY PRAGMATICS A young teenager fired several shots at a policeman. Eyewitnesses say he immediately fled away. REFERNCES CANNOT BE RESOLVED (intention not clear) John told Bill that he has been nominated to head the committee. information level data level knowledge level intentional level it's not even in the data
  • 30. Copyright © 2017 WALID S. SABA Perhaps chief among the “it‟s not even in the data” phenomena is that of Adjective-Ordering Restrictions (AORs), a phenomenon that can be explained by the examples below: (1) a. Carlos is a polite young man b. #Carlos is a young polite man (2) a. A small brown suitcase was found unattended b. #A brown small suitcase was found unattended The readings in (1a) and (2a) are clearly preferred by speakers of ordinary spoken language over the readings in (1b) and (2b), although there are no rules that speakers of ordinary language seem to be following. What makes the AORs phenomenon even more intriguing is the fact that these preferences are also consistently made across multiple languages. First of all this phenomena presents a paradigmatic challenge to the statistical and data-driven story about language learning, as it does not seem that speakers come to have these preferences by observing and analyzing data. Furthermore, there does not seem to be a pattern in the observed data suggesting what adjectives should precede or follow other adjectives. For example, while it is preferred that „small‟ precede „brown‟ in (2), in (3) „small‟ is not anymore preferred to be the first adjective: (3) A beautiful small suitcase was found unattended innate preferences? it's not even in the data
  • 31. Copyright © 2017 WALID S. SABA The most crucial challenge to data-driven NLU as it relates to adjective-ordering restrictions is to explain how beautiful in (4a) could be describing Olga‟s dancing as well as Olga as a person, while this reading is not available in (4b): (4) a. Olga is a tall beautiful dancer b. Olga is a beautiful tall dancer We will see later why beautiful in (4b) cannot anymore modify Olga‟s dancing (an abstract entity of type Activity) after it was polymorphically cast into describing a physical object. For now we want to note however that while various investigations on large corpora have not yielded any plausible explanation as to what seems to govern these adjective-ordering restrictions, we argue that even if some patterns were to be discovered, the more important question is „what is behind this phenomena – i.e., what is it that makes us have these ordering preferences, and across multiple languages‟? In our opinion, what is behind this phenomenon must be much deeper than the outside (observable) data of any language. In fact, we believe that a plausible account for this phenomena must shed some light on the conceptual structures and the processes that are operating in language. As stated above, a plausible explanation for this puzzle, one that is rooted in ontology, polymorphism, type unification and type casting, will be suggested later in this study. innate preferences? it's not even in the data
  • 32. Copyright © 2017 WALID S. SABA We have thus far argued that in the absence of some process or other source of information, a number of phenomena in natural language understanding cannot be observed, captured, or learned by simply analyzing the external linguistic data alone. Whether it‟s adjective-ordering restrictions, which seem to be not only data-independent, but even language-independent, to the missing (not explicitly stated) text that must somehow be discovered and interpreted, to situations where probabilities in the data are statistically insignificant, it is clear that data-driven approaches to NLU are inappropriate. Before we get into our proposals, however, we will next have a small discussion about intensions and how data alone, even if available, is not enough in high-level reasoning, the kind that is needed in NLU.
  • 33. Copyright © 2017 WALID S. SABA data is (in the end) just data no matter how big, extensions and intensions
  • 34. Copyright © 2017 WALID S. SABA Clearly, as objects (e.g., as logical gates) the expressions in (1) are not the same. For example, a logical circuit corresponding to the expression on the left-hand side has only two gates, while a gate for the other expression would have three, as shown below. What do we mean when we write an equality like this? (A ^ (B _ C)) = (A ^ B) _ (A ^ C) It would seem then that at some level, equality in data only is not enough and saying two objects are the same is different from saying they are equal (in their data value). In some contexts, as will be seen shortly, these differences are crucial. What is crucial to our discussion here is that data-driven approaches deal with data only, that is, equality in that paradigm is equality of one attribute, namely the final value. Thus, if it does turn out that equality of data alone is not enough in high-level reasoning (e.g., in NLU), then data-driven approaches to NLU, would also (or, again) clearly be inappropriate. Let us therefore take a closer look at the equality most of us know, and the related notions of intensions and extensions, notions that some of the most penetrating minds in mathematical logic have studied for nearly two centuries. (1) data and intensions
  • 35. Copyright © 2017 WALID S. SABA Our grade school teachers once told us that 256 = 16 Can we always equate and replace the data value 16 by the data value 256? Let‟s see … data and intensions
  • 36. Copyright © 2017 WALID S. SABA Mary taught her little brother that 7 + 9 = 16 Now if we blindly follow what our grade school teachers told us, namely that 256 = 16, we should be able to replace 16 by 256 without any problem. But if we do that we would then be able to alter reality and come up with What happened? Were we taught the wrong thing when we were told that 256 = 16? Not exactly, but we were also not told the whole story. I guess our grade school teachers did not know we will end up working in AI and NLU. If they did, they would have told us that extensional (data only) equality is not sufficient in high-level reasoning, and if equated with sameness at that level it can easily lead to false conclusions. here‘s a snapshot of some reality Mary taught her little brother that 7 + 9 = 𝟐𝟓𝟔 data and intensions
  • 37. Copyright © 2017 WALID S. SABA The four objects below are in fact equal, including 256 and 16, but in regard to one attribute only, namely their data value. As objects, however, they are not the same, as they differ in many other attributes, for example in the number of operators and the number of operands. Note however that the attributes value, no-of-operators, and no-of-operands are still not enough to establish true intensional equality between these objects, as demonstrated by the objects (a) and (b). At a minimum, true (intensional) equality between these objects would require the equality of at least four attributes: value, no-of-operators, no-of-operands, syntaxtree. equality and sameness (a) (b) In many domains where the only relevant attribute is the data (value), working with extensional (data equality) only might be enough. In tasks that require high-level reasoning, such as NLU, however, this will lead to contradictions and false conclusions, as the example of Mary and her little brother clearly demonstrate data and intensions
  • 38. Copyright © 2017 WALID S. SABA The four objects below are in fact equal, including 256 and 16, but only in regard to one attribute only, namely their data value. As objects, however, they are not the same, as they differ in many other attributes, for example in the number of operators and the number of operands. Note however that the attributes value, no-of-operators, and no-of-operands are still not enough to establish true intensional equality between these objects, as demonstrated by the objects (a) and (b). At a minimum, true (intensional) equality between these objects would require the equality of at least four attributes: value, no-of-operators, no-of-operands, syntaxtree. equality and sameness (a) (b) In many domains where the only relevant attribute is the data (value), working with extensional (data equality) only might be enough. In tasks that require high-level reasoning, such as NLU, however, this will lead to contradictions and false conclusions, as the example of Mary and her little brother clearly demonstrate As an aside … Reducing equality of objects to equality of one extensional attribute, namely the data value, is what is behind the so-called adversarial examples in deep neural works, where small perturbations in the image (the kind of which will not affect the human eye from making a different classification) will cause the network to classify the image in a completely different category. The same is true in the converse case, where a completely meaningless image (a blob of pixels) is classified with high certainty as a real-life object. That is, behind both of these phenomena is something similar to the fact that 256 is not always (and in all contexts) equal to 9 + 7, although certain calculations involving these data values might produce the same output value (bottom line: extensional data-only equality is not enough in high level reasoning) data and intensions
  • 39. Copyright © 2017 WALID S. SABA Beyond grade school, we were told in high school that two functions, f and g, are equal (are the same) if for every input they produce the same output. In notation, this was expressed as But this is not entirely true – or, our high school teachers did not also tell us the whole truth: if two functions are equal whenever they agree on their input-output pairings, then MergeSort and InsertionSort would be the same objects, since for any sequence But computer scientists know that although their external values are always the same (that is, they are extensionally equal), MergeSort and InsertSort are not the same objects as they differ in many other (and very important) attributes – for example in their space and time complexity. yet another example MergeSort(sequence) = InsertionSort(sequence) data and intensions
  • 40. Copyright © 2017 WALID S. SABA data and reasoning Here we consider an example where working with extensions (data values) only and ignoring intensions can easily lead to absurd conclusions. Consider the facts shown in the table below. Now according to the above, the teacher of Alexander the Great = Aristotle. Notice now that if we simply replace „the teacher of Alexander the Great‟ with a value that is only extensionally equal to it, we can get an absurdity from a very meaningful sentence, as shown below data and intensions
  • 41. Copyright © 2017 WALID S. SABA Let us now consider examples illustrating why intensionality cannot be ignored in natural language understanding. Suppose we have a question-answering system that was to return the names of: (1) all the tall presidents of the United States ? (2) all the former presidents of the United States ? A simple method for answering (1) would be to get two sets, the set of names of all tall people, and the set of names of all presidents of the United States, and simply return the intersection as the result. What about the query in (2), however? Clearly we cannot do the same, because we cannot, like in the case of tall, represent former by a set (an extension) of all former things. If we did, than Ronald Reagan, for example, would have been a „former president‟ even while serving his term as president, because he would have been in both sets: the set of presidents, and the set of „former things‟ as he was also a former actor. The point here is unlike tall, which is an extensional adjective that can semantically be represented by a set (the set of all tall things), former is an intensional adjective that logically operates on a concept returning a subset of that concept as a result. data and intensions data, intensions and reasoning
  • 42. Copyright © 2017 WALID S. SABA Let us elaborate on this subject some more. The following is a plausible meaning for (1) and (2) above: (1) tall presidents of the United States ) f x j is-president-of-the-us(x) ^ is-tall(x)g (2) former presidents of the United States ) f x j is-president-of-the-us (x) ^ F(x, president)g What the above says is: (1) „tall presidents of the United States‟ refers to any x that is in the set of presidents and also in the set of tall things; and (2) „former presidents of the United States‟ refers to any x that is in the set of presidents and also some F is also true of x. Cleary what F does with an x is something to effect of making sure that x was, at some point in time, but is not now, a president. The point here is that unlike is-tall(x), F is not a set, and has no extensional value, but is a logical expression that takes a concept and applies some condition returning a subset of the original concept. All of this is not available in data-driven NLU, where both „tall‟ and „former‟ are adjectives that equally modify nouns, which, as we have seen, can result in contradictions when executed on real data. data and intensions data, intensions and reasoning
  • 43. Copyright © 2017 WALID S. SABA One misguided attempt at salvaging the data-only solution would be to maintain a set of for the compound former presidents. This escape attempt is doomed, however, since composite sets for previous president, former senator, former governor, previous governor, etc. would also then have to be added and maintained. In fact, insisting on a data-only solution for intensional adjectives would essentially mean maintaining a set for every construction of the form [Adj1 Adj2 Noun], [Adj1 Adj2 Noun1 Noun2], … where any adjective Adji was an intensional adjective. This is exactly the same the situation we encountered previously (pages 12-15), where composite features for every possible relation were needed to resolve references in a data-driven model. In both cases, such alternatives are neither computationally, nor psychologically plausible. data and intensions data, intensions and reasoning
  • 44. Copyright © 2017 WALID S. SABA Another major problem with data-driven/statistical approaches to NLU is their complete denial of compositionality in computing the meaning of larger linguistic units as a function of the meaning of their constituents. To illustrate, consider the sentences below. (1) Jon bought an old copy of Das Kapital. (2) Jon read an old copy of Das Kapital. Although (1) and (2) refer to the same object, namely to a book entitled „Das Kpital‟, the reference in (1) is to a physical object that can be bought (and thus sold, and burned, etc), while in (2) the reference is to the content and ideas in that book. Thus, „Das Kapital‟ may refer to different features or properties of the book, depending on the context, where the context could extend over several sentences. For example, consider (3): (3) Jon read Das Kapital. He then burned it because he did not agree with anything it espouses. In (3), we are (at the same time) using „Das Kapital‟ to refer to an abstract object (namely the content of Das Kapital) when Jon read it and then disagreed with it‟s content, and a physical object, that can be burned. We will see later on how a strongly-typed system will discover the existence of all the potential types of objects that „Das Kapital‟ can refer to (physical object that can be burned, abstract object that can be read and disagreed with, etc.) data and intensions compostionality
  • 45. Copyright © 2017 WALID S. SABA In natural language we can speak of anything we can conceive or imagine, existent or non-existent. We can thus speak of and refer to an event that did not exist, as in (1) John cancelled the trip. It was planned for next Saturday. In (1), we are speaking about and referring to an event (a trip), that did not actually happen, thus a trip that never existed. We can also refer to or speak of objects that do not exist, as in (2) John painted a yellow bear. In (2) what is „yellow‟ is not an actual bear, but a depiction of some object, namely a bear. Reference to abstract and nonexistent objects can be quite involved, especially in mixed contexts where the initial reference is to an object that does not necessarily exist, but is an object that subsequent context implies its existence. For example, consider the following: (3) John’s book proposal was not well received. But it later became a bestseller when it was published. In (3), the reference was initially to a book proposal, which does not imply the existence of the book, although subsequent context implies the concrete existence of a book. Such inferences cannot be made with a simple analysis of the external data. data and intensions yellow bears?
  • 46. Copyright © 2017 WALID S. SABA Data-driven approaches typically ignore functional words (prepositions, quantifiers, etc.), and for a good reason: the probabilities of these words are equal in all contexts! But such words cannot be ignored as these words are what logically glues the various components of a sentence into a coherent whole. Consider for example the determiner „a‟, the smallest word in English, in the following sentences: (1) A paper on genetics was published by every student of Dr. Miller (2) A paper on genetics was referenced by every student of Dr. Miller While „a paper on genetics‟ may refer to a single and specific paper in (2), this not likely in (1), where „a‟ is most likely under the scope of „every‟. That is, the most likely meaning of (1) the one implied by (3) Every student of Dr. Miller published a paper on genetics Resolving such quantifier scope ambiguities are clearly beyond data-driven approaches and are a function of pragmatic world knowledge (e.g., while it is possible for several students to refer to a single paper, it is not likely that all of Dr. Miller‟s students published the same paper…) We shall later on see how a strongly-typed ontology of commonsense concepts can be used to make such inferences. data and intensions functional words
  • 47. Copyright © 2017 WALID S. SABA We (hopefully) have demonstrated that purely quantitative (statistical data-driven) approaches are not plausible models for natural language understanding, and for two main reasons: 1. The relevant information is often not even present in the data, or in many cases there is no statistical significance in the data to make the proper inferences. Attempts to remedy this lead to a combinatorial explosion in the size of features that would have to assumed, which renders these attempts computationally implausible. 2. It was shown that even when the data is available, reasoning with data only and ignoring intensions and logical definitions can easily lead to absurdities and contradictions. While statistical and data-driven models may not be appropriate for high-level reasoning tasks in language understanding, we believe that these models have a lot to offer in some linguistic and data- centric tasks. Chief among these are part-of-speech (POS) tagging, statistical parsing, and collecting and analyzing corpus linguistic data to „enable‟ and automate some of the tasks needed in building a system that can truly understand ordinary spoken languages. We are now in a position to start describing our proposal. data-driven NLU? so where are we now?
  • 48. Copyright © 2017 WALID S. SABA ontological vs. logical concepts
  • 49. Copyright © 2017 WALID S. SABA We will start with our proposal by first introducing the general framework, and we will do so gradually. The material presented form hereon assumes some exposure to logic, although we will try to simplify our presentation as much as can possibly be done. One of the major features in our framework is the crucial idea of distinguishing between what can be called ontological concepts, or first-intension concepts, as Cocchiarella (19xx) calls them, and logical concepts (or, second intension concepts). The difference between these two types of concepts can be illustrated by the following examples: (1) R2 : heavy(x :: physical) R3 : hungry(x :: animal) R4 : articulate(x :: human) R5 : make(x :: human, y :: artifact) R6 : imminent(x :: event) R7 : beautiful(x :: entity) What the above says is : heavy is a property that can be said of any object x that is of type physical; that we say hungry of objects that are of type animal; that articulate applies to objects that are of type human; that we can speak of the make relation between an object of type human and an object of type artifact; that we can say imminent of objects that are of type event; and, finally, that we can say beautiful of any entity. the framework ontological vs. logical concepts
  • 50. Copyright © 2017 WALID S. SABA the framework ontological vs. logical concepts It is also assumed that the types associated with predicates in (1), e.g. artifact, event, human, entity, etc. exist in a subsumption hierarchy as shown in the fragment hierarchy below, and where the dotted arrows indicate the existence of intermediate types. The fact that an object of type human is ultimately an object of type entity is expressed as human v entity. Furthermore, a property such as heavy can be said of objects of type human and objects of type artifact since human v physical, artifact v physical and heavy(x :: physical).
  • 51. Copyright © 2017 WALID S. SABA As mentioned earlier, a strongly-typed ontology is assumed throughout this study. Usually this conjures up thoughts of massive amount of knowledge that has to be hand coded and engineered by experts. This is not at all what we are assuming here. In fact, the ontological structure we are assuming (and we will discuss later on) is not massive at all since most everyday concepts are actually just instances of the basic ontological types. For example, there‟s nothing meaningful (i.e., sensible, regardless of whether it is true or false), in language that we can say about a „racing car‟ that we cannot say about a car). Thus, as far as language understanding, the ontological type car belongs to the ontology, and „racing car‟ is just an instance concept. With such an analysis, most of everyday concepts are just instances of basic ontological types. This issue is related to a comment that J. Fodor once made, something to the effect that “to be a concept, is to be locked to a word in the language”. This is also inline with Fred Sommers‟ idea of applicability in his proposal about The Tree of Language. Gottleb Frege‟s idea of how a word gets its meaning, namely from all the different ways it can be used in language, is also consistent with the ontological structure we assume, which was discovered by reverse engineering language itself. That is, what we can say about concepts, tells us what structure lies behind. We will discuss the details of the ontology later on, for now, we will simply assume that this ontological structure exists. about the ontological structure the framework
  • 52. Copyright © 2017 WALID S. SABA According to the above, in our framework we assume a Platonic universe that includes everything we can talk about in ordinary discourse, including abstract objects such as events, states, properties, etc. These ontological concepts exist as types in a strongly-typed ontology, and the logical concepts are all the properties of, or the relations that can hold between, these ontological concepts. In addition to logical and ontological concepts there are proper nouns, which are the names of objects; objects that can be of any type. We use the notation (91Sheba :: thing) to state that there is a unique object named Sheba, an object that is of type thing. With this basic machinery, let‟s consider the interpretation of the simple sentence „Sheba is a thief‟, where 〚s〛 stands for 'the meaning of s', is used to mean 'is interpreted as', and thief(x :: human) states that the property thief applies to objects that must be of type human: (2) 〚Sheba is a thief〛 ) (91Sheba :: thing)(thief(Sheba :: human)) Thus „Sheba is a thief‟ is interpreted as follows: there is some unique object named Sheba, an object that is initially assumed to be a thing; such that the property thief is true of Sheba. ontological vs. logical concepts ) the framework
  • 53. Copyright © 2017 WALID S. SABA Note that in our interpretation (repeated below) Sheba is now associated with more than one type in a single scope. (2) 〚Sheba is a thief〛 ) (91Sheba :: thing)(thief(Sheba :: human)) Initially unknown, and thus assumed to be an object of type thing, Sheba was later assigned the type human, when described by the property (or when in the context of being a) thief. In these situations a type unification must occur, and this is done as follows, (Sheba :: (thing ² human)) ! (Sheba :: human) where (s ² t) denotes a type unification between the types s and t, and where ! stands for „unifies to‟. Note that the unification of thing and human resulted in human since human v thing; that is, since an object that is of type human is ultimately an object of type thing. The final interpretation of „Sheba is a thief‟ is now the following: (2) 〚Sheba is a thief〛 ) (91Sheba :: human)(thief(Sheba)) In the final analysis „Sheba is a thief‟ is simply interpreted as: there is a unique object named Sheba, an object that (we now know) must be of type human, and that object is a thief. type unification – the basics the framework
  • 54. Copyright © 2017 WALID S. SABA Although we have interpreted a very simple sentence, we have already seen the power of embedding ontological types (that exist in some strongly-typed hierarchy) into the powerful machinery of logical semantics. Specifically, it was the type constraint on the property thief(x :: human), namely that it applies to objects that must be of type human, that allowed us to discover the fact that Sheba must be a human. Admittedly, this a very trivial „discovery‟ and in a very simple context. However, the power of type unification and the hidden information it will uncover will be more appreciated as we move on to more involved contexts. Suppose black(x :: physical) and own(x :: human,y :: entity). That is, we are assuming that black can be said of all objects of type physical, and that objects of type human can own any object of type entity. With that, let us consider now the following: (3) 〚 Sara owns a black cat 〛 ) (91Sara :: thing)(9c :: cat)(black(c :: physical) ^ own(Sara :: human, c :: entity)) Thus „Sara owns a black cat‟ is interpreted as follows: there is a unique thing named Sara, and some object c of type cat, such that c is black (and thus here it must be of type physical), and Sara owns c, where in this context Sara must be an object of type human and c an object of type entity. type unification – the basics the framework
  • 55. Copyright © 2017 WALID S. SABA Our interpretation for „Sara owns a black cat‟ is repeated below. (3) 〚 Sara owns a black cat 〛 ) (91Sara :: thing)(9c :: cat)(black(c :: physical) ^ own(Sara :: human, c :: entity)) Note now that, depending on the context they are mentioned in, Sara is assigned two types, and the object c is assigned three types. The type unifications that must occur in this situation are the following: (Sara :: (thing ² human)) ! (Sara :: human) (c :: ((physical ² entity) ² cat)) ! (c :: (physical ² cat)) ! (c :: cat) Note that the type unification (physical ² entity) ² cat) is associative, so the order in which the two type unifications are done does not matter. The final interpretation of „Sara owns a black cat‟ is therefore given by: (3) 〚 Sara owns a black cat 〛 ) (91Sara :: human)(9c :: cat)(black(c) ^ own(Sara, c)) That is, there is unique object named Sara, which is of type human, and some cat c, and Sara owns c. type unification – the basics the framework
  • 57. Copyright © 2017 WALID S. SABA As mentioned in our introduction, in our framework ontological concepts include abstract objects such as states, processes, events, properties, etc. Let us now consider one of these categories, namely activities. In our framework a concept such as dancer(x) is true of some x according to the following: (8x :: human)(dancer(x) ´ (9d :: activity)(dancing(d) ^ agent(d, x)) That is, any object x of type human is a dancer iff there is some object d of type activity such that d is a dancing activity, and x is the agent of d. Note that according to the above, there are at least two objects that are part of the meaning of „dancer‟, and in particular, some object x of type human, and some dancing activity, d. Thus, in saying „beautiful dancer‟, for example, one could be using „beautiful‟ to describe the dancer, or the dancing activity itself. Consider now the interpretation below, assuming that beautiful(x :: entity); that is, assuming beautiful is a property that can be said of any entity: (4) 〚 Sara is a beautiful dancer 〛 ) (91Sara :: thing)(9a :: activity) (dancing(a) ^ agent(a :: activity,Sara :: human) ^ (beautiful(a :: entity) _ beautiful(Sara :: entity))) abstract objects the framework
  • 58. Copyright © 2017 WALID S. SABA 〚 Sara is a beautiful dancer 〛 ) (91Sara :: thing)(9a :: activity) (dancing(a) ^ agent(a :: activity,Sara :: human) ^ (beautiful(a :: entity) _ beautiful(Sara :: entity))) Thus „Sara is a beautiful dancer‟ is interpreted as follows: there‟s a unique object named Sara, some activity a, such that a is a dancing activity, and Sara is the agent of a (and as such must be an object of type human), and either the dancing is beautiful, or Sara (or, of course, both). Note now that there are a number of type unifications that must occur: (Sara :: ((thing ² human) ² entity)) ! (Sara :: (human ² entity)) ! (Sara :: human) (a :: (activity ² entity)) ! (a :: activity) After all is said and done, the interpretation of (4) is the following: (4) 〚 Sara is a beautiful dancer 〛 ) (91Sara :: human)(9a :: activity) (dancing(a) ^ agent(a, Sara) ^ (beautiful(a)_beautiful(Sara))) Note that the ambiguity of what beautiful is describing is still represented in our final interpretation. abstract objects the framework
  • 59. Copyright © 2017 WALID S. SABA Thus far our type unifications have always succeeded. In some cases, however, a type unification between two types s and t could fail, and we write this as (s ² t) ! ? Let us see where this might occur and what would this result in. Consider the interpretation of „Sara is a blonde dancer‟ where we assume blonde(x :: human), that is, we are assuming that blonde is a property that applies to objects that must be of type human. (5) 〚 Sara is a blonde dancer 〛 ) (91Sara :: thing)(9a :: activity) (dancing(a) ^ agent(a :: activity, Sara :: human) ^ (blonde(a :: human)_blonde(Sara :: human))) The type unifications needed for Sara are quite simple: (Sara :: ((thing ² human) ² human)) ! (Sara :: (human ² human)) ! (Sara :: human) The type unification needed for the activity a, however, is not as straightforward. Before we continue, let us plug in the type unification of Sara to see where we‟re at. failed type unifications the framework
  • 60. Copyright © 2017 WALID S. SABA a brief detour
  • 61. Copyright © 2017 WALID S. SABA Before we continue with our proposal, we would like to illustrate the utility of separating concepts into logical and ontological concepts. We will do this here by proposing a solution to the so-called Paradox of the Ravens. Introduced in the 1940‟s by the logician (and once an assistant of Rudolph Carnap) Carl Gustav Hempel, the Paradox of the Ravens (or Hempel‟s Paradox, or the Paradox of Confirmation) has continued to occupy logicians, statisticians, and philosophers of science to this day. The paradox arises when one considers what constitutes as an evidence for a statement (or hypothesis). To illustrate what the Paradox of the Ravens is consider the following: (H1) All ravens are black (H2) All non-black things are not ravens That is, we have the hypothesis H1 that „All ravens are black‟. This hypothesis, however, is logically equivalent to the hypothesis H2 that „All non-black things are not ravens‟, as shown below: (1) and (2) are logically equivalent, thus any evidence/observation that confirms H1 must also confirm H2 and vice versa. While it sounds reasonable that observing black ravens should confirm H1, observing a white ball, or a red sofa, that do confirm H2, also confirm the logically equivalent hypothesis that all ravens are black, which does not sound plausible. what paradox of the ravens? a temporary diversion
  • 62. Copyright © 2017 WALID S. SABA what paradox of the ravens? Observing non-black objects that are not ravens as in (b), however, confirms hypothesis H2 (that all non-black things are not ravens). But H2 is logically equivalent to H1, leaving us with the unpleasant conclusion that observing red apples, blue suede shoes, or brown briefcases, confirms the hypothesis that „All ravens are black‟. Observing black ravens confirms hypothesis H1, namely that „All ravens are black‟ - the case in in (a). (a) (b)
  • 63. Copyright © 2017 WALID S. SABA a temporary diversion what paradox of the ravens? Many solutions have been proposed to the Paradox of the Ravens that range from accepting the paradox (that observing red apples and other non-black non-ravens does confirm the hypothesis „All ravens are black‟) to proposals in the Bayesian tradition that try to measure the „degree‟ of confirmation. The Bayesian proposals essentially amount to proposing that observing a red apple does confirm the hypothesis „All ravens are black‟ but it does so very minimally, and certainly much less than the observation of a black raven confirms „All ravens are black‟. Clearly, this is not a satisfactory solution since observing a red flower should not contribute at all to the confirmation of „All ravens are black‟. Worse, in the Bayesian analysis, the observation of black but non-raven objects actually negatively confirms (or disconfirms) the hypothesis that „All ravens are black‟. One logician that stands out in suggesting an explanation for the Paradox of the Ravens is W. V. Quine, who suggested (in „Natural Kinds‟) that there is no paradox in the first place, since universal statements of the form All Fs are Gs can only be confirmed on what he called natural kinds, and that „nonblack things‟ and „non ravens‟ are not natural kinds. Basically, for Quine, members of a natural kind must share most of their properties, and there‟s hardly anything similar between all „non-black things‟, or all non-ravens. While statistical/Bayesian and other logical proposal still have not suggested a reasonable explanation for the Ravens Paradox, we believe that the line of thought Quine was perusing is the most appropriate. However, Quine‟s natural kinds were not well-defined. In fact, what Quine was alluding to, probably, was that there is a difference between what we have called here logical concepts and that of ontological concepts
  • 64. Copyright © 2017 WALID S. SABA a temporary diversion what paradox of the ravens? The so-called Paradox of the Ravens exists simply because of mistakenly representing both ontological and logical concepts by predicates, although, ontologically, these two types of concepts are quite different. First, let us discuss some predicates and how we usually represent them in first-order logic. Consider the following: Suppose now that we would like to add types to our variables. That is, we would like our logical expressions to be, in computer programming terminology, strongly-typed. Suppose, further, that we also like our predicates to be polymorphic; that is, they apply to objects of a certain type and all of their subtypes. That is, if a predicate applies to objects of type vehicle, then it applies to all subtypes of vehicle (e.g., car, truck, bus, …) Given this, what are the appropriate types that one might associate with the variables of the predicates above? Here are some possible type assignments:
  • 65. Copyright © 2017 WALID S. SABA a temporary diversion what paradox of the ravens? What the above suggests is that, ignoring metaphor for the moment, the predicate black applies to objects that are of type physical. In other words, black is meaningless (or nonsensical) when applied to (or said of) objects that are not of type physical. Similarly, the above says that imminent is said of objects that are of type event (and, of course, all its subtypes, so we can say „an imminent trip‟, an „imminent meeting‟, imminent election‟, etc.). In the same vein the above says that sympathetic is said of objects that must be of type human, and that hungry applies to objects of type animal. But how about the predicates in (5) and (6)? What are the most appropriate types that can be associated with the variables in the predicates dog(x) and guitar(x), or of what types of objects can these predicates be meaningful? The only plausible answer seems to be the following: (5) and (6) are obvious tautologies, since, for example, the predicate dog applied to an object of type dog is always true. Clearly, then, (5) and (6) are quite different from the predicates in (1) through (4) : while the predicates in (1) through (4) are logical concepts, dog and guitar are not redicates/logical concepts, but ontological concepts that correspond to types in a strongly-typed ontology. With this background, let us now go back to the so-called Paradox of the Ravens.
  • 66. Copyright © 2017 WALID S. SABA a temporary diversion what paradox of the ravens?
  • 67. Copyright © 2017 WALID S. SABA a temporary diversion what paradox of the ravens?
  • 68. Copyright © 2017 WALID S. SABA a temporary diversion what paradox of the ravens?
  • 69.
  • 70. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 71. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 72. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 73. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 74. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 75. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 76. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 77. Copyright © 2017 WALID S. SABA salient properties/relations the framework
  • 78. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 79. Copyright © 2017 WALID S. SABA the framework failed type unifications
  • 80. Copyright © 2017 WALID S. SABA ontological semantics: contents the road ahead
  • 81.
  • 82. Copyright © 2017 WALID S. SABA the proposal words-sense disambiguation
  • 83. Copyright © 2017 WALID S. SABA the proposal words-sense disambiguation
  • 84. Copyright © 2017 WALID S. SABA the proposal words-sense disambiguation
  • 85. Copyright © 2017 WALID S. SABA Let us now look at situations where lexical ambiguities translate into ambiguities in both, logical and ontological concepts. Consider the sentences in (12) and (13): (10) Melinda ran for twenty minutes. (11) The program ran for twenty minutes. First of all, there is a clear ambiguity in the meaning of „program‟, as it could refer to a computer program (i.e., a process), or to a program of some event, among other meanings. Second, it is clear that the running of Melinda in (10) is different from the running of the program in (11). Let us consider the simplest of these two cases, namely the ambiguity in (10), assuming that there are (at least) two kinds of running activities, one who‟s agent is a (legged) animal, and one who‟s agent is a process: What the above says is the following: there‟s a unique object named Melinda, some twenty minutes that Melinda ran, and either a running activity of some human, or the running of some process. the proposal words-sense disambiguation
  • 86. Copyright © 2017 WALID S. SABA the proposal
  • 87. Copyright © 2017 WALID S. SABA the proposal
  • 88. Copyright © 2017 WALID S. SABA fragment of the ontology
  • 89. Copyright © 2017 WALID S. SABA the proposal words-sense disambiguation
  • 90. Copyright © 2017 WALID S. SABA the proposal
  • 91. Copyright © 2017 WALID S. SABA words-sense disambiguation the proposal
  • 92. Copyright © 2017 WALID S. SABA the proposal words-sense disambiguation
  • 93. Copyright © 2017 WALID S. SABA the proposal
  • 94. Copyright © 2017 WALID S. SABA the proposal
  • 95. Copyright © 2017 WALID S. SABA the proposal
  • 96. Copyright © 2017 WALID S. SABA the proposal
  • 97. the proposal The corner table wants another beer Tables have ‘wants’, and they drink beer?!
  • 101. Copyright © 2017 WALID S. SABA To be continued ...