SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
Università degli studi di Bari “Aldo Moro”
Dipartimento di Informatica
The 26th International Conference On Industrial, Engineering & Other Applications of Applied Intelligent Systems - IEA/AIE 2013
Amsterdam, The Netherlands, June 17-21, 2013
L.A.C.A.M.
http://lacam.di.uniba.it
An Approach to Automated Learning of
Conceptual Graphs from Text
Fulvio Rotella, Stefano Ferilli, Fabio Leuzzi
{fulvio.rotella, stefano.ferilli, fabio.leuzzi}@uniba.it
Overview
2/23
● Introduction
● Our Framework
● Goals
● Proposal
● Conceptual Graph Construction
● Knowledge Representation Formalism
● Approaching to missing/partial knowledge
● Probabilistic Reasoning by Association
● Qualitative Evaluations
● Conclusions & Future Works
Introduction
The spread of electronic documents and document repositories has
generated the need for automatic techniques to
● understand
● handle
the documents content in order to help users in satisfying their
information needs.
Full Text Understanding is not trivial, due to:
●intrinsic ambiguity of natural language
●huge amount of common sense and conceptual background
knowledge
For facing these problems
lexical and/or conceptual taxonomies are useful
even if manually building is very costly and error prone.
3/23
Our framework*
4/23
1. Capable to build a conceptual network
 Syntactic analysis by Stanford Parser [1] and Stanford
Dependencies [2]
 Handles positive/negative and active/passive form of sentence
 Relationships between subject and (direct/indirect) object
2. Performs generalizations to tackle data poorness and thus to enrich
the graph
3. Performs reasoning ‘by association' to look for relationships
between concepts
(*) F. Leuzzi, S. Ferilli, F. Rotella, “Improving Robustness and Flexibility of Concept Taxonomy Learning from Text”, New Frontiers in
Mining Complex Patterns, pg. 170-184, 2013, ISBN 978-3-642-37381-7
Limits
● Anaphoras not handled
● Concepts Clustering using flat/vectorial-representations
● Concepts Generalization based on external resources
(eg. Wordnet [3])
● Focused mainly to the definitial portion of the network
5/23
Our framework
Goals and Proposal
1. Automated learning of conceptual graphs from restricted
collections
2. Exploiting probabilistic reasoning ‘by association’ on extracted
knowledge
● To Exploit an anaphora resolution strategy
● To face missing/partial knowledge applying a relational
clustering
● To avoid the use of external resources to generalize similar
concepts
6/23
Conceptual Graph Construction
The final output is a typed syntactic structure of each sentence.
Stanford
Parser
Stanford
Dependencies
JavaRAP[4]
STEP 1: Pre-processing
STEP 2: Sentences elaboration
input texts
w/o anaphoras
7/23
Knowledge representation formalism
8/23
 only subject, verb and complement have been considered.
 subjects/complements will represent concepts, verbs will
express relations between them.
 indirect complements are treated as direct ones by
embedding the corresponding preposition into the verb.
 the frequency of each arc in positive and negative
sentences has been taken into account.
subject,
complement
..
subject,
verb...,
complement
9/23
Approaching to
missing/partial knowledge
The quality of the reasoning results applied on the network depends on
the processed texts + NOISE
e.g. if two nodes belong to disjoint graph regions, reasoning cannot
succeed
New Relational Generalization Approach
Concepts Description + Concepts Clustering + Generalization operator
Relational Concept Description
1. Weak Components of the graph extracted by JUNG [5]
 A maximal sub-graph in which at least a path exists between
each pair of vertices
2. For each concept k-neighborhood around it has been extracted
 a sub-graph induced by the set of concepts that are k or fewer
hops away from it
3. Conceptual Graph translated into a set of Horn clauses:
● <subj, verb_{pos,neg}, compl> → {pos, neg}_verb(subj, compl)
● eg. dog eats bone → pos_eat(dog, bone)
● concept(X):-rela(X,Y), relb(Z,X), relc(Y,T)
● eg. concept(dog):-
pos_eat(dog,bone),pos_spit(cat,bone),neg_eat(dog,mouse)
11/23
Relational Pairwise clustering
Exploits the relational representation of concepts
The similarity measure formulae similutudo [6] provides a relational
similarity evaluation between them.
12/23
concept(X):-
rela(X,Y),
relb(Z,X),
relc(Y,T).
concept(K):-
relb(K,Y),
reld(Z,K),
relf(Y,T),
rela(Z,T).
fs( C',C'' )
Generalization of cluster
generalization tacking advantage of an external resource
often not available for specific domains!
generalize each cluster using the maximum set of common
descriptors of each concept 13/23
Problem
Previous approach
Solution
Generalization of cluster
14/23
1. Performing the logical generalization operator in [7]
• a least general generalization (lgg) under ϴOI − subsumption
of two clauses is a generalization which is not more general
than any other such generalization, that is, it is either more
specifc than or not comparable to any other such
generalization.
2. Exploitable for:
 retrieval of documents of interest
 Introducing new taxonomical relationships
 shifting of the representation when needed (abstraction)
Probabilistic reasoning ‘by association’
 Reasoning ‘by association’ means:
 Finding a path of pairwise related concepts that establishes an
indirect interaction between two concepts c′ and c′′
 Real Word Data is noisy and uncertain
 Logical reasoning is conclusive, need of a probabilistic approach
 Exploit sof relationships among concepts
 Two strategies (B) and (D):
 (B) works in breadth aims at obtaining the minimal path between
concepts together with all involved relations
 (D) works in depth and exploits ProbLog [8] in order to allow
probabilistic queries on the conceptual graph
15/23
Given two nodes (concepts):
1. a Breadth-First Search starts from both nodes
2. the former searches the latter's frontier and vice versa
3. until the two frontiers meet by common nodes
Then the path is restored going backward to the roots in both
directions. 16/23
Probabilistic reasoning ‘by association’
Breadth-First Search (B)
Probabilistic reasoning ‘by association’
Breadth-First Search (B)
We also provide:
● the number of positive/negative instances
● the corresponding ratios over the total
Different gradations of actions between two concepts:
● permitted
● prohibited
● typical
● rare
17/23
Has been defined a formalism based on ProbLog language: f :: p
●
f is a ground atom:
link(subject,verb,object)
●
p is the ratio between:
the sum of all ground atoms for which f holds
and
the sum of all possible links between subject and complement
18/23
Probabilistic reasoning ‘by association’
ProbLog Inference Engine (D)
Probabilistic reasoning ‘by association’ *
(B)
(D)
(*) F. Leuzzi, S. Ferilli, F. Rotella, “Improving Robustness and Flexibility of Concept Taxonomy Learning from Text”, New Frontiers in Mining Complex
Patterns, pg. 170-184, 2013, ISBN 978-3-642-37381-7 19/23
Preliminary Evaluation
Experimental setting
17/23
Goal: evaluate the qualitative examination of the obtained clusters and
their generalizations.
The dataset regards 18 documents about social networks.
The size of the dataset was deliberately kept small in order to have poor
knowledge.
The similarity function returns value in ]0,4[ .
The similarity function threshold has been in [2.0, 2.3] with hops equal
to 0.5.
The graph built from text included:
● 695 concepts
● 727 relations
Preliminary Evaluation
Qualitative examination
21/23
Clusters obtained processing concepts described using one level of their
neigbourhood with a similarity threshold equal to 2.0.
19/23
Apply (lgg) under ϴOI to cluster # 35.
concept(X) : − impact(Y, X), signal(Y, X), signal_as(Y, X), do_with(Y, X),
consider(Y,X),offer(Y,X),offer_to(Y,X),average(Y,X),
average_about(Y,X),experience(Y,X),flee_in(Y,X),
be(Y,X).
θ = < {internet/Y, visible/X}, {internet/Y, textual/X} >
Preliminary Evaluation
Qualitative examination
Preliminary Evaluation
19/23
Apply (lgg) under ϴOI to cluster # 20.
concept(X) : − protect(Y, X), protect by(Y, X), become(Y, X), use(Y, X),
have(Y, X), have to(Y, X), have in(Y, X), have on(Y, X),
find(Y,X),go(Y,X),look(Y,X),begin(Y,X),begin with(Y,X),
begin about(Y,X),suspect in(Y,X),suspect for(Y,X).
θ =< { parent/Y, kid/X}, {parent/Y, guru/X}, {parent/Y, limit/X} >
uncovered portion of kid:
teach(kid, school), launch about(f oundation, kid), teach(kid, contrast), come from(kid,contrast),
launch(foundation,kid), finish in(kid,school), invite_from(school, parent), possess to(school,
parent), invite(school, parent), finish_in(kid,side), come_from(kid,school),f
ind_in(school,parent), produce(school,parent), come from(kid,side), find_from(school,parent),
finish_in(kid,contrast), invite about(school,parent), come_before(school,parent),
release(foundation, kid), invite_to(school, parent), teach(kid, side),
release_from(foundation,kid).
uncovered portion of guru: become(teenager, guru).
uncovered portion of limit: be(ability, limit), limit(ability, limit).
The improvement performed can be appreciated remarking the novelty
in the method of description construction.
● Exploiting the Hamming distance we obtained a first level relation
centric
(i.e the concept with its direct relations)
● Exploiting our method we obtained a concept centric description
(i.e. direct and indirect relations between the first level concepts)
The results show that the procedure seems to be reliable in order to
recognize similar concepts on the basis of their structural position in
the graph
22/23
Preliminary Evaluation Remarks
Conclusions
This work proposes an approach to automatically learn conceptual
graphs from text, avoiding the support of external resources.
It works mixing different techniques.
● It improves exploits an anaphora resolution technique
● It applies a relational clustering to group similar concepts
● It generalizes each cluster to obtain new concepts
● Such concepts can be used to:
● build taxonomic relations
● bridge disjoint portion of the graph
Preliminary experiments show that this approach can be viable
although extensions and refinements are needed.
22/23
Future works
1. Enrich the Conceptual Graph with more information
 Collocations Extraction
 Identification of compound concepts (eg. House of Representatives)
 Identification of concepts attributes (eg. Adjectives) and properties
(eg. can(John,eat) )
2. Performing more extensive experiments adopting dataset available
online in order to study the behaviour of the system and its limits
3. Automatic setting of suitable thresholds for searching
generalizations
4. Exploiting more than one level of concept description, can be
achieved interesting results
23/23
Thanks for attention
Questions?
References
[1] Klein and C. D. Manning. Fast exact inference with a factored model for natural
language parsing. In Advances in Neural Information Processing Systems, volume 15.
MIT Press, 2003.
[2] M.C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed depen-
dency parses from phrase structure trees. In LREC, 2006.
[3] C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cam-
bridge, MA, 1998.
[4] L Qiu, M.Y. Kan, and T.S. Chua. A public reference implementation of the rap
anaphora resolution algorithm. In Proceedings of LREC 2004, pages 291–294, 2004.
[5] J. O’Madadhain, D. Fisher, S. White, and Y. Boey. The JUNG (Java Universal
Network/Graph) Framework. Technical report, UCI-ICS, October 2003.
[6] S. Ferilli, T. M. A. Basile, M. Biba, N. Di Mauro, and F. Esposito. A general similarity
framework for horn clause logic. Fundam. Inf., 90(1-2):43–66, January 2009.
[7] G.Semeraro,F.Esposito,D.Malerba,N.Fanizzi,andS.Ferilli.Alogicframework for the
incremental inductive synthesis of datalog theories. In Norbert E. Fuchs, editor,
LOPSTR, volume 1463 of LNCS, pages 300–321. Springer, 1997.
[8] L. De Raedt, A. Kimmig, and H. Toivonen. Problog: a probabilistic prolog and its
application in link discovery. In In Proc. of 20th IJCAI, pages 2468–2473. AAAI Press,
2007.

Mais conteúdo relacionado

Mais procurados

Tensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsTensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsDimitrios Kartsaklis
 
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...L. Thorne McCarty
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationElaheh Barati
 
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextCooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextUniversity of Bari (Italy)
 
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...irjes
 
Logics of Context and Modal Type Theories
Logics of Context and Modal Type TheoriesLogics of Context and Modal Type Theories
Logics of Context and Modal Type TheoriesValeria de Paiva
 
Csr2011 june14 12_00_hansen
Csr2011 june14 12_00_hansenCsr2011 june14 12_00_hansen
Csr2011 june14 12_00_hansenCSR2011
 
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...ijaia
 
POPL 2012 Presentation
POPL 2012 PresentationPOPL 2012 Presentation
POPL 2012 Presentationagarwal1975
 
Advances in Learning with Bayesian Networks - july 2015
Advances in Learning with Bayesian Networks - july 2015Advances in Learning with Bayesian Networks - july 2015
Advances in Learning with Bayesian Networks - july 2015University of Nantes
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_basedcyan1d3
 
Simulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter DaelemansSimulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter Daelemansbutest
 
An Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIAn Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIGiovanni Ciatto
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Association for Computational Linguistics
 

Mais procurados (18)

Tensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsTensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language Semantics
 
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
How to Ground A Language for Legal Discourse In a Prototypical Perceptual Sem...
 
Ontology matching
Ontology matchingOntology matching
Ontology matching
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
 
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from TextCooperating Techniques for Extracting Conceptual Taxonomies from Text
Cooperating Techniques for Extracting Conceptual Taxonomies from Text
 
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...
 
Distributional semantics
Distributional semanticsDistributional semantics
Distributional semantics
 
Logics of Context and Modal Type Theories
Logics of Context and Modal Type TheoriesLogics of Context and Modal Type Theories
Logics of Context and Modal Type Theories
 
Csr2011 june14 12_00_hansen
Csr2011 june14 12_00_hansenCsr2011 june14 12_00_hansen
Csr2011 june14 12_00_hansen
 
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...
 
Ai inductive bias and knowledge
Ai inductive bias and knowledgeAi inductive bias and knowledge
Ai inductive bias and knowledge
 
POPL 2012 Presentation
POPL 2012 PresentationPOPL 2012 Presentation
POPL 2012 Presentation
 
Advances in Learning with Bayesian Networks - july 2015
Advances in Learning with Bayesian Networks - july 2015Advances in Learning with Bayesian Networks - july 2015
Advances in Learning with Bayesian Networks - july 2015
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_based
 
Simulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter DaelemansSimulation of Language Acquisition Walter Daelemans
Simulation of Language Acquisition Walter Daelemans
 
An Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AIAn Abstract Framework for Agent-Based Explanations in AI
An Abstract Framework for Agent-Based Explanations in AI
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
 

Destaque

Prefixes dis
Prefixes disPrefixes dis
Prefixes dissharyndJ
 
Practical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational DatabasesPractical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational DatabasesRalph Winters
 
SECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTS
SECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTSSECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTS
SECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTSHezel Nee Gupit
 
Content analysis
Content analysisContent analysis
Content analysisHans Mallen
 
Critical Discourse Analysis adel thamery
 Critical Discourse Analysis adel thamery Critical Discourse Analysis adel thamery
Critical Discourse Analysis adel thameryAdel Thamery
 
Modal verbs Role-Play Activity
Modal verbs Role-Play ActivityModal verbs Role-Play Activity
Modal verbs Role-Play Activityemptylahh
 
Introduce prefixes suffixes roots affixes power point
Introduce prefixes suffixes roots affixes power pointIntroduce prefixes suffixes roots affixes power point
Introduce prefixes suffixes roots affixes power pointDaphna Doron
 

Destaque (11)

A Sample of CDA
A Sample of CDAA Sample of CDA
A Sample of CDA
 
Prefixes dis
Prefixes disPrefixes dis
Prefixes dis
 
Practical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational DatabasesPractical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational Databases
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysis
 
Affixes
AffixesAffixes
Affixes
 
SECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTS
SECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTSSECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTS
SECTION 11- GENERALIZATION AND INTERPRETATION OF RESULTS
 
Affixes
AffixesAffixes
Affixes
 
Content analysis
Content analysisContent analysis
Content analysis
 
Critical Discourse Analysis adel thamery
 Critical Discourse Analysis adel thamery Critical Discourse Analysis adel thamery
Critical Discourse Analysis adel thamery
 
Modal verbs Role-Play Activity
Modal verbs Role-Play ActivityModal verbs Role-Play Activity
Modal verbs Role-Play Activity
 
Introduce prefixes suffixes roots affixes power point
Introduce prefixes suffixes roots affixes power pointIntroduce prefixes suffixes roots affixes power point
Introduce prefixes suffixes roots affixes power point
 

Semelhante a An Approach to Automated Learning of Conceptual Graphs from Text

ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015RIILP
 
Learning Word Subsumption Projections for the Russian Language
Learning Word Subsumption Projections for the Russian LanguageLearning Word Subsumption Projections for the Russian Language
Learning Word Subsumption Projections for the Russian LanguageUral-PDC
 
GUC_2744_59_29307_2023-02-22T14_07_02.pdf
GUC_2744_59_29307_2023-02-22T14_07_02.pdfGUC_2744_59_29307_2023-02-22T14_07_02.pdf
GUC_2744_59_29307_2023-02-22T14_07_02.pdfChrisRomany1
 
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...University of Bari (Italy)
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for RetrievalBhaskar Mitra
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationMarco Righini
 
An approximate possibilistic
An approximate possibilisticAn approximate possibilistic
An approximate possibilisticcsandit
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modelingHiroyuki Kuromiya
 
Supervised Corpus-based Methods for Word Sense Disambiguation
Supervised Corpus-based Methods for Word Sense DisambiguationSupervised Corpus-based Methods for Word Sense Disambiguation
Supervised Corpus-based Methods for Word Sense Disambiguationbutest
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
 
lecture_mooney.ppt
lecture_mooney.pptlecture_mooney.ppt
lecture_mooney.pptbutest
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph dataPetra Selmer
 
kgpresentation.pdf
kgpresentation.pdfkgpresentation.pdf
kgpresentation.pdfssuser4cd9a9
 
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSCSSN
 
Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN RishirajChakraborty4
 
Gmx europlop08
Gmx europlop08Gmx europlop08
Gmx europlop08Yishay Mor
 
Hybrid Meta-Heuristic Algorithms For Solving Network Design Problem
Hybrid Meta-Heuristic Algorithms For Solving Network Design ProblemHybrid Meta-Heuristic Algorithms For Solving Network Design Problem
Hybrid Meta-Heuristic Algorithms For Solving Network Design ProblemAlana Cartwright
 
Optimistic decision making using an
Optimistic decision making using anOptimistic decision making using an
Optimistic decision making using anijaia
 
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsFabian Pedregosa
 

Semelhante a An Approach to Automated Learning of Conceptual Graphs from Text (20)

ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
 
Learning Word Subsumption Projections for the Russian Language
Learning Word Subsumption Projections for the Russian LanguageLearning Word Subsumption Projections for the Russian Language
Learning Word Subsumption Projections for the Russian Language
 
GUC_2744_59_29307_2023-02-22T14_07_02.pdf
GUC_2744_59_29307_2023-02-22T14_07_02.pdfGUC_2744_59_29307_2023-02-22T14_07_02.pdf
GUC_2744_59_29307_2023-02-22T14_07_02.pdf
 
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
An approximate possibilistic
An approximate possibilisticAn approximate possibilistic
An approximate possibilistic
 
Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Supervised Corpus-based Methods for Word Sense Disambiguation
Supervised Corpus-based Methods for Word Sense DisambiguationSupervised Corpus-based Methods for Word Sense Disambiguation
Supervised Corpus-based Methods for Word Sense Disambiguation
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
 
lecture_mooney.ppt
lecture_mooney.pptlecture_mooney.ppt
lecture_mooney.ppt
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
kgpresentation.pdf
kgpresentation.pdfkgpresentation.pdf
kgpresentation.pdf
 
GDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision MakingGDSC SSN - solution Challenge : Fundamentals of Decision Making
GDSC SSN - solution Challenge : Fundamentals of Decision Making
 
Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN
 
Gmx europlop08
Gmx europlop08Gmx europlop08
Gmx europlop08
 
Hybrid Meta-Heuristic Algorithms For Solving Network Design Problem
Hybrid Meta-Heuristic Algorithms For Solving Network Design ProblemHybrid Meta-Heuristic Algorithms For Solving Network Design Problem
Hybrid Meta-Heuristic Algorithms For Solving Network Design Problem
 
Optimistic decision making using an
Optimistic decision making using anOptimistic decision making using an
Optimistic decision making using an
 
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and Algorithms
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

An Approach to Automated Learning of Conceptual Graphs from Text

  • 1. Università degli studi di Bari “Aldo Moro” Dipartimento di Informatica The 26th International Conference On Industrial, Engineering & Other Applications of Applied Intelligent Systems - IEA/AIE 2013 Amsterdam, The Netherlands, June 17-21, 2013 L.A.C.A.M. http://lacam.di.uniba.it An Approach to Automated Learning of Conceptual Graphs from Text Fulvio Rotella, Stefano Ferilli, Fabio Leuzzi {fulvio.rotella, stefano.ferilli, fabio.leuzzi}@uniba.it
  • 2. Overview 2/23 ● Introduction ● Our Framework ● Goals ● Proposal ● Conceptual Graph Construction ● Knowledge Representation Formalism ● Approaching to missing/partial knowledge ● Probabilistic Reasoning by Association ● Qualitative Evaluations ● Conclusions & Future Works
  • 3. Introduction The spread of electronic documents and document repositories has generated the need for automatic techniques to ● understand ● handle the documents content in order to help users in satisfying their information needs. Full Text Understanding is not trivial, due to: ●intrinsic ambiguity of natural language ●huge amount of common sense and conceptual background knowledge For facing these problems lexical and/or conceptual taxonomies are useful even if manually building is very costly and error prone. 3/23
  • 4. Our framework* 4/23 1. Capable to build a conceptual network  Syntactic analysis by Stanford Parser [1] and Stanford Dependencies [2]  Handles positive/negative and active/passive form of sentence  Relationships between subject and (direct/indirect) object 2. Performs generalizations to tackle data poorness and thus to enrich the graph 3. Performs reasoning ‘by association' to look for relationships between concepts (*) F. Leuzzi, S. Ferilli, F. Rotella, “Improving Robustness and Flexibility of Concept Taxonomy Learning from Text”, New Frontiers in Mining Complex Patterns, pg. 170-184, 2013, ISBN 978-3-642-37381-7
  • 5. Limits ● Anaphoras not handled ● Concepts Clustering using flat/vectorial-representations ● Concepts Generalization based on external resources (eg. Wordnet [3]) ● Focused mainly to the definitial portion of the network 5/23 Our framework
  • 6. Goals and Proposal 1. Automated learning of conceptual graphs from restricted collections 2. Exploiting probabilistic reasoning ‘by association’ on extracted knowledge ● To Exploit an anaphora resolution strategy ● To face missing/partial knowledge applying a relational clustering ● To avoid the use of external resources to generalize similar concepts 6/23
  • 7. Conceptual Graph Construction The final output is a typed syntactic structure of each sentence. Stanford Parser Stanford Dependencies JavaRAP[4] STEP 1: Pre-processing STEP 2: Sentences elaboration input texts w/o anaphoras 7/23
  • 8. Knowledge representation formalism 8/23  only subject, verb and complement have been considered.  subjects/complements will represent concepts, verbs will express relations between them.  indirect complements are treated as direct ones by embedding the corresponding preposition into the verb.  the frequency of each arc in positive and negative sentences has been taken into account. subject, complement .. subject, verb..., complement
  • 9. 9/23 Approaching to missing/partial knowledge The quality of the reasoning results applied on the network depends on the processed texts + NOISE e.g. if two nodes belong to disjoint graph regions, reasoning cannot succeed New Relational Generalization Approach Concepts Description + Concepts Clustering + Generalization operator
  • 10. Relational Concept Description 1. Weak Components of the graph extracted by JUNG [5]  A maximal sub-graph in which at least a path exists between each pair of vertices 2. For each concept k-neighborhood around it has been extracted  a sub-graph induced by the set of concepts that are k or fewer hops away from it 3. Conceptual Graph translated into a set of Horn clauses: ● <subj, verb_{pos,neg}, compl> → {pos, neg}_verb(subj, compl) ● eg. dog eats bone → pos_eat(dog, bone) ● concept(X):-rela(X,Y), relb(Z,X), relc(Y,T) ● eg. concept(dog):- pos_eat(dog,bone),pos_spit(cat,bone),neg_eat(dog,mouse) 11/23
  • 11. Relational Pairwise clustering Exploits the relational representation of concepts The similarity measure formulae similutudo [6] provides a relational similarity evaluation between them. 12/23 concept(X):- rela(X,Y), relb(Z,X), relc(Y,T). concept(K):- relb(K,Y), reld(Z,K), relf(Y,T), rela(Z,T). fs( C',C'' )
  • 12. Generalization of cluster generalization tacking advantage of an external resource often not available for specific domains! generalize each cluster using the maximum set of common descriptors of each concept 13/23 Problem Previous approach Solution
  • 13. Generalization of cluster 14/23 1. Performing the logical generalization operator in [7] • a least general generalization (lgg) under ϴOI − subsumption of two clauses is a generalization which is not more general than any other such generalization, that is, it is either more specifc than or not comparable to any other such generalization. 2. Exploitable for:  retrieval of documents of interest  Introducing new taxonomical relationships  shifting of the representation when needed (abstraction)
  • 14. Probabilistic reasoning ‘by association’  Reasoning ‘by association’ means:  Finding a path of pairwise related concepts that establishes an indirect interaction between two concepts c′ and c′′  Real Word Data is noisy and uncertain  Logical reasoning is conclusive, need of a probabilistic approach  Exploit sof relationships among concepts  Two strategies (B) and (D):  (B) works in breadth aims at obtaining the minimal path between concepts together with all involved relations  (D) works in depth and exploits ProbLog [8] in order to allow probabilistic queries on the conceptual graph 15/23
  • 15. Given two nodes (concepts): 1. a Breadth-First Search starts from both nodes 2. the former searches the latter's frontier and vice versa 3. until the two frontiers meet by common nodes Then the path is restored going backward to the roots in both directions. 16/23 Probabilistic reasoning ‘by association’ Breadth-First Search (B)
  • 16. Probabilistic reasoning ‘by association’ Breadth-First Search (B) We also provide: ● the number of positive/negative instances ● the corresponding ratios over the total Different gradations of actions between two concepts: ● permitted ● prohibited ● typical ● rare 17/23
  • 17. Has been defined a formalism based on ProbLog language: f :: p ● f is a ground atom: link(subject,verb,object) ● p is the ratio between: the sum of all ground atoms for which f holds and the sum of all possible links between subject and complement 18/23 Probabilistic reasoning ‘by association’ ProbLog Inference Engine (D)
  • 18. Probabilistic reasoning ‘by association’ * (B) (D) (*) F. Leuzzi, S. Ferilli, F. Rotella, “Improving Robustness and Flexibility of Concept Taxonomy Learning from Text”, New Frontiers in Mining Complex Patterns, pg. 170-184, 2013, ISBN 978-3-642-37381-7 19/23
  • 19. Preliminary Evaluation Experimental setting 17/23 Goal: evaluate the qualitative examination of the obtained clusters and their generalizations. The dataset regards 18 documents about social networks. The size of the dataset was deliberately kept small in order to have poor knowledge. The similarity function returns value in ]0,4[ . The similarity function threshold has been in [2.0, 2.3] with hops equal to 0.5. The graph built from text included: ● 695 concepts ● 727 relations
  • 20. Preliminary Evaluation Qualitative examination 21/23 Clusters obtained processing concepts described using one level of their neigbourhood with a similarity threshold equal to 2.0.
  • 21. 19/23 Apply (lgg) under ϴOI to cluster # 35. concept(X) : − impact(Y, X), signal(Y, X), signal_as(Y, X), do_with(Y, X), consider(Y,X),offer(Y,X),offer_to(Y,X),average(Y,X), average_about(Y,X),experience(Y,X),flee_in(Y,X), be(Y,X). θ = < {internet/Y, visible/X}, {internet/Y, textual/X} > Preliminary Evaluation Qualitative examination
  • 22. Preliminary Evaluation 19/23 Apply (lgg) under ϴOI to cluster # 20. concept(X) : − protect(Y, X), protect by(Y, X), become(Y, X), use(Y, X), have(Y, X), have to(Y, X), have in(Y, X), have on(Y, X), find(Y,X),go(Y,X),look(Y,X),begin(Y,X),begin with(Y,X), begin about(Y,X),suspect in(Y,X),suspect for(Y,X). θ =< { parent/Y, kid/X}, {parent/Y, guru/X}, {parent/Y, limit/X} > uncovered portion of kid: teach(kid, school), launch about(f oundation, kid), teach(kid, contrast), come from(kid,contrast), launch(foundation,kid), finish in(kid,school), invite_from(school, parent), possess to(school, parent), invite(school, parent), finish_in(kid,side), come_from(kid,school),f ind_in(school,parent), produce(school,parent), come from(kid,side), find_from(school,parent), finish_in(kid,contrast), invite about(school,parent), come_before(school,parent), release(foundation, kid), invite_to(school, parent), teach(kid, side), release_from(foundation,kid). uncovered portion of guru: become(teenager, guru). uncovered portion of limit: be(ability, limit), limit(ability, limit).
  • 23. The improvement performed can be appreciated remarking the novelty in the method of description construction. ● Exploiting the Hamming distance we obtained a first level relation centric (i.e the concept with its direct relations) ● Exploiting our method we obtained a concept centric description (i.e. direct and indirect relations between the first level concepts) The results show that the procedure seems to be reliable in order to recognize similar concepts on the basis of their structural position in the graph 22/23 Preliminary Evaluation Remarks
  • 24. Conclusions This work proposes an approach to automatically learn conceptual graphs from text, avoiding the support of external resources. It works mixing different techniques. ● It improves exploits an anaphora resolution technique ● It applies a relational clustering to group similar concepts ● It generalizes each cluster to obtain new concepts ● Such concepts can be used to: ● build taxonomic relations ● bridge disjoint portion of the graph Preliminary experiments show that this approach can be viable although extensions and refinements are needed. 22/23
  • 25. Future works 1. Enrich the Conceptual Graph with more information  Collocations Extraction  Identification of compound concepts (eg. House of Representatives)  Identification of concepts attributes (eg. Adjectives) and properties (eg. can(John,eat) ) 2. Performing more extensive experiments adopting dataset available online in order to study the behaviour of the system and its limits 3. Automatic setting of suitable thresholds for searching generalizations 4. Exploiting more than one level of concept description, can be achieved interesting results 23/23
  • 27. References [1] Klein and C. D. Manning. Fast exact inference with a factored model for natural language parsing. In Advances in Neural Information Processing Systems, volume 15. MIT Press, 2003. [2] M.C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed depen- dency parses from phrase structure trees. In LREC, 2006. [3] C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cam- bridge, MA, 1998. [4] L Qiu, M.Y. Kan, and T.S. Chua. A public reference implementation of the rap anaphora resolution algorithm. In Proceedings of LREC 2004, pages 291–294, 2004. [5] J. O’Madadhain, D. Fisher, S. White, and Y. Boey. The JUNG (Java Universal Network/Graph) Framework. Technical report, UCI-ICS, October 2003. [6] S. Ferilli, T. M. A. Basile, M. Biba, N. Di Mauro, and F. Esposito. A general similarity framework for horn clause logic. Fundam. Inf., 90(1-2):43–66, January 2009. [7] G.Semeraro,F.Esposito,D.Malerba,N.Fanizzi,andS.Ferilli.Alogicframework for the incremental inductive synthesis of datalog theories. In Norbert E. Fuchs, editor, LOPSTR, volume 1463 of LNCS, pages 300–321. Springer, 1997. [8] L. De Raedt, A. Kimmig, and H. Toivonen. Problog: a probabilistic prolog and its application in link discovery. In In Proc. of 20th IJCAI, pages 2468–2473. AAAI Press, 2007.