SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 11, Issue 2 (May. - Jun. 2013), PP 101-117
www.iosrjournals.org
www.iosrjournals.org 101 | Page
Tools for Ontology Building from Texts: Analysis and
Improvement of the Results of Text2Onto
Sonam Mittal1
, Nupur Mittal2
1
Computer Science, B.K. Birla Institute of Engineering & Technology, Pilani, Rajasthan, India
2
Computer Science, Ecole Polytechnique de l’Universit´e de Nantes, France
Abstract: Building ontologies from texts is a difficult and time-consuming process. Several tools have
been developed to facilitate this process. However, these tools are not mature enough to automate all
tasks to build a good ontology without human intervention. Among these tools, Text2Onto is a one for
learning ontology from textual data. This case study aims at understanding the architecture and
working principle of Text2Onto, analyzing the errors that Text2Onto can produce and finding a solution
to reduce human intervention as well as to improve the result of Text2Onto.Three texts of different length
were used in the experiment. Quality of Text2Onto results was assessed by comparing the entities
extracted by Text2Onto with the ones extracted manually. Some causes of errors produced by
Text2Onto were identified too. As an attempt to improve the result of Text2Onto, change discovery
feature of Text2Onto was used. Meta- model of the given text was fed to Text2Onto to obtain a POM
on top of which an ontology was built for the existing text. The meta-model ontology was aimed to
identify all the core concepts and relations as done in the manual ontology and the ultimate
objective was to improve the hierarchy of the of the ontology. The use of meta model should help to
better classify the concepts under various core concepts.
Keywords: Ontology, Text2Onto
I. Introduction
In the current scenario, use of domain ontology has been increasing. To make such domain
ontologies, general method used is extracting ontology from textual resources. It involves processing of
huge amount of texts which makes it a difficult and time-consuming task. In order to expedite the process
and support the ontogists in different phases of ontology building process, several tools based on
linguistic or statistical techniques have been developed. However, the tools are not fully automated yet.
Human intervention is required at some phases of the tools to validate the results of the tools so as to
produce a good result. Such human intervention is not only time consuming but also error-prone.
Therefore, minimizing human activities for error correction is a key for enhancing these tools.
Text2Onto is a framework for learning ontologies from textual data. It can extract different
ontology components like concepts, relations, instances, hierarchy etc from documents. It also gives some
statistical values which help to understand the importance of those components in the text. However,
users have to verify its results. We, therefore, studied this tool in order to assess how relevant its results
are and to check if its result can be improved. For this purpose, first of all, architecture and working
principles of Text2Onto were studied. Then we performed some experiments. To assess the results, we
mainly considered concepts, instances and relations. We also observed taxonomy. However, the detailed
study revolved around these three components.
II. Literature Review
This section gives brief overview of Ontology, Ontology building processes and sums up the papers [1], [3],
[4], [5], [6], [7].
2.1 Ontology
An ontology is an explicit, formal specification (i.e. machine readable) of a shared (accepted by a
group or community) conceptualization of a domain of interest [2]. It should be restricted to a given
domain of interest and therefore model concepts and relations that are relevant to a particular task or
application domain. Ontologies are built to be reused or shared anytime, anywhere and
independently of the behavior and domain of the application that uses them. The process of
instantiating the a knowledge base is referred to as ontology population whereas the automatic
support in ontology development is usually referred to as ontology learning. Ontology learning is
concerned with knowledge acquisition.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 102 | Page
2.2 Ontology life cycle
Ontology development process refers to what activities are carried out to build the ontologies from
scratch.[1] In order to start the ontology development process, there is a need to plan out the activities to be
carried out and the resources used for them. Thus an ontology specification document is prepared in order
to write the requirements and the specifications of the ontology development process. The process of
ontology building starts with conceptualization of the acquired knowledge in a conceptual model in
order to describe the problem and its solution with the help of some intermediate representations. Next,
the conceptual models are formalized into formal or semi-compatible formal models using frame-oriented
or Description Logic (DL) representation systems. The next step is to integrate the current ontology with
the existing ontologies. Though this is an optional step, we should consider reusing existing ontologies in
order to avoid duplicate effort in building them. After this, the ontology is implemented in a formal
language like OWL, RDF etc. Once the ontology is implemented, it is evaluated to make a technical
judgment with respect to a frame of reference. There is a need to document the ontology to the best
possible extent. Finally, efforts are put to maintain and update the ontology.
There can be various ways to follow these activities to develop the ontology. The most common among
them are water fall life cycle and incremental life cycle.
III. Methontology
Methontology [1] is a well-structured methodology used to build ontologies from scratch. It
follows a certain number of well-defined steps to guide the ontology development process. Methontology
follows the order of specification, knowledge acquisition, conceptualization, implementation, evaluation and
documentation activities in order to carry out the ontology development process. It also identifies the
management activities like schedule, control and quality assurance and some support activities like
integration and evaluation.
3.1 Specification
The first phase according to Methontology is specification where an ontology specification
document is a formal or semi-formal document written in natural language (NL) having information
like purpose of the ontology, level of formality implemented in the ontology, scope of ontology and source
of knowledge. A good design of this document is the one where each and every term is relevant and has
partial completeness and ensures consistency of all the terms.
3.2 Knowledge Acquisition
The specification is followed by knowledge acquisition, which is an independent activity performed
using techniques like brainstorming, interviews, formal questions, non-structured interviews, informal text
analysis, formal text analysis, structured interviews and knowledge acquisition tools.
3.3 Conceptualization
The next step is structuring the domain knowledge in a conceptual model. This is the step of
conceptualization where a glossary of terms is built, relations are identified, taxonomy is defined, the data
dictionary is implemented and table of rules and formula is made. Data dictionary describes and gathers
all the useful and potentially usable domain concepts, their meanings, attributes, instances, etc. Table of
instance attributes provide information about the attribute or about its values at the instance. Thus the
result of this phase of Methontology is a conceptual model expressed as a set of well-defined deliverables
which allow to access the usefulness of the ontology and to compare the scope and completeness of various
other ontologies.
3.4 Integration
Integration is an optional step that is used to accelerate the process of building ontology by
merging various already existing related ontologies. This leads to inspection of the meta-ontologies and then
to find out the best suited libraries to provide term definition. As a result, Methontology produces an
integration document summarizing the meta-ontology, the name of the terms to be used from conceptual
model and the name of the ontology from which the corresponding definition is taken. Methontology highly
recommends the use of already existing ontologies.
3.5 Implementation
Implementation of the ontology is done using a formal language and an ontology development
environment which is incorporated with a lexical and syntactic analyzer so as to avoid lexical and
syntactic errors.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 103 | Page
3.6 Evaluation
Once the ontology has been implemented, they are judged technically which results in a small
evaluation document where the methods used to evaluate the ontology will be described.
3.7 Documentation
Documentation should be carried out during all the above steps. It is the summing of the steps,
procedures and results of each step in a written document.
IV. Ontology Learning Layers
Different aspects of Ontology Learning (OL) have been presented in the form of a stack on the
paper [6]. OL involves the processing of different layers of this stack. It follows an order of identifying the
terms (linguistic realizations of domain-specific concepts), finding out their synonyms, categorizing them as
concepts, defining concept hierarchies, relations and describing rules in order to restrict the concepts.
Different ontology components and the methods for extracting them are explained in the following
sections in details.
V. Ontology modeling components
Methontology deals to conceptualize ontologies with a tabular and graphical IRs. The components
of such IRs are: Concepts, Relations between the concepts of the domain, Instances (specialization of
concept), Constants, Attributes (properties of the concepts in general and instances in specification),
formal axioms and rules specified in formal or semi-formal notation using DL. These components are
used to conceptualize the ontologies by performing certain tasks as proposed by Methontology.
5.1 Term
Terms are linguistic realizations of domain-specific concepts. Term extraction is a mandatory step
for all the aspects of ontology learning from text. The methods for term extraction are based on
information retrieval, NLP research and term indexing. The state-of-the art is mostly to run a part-of-
speech tagger over the domain corpus and then to manually verify the terms hence constructing ad-hoc
patterns. In order to automatically identify only relevant terms, a statistical processing step can be used
that compares the distribution of terms between corpora.
5.2 Synonym
Finding the synonyms allows the acquisition of the semantic term variants in and between languages
and hence helps in term translation. The main implementation is by integrating WordNet for getting the
English synonyms. This requires word sense disambiguation algorithms to identify the synonyms according
to the meaning of the word in the phrase. Clustering and related techniques can be another alternative for
dynamic acquisition. Two main approaches [6] are:
1. Harris Distribution Hypothesis: Terms are similar in meaning to the extent in which they share
syntactic contexts.
2. Statistical information measures defined over the web.
5.3 Concept
In identification of concept should focus to provide:
1. Definition of the concept.
2. Set of concept instances i.e. its extensions.
3. A set of linguistic realizations of the concept.
Intentional concept learning includes extraction of formal and informal definitions. An informal
definition can be a textual description whereas the formal description includes the extraction of concept
properties and relations with other concepts. OntoLearn system can be used for this purpose.
5.4 Taxonomy
Three main factors exploited to induce taxonomies are:
1. Application of lexico-syntactic patterns to detect hyponymy relations.
2. Context of synonym extraction and term clustering mainly using hierarchical clustering.
3. Document based notation of term subsumption.
5.5 Relation
Relations represent a type of association between concepts of the domain. Text mining using
statistical analysis with more or less complex levels of linguistic analysis is used for extracting relations.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 104 | Page
Relation extraction is similar to the problem of acquiring selection restrictions for verb arguments in
NLP. Automatic content extractor program is one such program used for this purpose.
5.6 Rule
These are used to infer knowledge in the ontology. The important factor for rule extraction is to
learn lexical entailment for application in question answering systems.
5.7 Formal Axiom
Formal axioms are the logical expressions that are always true and are used as constraints in
ontology. The ontologist must identify the formal axioms needed in the ontology and should describe them
precisely. Information like Name, natural language description and logic expression should be identified
for each formal axiom.
5.8 Instance
Relevant instances must be identified from the concept dictionary in an Instance table. NL tagger
can be used in order to identify the proper nouns and hence the instances.
5.9 Constant
Constants are numeric values that do not change during the time.
5.10 Attribute
Attributes describe the properties of instances and concepts. They can be instance attributes or class
attributes accordingly. Ontology development tools usually provide predefined domain-independent class
attributes for all the concepts.
VI. Ontology tools and frameworks
Several tools and frameworks have been developed to aid the ontologist in different steps of
ontology building. Different tools are available for extracting ontology components from different kinds of
sources like text, semi structured text, dictionary etc. The scope of these tools varies from basic linguistic
processing like term extraction, tagging etc to guiding the whole ontology building process. Some of the
ontology tools and frameworks are discussed in the following section. As the scope of this study is limited
to Text2Onto, we will discuss about it in detail. Other tools are presented briefly.
VII. Text2Onto
Text2Onto [7] is a framework for learning ontologies from textual data. It is a redesign of
TextToOnto and is based on Probabilistic Ontology Model (POM) which stores the learned primitives
independent of a specific Knowledge Representation (KR) language. It calculates a confidence for each
learned object for better user interaction. It also updates the learned knowledge each time the corpus is
changed and avoids processing it by scratch. It allows for easy combination and execution of algorithms
as well as writing new algorithms.
7.1 Architecture and Workflow
The main components of Text2Onto are Algorithms, an Algorithm Controller and POM. The
learning algorithms are initialized by a controller which triggers the linguistic preprocessing of the data.
Text2Onto depends on the output of Gate. During preprocessing, it calls the applications of Gate to
i. tokenize the document (identifying words, spaces, tabs, punctuation marks etc)
ii. split sentences
iii. tag POS
iv. match JAPE patterns to find noun/verb phrases
Then the algorithms use the results from these applications.
Gate stores the results in an object called Annotation Set which is a set of Annotation objects.
Annotation object stores the following information:
a. id - unique id assigned to the token/element
b. type - type of the element (Token, SpaceToken, Sentence, Noun, Verb etc)
c. features - a map of various info like whether it is a stopword or not, the category( or tag) of the
element (e.g. NN), etc.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 105 | Page
d. start offset - Starting position of the element.
e. end offset - ending position of the element.
Text2Onto uses the „type‟ property to filter the required entity and then uses start and end offset to find
the actual word. For e.g. suppose our corpus begins with the following line:
Ontology evaluation is a critical task. . .
Then the information of a word „task‟ is stored in Annotation object with type „Token‟, category „NN‟,
start offset „34‟ and end offset „38‟. Text2Onto uses the offset values to get the exact word again.
After preprocessing the corpus, the controller executes the ontology learning algorithms in the
appropriate order and applies the algorithms‟ change requests to the POM.
The execution of algorithms takes place in three phases notification phase, computation phase and
result generation phase. In the first phase, the algorithm learns about recent changes to the corpus. In the
second phase, these changes are mapped to changes with respect to the reference repository and finally,
requests for POM changes are generated from the updated content of the reference repository.
Text2Onto includes a Modeling Primitive Library (MPL) which makes the primitive models Ontology
language independent.
7.2 POM
POM (Probabilistic Ontology Model also called Preliminary Ontology Model) is the basic
building block of Text2Onto. It is an extensible collection of modeling primitives for different types of
ontology elements or axioms and uses confidence and relevance annotations for capturing uncertainty. It is
KR language- independent and thus can be transformed into any reasonably expressive knowledge
representation language such as OWL, RDFS, F-logic etc. The modeling primitives used in Text2Onto
are as follows:
i. concepts (CLASS)
ii. concept inheritance (SUBCLASS-OF)
iii. concept instantiation (INSTANCE-OF)
iv. properties/relations (RELATION)
v. domain and range restrictions (DOMAIN/RANGE)
vi. mereological relations
vii. equivalence
POM is traceable because for each object, it also stores a pointer to those parts of the document
from which it was derived. It also allows maintenance of multiple modeling alternatives in parallel.
Adding new primitives does not imply changing the underlying framework thus making it flexible and
extensible.
7.3 Data-driven Change Discovery
An important feature of Text2Onto is data-driven change discovery which prevents the whole
corpus from being processed from scratch each time it changes. When there are changes in the corpus,
Text2Onto detects the changes and calculates POM deltas with respect to the changes. As POM is
extensible, it modifies the POM without recalculating it for the whole document collection. The benefits
of this feature are that the document reprocessing time is saved and the evolution of the ontology can be
traced.
7.4 Ontology Learning Algorithms/Methods
Text2Onto combines Machine Learning approaches with basic linguistics approaches for learning
ontology. Different modeling primitives in POM are instantiated and populated by different algorithms.
Before populating POM, the text documents undergo linguistic preprocessing which is initiated by the
algorithm controller. Basic linguistic preprocessing involves tokenization, sentence splitting, syntactic
tagging of all the tokens by POS tagger and lemmatizing by morphological analyzer or stemming by a
stemmer. The output of these steps is an annotated corpus which is then fed to JAPE transducer to
match a set of particular patterns required by the ontology learning algorithms. The algorithms use certain
criteria to evaluate the confidence of the extracted entities. The following section presents the techniques
and criteria used by these algorithms to extract different ontology components.
7.4.1 Concepts
Text2Onto comes with three algorithms for extracting concepts EntropyConceptExtraction,
RTFConceptExtraction and TFDIFConceptExtraction. It looks for the type „Concept‟ in the Gate results.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 106 | Page
All of these algorithms filter the same type. The only difference is the criteria they take for the probability
/ relevance calculation. These algorithms use statistical measures such as TFIDF (Term Frequency Inverted
Document Frequency), Entropy, C-value, NC-value, RTF (Relative Term Frequency). For each term, the
values of these measures are normalized to [0...1] and used as corresponding probability in the POM.
1. RTFConceptExtraction
It calculates Relative Term Frequency which is obtained by dividing the absolute term frequency
(number of times a term t appears in the document d) of the term t in the document d divided by the
maximum absolute term frequency (the number of times any term appears the maximum number of times in
the document d) of the document d.
𝑡𝐟(𝐭, 𝐃) =
𝐚𝐛𝐬𝐨𝐥𝐮𝐭𝐞 𝐭𝐞𝐫𝐦 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲
𝐦𝐚𝐱𝐢𝐦𝐮𝐦 𝐚𝐛𝐬𝐨𝐥𝐮𝐭𝐞 𝐭𝐞𝐫𝐦 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲
2. TFIDFConceptExtraction
It calculates term frequency inverse document frequency which is the product of TF (term
frequency) and IDF (Inverse Document Frequency). IDF is obtained by dividing the total number of
documents by the number of documents containing the term, and then taking the log of that quotient.
tf-idf(t, d, D) = tf(t, d) × idf(t, D)
where,
𝒊𝒅𝒇 𝒕, 𝑫 = 𝒍𝒐𝒈
𝑫
𝒅𝒇 𝒕
|D| = number of all documents
df(t) = Number of documents containing the term.
3. EntropyConceptExtraction
It computes entropy which is a combination of C-value (indicator of termhood) and NC-value
(Contextual indicators of termhood)
C-value (frequency-based method sensitive to multi-word terms)
𝐂−
𝐯𝐚𝐥𝐮𝐞 𝐚 =
𝐥𝐨𝐠 𝟐 𝐚 𝐟 𝐚 𝐢𝐟 𝐚 𝐢𝐬 𝐧𝐨𝐭 𝐧𝐞𝐬𝐭𝐞𝐝
𝐥𝐨𝐠 𝟐 𝐚 𝐟 𝐚 −
𝟏
𝐓𝐚
𝐟(𝐛)
𝐛𝛜𝐓𝐚
f(a) is the frequency of a, Ta is the set of terms which contain a.
NC-value (incorporation of information from context words indicating termhood)
𝐰𝐞𝐢𝐠𝐡𝐭 𝐰 =
𝐭(𝐰)
𝐧
where t(w) is the number of times that w appears in the context of a term.
7.4.2 Instances
An algorithm called TFIDFInstanceExtraction is available in Text2Onto for extraction of
instances. It filters “Instance” type from the gate result and computes TFIDF as in
TFIDFConceptExtraction.
7.4.3 General relations
General relations are identified using linguistic approach. The algorithm SubcatRelationExtraction
filters the types “TransitiveVerbPhrase”, “IntransitivePPVerbPhrase”, and “ TransitivePPVerbPhrase”
in the Gate results which is obtained by shallow parsing to identify the following syntactical frames:
• Transitive, e.g., love (subj, obj)
• Intransitive + PP-complement, e.g., walk (subj, pp (to))
• Transitive + PP-complement, e.g., hit (subj, obj, pp (with))
For each verb phrases, it finds its subject, object and associated preposition. (By filtering Nouns and
Verbs from the sentence) and then stems them and prepares the relation.
7.4.4 Subclass-of relations
Subclass-of relations identification involves several algorithms which use hypernym structure of
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 107 | Page
WordNet, match Hearst patterns and apply linguistic heuristics. The results of these algorithms are
combined through combination strategies. These algorithms depend on the result of concept extraction
algorithms. Relevance calculation of one of the algorithms is presented below:
1. WordNetClassifcationExtraction
It extracts subclass-of relations among the extracted concepts identifying the hypernym structure of the
concepts in WordNet. Relevance is calculated in the following manner:
If a is a subclass of b, then
𝐑𝐞𝐥𝐞𝐯𝐚𝐧𝐜𝐞 =
𝐍𝐨. 𝐨𝐟 𝐬𝐲𝐧𝐨𝐧𝐲𝐦𝐬 𝐨𝐟 𝐚 𝐟𝐨𝐫 𝐰𝐡𝐢𝐜𝐡 𝐛 𝐢𝐬 𝐚 𝐡𝐲𝐩𝐞𝐫𝐧𝐲𝐦
𝐍𝐨. 𝐨𝐟 𝐬𝐲𝐧𝐨𝐧𝐲𝐦𝐬 𝐨𝐟 𝐚
7.4.5 Instance-of relations
Lexical patterns and context similarity are taken into account for instance classification. A pattern-
matching algorithm similar to the one use for discovering mereological relations is also used for instance-
of relation extraction.
7.4.6 Equivalence and equality
The algorithm calculates the similarity between terms on the basis of contextual features
extracted from the corpus.
7.4.7 Disjointness
A heuristic approach based on lexico-syntactic patterns is implemented to learn disjointness.
The algorithm learns disjointness from the patterns like NounPhrase1, NounPhrase2.... (and/or)
NounPhrasen.
7.4.8 Subtopic-of relations
Subtopic-of relations are discovered using a method for building concept hierarchies. There is also
an algorithm for extracting this kind of relationships from previously identified subclass-of relations.
7.5 NeOn Toolkit
NeOn Toolkit is an open source multi-platform ontology engineering environment and provide
comprehensive support for ontology engineering lifecycle. It is based on Eclipse platform and provides
various plugins for different activities in ontology building. Following plugins are under the scope of this
case study:
7.5.1 Text2Onto plug-in
It is a graphical front-end for Text2Onto that is available for the NeOn toolkit. It enables the
integration of Text2Onto into a process of semi-automatic ontology engineering.
7.5.2 LeDA Plugin
LeDA, an open source framework for automatic generation of disjointness axioms, has been
implemented in this plug-in developed to support both enrichment and evaluation of the acquired
ontologies. The plug-in facilitates a customized generation of disjointness axioms for various domains by
supporting both the training as well as the classification phase.
7.6 Ontocase
OntoCase is an approach to use ontology patterns throughout an iterative ontology construction
and evolution framework. In OntoCase the patterns constitute the backbone of these reusable solutions
because they can be utilized directly as solutions to specific modeling problems. The central repository
consists of pattern catalogue, ontology architecture and other reusable assets. The OntoCase cycle consists
of 4 phases, Retrieval, Reuse, Evaluations and revision and Discovery of new pattern candidates. The first
phase corresponds to input analysis and pattern retrieval. It constitutes the process of analyzing the input
and matching derived input representation to the pattern base to select appropriate pattern. The second
phase includes pattern specialization, adaptation and composition and constitutes the process of reusing
the retrieved patterns and constructing an improved ontology. The third one concerns evaluation and
revision of the ontology to improve the fit to the input and the ontology quality. The final phase includes
the discovery of new pattern candidates or the other reusable components as well as storing pattern
feedback.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 108 | Page
VIII. Learning disjointness axioms (LeDA)
LeDA is an open-source framework for learning disjointness [3] and is based on machine
learning classifier called Naive Bayes. The classifier is trained based on a vector of feature values and
manually created disjointness axioms (i.e. a pair of classes labeled „disjoint‟ or „not disjoint‟). The
following features are using in this framework:
Taxonomic overlap: Taxonomic overlap is the set of common individuals.
Semantic distance: The semantic distance between two classes c1 and c2 is the minimum length of a
path consisting of subsumption relationships between atomic classes that connects c1 and c2.
Object properties: This feature encodes the semantic relatedness of two classes, c1 and c2, based on
the number of object properties they share.
Label similarity: This feature gives the semantic similarity between two classes based on common
prefix or suffix shared by them. Levenshtein edit distance, Q-grams and Jaro-Wrinkler distance are taken
into account to calculate label similarity in LeDA.
Wordnet similarity: LeDA uses Wordnet-bases similarity measure that computes the cosine similarity
between vector-based representations of the glosses that are associated with the two synsets.
Features based on Learned Ontology: From the already acquired knowledge such as terminological
overlap, classes, individuals, subsumption and class membership axioms, more features, viz. subsumption,
taxonomic overlap of subclasses and instances and lexical context similarity, are calculated.
IX. LExO for Learning Class Descriptions
LExO (Learning Expressive Ontologies) [3] automatically generates DL axioms from natural
language sentences. It analyzes the syntactic structures of the input sentence and generates
dependency tree which is then transformed into XML-based format and finally to DL axioms by means
of manually engineered transformation rules. However, this automation of DL generation needs human
intervention to verify if all of them are correct.
X. Relexo
Relational Exploration for Learning Expressive Ontologies is a tool used for the difficult and
time-consuming phase of ontology refinement [4]. It not only supports the user in a stepwise refinement of
the ontology but also helps to ensure the compatibility of a logical axiomatization with the user‟s
conceptualization. It combines a method for learning complex class descriptions from textual definitions
with the Formal Concept Analysis (FCA)-based technique of relational exploration. The LExO
component of this assists the ontologist in the process of axiomatizing atomic classes; the exploration
part helps to integrate newly acquired entities into the ontology. It also helps the user to detect
inconsistencies or mismatches between the ontology and her conceptualization and hence provides a
stepwise approximation of the user‟s domain knowledge.
XI. Alignment To Top-Level Ontologies
It is a special case of ontology matching where the goal is to primarily find correspondences
between more general concepts or relations in the top-level ontology and more specific concepts and relations
on the engineered ontology. Aligning Ontology to a top-level ontology might also be compared to
automatically specializing or extending a top-level ontology. Methods like lexical substitution may be used
to find clues of whether or not a more general concept is related to a more specific one in the other
ontology the alignment of ontology to a top-level ontology engineering patterns. By determining that a
pattern can be applied and applying it then provides a connection to the top-level ontology.
XII. Experiment
In order to evaluate the results of Text2Onto and improve them, some experiments were carried out. The
objectives of the experiments were
• To analyze the various algorithms and criteria used by Text2Onto for extracting different
ontology components.
• To analyze the result produced by Text2Onto
• To compare the components extracted by Text2Onto with the ones extracted manually.
• To analyze errors found in the ontology built by Text2onto and identifying their origin.
• To analyze Text2Onto outcomes when adding meta-model of the ontology as an additional input.
Details on the experimental data and the experiment protocol are presented in the following sections.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 109 | Page
XIII. Experimental Data
The experiments were conducted for three individual texts. The first text which we will call
„Abstract‟ onwards was a compilation of abstract of four different papers. The remaining texts will be
referred to as „Text1‟ and „Text2‟. All of these texts were related to Ontology building and ontology learning
tools. Ontologies were built manually from these texts as well as from Text2Onto.
XIV. Experimental Protocol
The experiments were performed in five phases. The first phase involved the building of ontology
manually from the three texts. The second phase was concerned with the development of ontology using
Text2Onto. In the third phase, the ontology built by Text2Onto was compared with the manual one. In
the next phase, meta-model of the texts were fed to Text2Onto and the corresponding ontology was built
again. Finally, the results were compared with the older ontologies. These phases are further described in
details in the following section:
14.1 Experimental Work-flow
The following steps were carried out for each text:
1. Building ontology manually
Methontology was followed to build ontologies from the three texts manually. All the steps like
glossary building, meta-model and taxonomy were followed while building ontology from Abstract and
Text2 whereas the ontology of Text1 was provided to us. The ontology was conceptualized in the following
way:
1. POS tagging of all the terms in the document.
2. Identify the concepts and relation from the validated terms.
3. Making the meta-model.
The aim is to subsume all the accepted concepts into some of the core concepts.
4. Identifying the accepted terms (concepts), their related core-concepts and finding their synonyms.
5. Defining the is-a hierarchy for the concepts and the identified core-concepts.
6. Identifying other binary relations.
7. Validating the meta-model.
2. Building ontology using Text2Onto
This step involved the use of Text2Onto to build the same ontology automatically.
3. Analysis of Text2Onto results
The Analysis phase was itself done in two phases. First, the results of different algorithms of
Text2Onto were compared with each other in order to find the interesting criteria for the extraction of
different components. This was done for concepts, instances, relation and hierarchy extraction. The main
criteria for the comparison were the relevance value.
Secondly, a comparison and study of differences between the results of tasks performed in the previous
two phases were carried out to estimate and comment on the quality of the ontology built by the tool.
The comparison was very detailed in the sense that all concepts, instances, relations and hierarchies
extracted from these two methods were compared. It was followed by the identification of causes for the
differences and errors/shortcomings in the performance of the tool.
4. Adding Meta-model to the ontology using Text2Onto
The idea was to observe if Text2Onto gives better results when ontology is built on top of its
meta- model. For this, the meta-model built manually in the first phase was introduced into Text2Onto
and ontologies were built upon their corresponding meta-model. This process involved the following
steps:
(a) Conversion of the meta model into text
In order to get a POM of meta-model, we converted meta-model into text from which Text2Onto can
extract core concepts and relations between them. Details about the process of conversion are given in the
section 16Conversion of Meta-Model to text.
(b) Obtaining meta model POM
The meta model text was fed to Text2Onto to obtain a meta model POM which contained all core
concepts and relations between them.
(c) Improving the ontology using meta-model
Once the POM has been obtained from Text2Onto, the original text was added to it to build a new
ontology combined with the meta model.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 110 | Page
5. Comparison of the ontology built with and without the meta model
In this phase, the ontology build in the second phase was compared with the one built using meta
model. Relevance values, identification of new components and hierarchies were considered while
comparison.
XV. Results And Observations
15.1 Comparison of Algorithms and criteria of Text2Onto
The algorithms and criteria used by Text2Onto for extracting ontology components were
studied in detail so as to compare their performance. The comparison was done based on the relevance
values computed by these algorithms.
15.1.1 Observations
Though the values of relevance in case of entropy are different from those in case of other
algorithms, they hold the similar relations and the relative values for the concepts. Same is also true with
the combination of one or more such evaluation algorithms. It was observed that the order of the
extracted components is independent of the algorithms/criteria used. So we cannot say if one algorithm
is superior to the others or one criterion is better than the others. We observed the same behavior in all
three texts.
XVI. Conversion Of Meta-Model To Text
In order to try to improve the ontology built by the tool Text2Onto, the meta-model is used and is
translated to text. As concepts and relations of meta-model should be all identified when executed with
the tool, first try was to write a paragraph about the meta-model. This worked fine for most of the
concepts but a very few relationships could be identified and some of the concepts were also left out and
some extra concepts were included (which were used in the paragraph to structure the meta-model
tran slation ). The next try was to write simple sentences consisting of two nouns (the concepts) related
by a verb (the relation between the two concepts). We tried to use the core concepts and relations only
from the text as much as possible. However, this also could not identify all the relations properly. Finally a
new algorithm was proposed so as to achieve the desired goal as well as to enhance the results of
Text2Onto. Below are the translations of meta model for the various experimental data used.
16.1 AbstractText
The meta model of this text is given in the figure 1. For this meta model, we used the following lines to
construct meta model POM in Text2Onto.
A system is composed of methods.
A method has method components.
A tool implements methods.
An algorithm is used by methods.
An expert participates in ontology building step.
Ontology building step uses resources.
A resource is stored in data repository.
A term is included in resources.
Ontology building step is composed of ontology building process.
Ontology has ontology components.
A user community uses ontologies.
Ontology describes domain.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 111 | Page
Figure 1: Abstract-Text Meta Model
16.2 Text1
The meta model of this text is given in the figure 2.
Figure 2: Text1 Meta Model
16.3 Text2
The meta model of this text is given in the figure 3 and the corresponding meta-model text is given
below.
Domain has ontology.
Ontology is composed by ontology components.
Ontology is built by methodology.
Tool builds ontology.
Activity is guided by methodology.
Activity produces model.
Representation is resulted by mode
Tool supports activity.
Organization develops tool.
Methodology is developed by organization.
Tool uses language.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 112 | Page
Person uses tool.
Person creates ontology.
Figure 3: Text2 Meta Model
16.4 Comparison of Manual and Automated Ontologies
This sections includes the comparison of the two methods of ontology building i.e. MANUAL
and AUTOMATED with the tool Text2Onto. The aim of the comparison is to evaluate the process of
ontology building by the tool and then analyze the results to suggest improvements to the tool.
16.4.1 Manual Ontology - Abstract
Abstract text was the shortest of all texts. It had 536 terms in total out of which 34 terms were
accepted as concepts and 9 as instances.
16.4.2 Automated Ontology - Abstract
The same text was fed to Text2Onto for automating the process of ontology building. As the
importance of ontology components based on relevance values was found to be independent of the
algorithms used, we could choose any algorithm from the available list of them. As we were extracting
ontology from a single document, the algorithms that use TFIDF criteria was not interesting for us. So,
we didn‟t choose this algorithm during analysis. The evaluation algorithms used in the Text2Onto gave
the relevance values to the concepts and other components identified.
Text2Onto did not support writing the results in a separate file and hence we added another
method that could save the results in a different excel file for each execution of Text2Onto. This was also
necessary for the later phases of comparison.
Text2Onto extracted 85 concepts, 14 individuals, and 3 general relations.
16.4.3 Comparison of manual and automated ontology - Abstract
The two ontologies were compared majorly based on the identified concepts, instances, and
relations. Out of 34 concepts extracted manually, only 26 matched the ones extracted from Text2Onto.
Only 7 instances were common to both ontologies and none of the relations were common to them. We
observed that the manual ontology was better in identifying the concepts because in the ontology made
by Text2Onto some of the irrelevant concepts were also considered. Another major problem was the
identification of the composite concepts. All the composite concepts (consisting of more than one atomic
word) were not identified unlike the manual ontology. Relations were not at all satisfactory.
The possible reasons attributed for these differences are as follows:
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 113 | Page
1. The text was not consistent as a whole.
The text was basically a summarization of different texts and hence it lacked synchronization between its
different paragraphs. Thus there was a need to try with another longer and better text so as to conclude
anything significant.
2. The frequency for most of the terms (concepts and relations) was very less.
16.4.4 Manual ontology - Text1
For this ontology, there were 4807 terms after tokenization, of which, 472 were nouns and 226 were
verbs. After performing the operation of stemming, the number of nouns was reduced to 357 as close as
25% reduction in comparison with the original count.
16.4.5 Automated ontology - Text1
The Text1 was fed to Text2Onto for making the ontology automatically. 406 concepts, 94
instances and 16 relations were extracted from Text2Onto.
16.4.6 Comparison of manual and automated ontologies - Text1
As compared to 357 terms from the manual ontology, Text2Onto extracted 406 terms. Among
them only 87 concepts were common to both of them. Some highly irrelevant terms were also included in
the results of Text2Onto based on their high relevance values. On the other hand, some important composite
terms were missed out from the results of automated ontology.
16.4.7 Manual ontology - Text2
Following the same procedure as above for building the manual ontology, there were 4761 terms in
the knowledge base. Finally 667 valid terms were refined from this knowledge base of which ultimately
200 terms were accepted as concepts of the ontology.
16.4.8 Automated ontology - Text2
350 terms (concepts) were extracted from this text when it was run with Text2Onto. A lot of
concepts were insignificant and had to be rejected when the comparison was made.
16.4.9 Comparison of Manual and Automated Ontologies
This automated ontology was better than the earlier too as it could identify many relations
and the is-a hierarchy was better than the others.
16.4.10Observations
Relevance Values and their roles
In order to assess the result of Text2Onto and possibility to automate the process of ontology
building, we examined the role of relevance values for concepts in Text2Onto. The following
observations were made regarding the same:
 Most of the terms that were extracted by Text2Onto as concepts can be accepted based on
their relevance values.
 The core concepts generally have very high relevance.
 Most of the terms with high relevance value are accepted.
 There are concepts which are always rejected despite of their very high values. After studying
man y papers and previous works in this field, there is no general rule that can be applied to
automatically reject these terms but some corpus specific rules can be written.
 There are concepts which are accepted despite of their low values. In order to automate the third
and fourth process, we tried to find out some information about these kinds of concepts. We
observed that the terms with high relevance values (which are generally rejected) occur in the same
kind of pattern. For example the concept is „ORDER‟. It is generally observed to appear a s “IN
ORDER T O”. Thus predefining many such patterns to exclude can be one solution to reject some
terms despite their high relevance values.
16.5 Analysis of errors
16.5.1 Identification of errors
Following errors were identified while comparing the ontologies built manually and the ones built
usingText2Onto:
1. Some concepts were also identified as instances by Text2Onto. For e.g. ontology, WSD
2. Acronyms were not identified by Text2Onto. E.g. SSI, POM.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 114 | Page
3. Synonyms were not identified properly.
4. Very few relations wer e identified by Text2Onto most of which were not appropriate (interesting)
at all.
5. Instance-of algorithm did not give the instances that are given by instance algorithm.
6. Some verbs like extract and inspect which we had considered as relations were identified as concepts
by Text2Onto.
16.5.2 Identification of causes of errors
After an in depth study of the algorithms of Text2Onto, following causes of errors were observed:
1. POS tagger used by GATE tags some words incorrectly. For e.g. the verb extract was tagged as
noun.
2. Errors may also be due to grammatical mistakes in the corpus file.
3. In the case of Abstract text, er r or s may also be due to its length and content. The text
con tain ed 4 paragraphs from different papers, and hence had few common terminologies.
4. The algorithms t o extract concepts and instances work independently. Thus, identification of a
term as both concept and instance is not handled in Text2Onto.
5. SubcatRelationExtraction algorithm can extract relations from simple sentences only.
The patterns it can identify are:
Subject + transitive verb + object
Subject + transitive verb + object + preposition + object
Subject + intransitive verb + preposition + object
It identifies only those verbs as relations which come with a singular subject (concept). For e.g. it can
extract the relation build from a tool builds ontology but not from Tools build ontology.
XVII. Improvement Of Text2Onto Results
As the result of Text2Onto was not good compared to manual ontology, we did two things to
improve it. First, we added an algorithm to improve relation e x t r a c t i o n of Text2Onto. Second, we
performed some experiments on Text2Onto adding meta model to the ontologies built above. The
following section describes the added algorithm and the results and observations from the experiment.
17.1 Algorithm to improve Text2Onto results
The relations extracted from Text2Onto were not interesting at all. Moreover, we found it
difficult to make Text2Onto extract all the relations from Meta model text. So, we decided to add an
algorithm to improve the result of relation extraction in Text2Onto. To extract more relations in order to
make a better meta-model, we have added two JAPE rules along with an algorithm to process them.
The added JAPE rules identify sentences in passive voice and sentences with more than one verb (one
auxiliary verb followed by a main verb) with preposition, i.e. the following syntactical patterns:
• Subject + be-verb + Main verb +”by” + Object e.g. Ontology is built by experts
• Subject + auxiliary-verb + Main verb + preposition + Object e.g. Ontology is composed of components
Though these patterns are similar to each other, we added two patterns instead of one in order to
identify these grammatically significant patterns separately. The new algorithm c a n find these patterns
from both meta-model and the ontology text. As a result, we could obtain the relations that were not
identified in the text earlier.
The added JAPE expressions are as below:
R u le: Passive Phrase
(
({Noun Phrase} | {Proper Noun Phrase}): object
{SpaceToken. kind = = space}
({Token. category = = VBZ}
| {Token. strings == is}): auxverb
{Space Token. kind = = space}
({Token. category = = VBN}
| {Token. categories = = VBD}): verb
{SpaceToken. kind = = space}
({Token .string = = by}): prep
{SpaceToken. kind = = space}
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 115 | Page
({NounPhrase}
| {Proper Noun Phrase}): subject
): passive −−>
: Passive. Passive Phrase =
{ rule = ” Passive Phrase "},
: Verb. Verb =
{Rule = “Passive Phrase "},
: Subject .Subject =
{Rule = " Passive Phrase "},
: object .Object =
{Rule = "Passive Phrase "},
: prep. Preposition =
{Rule = "Passive Phrase "}
R u le: Multi Verbswith Prep
(
({NounPhrase} | {Proper Noun Phrase}): subject
{Space Token. kind = = space}
({Token. category = = VBZ}
{Token. category = = VB}) : auxverb
{SpaceToken. kind = = space}
({Token. category = = VBN}
| {Token. categories = = VBD}): verb
{SpaceToken. kind = = space}
({Token. category = = IN}): prep
{SpaceToken. kind = = space}
({NounPhrase} | {Proper Noun Phrase}): object
): mvwp −−>
: mvwp. MultiVerbswith Prep =
{Rule = "MultiVerbswith Prep"},
: Verb. Verb =
{Rule = "Multi Verbswith Prep"},
: Subject. Subject =
{Rule = "MultiVerbswith Prep"},
: object. Object=
{Rule = " MultiVerbswith Prep"},
: prep. Preposition =
{Rule = " MultiVerbswith Prep"}
These JAPE expressions are used by GATE application to match the syntactical patterns. Using the
new algorithm, we could extract more relations from the original text.
17.2 Enhancement of Ontology using Meta-Model
The main idea was to try to improve the results of Text2Onto so that the process of building
Ontology can be automated. For this first of all, the text was fed to Text2Onto and shortcomings were
identified. Now in order to overcome this, we thought of feeding the meta model to it so that we can
obtain better extraction of concepts, relations and taxonomy. The experiment was carried out for the three
text document. Results obtained from the text were compared with the results obtained from meta
model plus the text to assess the improvement of Text2Onto results.
17.2.1 Observations
Following observations were made when meta-model and ontology were used on same POM to make
the ontology:
1. All the core concepts were identified and their relevance was increased. (The c o r e concepts
w e r e identified earlier also)
2. The core concepts which are not present in the text had greater values.
3. The relations from the meta-model are identified and included in the ontology. Due to addition of
more patterns, some more relations are identified form the text. However, the useful relations are
limited to core concepts.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 116 | Page
4. Hierarchy does not seem to be improved with the algorithms.
VerticalRelationsConceptClassification and PatternConceptClassification. Rather, core concepts with composite
terms are further classified by these algorithms. For e.g. Ontology component w a s classified under
Component. We have not checked this with WordnetConceptClassificationalgorithm yet as it give lots of
irrelevant subclass of relations.
From these behaviors, we can present the following ideas of making meta model:
• We can make meta model with the terms not present in the text (point 2)
• If terms present in the text are used for making meta-model, we can write try to increase the
frequency of core concepts in the meta model itself. (Point 1)
• We can avoid composite terms in meta-model as much as possible. (Point 4)
XVIII. Conclusion
We studied the architecture and working of a tool called Text2Onto that extracts ontologies
from textual input and analyzed its results conducting some experiments with three texts. As a part of
the experiments, ontologies were built manually a s well as using the tool and they were compared with
each other. After a detailed analysis of the results, we reached the final conclusions as follows:
1. Relevance measure cannot be a general measure to reject or accept all the terms.
In automated ontology, there are several terms that have high relevance values and are still
rejected by the experts because they do not hold importance for the ontology. Also there are terms
which, even after having a significantly low value of relevance, are accepted. This is also very common
with the core concepts.
Hence the idea of directly using relevance values for accepting or rejecting concepts needs some further
refinement.
2. Meta-Model could not improve the ontology in terms of its is-a hierarchy.
Though meta model increased the relevance value of core concepts, is-a hierarchy was not
improved. Even after having more extracted relations and properly identified core-concepts using the
meta-model, it could not help in making the hierarchy better. Identifying the relations and concepts
has no effect on subclassof algorithm results. As stated above, there are a few refinements that can be
done for the same. They are suggested in the next section of the report.
XIX. Future Work
From the study of Text2Onto and the outcome of the analysis of its results, we could suggest the
following future work and enhancement to Text2Onto.
1. Enhance the use of meta-model to modify the is-a hierarchy of the Ontology.
After adding corpus to the upper ontology (using the meta-model), we should increase the relevance of
values of the concepts that were identified only for the upper ontology because those core concepts may not
be frequent or very relevant.
2. We can try to manually include the following kind of hierarchy in the Ontology
Text2Onto uses the following concept while extracting relations:
If A<is related to>B and C <is related to>D then A <is related to>D and C <is related to>B also. This
kind of relation str uctur e can be exploited to improve the hierarchy o f concepts. If A <related to>B
and C <related to>D, then C, D can be considered to be a subclass of A and B respectively. Though this
idea may not be applicable for all relations, we can enhance the meta-model significantly for some
relations with same name.
3. Another algorithm can be added where some of the “unwanted” domain-concepts can be predefined and
hence avoided to be included in the ontology. This task will require human interaction before starting to
build the ontology because the “interestingness” of the concepts is significantly dependent on the
domain.
A similar approach can be followed for the “infrequent” and “significant” concepts of a particular
domain.
These two approaches can lead us to use relevance measure as significant criteria to accept or reject a
term. Hence the problem of difference in the concepts between manual and automated ontology can be
overcome.
Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto
www.iosrjournals.org 117 | Page
4. As the algorithms a r e executed separately, some terms are identified as both concepts and
instances.
A feature (or post-processing) can be included so that the terms should either be listed as concepts or as
individuals but not as both. Post processing is also required to remove unnecessary or irrelevant
subsumption relation. Synonyms can be taken i n t o account to improve the result of subsumption
algorithm.
5. A module can be added to identify the acronyms. Examples fr om the text POM and “probabilistic
ontology model” should be identified as one Term.
References
[1] Mariano Fernandez, Asuncion Gomez-P´erez, and Natalia Juristo. Methontology: From ontological art towards
ontological engineering. 1997.
[2] Tom Gruber. What is ontology? 1992. http://www-ksl.stanford.edu/kst/what-is-an-ntology.html.
[3] Volker J. Prototype for learning networked ontologies, deliverable d3.8.1 of neon project. 2009.
[4] Volker Johanna and Blomqvist Eva. Evaluation of methods for contextualized learning of networked ontologies. D eliverable
d3.8.2 of neon project. 2008.
[5] Corcho O., Fernandez-Lopez M., Perez A. G., and Lopez-Cima A. Building legal ontologies with methontology and
webode. Pages 142–157, 2003.
[6] Buitelaar P., Cimiano P., and B. Magnini. Ontology learning from text: an overview. Ontology Learning from Text: Methods,
Applications a n d Evaluation, pages 3–12, 2005.
[7] Cimiano P. and Volker J. Text2onto - a framework for ontology learning and data-driven change discovery. 2005.

Mais conteúdo relacionado

Mais procurados

0810ijdms02
0810ijdms020810ijdms02
0810ijdms02ayu dewi
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...IJwest
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processingATHMAN HAJ-HAMOU
 
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW ijait
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data IntegrationJanna Hastings
 
Ontology Construction from Text: Challenges and Trends
Ontology Construction from Text: Challenges and TrendsOntology Construction from Text: Challenges and Trends
Ontology Construction from Text: Challenges and TrendsCSCJournals
 
Ontology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and moreOntology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and moreAdriel Café
 
Data Integration Ontology Mapping
Data Integration Ontology MappingData Integration Ontology Mapping
Data Integration Ontology MappingPradeep B Pillai
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
 
ONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSKishan Patel
 
Neural perceptual model to global local vision for the recognition of the log...
Neural perceptual model to global local vision for the recognition of the log...Neural perceptual model to global local vision for the recognition of the log...
Neural perceptual model to global local vision for the recognition of the log...ijaia
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyDebashisnaskar
 
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...IJORCS
 
Novelty detection via topic modeling in research articles
Novelty detection via topic modeling in research articlesNovelty detection via topic modeling in research articles
Novelty detection via topic modeling in research articlescsandit
 

Mais procurados (20)

0810ijdms02
0810ijdms020810ijdms02
0810ijdms02
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
 
The basics of ontologies
The basics of ontologiesThe basics of ontologies
The basics of ontologies
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
Using ontology for natural language processing
Using ontology for natural language processingUsing ontology for natural language processing
Using ontology for natural language processing
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processing
 
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
 
Ontology Construction from Text: Challenges and Trends
Ontology Construction from Text: Challenges and TrendsOntology Construction from Text: Challenges and Trends
Ontology Construction from Text: Challenges and Trends
 
Reference Ontology Presentation
Reference Ontology PresentationReference Ontology Presentation
Reference Ontology Presentation
 
Ontology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and moreOntology integration - Heterogeneity, Techniques and more
Ontology integration - Heterogeneity, Techniques and more
 
Data Integration Ontology Mapping
Data Integration Ontology MappingData Integration Ontology Mapping
Data Integration Ontology Mapping
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
 
ONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESS
 
Neural perceptual model to global local vision for the recognition of the log...
Neural perceptual model to global local vision for the recognition of the log...Neural perceptual model to global local vision for the recognition of the log...
Neural perceptual model to global local vision for the recognition of the log...
 
Ontology Learning
Ontology LearningOntology Learning
Ontology Learning
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
 
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...
 
Ontologies
OntologiesOntologies
Ontologies
 
Novelty detection via topic modeling in research articles
Novelty detection via topic modeling in research articlesNovelty detection via topic modeling in research articles
Novelty detection via topic modeling in research articles
 

Destaque

Performance Analysis of New Light Weight Cryptographic Algorithms
Performance Analysis of New Light Weight Cryptographic  AlgorithmsPerformance Analysis of New Light Weight Cryptographic  Algorithms
Performance Analysis of New Light Weight Cryptographic AlgorithmsIOSR Journals
 
An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmIOSR Journals
 
Script Identification for printed document images at text-line level using DC...
Script Identification for printed document images at text-line level using DC...Script Identification for printed document images at text-line level using DC...
Script Identification for printed document images at text-line level using DC...IOSR Journals
 
A Secure Model for Cloud Computing Based Storage and Retrieval
A Secure Model for Cloud Computing Based Storage and  RetrievalA Secure Model for Cloud Computing Based Storage and  Retrieval
A Secure Model for Cloud Computing Based Storage and RetrievalIOSR Journals
 
R120234【メソ研】003
R120234【メソ研】003R120234【メソ研】003
R120234【メソ研】003Sei Sumi
 
International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...
International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...
International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...Odyssey Recruitment
 
Mobile Networking and Ad hoc routing protocols validation
Mobile Networking and Ad hoc routing protocols validationMobile Networking and Ad hoc routing protocols validation
Mobile Networking and Ad hoc routing protocols validationIOSR Journals
 
Performance Evaluation of High Speed Congestion Control Protocols
Performance Evaluation of High Speed Congestion Control  ProtocolsPerformance Evaluation of High Speed Congestion Control  Protocols
Performance Evaluation of High Speed Congestion Control ProtocolsIOSR Journals
 
Requirements and Challenges for Securing Cloud Applications and Services
Requirements and Challenges for Securing Cloud Applications  and ServicesRequirements and Challenges for Securing Cloud Applications  and Services
Requirements and Challenges for Securing Cloud Applications and ServicesIOSR Journals
 
Implementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record LinkageImplementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record LinkageIOSR Journals
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Thomas Gottron
 
Implementation of redundancy in the effective regulation of temperature in an...
Implementation of redundancy in the effective regulation of temperature in an...Implementation of redundancy in the effective regulation of temperature in an...
Implementation of redundancy in the effective regulation of temperature in an...IOSR Journals
 
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...IOSR Journals
 
Social Network Based Learning Management System
Social Network Based Learning Management SystemSocial Network Based Learning Management System
Social Network Based Learning Management SystemIOSR Journals
 
Итоговое сочинение - 2015
Итоговое сочинение - 2015Итоговое сочинение - 2015
Итоговое сочинение - 2015Natalya Dyrda
 

Destaque (20)

B0530714
B0530714B0530714
B0530714
 
Performance Analysis of New Light Weight Cryptographic Algorithms
Performance Analysis of New Light Weight Cryptographic  AlgorithmsPerformance Analysis of New Light Weight Cryptographic  Algorithms
Performance Analysis of New Light Weight Cryptographic Algorithms
 
B01041018
B01041018B01041018
B01041018
 
A01060107
A01060107A01060107
A01060107
 
F0411925
F0411925F0411925
F0411925
 
An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
 
Script Identification for printed document images at text-line level using DC...
Script Identification for printed document images at text-line level using DC...Script Identification for printed document images at text-line level using DC...
Script Identification for printed document images at text-line level using DC...
 
A Secure Model for Cloud Computing Based Storage and Retrieval
A Secure Model for Cloud Computing Based Storage and  RetrievalA Secure Model for Cloud Computing Based Storage and  Retrieval
A Secure Model for Cloud Computing Based Storage and Retrieval
 
R120234【メソ研】003
R120234【メソ研】003R120234【メソ研】003
R120234【メソ研】003
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...
International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...
International Medical Careers Forum Oct 15 2016 Sharing My Own Trip Dr Ameed ...
 
Mobile Networking and Ad hoc routing protocols validation
Mobile Networking and Ad hoc routing protocols validationMobile Networking and Ad hoc routing protocols validation
Mobile Networking and Ad hoc routing protocols validation
 
Performance Evaluation of High Speed Congestion Control Protocols
Performance Evaluation of High Speed Congestion Control  ProtocolsPerformance Evaluation of High Speed Congestion Control  Protocols
Performance Evaluation of High Speed Congestion Control Protocols
 
Requirements and Challenges for Securing Cloud Applications and Services
Requirements and Challenges for Securing Cloud Applications  and ServicesRequirements and Challenges for Securing Cloud Applications  and Services
Requirements and Challenges for Securing Cloud Applications and Services
 
Implementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record LinkageImplementation of Matching Tree Technique for Online Record Linkage
Implementation of Matching Tree Technique for Online Record Linkage
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data
 
Implementation of redundancy in the effective regulation of temperature in an...
Implementation of redundancy in the effective regulation of temperature in an...Implementation of redundancy in the effective regulation of temperature in an...
Implementation of redundancy in the effective regulation of temperature in an...
 
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
A New Approach of Protein Sequence Compression using Repeat Reduction and ASC...
 
Social Network Based Learning Management System
Social Network Based Learning Management SystemSocial Network Based Learning Management System
Social Network Based Learning Management System
 
Итоговое сочинение - 2015
Итоговое сочинение - 2015Итоговое сочинение - 2015
Итоговое сочинение - 2015
 

Semelhante a Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto

SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalgowthamnaidu0986
 
A Review on Evolution and Versioning of Ontology Based Information Systems
A Review on Evolution and Versioning of Ontology Based Information SystemsA Review on Evolution and Versioning of Ontology Based Information Systems
A Review on Evolution and Versioning of Ontology Based Information Systemsiosrjce
 
Implementation of a Knowledge Management Methodology based on Ontologies :Cas...
Implementation of a Knowledge Management Methodology based on Ontologies :Cas...Implementation of a Knowledge Management Methodology based on Ontologies :Cas...
Implementation of a Knowledge Management Methodology based on Ontologies :Cas...rahulmonikasharma
 
A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications IJwest
 
A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications dannyijwest
 
A Comparative Study of Ontology building Tools in Semantic Web Applications
A Comparative Study of Ontology building Tools in Semantic Web Applications A Comparative Study of Ontology building Tools in Semantic Web Applications
A Comparative Study of Ontology building Tools in Semantic Web Applications dannyijwest
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentJorge Barreto
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...dannyijwest
 
A Survey of Ontology-based Information Extraction for Social Media Content An...
A Survey of Ontology-based Information Extraction for Social Media Content An...A Survey of Ontology-based Information Extraction for Social Media Content An...
A Survey of Ontology-based Information Extraction for Social Media Content An...ijcnes
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
 
The Ontology of the Competency-Based Approach and the Perspectives of Impleme...
The Ontology of the Competency-Based Approach and the Perspectives of Impleme...The Ontology of the Competency-Based Approach and the Perspectives of Impleme...
The Ontology of the Competency-Based Approach and the Perspectives of Impleme...IJCSIS Research Publications
 
An Automated Text Summarization Methodology
An Automated Text Summarization MethodologyAn Automated Text Summarization Methodology
An Automated Text Summarization MethodologyJoe Andelija
 
An Ontology Based For Drilling Report Classification
An Ontology Based For Drilling Report ClassificationAn Ontology Based For Drilling Report Classification
An Ontology Based For Drilling Report ClassificationSarah Adams
 
An Approach for Knowledge Extraction Using Ontology Construction and Machine ...
An Approach for Knowledge Extraction Using Ontology Construction and Machine ...An Approach for Knowledge Extraction Using Ontology Construction and Machine ...
An Approach for Knowledge Extraction Using Ontology Construction and Machine ...Waqas Tariq
 
Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...
Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...
Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...dannyijwest
 
Iot ontologies state of art$$$
Iot ontologies state of art$$$Iot ontologies state of art$$$
Iot ontologies state of art$$$Sof Ouni
 
Automatic Annotation Approach Of Events In News Articles
Automatic Annotation Approach Of Events In News ArticlesAutomatic Annotation Approach Of Events In News Articles
Automatic Annotation Approach Of Events In News ArticlesJoaquin Hamad
 
An adaptation of Text2Onto for supporting the French language
An adaptation of Text2Onto for supporting  the French language An adaptation of Text2Onto for supporting  the French language
An adaptation of Text2Onto for supporting the French language IJECEIAES
 

Semelhante a Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto (20)

SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professional
 
A Review on Evolution and Versioning of Ontology Based Information Systems
A Review on Evolution and Versioning of Ontology Based Information SystemsA Review on Evolution and Versioning of Ontology Based Information Systems
A Review on Evolution and Versioning of Ontology Based Information Systems
 
F017233543
F017233543F017233543
F017233543
 
Implementation of a Knowledge Management Methodology based on Ontologies :Cas...
Implementation of a Knowledge Management Methodology based on Ontologies :Cas...Implementation of a Knowledge Management Methodology based on Ontologies :Cas...
Implementation of a Knowledge Management Methodology based on Ontologies :Cas...
 
A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications
 
A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications A Comparative Study Ontology Building Tools for Semantic Web Applications
A Comparative Study Ontology Building Tools for Semantic Web Applications
 
A Comparative Study of Ontology building Tools in Semantic Web Applications
A Comparative Study of Ontology building Tools in Semantic Web Applications A Comparative Study of Ontology building Tools in Semantic Web Applications
A Comparative Study of Ontology building Tools in Semantic Web Applications
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
 
A Survey of Ontology-based Information Extraction for Social Media Content An...
A Survey of Ontology-based Information Extraction for Social Media Content An...A Survey of Ontology-based Information Extraction for Social Media Content An...
A Survey of Ontology-based Information Extraction for Social Media Content An...
 
Hcome kais
Hcome kaisHcome kais
Hcome kais
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
The Ontology of the Competency-Based Approach and the Perspectives of Impleme...
The Ontology of the Competency-Based Approach and the Perspectives of Impleme...The Ontology of the Competency-Based Approach and the Perspectives of Impleme...
The Ontology of the Competency-Based Approach and the Perspectives of Impleme...
 
An Automated Text Summarization Methodology
An Automated Text Summarization MethodologyAn Automated Text Summarization Methodology
An Automated Text Summarization Methodology
 
An Ontology Based For Drilling Report Classification
An Ontology Based For Drilling Report ClassificationAn Ontology Based For Drilling Report Classification
An Ontology Based For Drilling Report Classification
 
An Approach for Knowledge Extraction Using Ontology Construction and Machine ...
An Approach for Knowledge Extraction Using Ontology Construction and Machine ...An Approach for Knowledge Extraction Using Ontology Construction and Machine ...
An Approach for Knowledge Extraction Using Ontology Construction and Machine ...
 
Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...
Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...
Evaluating Scientific Domain Ontologies for the Electromagnetic Knowledge Dom...
 
Iot ontologies state of art$$$
Iot ontologies state of art$$$Iot ontologies state of art$$$
Iot ontologies state of art$$$
 
Automatic Annotation Approach Of Events In News Articles
Automatic Annotation Approach Of Events In News ArticlesAutomatic Annotation Approach Of Events In News Articles
Automatic Annotation Approach Of Events In News Articles
 
An adaptation of Text2Onto for supporting the French language
An adaptation of Text2Onto for supporting  the French language An adaptation of Text2Onto for supporting  the French language
An adaptation of Text2Onto for supporting the French language
 

Mais de IOSR Journals (20)

A011140104
A011140104A011140104
A011140104
 
M0111397100
M0111397100M0111397100
M0111397100
 
L011138596
L011138596L011138596
L011138596
 
K011138084
K011138084K011138084
K011138084
 
J011137479
J011137479J011137479
J011137479
 
I011136673
I011136673I011136673
I011136673
 
G011134454
G011134454G011134454
G011134454
 
H011135565
H011135565H011135565
H011135565
 
F011134043
F011134043F011134043
F011134043
 
E011133639
E011133639E011133639
E011133639
 
D011132635
D011132635D011132635
D011132635
 
C011131925
C011131925C011131925
C011131925
 
B011130918
B011130918B011130918
B011130918
 
A011130108
A011130108A011130108
A011130108
 
I011125160
I011125160I011125160
I011125160
 
H011124050
H011124050H011124050
H011124050
 
G011123539
G011123539G011123539
G011123539
 
F011123134
F011123134F011123134
F011123134
 
E011122530
E011122530E011122530
E011122530
 
D011121524
D011121524D011121524
D011121524
 

Último

IT3401-WEB ESSENTIALS PRESENTATIONS.pptx
IT3401-WEB ESSENTIALS PRESENTATIONS.pptxIT3401-WEB ESSENTIALS PRESENTATIONS.pptx
IT3401-WEB ESSENTIALS PRESENTATIONS.pptxSAJITHABANUS
 
Summer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdf
Summer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdfSummer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdf
Summer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdfNaveenVerma126
 
nvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptxnvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptxjasonsedano2
 
CSR Managerial Round Questions and answers.pptx
CSR Managerial Round Questions and answers.pptxCSR Managerial Round Questions and answers.pptx
CSR Managerial Round Questions and answers.pptxssusera0771e
 
How to Write a Good Scientific Paper.pdf
How to Write a Good Scientific Paper.pdfHow to Write a Good Scientific Paper.pdf
How to Write a Good Scientific Paper.pdfRedhwan Qasem Shaddad
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchrohitcse52
 
Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...soginsider
 
Vertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptx
Vertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptxVertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptx
Vertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptxLMW Machine Tool Division
 
GENERAL CONDITIONS FOR CONTRACTS OF CIVIL ENGINEERING WORKS
GENERAL CONDITIONS  FOR  CONTRACTS OF CIVIL ENGINEERING WORKS GENERAL CONDITIONS  FOR  CONTRACTS OF CIVIL ENGINEERING WORKS
GENERAL CONDITIONS FOR CONTRACTS OF CIVIL ENGINEERING WORKS Bahzad5
 
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....santhyamuthu1
 
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docxSUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docxNaveenVerma126
 
specification estimation and valuation of a building
specification estimation and valuation of a buildingspecification estimation and valuation of a building
specification estimation and valuation of a buildingswethasekhar5
 
Technology Features of Apollo HDD Machine, Its Technical Specification with C...
Technology Features of Apollo HDD Machine, Its Technical Specification with C...Technology Features of Apollo HDD Machine, Its Technical Specification with C...
Technology Features of Apollo HDD Machine, Its Technical Specification with C...Apollo Techno Industries Pvt Ltd
 
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratoryدليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide LaboratoryBahzad5
 
Design Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptxDesign Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptxrajesshs31r
 
Basic Principle of Electrochemical Sensor
Basic Principle of  Electrochemical SensorBasic Principle of  Electrochemical Sensor
Basic Principle of Electrochemical SensorTanvir Moin
 

Último (20)

IT3401-WEB ESSENTIALS PRESENTATIONS.pptx
IT3401-WEB ESSENTIALS PRESENTATIONS.pptxIT3401-WEB ESSENTIALS PRESENTATIONS.pptx
IT3401-WEB ESSENTIALS PRESENTATIONS.pptx
 
Summer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdf
Summer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdfSummer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdf
Summer training report on BUILDING CONSTRUCTION for DIPLOMA Students.pdf
 
nvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptxnvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptx
 
Lecture 4 .pdf
Lecture 4                              .pdfLecture 4                              .pdf
Lecture 4 .pdf
 
CSR Managerial Round Questions and answers.pptx
CSR Managerial Round Questions and answers.pptxCSR Managerial Round Questions and answers.pptx
CSR Managerial Round Questions and answers.pptx
 
Lecture 2 .pdf
Lecture 2                           .pdfLecture 2                           .pdf
Lecture 2 .pdf
 
How to Write a Good Scientific Paper.pdf
How to Write a Good Scientific Paper.pdfHow to Write a Good Scientific Paper.pdf
How to Write a Good Scientific Paper.pdf
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Litature Review: Research Paper work for Engineering
Litature Review: Research Paper work for EngineeringLitature Review: Research Paper work for Engineering
Litature Review: Research Paper work for Engineering
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
 
Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...Transforming Process Safety Management: Challenges, Benefits, and Transition ...
Transforming Process Safety Management: Challenges, Benefits, and Transition ...
 
Vertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptx
Vertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptxVertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptx
Vertical- Machining - Center - VMC -LMW-Machine-Tool-Division.pptx
 
GENERAL CONDITIONS FOR CONTRACTS OF CIVIL ENGINEERING WORKS
GENERAL CONDITIONS  FOR  CONTRACTS OF CIVIL ENGINEERING WORKS GENERAL CONDITIONS  FOR  CONTRACTS OF CIVIL ENGINEERING WORKS
GENERAL CONDITIONS FOR CONTRACTS OF CIVIL ENGINEERING WORKS
 
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
 
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docxSUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
 
specification estimation and valuation of a building
specification estimation and valuation of a buildingspecification estimation and valuation of a building
specification estimation and valuation of a building
 
Technology Features of Apollo HDD Machine, Its Technical Specification with C...
Technology Features of Apollo HDD Machine, Its Technical Specification with C...Technology Features of Apollo HDD Machine, Its Technical Specification with C...
Technology Features of Apollo HDD Machine, Its Technical Specification with C...
 
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratoryدليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
 
Design Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptxDesign Analysis of Alogorithm 1 ppt 2024.pptx
Design Analysis of Alogorithm 1 ppt 2024.pptx
 
Basic Principle of Electrochemical Sensor
Basic Principle of  Electrochemical SensorBasic Principle of  Electrochemical Sensor
Basic Principle of Electrochemical Sensor
 

Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto

  • 1. IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 11, Issue 2 (May. - Jun. 2013), PP 101-117 www.iosrjournals.org www.iosrjournals.org 101 | Page Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto Sonam Mittal1 , Nupur Mittal2 1 Computer Science, B.K. Birla Institute of Engineering & Technology, Pilani, Rajasthan, India 2 Computer Science, Ecole Polytechnique de l’Universit´e de Nantes, France Abstract: Building ontologies from texts is a difficult and time-consuming process. Several tools have been developed to facilitate this process. However, these tools are not mature enough to automate all tasks to build a good ontology without human intervention. Among these tools, Text2Onto is a one for learning ontology from textual data. This case study aims at understanding the architecture and working principle of Text2Onto, analyzing the errors that Text2Onto can produce and finding a solution to reduce human intervention as well as to improve the result of Text2Onto.Three texts of different length were used in the experiment. Quality of Text2Onto results was assessed by comparing the entities extracted by Text2Onto with the ones extracted manually. Some causes of errors produced by Text2Onto were identified too. As an attempt to improve the result of Text2Onto, change discovery feature of Text2Onto was used. Meta- model of the given text was fed to Text2Onto to obtain a POM on top of which an ontology was built for the existing text. The meta-model ontology was aimed to identify all the core concepts and relations as done in the manual ontology and the ultimate objective was to improve the hierarchy of the of the ontology. The use of meta model should help to better classify the concepts under various core concepts. Keywords: Ontology, Text2Onto I. Introduction In the current scenario, use of domain ontology has been increasing. To make such domain ontologies, general method used is extracting ontology from textual resources. It involves processing of huge amount of texts which makes it a difficult and time-consuming task. In order to expedite the process and support the ontogists in different phases of ontology building process, several tools based on linguistic or statistical techniques have been developed. However, the tools are not fully automated yet. Human intervention is required at some phases of the tools to validate the results of the tools so as to produce a good result. Such human intervention is not only time consuming but also error-prone. Therefore, minimizing human activities for error correction is a key for enhancing these tools. Text2Onto is a framework for learning ontologies from textual data. It can extract different ontology components like concepts, relations, instances, hierarchy etc from documents. It also gives some statistical values which help to understand the importance of those components in the text. However, users have to verify its results. We, therefore, studied this tool in order to assess how relevant its results are and to check if its result can be improved. For this purpose, first of all, architecture and working principles of Text2Onto were studied. Then we performed some experiments. To assess the results, we mainly considered concepts, instances and relations. We also observed taxonomy. However, the detailed study revolved around these three components. II. Literature Review This section gives brief overview of Ontology, Ontology building processes and sums up the papers [1], [3], [4], [5], [6], [7]. 2.1 Ontology An ontology is an explicit, formal specification (i.e. machine readable) of a shared (accepted by a group or community) conceptualization of a domain of interest [2]. It should be restricted to a given domain of interest and therefore model concepts and relations that are relevant to a particular task or application domain. Ontologies are built to be reused or shared anytime, anywhere and independently of the behavior and domain of the application that uses them. The process of instantiating the a knowledge base is referred to as ontology population whereas the automatic support in ontology development is usually referred to as ontology learning. Ontology learning is concerned with knowledge acquisition.
  • 2. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 102 | Page 2.2 Ontology life cycle Ontology development process refers to what activities are carried out to build the ontologies from scratch.[1] In order to start the ontology development process, there is a need to plan out the activities to be carried out and the resources used for them. Thus an ontology specification document is prepared in order to write the requirements and the specifications of the ontology development process. The process of ontology building starts with conceptualization of the acquired knowledge in a conceptual model in order to describe the problem and its solution with the help of some intermediate representations. Next, the conceptual models are formalized into formal or semi-compatible formal models using frame-oriented or Description Logic (DL) representation systems. The next step is to integrate the current ontology with the existing ontologies. Though this is an optional step, we should consider reusing existing ontologies in order to avoid duplicate effort in building them. After this, the ontology is implemented in a formal language like OWL, RDF etc. Once the ontology is implemented, it is evaluated to make a technical judgment with respect to a frame of reference. There is a need to document the ontology to the best possible extent. Finally, efforts are put to maintain and update the ontology. There can be various ways to follow these activities to develop the ontology. The most common among them are water fall life cycle and incremental life cycle. III. Methontology Methontology [1] is a well-structured methodology used to build ontologies from scratch. It follows a certain number of well-defined steps to guide the ontology development process. Methontology follows the order of specification, knowledge acquisition, conceptualization, implementation, evaluation and documentation activities in order to carry out the ontology development process. It also identifies the management activities like schedule, control and quality assurance and some support activities like integration and evaluation. 3.1 Specification The first phase according to Methontology is specification where an ontology specification document is a formal or semi-formal document written in natural language (NL) having information like purpose of the ontology, level of formality implemented in the ontology, scope of ontology and source of knowledge. A good design of this document is the one where each and every term is relevant and has partial completeness and ensures consistency of all the terms. 3.2 Knowledge Acquisition The specification is followed by knowledge acquisition, which is an independent activity performed using techniques like brainstorming, interviews, formal questions, non-structured interviews, informal text analysis, formal text analysis, structured interviews and knowledge acquisition tools. 3.3 Conceptualization The next step is structuring the domain knowledge in a conceptual model. This is the step of conceptualization where a glossary of terms is built, relations are identified, taxonomy is defined, the data dictionary is implemented and table of rules and formula is made. Data dictionary describes and gathers all the useful and potentially usable domain concepts, their meanings, attributes, instances, etc. Table of instance attributes provide information about the attribute or about its values at the instance. Thus the result of this phase of Methontology is a conceptual model expressed as a set of well-defined deliverables which allow to access the usefulness of the ontology and to compare the scope and completeness of various other ontologies. 3.4 Integration Integration is an optional step that is used to accelerate the process of building ontology by merging various already existing related ontologies. This leads to inspection of the meta-ontologies and then to find out the best suited libraries to provide term definition. As a result, Methontology produces an integration document summarizing the meta-ontology, the name of the terms to be used from conceptual model and the name of the ontology from which the corresponding definition is taken. Methontology highly recommends the use of already existing ontologies. 3.5 Implementation Implementation of the ontology is done using a formal language and an ontology development environment which is incorporated with a lexical and syntactic analyzer so as to avoid lexical and syntactic errors.
  • 3. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 103 | Page 3.6 Evaluation Once the ontology has been implemented, they are judged technically which results in a small evaluation document where the methods used to evaluate the ontology will be described. 3.7 Documentation Documentation should be carried out during all the above steps. It is the summing of the steps, procedures and results of each step in a written document. IV. Ontology Learning Layers Different aspects of Ontology Learning (OL) have been presented in the form of a stack on the paper [6]. OL involves the processing of different layers of this stack. It follows an order of identifying the terms (linguistic realizations of domain-specific concepts), finding out their synonyms, categorizing them as concepts, defining concept hierarchies, relations and describing rules in order to restrict the concepts. Different ontology components and the methods for extracting them are explained in the following sections in details. V. Ontology modeling components Methontology deals to conceptualize ontologies with a tabular and graphical IRs. The components of such IRs are: Concepts, Relations between the concepts of the domain, Instances (specialization of concept), Constants, Attributes (properties of the concepts in general and instances in specification), formal axioms and rules specified in formal or semi-formal notation using DL. These components are used to conceptualize the ontologies by performing certain tasks as proposed by Methontology. 5.1 Term Terms are linguistic realizations of domain-specific concepts. Term extraction is a mandatory step for all the aspects of ontology learning from text. The methods for term extraction are based on information retrieval, NLP research and term indexing. The state-of-the art is mostly to run a part-of- speech tagger over the domain corpus and then to manually verify the terms hence constructing ad-hoc patterns. In order to automatically identify only relevant terms, a statistical processing step can be used that compares the distribution of terms between corpora. 5.2 Synonym Finding the synonyms allows the acquisition of the semantic term variants in and between languages and hence helps in term translation. The main implementation is by integrating WordNet for getting the English synonyms. This requires word sense disambiguation algorithms to identify the synonyms according to the meaning of the word in the phrase. Clustering and related techniques can be another alternative for dynamic acquisition. Two main approaches [6] are: 1. Harris Distribution Hypothesis: Terms are similar in meaning to the extent in which they share syntactic contexts. 2. Statistical information measures defined over the web. 5.3 Concept In identification of concept should focus to provide: 1. Definition of the concept. 2. Set of concept instances i.e. its extensions. 3. A set of linguistic realizations of the concept. Intentional concept learning includes extraction of formal and informal definitions. An informal definition can be a textual description whereas the formal description includes the extraction of concept properties and relations with other concepts. OntoLearn system can be used for this purpose. 5.4 Taxonomy Three main factors exploited to induce taxonomies are: 1. Application of lexico-syntactic patterns to detect hyponymy relations. 2. Context of synonym extraction and term clustering mainly using hierarchical clustering. 3. Document based notation of term subsumption. 5.5 Relation Relations represent a type of association between concepts of the domain. Text mining using statistical analysis with more or less complex levels of linguistic analysis is used for extracting relations.
  • 4. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 104 | Page Relation extraction is similar to the problem of acquiring selection restrictions for verb arguments in NLP. Automatic content extractor program is one such program used for this purpose. 5.6 Rule These are used to infer knowledge in the ontology. The important factor for rule extraction is to learn lexical entailment for application in question answering systems. 5.7 Formal Axiom Formal axioms are the logical expressions that are always true and are used as constraints in ontology. The ontologist must identify the formal axioms needed in the ontology and should describe them precisely. Information like Name, natural language description and logic expression should be identified for each formal axiom. 5.8 Instance Relevant instances must be identified from the concept dictionary in an Instance table. NL tagger can be used in order to identify the proper nouns and hence the instances. 5.9 Constant Constants are numeric values that do not change during the time. 5.10 Attribute Attributes describe the properties of instances and concepts. They can be instance attributes or class attributes accordingly. Ontology development tools usually provide predefined domain-independent class attributes for all the concepts. VI. Ontology tools and frameworks Several tools and frameworks have been developed to aid the ontologist in different steps of ontology building. Different tools are available for extracting ontology components from different kinds of sources like text, semi structured text, dictionary etc. The scope of these tools varies from basic linguistic processing like term extraction, tagging etc to guiding the whole ontology building process. Some of the ontology tools and frameworks are discussed in the following section. As the scope of this study is limited to Text2Onto, we will discuss about it in detail. Other tools are presented briefly. VII. Text2Onto Text2Onto [7] is a framework for learning ontologies from textual data. It is a redesign of TextToOnto and is based on Probabilistic Ontology Model (POM) which stores the learned primitives independent of a specific Knowledge Representation (KR) language. It calculates a confidence for each learned object for better user interaction. It also updates the learned knowledge each time the corpus is changed and avoids processing it by scratch. It allows for easy combination and execution of algorithms as well as writing new algorithms. 7.1 Architecture and Workflow The main components of Text2Onto are Algorithms, an Algorithm Controller and POM. The learning algorithms are initialized by a controller which triggers the linguistic preprocessing of the data. Text2Onto depends on the output of Gate. During preprocessing, it calls the applications of Gate to i. tokenize the document (identifying words, spaces, tabs, punctuation marks etc) ii. split sentences iii. tag POS iv. match JAPE patterns to find noun/verb phrases Then the algorithms use the results from these applications. Gate stores the results in an object called Annotation Set which is a set of Annotation objects. Annotation object stores the following information: a. id - unique id assigned to the token/element b. type - type of the element (Token, SpaceToken, Sentence, Noun, Verb etc) c. features - a map of various info like whether it is a stopword or not, the category( or tag) of the element (e.g. NN), etc.
  • 5. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 105 | Page d. start offset - Starting position of the element. e. end offset - ending position of the element. Text2Onto uses the „type‟ property to filter the required entity and then uses start and end offset to find the actual word. For e.g. suppose our corpus begins with the following line: Ontology evaluation is a critical task. . . Then the information of a word „task‟ is stored in Annotation object with type „Token‟, category „NN‟, start offset „34‟ and end offset „38‟. Text2Onto uses the offset values to get the exact word again. After preprocessing the corpus, the controller executes the ontology learning algorithms in the appropriate order and applies the algorithms‟ change requests to the POM. The execution of algorithms takes place in three phases notification phase, computation phase and result generation phase. In the first phase, the algorithm learns about recent changes to the corpus. In the second phase, these changes are mapped to changes with respect to the reference repository and finally, requests for POM changes are generated from the updated content of the reference repository. Text2Onto includes a Modeling Primitive Library (MPL) which makes the primitive models Ontology language independent. 7.2 POM POM (Probabilistic Ontology Model also called Preliminary Ontology Model) is the basic building block of Text2Onto. It is an extensible collection of modeling primitives for different types of ontology elements or axioms and uses confidence and relevance annotations for capturing uncertainty. It is KR language- independent and thus can be transformed into any reasonably expressive knowledge representation language such as OWL, RDFS, F-logic etc. The modeling primitives used in Text2Onto are as follows: i. concepts (CLASS) ii. concept inheritance (SUBCLASS-OF) iii. concept instantiation (INSTANCE-OF) iv. properties/relations (RELATION) v. domain and range restrictions (DOMAIN/RANGE) vi. mereological relations vii. equivalence POM is traceable because for each object, it also stores a pointer to those parts of the document from which it was derived. It also allows maintenance of multiple modeling alternatives in parallel. Adding new primitives does not imply changing the underlying framework thus making it flexible and extensible. 7.3 Data-driven Change Discovery An important feature of Text2Onto is data-driven change discovery which prevents the whole corpus from being processed from scratch each time it changes. When there are changes in the corpus, Text2Onto detects the changes and calculates POM deltas with respect to the changes. As POM is extensible, it modifies the POM without recalculating it for the whole document collection. The benefits of this feature are that the document reprocessing time is saved and the evolution of the ontology can be traced. 7.4 Ontology Learning Algorithms/Methods Text2Onto combines Machine Learning approaches with basic linguistics approaches for learning ontology. Different modeling primitives in POM are instantiated and populated by different algorithms. Before populating POM, the text documents undergo linguistic preprocessing which is initiated by the algorithm controller. Basic linguistic preprocessing involves tokenization, sentence splitting, syntactic tagging of all the tokens by POS tagger and lemmatizing by morphological analyzer or stemming by a stemmer. The output of these steps is an annotated corpus which is then fed to JAPE transducer to match a set of particular patterns required by the ontology learning algorithms. The algorithms use certain criteria to evaluate the confidence of the extracted entities. The following section presents the techniques and criteria used by these algorithms to extract different ontology components. 7.4.1 Concepts Text2Onto comes with three algorithms for extracting concepts EntropyConceptExtraction, RTFConceptExtraction and TFDIFConceptExtraction. It looks for the type „Concept‟ in the Gate results.
  • 6. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 106 | Page All of these algorithms filter the same type. The only difference is the criteria they take for the probability / relevance calculation. These algorithms use statistical measures such as TFIDF (Term Frequency Inverted Document Frequency), Entropy, C-value, NC-value, RTF (Relative Term Frequency). For each term, the values of these measures are normalized to [0...1] and used as corresponding probability in the POM. 1. RTFConceptExtraction It calculates Relative Term Frequency which is obtained by dividing the absolute term frequency (number of times a term t appears in the document d) of the term t in the document d divided by the maximum absolute term frequency (the number of times any term appears the maximum number of times in the document d) of the document d. 𝑡𝐟(𝐭, 𝐃) = 𝐚𝐛𝐬𝐨𝐥𝐮𝐭𝐞 𝐭𝐞𝐫𝐦 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 𝐦𝐚𝐱𝐢𝐦𝐮𝐦 𝐚𝐛𝐬𝐨𝐥𝐮𝐭𝐞 𝐭𝐞𝐫𝐦 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲 2. TFIDFConceptExtraction It calculates term frequency inverse document frequency which is the product of TF (term frequency) and IDF (Inverse Document Frequency). IDF is obtained by dividing the total number of documents by the number of documents containing the term, and then taking the log of that quotient. tf-idf(t, d, D) = tf(t, d) × idf(t, D) where, 𝒊𝒅𝒇 𝒕, 𝑫 = 𝒍𝒐𝒈 𝑫 𝒅𝒇 𝒕 |D| = number of all documents df(t) = Number of documents containing the term. 3. EntropyConceptExtraction It computes entropy which is a combination of C-value (indicator of termhood) and NC-value (Contextual indicators of termhood) C-value (frequency-based method sensitive to multi-word terms) 𝐂− 𝐯𝐚𝐥𝐮𝐞 𝐚 = 𝐥𝐨𝐠 𝟐 𝐚 𝐟 𝐚 𝐢𝐟 𝐚 𝐢𝐬 𝐧𝐨𝐭 𝐧𝐞𝐬𝐭𝐞𝐝 𝐥𝐨𝐠 𝟐 𝐚 𝐟 𝐚 − 𝟏 𝐓𝐚 𝐟(𝐛) 𝐛𝛜𝐓𝐚 f(a) is the frequency of a, Ta is the set of terms which contain a. NC-value (incorporation of information from context words indicating termhood) 𝐰𝐞𝐢𝐠𝐡𝐭 𝐰 = 𝐭(𝐰) 𝐧 where t(w) is the number of times that w appears in the context of a term. 7.4.2 Instances An algorithm called TFIDFInstanceExtraction is available in Text2Onto for extraction of instances. It filters “Instance” type from the gate result and computes TFIDF as in TFIDFConceptExtraction. 7.4.3 General relations General relations are identified using linguistic approach. The algorithm SubcatRelationExtraction filters the types “TransitiveVerbPhrase”, “IntransitivePPVerbPhrase”, and “ TransitivePPVerbPhrase” in the Gate results which is obtained by shallow parsing to identify the following syntactical frames: • Transitive, e.g., love (subj, obj) • Intransitive + PP-complement, e.g., walk (subj, pp (to)) • Transitive + PP-complement, e.g., hit (subj, obj, pp (with)) For each verb phrases, it finds its subject, object and associated preposition. (By filtering Nouns and Verbs from the sentence) and then stems them and prepares the relation. 7.4.4 Subclass-of relations Subclass-of relations identification involves several algorithms which use hypernym structure of
  • 7. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 107 | Page WordNet, match Hearst patterns and apply linguistic heuristics. The results of these algorithms are combined through combination strategies. These algorithms depend on the result of concept extraction algorithms. Relevance calculation of one of the algorithms is presented below: 1. WordNetClassifcationExtraction It extracts subclass-of relations among the extracted concepts identifying the hypernym structure of the concepts in WordNet. Relevance is calculated in the following manner: If a is a subclass of b, then 𝐑𝐞𝐥𝐞𝐯𝐚𝐧𝐜𝐞 = 𝐍𝐨. 𝐨𝐟 𝐬𝐲𝐧𝐨𝐧𝐲𝐦𝐬 𝐨𝐟 𝐚 𝐟𝐨𝐫 𝐰𝐡𝐢𝐜𝐡 𝐛 𝐢𝐬 𝐚 𝐡𝐲𝐩𝐞𝐫𝐧𝐲𝐦 𝐍𝐨. 𝐨𝐟 𝐬𝐲𝐧𝐨𝐧𝐲𝐦𝐬 𝐨𝐟 𝐚 7.4.5 Instance-of relations Lexical patterns and context similarity are taken into account for instance classification. A pattern- matching algorithm similar to the one use for discovering mereological relations is also used for instance- of relation extraction. 7.4.6 Equivalence and equality The algorithm calculates the similarity between terms on the basis of contextual features extracted from the corpus. 7.4.7 Disjointness A heuristic approach based on lexico-syntactic patterns is implemented to learn disjointness. The algorithm learns disjointness from the patterns like NounPhrase1, NounPhrase2.... (and/or) NounPhrasen. 7.4.8 Subtopic-of relations Subtopic-of relations are discovered using a method for building concept hierarchies. There is also an algorithm for extracting this kind of relationships from previously identified subclass-of relations. 7.5 NeOn Toolkit NeOn Toolkit is an open source multi-platform ontology engineering environment and provide comprehensive support for ontology engineering lifecycle. It is based on Eclipse platform and provides various plugins for different activities in ontology building. Following plugins are under the scope of this case study: 7.5.1 Text2Onto plug-in It is a graphical front-end for Text2Onto that is available for the NeOn toolkit. It enables the integration of Text2Onto into a process of semi-automatic ontology engineering. 7.5.2 LeDA Plugin LeDA, an open source framework for automatic generation of disjointness axioms, has been implemented in this plug-in developed to support both enrichment and evaluation of the acquired ontologies. The plug-in facilitates a customized generation of disjointness axioms for various domains by supporting both the training as well as the classification phase. 7.6 Ontocase OntoCase is an approach to use ontology patterns throughout an iterative ontology construction and evolution framework. In OntoCase the patterns constitute the backbone of these reusable solutions because they can be utilized directly as solutions to specific modeling problems. The central repository consists of pattern catalogue, ontology architecture and other reusable assets. The OntoCase cycle consists of 4 phases, Retrieval, Reuse, Evaluations and revision and Discovery of new pattern candidates. The first phase corresponds to input analysis and pattern retrieval. It constitutes the process of analyzing the input and matching derived input representation to the pattern base to select appropriate pattern. The second phase includes pattern specialization, adaptation and composition and constitutes the process of reusing the retrieved patterns and constructing an improved ontology. The third one concerns evaluation and revision of the ontology to improve the fit to the input and the ontology quality. The final phase includes the discovery of new pattern candidates or the other reusable components as well as storing pattern feedback.
  • 8. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 108 | Page VIII. Learning disjointness axioms (LeDA) LeDA is an open-source framework for learning disjointness [3] and is based on machine learning classifier called Naive Bayes. The classifier is trained based on a vector of feature values and manually created disjointness axioms (i.e. a pair of classes labeled „disjoint‟ or „not disjoint‟). The following features are using in this framework: Taxonomic overlap: Taxonomic overlap is the set of common individuals. Semantic distance: The semantic distance between two classes c1 and c2 is the minimum length of a path consisting of subsumption relationships between atomic classes that connects c1 and c2. Object properties: This feature encodes the semantic relatedness of two classes, c1 and c2, based on the number of object properties they share. Label similarity: This feature gives the semantic similarity between two classes based on common prefix or suffix shared by them. Levenshtein edit distance, Q-grams and Jaro-Wrinkler distance are taken into account to calculate label similarity in LeDA. Wordnet similarity: LeDA uses Wordnet-bases similarity measure that computes the cosine similarity between vector-based representations of the glosses that are associated with the two synsets. Features based on Learned Ontology: From the already acquired knowledge such as terminological overlap, classes, individuals, subsumption and class membership axioms, more features, viz. subsumption, taxonomic overlap of subclasses and instances and lexical context similarity, are calculated. IX. LExO for Learning Class Descriptions LExO (Learning Expressive Ontologies) [3] automatically generates DL axioms from natural language sentences. It analyzes the syntactic structures of the input sentence and generates dependency tree which is then transformed into XML-based format and finally to DL axioms by means of manually engineered transformation rules. However, this automation of DL generation needs human intervention to verify if all of them are correct. X. Relexo Relational Exploration for Learning Expressive Ontologies is a tool used for the difficult and time-consuming phase of ontology refinement [4]. It not only supports the user in a stepwise refinement of the ontology but also helps to ensure the compatibility of a logical axiomatization with the user‟s conceptualization. It combines a method for learning complex class descriptions from textual definitions with the Formal Concept Analysis (FCA)-based technique of relational exploration. The LExO component of this assists the ontologist in the process of axiomatizing atomic classes; the exploration part helps to integrate newly acquired entities into the ontology. It also helps the user to detect inconsistencies or mismatches between the ontology and her conceptualization and hence provides a stepwise approximation of the user‟s domain knowledge. XI. Alignment To Top-Level Ontologies It is a special case of ontology matching where the goal is to primarily find correspondences between more general concepts or relations in the top-level ontology and more specific concepts and relations on the engineered ontology. Aligning Ontology to a top-level ontology might also be compared to automatically specializing or extending a top-level ontology. Methods like lexical substitution may be used to find clues of whether or not a more general concept is related to a more specific one in the other ontology the alignment of ontology to a top-level ontology engineering patterns. By determining that a pattern can be applied and applying it then provides a connection to the top-level ontology. XII. Experiment In order to evaluate the results of Text2Onto and improve them, some experiments were carried out. The objectives of the experiments were • To analyze the various algorithms and criteria used by Text2Onto for extracting different ontology components. • To analyze the result produced by Text2Onto • To compare the components extracted by Text2Onto with the ones extracted manually. • To analyze errors found in the ontology built by Text2onto and identifying their origin. • To analyze Text2Onto outcomes when adding meta-model of the ontology as an additional input. Details on the experimental data and the experiment protocol are presented in the following sections.
  • 9. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 109 | Page XIII. Experimental Data The experiments were conducted for three individual texts. The first text which we will call „Abstract‟ onwards was a compilation of abstract of four different papers. The remaining texts will be referred to as „Text1‟ and „Text2‟. All of these texts were related to Ontology building and ontology learning tools. Ontologies were built manually from these texts as well as from Text2Onto. XIV. Experimental Protocol The experiments were performed in five phases. The first phase involved the building of ontology manually from the three texts. The second phase was concerned with the development of ontology using Text2Onto. In the third phase, the ontology built by Text2Onto was compared with the manual one. In the next phase, meta-model of the texts were fed to Text2Onto and the corresponding ontology was built again. Finally, the results were compared with the older ontologies. These phases are further described in details in the following section: 14.1 Experimental Work-flow The following steps were carried out for each text: 1. Building ontology manually Methontology was followed to build ontologies from the three texts manually. All the steps like glossary building, meta-model and taxonomy were followed while building ontology from Abstract and Text2 whereas the ontology of Text1 was provided to us. The ontology was conceptualized in the following way: 1. POS tagging of all the terms in the document. 2. Identify the concepts and relation from the validated terms. 3. Making the meta-model. The aim is to subsume all the accepted concepts into some of the core concepts. 4. Identifying the accepted terms (concepts), their related core-concepts and finding their synonyms. 5. Defining the is-a hierarchy for the concepts and the identified core-concepts. 6. Identifying other binary relations. 7. Validating the meta-model. 2. Building ontology using Text2Onto This step involved the use of Text2Onto to build the same ontology automatically. 3. Analysis of Text2Onto results The Analysis phase was itself done in two phases. First, the results of different algorithms of Text2Onto were compared with each other in order to find the interesting criteria for the extraction of different components. This was done for concepts, instances, relation and hierarchy extraction. The main criteria for the comparison were the relevance value. Secondly, a comparison and study of differences between the results of tasks performed in the previous two phases were carried out to estimate and comment on the quality of the ontology built by the tool. The comparison was very detailed in the sense that all concepts, instances, relations and hierarchies extracted from these two methods were compared. It was followed by the identification of causes for the differences and errors/shortcomings in the performance of the tool. 4. Adding Meta-model to the ontology using Text2Onto The idea was to observe if Text2Onto gives better results when ontology is built on top of its meta- model. For this, the meta-model built manually in the first phase was introduced into Text2Onto and ontologies were built upon their corresponding meta-model. This process involved the following steps: (a) Conversion of the meta model into text In order to get a POM of meta-model, we converted meta-model into text from which Text2Onto can extract core concepts and relations between them. Details about the process of conversion are given in the section 16Conversion of Meta-Model to text. (b) Obtaining meta model POM The meta model text was fed to Text2Onto to obtain a meta model POM which contained all core concepts and relations between them. (c) Improving the ontology using meta-model Once the POM has been obtained from Text2Onto, the original text was added to it to build a new ontology combined with the meta model.
  • 10. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 110 | Page 5. Comparison of the ontology built with and without the meta model In this phase, the ontology build in the second phase was compared with the one built using meta model. Relevance values, identification of new components and hierarchies were considered while comparison. XV. Results And Observations 15.1 Comparison of Algorithms and criteria of Text2Onto The algorithms and criteria used by Text2Onto for extracting ontology components were studied in detail so as to compare their performance. The comparison was done based on the relevance values computed by these algorithms. 15.1.1 Observations Though the values of relevance in case of entropy are different from those in case of other algorithms, they hold the similar relations and the relative values for the concepts. Same is also true with the combination of one or more such evaluation algorithms. It was observed that the order of the extracted components is independent of the algorithms/criteria used. So we cannot say if one algorithm is superior to the others or one criterion is better than the others. We observed the same behavior in all three texts. XVI. Conversion Of Meta-Model To Text In order to try to improve the ontology built by the tool Text2Onto, the meta-model is used and is translated to text. As concepts and relations of meta-model should be all identified when executed with the tool, first try was to write a paragraph about the meta-model. This worked fine for most of the concepts but a very few relationships could be identified and some of the concepts were also left out and some extra concepts were included (which were used in the paragraph to structure the meta-model tran slation ). The next try was to write simple sentences consisting of two nouns (the concepts) related by a verb (the relation between the two concepts). We tried to use the core concepts and relations only from the text as much as possible. However, this also could not identify all the relations properly. Finally a new algorithm was proposed so as to achieve the desired goal as well as to enhance the results of Text2Onto. Below are the translations of meta model for the various experimental data used. 16.1 AbstractText The meta model of this text is given in the figure 1. For this meta model, we used the following lines to construct meta model POM in Text2Onto. A system is composed of methods. A method has method components. A tool implements methods. An algorithm is used by methods. An expert participates in ontology building step. Ontology building step uses resources. A resource is stored in data repository. A term is included in resources. Ontology building step is composed of ontology building process. Ontology has ontology components. A user community uses ontologies. Ontology describes domain.
  • 11. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 111 | Page Figure 1: Abstract-Text Meta Model 16.2 Text1 The meta model of this text is given in the figure 2. Figure 2: Text1 Meta Model 16.3 Text2 The meta model of this text is given in the figure 3 and the corresponding meta-model text is given below. Domain has ontology. Ontology is composed by ontology components. Ontology is built by methodology. Tool builds ontology. Activity is guided by methodology. Activity produces model. Representation is resulted by mode Tool supports activity. Organization develops tool. Methodology is developed by organization. Tool uses language.
  • 12. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 112 | Page Person uses tool. Person creates ontology. Figure 3: Text2 Meta Model 16.4 Comparison of Manual and Automated Ontologies This sections includes the comparison of the two methods of ontology building i.e. MANUAL and AUTOMATED with the tool Text2Onto. The aim of the comparison is to evaluate the process of ontology building by the tool and then analyze the results to suggest improvements to the tool. 16.4.1 Manual Ontology - Abstract Abstract text was the shortest of all texts. It had 536 terms in total out of which 34 terms were accepted as concepts and 9 as instances. 16.4.2 Automated Ontology - Abstract The same text was fed to Text2Onto for automating the process of ontology building. As the importance of ontology components based on relevance values was found to be independent of the algorithms used, we could choose any algorithm from the available list of them. As we were extracting ontology from a single document, the algorithms that use TFIDF criteria was not interesting for us. So, we didn‟t choose this algorithm during analysis. The evaluation algorithms used in the Text2Onto gave the relevance values to the concepts and other components identified. Text2Onto did not support writing the results in a separate file and hence we added another method that could save the results in a different excel file for each execution of Text2Onto. This was also necessary for the later phases of comparison. Text2Onto extracted 85 concepts, 14 individuals, and 3 general relations. 16.4.3 Comparison of manual and automated ontology - Abstract The two ontologies were compared majorly based on the identified concepts, instances, and relations. Out of 34 concepts extracted manually, only 26 matched the ones extracted from Text2Onto. Only 7 instances were common to both ontologies and none of the relations were common to them. We observed that the manual ontology was better in identifying the concepts because in the ontology made by Text2Onto some of the irrelevant concepts were also considered. Another major problem was the identification of the composite concepts. All the composite concepts (consisting of more than one atomic word) were not identified unlike the manual ontology. Relations were not at all satisfactory. The possible reasons attributed for these differences are as follows:
  • 13. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 113 | Page 1. The text was not consistent as a whole. The text was basically a summarization of different texts and hence it lacked synchronization between its different paragraphs. Thus there was a need to try with another longer and better text so as to conclude anything significant. 2. The frequency for most of the terms (concepts and relations) was very less. 16.4.4 Manual ontology - Text1 For this ontology, there were 4807 terms after tokenization, of which, 472 were nouns and 226 were verbs. After performing the operation of stemming, the number of nouns was reduced to 357 as close as 25% reduction in comparison with the original count. 16.4.5 Automated ontology - Text1 The Text1 was fed to Text2Onto for making the ontology automatically. 406 concepts, 94 instances and 16 relations were extracted from Text2Onto. 16.4.6 Comparison of manual and automated ontologies - Text1 As compared to 357 terms from the manual ontology, Text2Onto extracted 406 terms. Among them only 87 concepts were common to both of them. Some highly irrelevant terms were also included in the results of Text2Onto based on their high relevance values. On the other hand, some important composite terms were missed out from the results of automated ontology. 16.4.7 Manual ontology - Text2 Following the same procedure as above for building the manual ontology, there were 4761 terms in the knowledge base. Finally 667 valid terms were refined from this knowledge base of which ultimately 200 terms were accepted as concepts of the ontology. 16.4.8 Automated ontology - Text2 350 terms (concepts) were extracted from this text when it was run with Text2Onto. A lot of concepts were insignificant and had to be rejected when the comparison was made. 16.4.9 Comparison of Manual and Automated Ontologies This automated ontology was better than the earlier too as it could identify many relations and the is-a hierarchy was better than the others. 16.4.10Observations Relevance Values and their roles In order to assess the result of Text2Onto and possibility to automate the process of ontology building, we examined the role of relevance values for concepts in Text2Onto. The following observations were made regarding the same:  Most of the terms that were extracted by Text2Onto as concepts can be accepted based on their relevance values.  The core concepts generally have very high relevance.  Most of the terms with high relevance value are accepted.  There are concepts which are always rejected despite of their very high values. After studying man y papers and previous works in this field, there is no general rule that can be applied to automatically reject these terms but some corpus specific rules can be written.  There are concepts which are accepted despite of their low values. In order to automate the third and fourth process, we tried to find out some information about these kinds of concepts. We observed that the terms with high relevance values (which are generally rejected) occur in the same kind of pattern. For example the concept is „ORDER‟. It is generally observed to appear a s “IN ORDER T O”. Thus predefining many such patterns to exclude can be one solution to reject some terms despite their high relevance values. 16.5 Analysis of errors 16.5.1 Identification of errors Following errors were identified while comparing the ontologies built manually and the ones built usingText2Onto: 1. Some concepts were also identified as instances by Text2Onto. For e.g. ontology, WSD 2. Acronyms were not identified by Text2Onto. E.g. SSI, POM.
  • 14. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 114 | Page 3. Synonyms were not identified properly. 4. Very few relations wer e identified by Text2Onto most of which were not appropriate (interesting) at all. 5. Instance-of algorithm did not give the instances that are given by instance algorithm. 6. Some verbs like extract and inspect which we had considered as relations were identified as concepts by Text2Onto. 16.5.2 Identification of causes of errors After an in depth study of the algorithms of Text2Onto, following causes of errors were observed: 1. POS tagger used by GATE tags some words incorrectly. For e.g. the verb extract was tagged as noun. 2. Errors may also be due to grammatical mistakes in the corpus file. 3. In the case of Abstract text, er r or s may also be due to its length and content. The text con tain ed 4 paragraphs from different papers, and hence had few common terminologies. 4. The algorithms t o extract concepts and instances work independently. Thus, identification of a term as both concept and instance is not handled in Text2Onto. 5. SubcatRelationExtraction algorithm can extract relations from simple sentences only. The patterns it can identify are: Subject + transitive verb + object Subject + transitive verb + object + preposition + object Subject + intransitive verb + preposition + object It identifies only those verbs as relations which come with a singular subject (concept). For e.g. it can extract the relation build from a tool builds ontology but not from Tools build ontology. XVII. Improvement Of Text2Onto Results As the result of Text2Onto was not good compared to manual ontology, we did two things to improve it. First, we added an algorithm to improve relation e x t r a c t i o n of Text2Onto. Second, we performed some experiments on Text2Onto adding meta model to the ontologies built above. The following section describes the added algorithm and the results and observations from the experiment. 17.1 Algorithm to improve Text2Onto results The relations extracted from Text2Onto were not interesting at all. Moreover, we found it difficult to make Text2Onto extract all the relations from Meta model text. So, we decided to add an algorithm to improve the result of relation extraction in Text2Onto. To extract more relations in order to make a better meta-model, we have added two JAPE rules along with an algorithm to process them. The added JAPE rules identify sentences in passive voice and sentences with more than one verb (one auxiliary verb followed by a main verb) with preposition, i.e. the following syntactical patterns: • Subject + be-verb + Main verb +”by” + Object e.g. Ontology is built by experts • Subject + auxiliary-verb + Main verb + preposition + Object e.g. Ontology is composed of components Though these patterns are similar to each other, we added two patterns instead of one in order to identify these grammatically significant patterns separately. The new algorithm c a n find these patterns from both meta-model and the ontology text. As a result, we could obtain the relations that were not identified in the text earlier. The added JAPE expressions are as below: R u le: Passive Phrase ( ({Noun Phrase} | {Proper Noun Phrase}): object {SpaceToken. kind = = space} ({Token. category = = VBZ} | {Token. strings == is}): auxverb {Space Token. kind = = space} ({Token. category = = VBN} | {Token. categories = = VBD}): verb {SpaceToken. kind = = space} ({Token .string = = by}): prep {SpaceToken. kind = = space}
  • 15. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 115 | Page ({NounPhrase} | {Proper Noun Phrase}): subject ): passive −−> : Passive. Passive Phrase = { rule = ” Passive Phrase "}, : Verb. Verb = {Rule = “Passive Phrase "}, : Subject .Subject = {Rule = " Passive Phrase "}, : object .Object = {Rule = "Passive Phrase "}, : prep. Preposition = {Rule = "Passive Phrase "} R u le: Multi Verbswith Prep ( ({NounPhrase} | {Proper Noun Phrase}): subject {Space Token. kind = = space} ({Token. category = = VBZ} {Token. category = = VB}) : auxverb {SpaceToken. kind = = space} ({Token. category = = VBN} | {Token. categories = = VBD}): verb {SpaceToken. kind = = space} ({Token. category = = IN}): prep {SpaceToken. kind = = space} ({NounPhrase} | {Proper Noun Phrase}): object ): mvwp −−> : mvwp. MultiVerbswith Prep = {Rule = "MultiVerbswith Prep"}, : Verb. Verb = {Rule = "Multi Verbswith Prep"}, : Subject. Subject = {Rule = "MultiVerbswith Prep"}, : object. Object= {Rule = " MultiVerbswith Prep"}, : prep. Preposition = {Rule = " MultiVerbswith Prep"} These JAPE expressions are used by GATE application to match the syntactical patterns. Using the new algorithm, we could extract more relations from the original text. 17.2 Enhancement of Ontology using Meta-Model The main idea was to try to improve the results of Text2Onto so that the process of building Ontology can be automated. For this first of all, the text was fed to Text2Onto and shortcomings were identified. Now in order to overcome this, we thought of feeding the meta model to it so that we can obtain better extraction of concepts, relations and taxonomy. The experiment was carried out for the three text document. Results obtained from the text were compared with the results obtained from meta model plus the text to assess the improvement of Text2Onto results. 17.2.1 Observations Following observations were made when meta-model and ontology were used on same POM to make the ontology: 1. All the core concepts were identified and their relevance was increased. (The c o r e concepts w e r e identified earlier also) 2. The core concepts which are not present in the text had greater values. 3. The relations from the meta-model are identified and included in the ontology. Due to addition of more patterns, some more relations are identified form the text. However, the useful relations are limited to core concepts.
  • 16. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 116 | Page 4. Hierarchy does not seem to be improved with the algorithms. VerticalRelationsConceptClassification and PatternConceptClassification. Rather, core concepts with composite terms are further classified by these algorithms. For e.g. Ontology component w a s classified under Component. We have not checked this with WordnetConceptClassificationalgorithm yet as it give lots of irrelevant subclass of relations. From these behaviors, we can present the following ideas of making meta model: • We can make meta model with the terms not present in the text (point 2) • If terms present in the text are used for making meta-model, we can write try to increase the frequency of core concepts in the meta model itself. (Point 1) • We can avoid composite terms in meta-model as much as possible. (Point 4) XVIII. Conclusion We studied the architecture and working of a tool called Text2Onto that extracts ontologies from textual input and analyzed its results conducting some experiments with three texts. As a part of the experiments, ontologies were built manually a s well as using the tool and they were compared with each other. After a detailed analysis of the results, we reached the final conclusions as follows: 1. Relevance measure cannot be a general measure to reject or accept all the terms. In automated ontology, there are several terms that have high relevance values and are still rejected by the experts because they do not hold importance for the ontology. Also there are terms which, even after having a significantly low value of relevance, are accepted. This is also very common with the core concepts. Hence the idea of directly using relevance values for accepting or rejecting concepts needs some further refinement. 2. Meta-Model could not improve the ontology in terms of its is-a hierarchy. Though meta model increased the relevance value of core concepts, is-a hierarchy was not improved. Even after having more extracted relations and properly identified core-concepts using the meta-model, it could not help in making the hierarchy better. Identifying the relations and concepts has no effect on subclassof algorithm results. As stated above, there are a few refinements that can be done for the same. They are suggested in the next section of the report. XIX. Future Work From the study of Text2Onto and the outcome of the analysis of its results, we could suggest the following future work and enhancement to Text2Onto. 1. Enhance the use of meta-model to modify the is-a hierarchy of the Ontology. After adding corpus to the upper ontology (using the meta-model), we should increase the relevance of values of the concepts that were identified only for the upper ontology because those core concepts may not be frequent or very relevant. 2. We can try to manually include the following kind of hierarchy in the Ontology Text2Onto uses the following concept while extracting relations: If A<is related to>B and C <is related to>D then A <is related to>D and C <is related to>B also. This kind of relation str uctur e can be exploited to improve the hierarchy o f concepts. If A <related to>B and C <related to>D, then C, D can be considered to be a subclass of A and B respectively. Though this idea may not be applicable for all relations, we can enhance the meta-model significantly for some relations with same name. 3. Another algorithm can be added where some of the “unwanted” domain-concepts can be predefined and hence avoided to be included in the ontology. This task will require human interaction before starting to build the ontology because the “interestingness” of the concepts is significantly dependent on the domain. A similar approach can be followed for the “infrequent” and “significant” concepts of a particular domain. These two approaches can lead us to use relevance measure as significant criteria to accept or reject a term. Hence the problem of difference in the concepts between manual and automated ontology can be overcome.
  • 17. Tools for Ontology Building from Texts: Analysis and Improvement of the Results of Text2Onto www.iosrjournals.org 117 | Page 4. As the algorithms a r e executed separately, some terms are identified as both concepts and instances. A feature (or post-processing) can be included so that the terms should either be listed as concepts or as individuals but not as both. Post processing is also required to remove unnecessary or irrelevant subsumption relation. Synonyms can be taken i n t o account to improve the result of subsumption algorithm. 5. A module can be added to identify the acronyms. Examples fr om the text POM and “probabilistic ontology model” should be identified as one Term. References [1] Mariano Fernandez, Asuncion Gomez-P´erez, and Natalia Juristo. Methontology: From ontological art towards ontological engineering. 1997. [2] Tom Gruber. What is ontology? 1992. http://www-ksl.stanford.edu/kst/what-is-an-ntology.html. [3] Volker J. Prototype for learning networked ontologies, deliverable d3.8.1 of neon project. 2009. [4] Volker Johanna and Blomqvist Eva. Evaluation of methods for contextualized learning of networked ontologies. D eliverable d3.8.2 of neon project. 2008. [5] Corcho O., Fernandez-Lopez M., Perez A. G., and Lopez-Cima A. Building legal ontologies with methontology and webode. Pages 142–157, 2003. [6] Buitelaar P., Cimiano P., and B. Magnini. Ontology learning from text: an overview. Ontology Learning from Text: Methods, Applications a n d Evaluation, pages 3–12, 2005. [7] Cimiano P. and Volker J. Text2onto - a framework for ontology learning and data-driven change discovery. 2005.