SlideShare uma empresa Scribd logo
1 de 4
Baixar para ler offline
Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452
2449 | P a g e
Imitated Rulepossession From Identical Web Sites Using Rule Ontology
Ms. Vidya. V*, Mrs. S. C. Punitha**
*M.Phil Scholar, PSGR Krishnammal College for Women, Coimbatore, India
** Head of the Department, Computer Science, PSGR Krishnammal College for Women, Coimbatore, India.
ABSTRACT
The rule ontology is a generalized,
condensed, and specifically rearranged version of
the existing rules. Rule acquisition research is
relatively unpopular while there are many works
on ontology learning. We proposed an automatic
rule acquisition procedure using ontology, named
Rule To Onto, that includes information about the
rule components and their structures. We started
from the idea that it will be helpful to acquire
rules from a site if we have similar rules acquired
from other similar sites of the same domain. Rule
To Onto is a generalized, condensed, and
specifically rearranged version of the existing
rules. The rule acquisition procedure consists of
the rule component identification step and the rule
composition step. We used stemming and semantic
similarity in the former step and developed the A*
algorithm[1] in the latter step. This paper focuses
on using Rules Extraction to automatically
augment web pages with semantic annotations,
and find out whether and how ontology Rules
Extractions could be used in the extraction
process.
Keywords- rule extraction, rule onto, semantic
similarity, stemming.
I. INTRODUCTION
Automatic Rules possession is a relatively
fresh area of work and in the literature there is still
very little guidance about how Rules Extraction
projects should be managed. Every project typically
starts by setting its scope and goals; in case of Rules
Extraction one needs to define the items to be
extracted, from which documents to perform
extraction, how to integrate and store the extracted
data, and how to use it in a final application. Rule
acquisition is as essential as ontology acquisition,
even though rule acquisition is still a bottleneck in
the deployment of rule-based systems. This is time
consuming and laborious, because it requires
knowledge experts as well as domain experts, and
there are communication problems between them.
However, sometimes rules have already been implied
in Web pages, and it is possible to acquire them from
Web pages in the same manner as ontology learning
[2]. There are some problems with extracting rules
from text. First, which words of the Web page are
rule components and which types of rule components
are they?. Second, how can we compose rules with
the rule components? There are numerous possible
combinations of making rules. Our idea for solving
these problems is using rules of similar sites in
limited situations under a couple of assumptions. Let
us suppose that we have to acquire rules from several
sites of the same domain. The sites have similar Web
pages explaining similar rules from each other. A
comparison shopping portal can be an example.
II. STEPS IN EXTRACTION OF RULES
2.1 Role of Rule Ontology
The purpose of using ontology in our
approach is to automate the rule acquisition
procedure. The starting point of our approach is that
it will be helpful for acquiring rules from a site, if we
have similar rules acquired from other similar sites of
the same domain[3] . Rule ontology, which , includes
the information about rules including terms, rule
component types, and rule structures. We named the
rule ontology Rule To Onto. It has the advantage that
it is structured information and is much smaller than
rule bases, so that it is easy to reuse, share, and
accumulate. However, some part of the burden of
rule acquisition is shifted to ontology acquisition in
our approach. The fact that we need similar sites and
rule bases is a significant constraint in our approach,
even though we have an automatic procedure for
building ontology from the existing rule bases.
The following figure shows the conversion of html
document to text .
FIG.1 converting HTML TO TEXT
2.2 Rule Extraction
A large number of Rules Extraction systems
are based on manually defined grammars whose aim
is to identify segments of interest in the stream of
Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452
2450 | P a g e
processed text. These systems Rules Extraction the
processed text using the phrase representation,
Regular grammars have been the most popular since
text can be searched very effectively for their matches
using finite state automata. The following figure
shows the extractions.
FIG.2 EXTRACTIONS USING RULE ONTO
Rules Extraction system utilized a cascade
of regular grammars to capture occurrences of
increasingly complex events in text; the latter layers
matched output of the former. For example, the first
levels of regular grammars captured multi-word
expressions, then noun and verb groups RULE
ONTO Extractions and Rules Extraction, modeling
parts of reality with domain RULE ONTO
Extractions became increasingly popular and a
number of ontology authoring tools appeared. Rules
Extraction techniques became the natural choice to
populate these ontology Rules Extractions with
instances from text automatically.
2.3 Stemming and Semantic Similarity
In information retrieval, stemming [4] is the
process for reducing inflected (or sometimes derived)
words to their stem, base or root form—generally a
written word form. The stem need not be identical to
the morphological root of the word; it is usually
sufficient that related words map to the same stem,
even if this stem is not in itself a valid root. typically
smaller list of "rules" is stored which provides a path
for the algorithm, given an input word form, to find
its root form. Some examples of the rules include:
 if the word ends in 'ed', remove the 'ed'
 if the word ends in 'ing', remove the 'ing'
 if the word ends in 'ly', remove the 'ly'
The following figure shows the stemming of
the word “timing” to “time”.
FIG 3.STEMMING
Even though the patterns and contents of
rules of different sites are similar, they usually use
different terms that have the same meaning. They use
synonyms in most cases, but they sometimes use
semantically similar[5] concepts with different rule
structures. For example, Amazon uses the concept
region for shipping destinations, but Powells.com
uses country in every shipping rate rules. Country is
not the synonym of region, but is semantically similar
to region. Therefore, we decided to use a semantic
similarity measure in addition to synonyms in order
to increase the recall rate when we identify variables
and values
The following figure shows the semantic
similarity of few keywords.
FIG 4.Semantic Similarity
III. RULE ONTOLOGY GENERATION
RuleToOnto is domain specific knowledge
that provides information about rule components and
structures. It is possible to directly use the rules of the
previous system instead of the proposed ontology.
However, it requires a large space and additional
processes to utilize information on rules, while Rule
To Onto is a generalized compact set of information
for rule acquisition. Thus, we use Rule To Onto
instead of the rules themselves. While the rule
component identification step needs variables, values,
Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452
2451 | P a g e
and the relationship between them, the rule
composition step requires generalized rule structures.
Therefore, Rule To Onto represents the IF and THEN
parts of each rule by connecting rules with variables
with the IF and THEN relations, in addition to basic
information about variables, values, and connections
between variables and values. The Rule To Onto
schema has three object properties Has Value, IF and
THEN, and three classes, Variable, Value, and Rule.
3.1 RuleToOnto Generation Using Protégé
The following depicts the RuleToOnto
Generation of the keywords using Protégé[6]: Email,
mobile, phone, password
<?xml version="1.0"?>
<rdf:RDF
xmlns="http://a.com/ontology#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-
syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-
schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xml:base="http://a.com/ontology">
<owl:Ontology rdf:about=""/>
<owl:Class rdf:ID="email">
<rdfs:subClassOf>
<owl:Class rdf:about="#mobile"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="phone">
<rdfs:subClassOf>
<owl:Class rdf:about="#password"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="null"/>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf>
<owl:Class rdf:ID="null"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf>
<owl:Class rdf:about="#null"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf>
<owl:Class rdf:about="#null"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf>
<owl:Class rdf:about="#null"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf>
<owl:Class rdf:about="#null"/>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
<owl:Class rdf:ID="null">
<rdfs:subClassOf rdf:resource="#null"/>
</owl:Class>
</rdf:RDF>
<!-- Created with Protege (with OWL Plugin 1.2
beta, Build 139) http://protege.stanford.edu -->
3.2 Precision and recall
Precision also called positive predictive
value is the fraction of retrieved instances that are
relevant, while recall also known as sensitivity is the
fraction of relevant instances that are retrieved. Both
precision and recall are therefore based on an
understanding and measure of relevance.
The following figure shows the precision
and recall chart of the extracted rules .
FIG.5 Precision and Recall chart
Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452
2452 | P a g e
IV. IV.CONCLUSION
This paper focuses on the approaches of
Rules Extraction from documents, focusing on semi-
structured texts such as HTML pages or emails. One
limitation of our approach is that the experiment
results do not show that the performance of our
approach is better than others, because there is no
other rule possession study that we can directly
compare our results with. This work can be extended
for an image classification method that is later used
for multimedia Rules Extraction that classify images
found on the Internet into irrelevant pictures and
pictures that should be considered for extraction.
V. ACKNOWLEDGEMENTS
This research work is carried out with
valuable support by Mrs. S.C. Punitha, Head, and
Department of Computer Science PSGR
Krishnammal College for Women, Coimbatore,
India.
REFERENCES
[1] J.C. Beck and M. Fox, “A Generic
Framework for Constraint Directed Search
and Scheduling,” AI Magazine, vol. 19, no.
4, pp. 101-130, 1998.
[2] P. Cimiano and J. Volker, “Text2onto-a
Framework for Ontology Learning and
Data-Driven Change Discovery,” Proc.
10th Int’l Conf. Applications of Natural
Language to Rules Systems (NLDB), pp.
227-238, 2005.
[3] Sangun Park and Juyoung Kang “Using Rule
Ontology in Repeated RuleAcquisition from
Similar Web Sites” IEEE TRANSACTIONS
ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 24, NO. 6, JUNE
2012
[4] B. Priyadharshini, A. Mohana, R. Radhika,
“Retrieving Rules Using Composetoonto
Based On Stemming Algorithm From
Similar Web Sites” , International Journal of
Engineering Research & Technology
(IJERT) Vol. 2 Issue 3, March - 2013 ISSN:
2278-0181
[5] David Sánchez, Montserrat Batet, David
Isern, Aida Valls, “Ontology-based semantic
similarity: a new feature-based approach”
Intelligent Technologies for Advanced
Knowledge Acquisition
(ITAKA),Departament d’Enginyeria
Informàtica i Matemàtiques, Universitat
Rovira i Virgili, Avda. Països Catalans, 26.
43007 Tarragona (Spain)
[6] P. Buitelaar, D. Olejnik, and M. Sintek, “A
Prote´ge´ Plug-in for Ontology Extraction
from Text Based on Linguistic Analysis,”
Proc. First European Semantic Web Symp.
(ESWS), 2004.

Mais conteúdo relacionado

Mais procurados

Mais procurados (16)

A03730108
A03730108A03730108
A03730108
 
Identifying the semantic relations on
Identifying the semantic relations onIdentifying the semantic relations on
Identifying the semantic relations on
 
D017422528
D017422528D017422528
D017422528
 
Hyponymy extraction of domain ontology
Hyponymy extraction of domain ontologyHyponymy extraction of domain ontology
Hyponymy extraction of domain ontology
 
Discovering latent informaion by
Discovering latent informaion byDiscovering latent informaion by
Discovering latent informaion by
 
Ontology Engineering for Big Data
Ontology Engineering for Big DataOntology Engineering for Big Data
Ontology Engineering for Big Data
 
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
 
Sentence similarity-based-text-summarization-using-clusters
Sentence similarity-based-text-summarization-using-clustersSentence similarity-based-text-summarization-using-clusters
Sentence similarity-based-text-summarization-using-clusters
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
 
ontology based- data_integration.
ontology based- data_integration.ontology based- data_integration.
ontology based- data_integration.
 
Named entity recognition using web document corpus
Named entity recognition using web document corpusNamed entity recognition using web document corpus
Named entity recognition using web document corpus
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Named Entity Recognition Using Web Document Corpus
Named Entity Recognition Using Web Document CorpusNamed Entity Recognition Using Web Document Corpus
Named Entity Recognition Using Web Document Corpus
 
Translating Ontologies in Real-World Settings
Translating Ontologies in Real-World SettingsTranslating Ontologies in Real-World Settings
Translating Ontologies in Real-World Settings
 

Destaque (8)

Herramientas web
Herramientas webHerramientas web
Herramientas web
 
cumpleaños
cumpleañoscumpleaños
cumpleaños
 
Mirza ghalib famous quotes / sher
Mirza ghalib famous quotes / sherMirza ghalib famous quotes / sher
Mirza ghalib famous quotes / sher
 
Niles West v Glenbrook South 1984
Niles West v Glenbrook South 1984Niles West v Glenbrook South 1984
Niles West v Glenbrook South 1984
 
Sistema nervioso
Sistema nerviosoSistema nervioso
Sistema nervioso
 
Como eliminar lunares del rostro
Como eliminar lunares del rostroComo eliminar lunares del rostro
Como eliminar lunares del rostro
 
REPOSITORIOS
REPOSITORIOSREPOSITORIOS
REPOSITORIOS
 
Unetenet presentation plan
Unetenet presentation plan Unetenet presentation plan
Unetenet presentation plan
 

Semelhante a Nz3424492452

Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
IJwest
 
Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362
Editor IJARCET
 
Ck32985989
Ck32985989Ck32985989
Ck32985989
IJMER
 

Semelhante a Nz3424492452 (20)

A Framework for Automated Association Mining Over Multiple Databases
A Framework for Automated Association Mining Over Multiple DatabasesA Framework for Automated Association Mining Over Multiple Databases
A Framework for Automated Association Mining Over Multiple Databases
 
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEMCANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
 
IRJET - BOT Virtual Guide
IRJET -  	  BOT Virtual GuideIRJET -  	  BOT Virtual Guide
IRJET - BOT Virtual Guide
 
User friendly pattern search paradigm
User friendly pattern search paradigmUser friendly pattern search paradigm
User friendly pattern search paradigm
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[
 
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyConceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
 
K355662
K355662K355662
K355662
 
K355662
K355662K355662
K355662
 
Browser Extension to Removing Dust using Sequence Alignment and Content Matc...
Browser Extension to Removing Dust using Sequence Alignment and Content  Matc...Browser Extension to Removing Dust using Sequence Alignment and Content  Matc...
Browser Extension to Removing Dust using Sequence Alignment and Content Matc...
 
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVALONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
ONTOLOGICAL TREE GENERATION FOR ENHANCED INFORMATION RETRIEVAL
 
Optimization of Mining Association Rule from XML Documents
Optimization of Mining Association Rule from XML DocumentsOptimization of Mining Association Rule from XML Documents
Optimization of Mining Association Rule from XML Documents
 
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
 
The Revolution Of Cloud Computing
The Revolution Of Cloud ComputingThe Revolution Of Cloud Computing
The Revolution Of Cloud Computing
 
Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362
 
Towards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational DatabaseTowards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational Database
 
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
EMPLOYING THE CATEGORIES OF WIKIPEDIA IN THE TASK OF AUTOMATIC DOCUMENTS CLUS...
 
A Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User TransactionsA Soft Set-based Co-occurrence for Clustering Web User Transactions
A Soft Set-based Co-occurrence for Clustering Web User Transactions
 
Ck32985989
Ck32985989Ck32985989
Ck32985989
 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach
 

Último

Último (20)

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Nz3424492452

  • 1. Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452 2449 | P a g e Imitated Rulepossession From Identical Web Sites Using Rule Ontology Ms. Vidya. V*, Mrs. S. C. Punitha** *M.Phil Scholar, PSGR Krishnammal College for Women, Coimbatore, India ** Head of the Department, Computer Science, PSGR Krishnammal College for Women, Coimbatore, India. ABSTRACT The rule ontology is a generalized, condensed, and specifically rearranged version of the existing rules. Rule acquisition research is relatively unpopular while there are many works on ontology learning. We proposed an automatic rule acquisition procedure using ontology, named Rule To Onto, that includes information about the rule components and their structures. We started from the idea that it will be helpful to acquire rules from a site if we have similar rules acquired from other similar sites of the same domain. Rule To Onto is a generalized, condensed, and specifically rearranged version of the existing rules. The rule acquisition procedure consists of the rule component identification step and the rule composition step. We used stemming and semantic similarity in the former step and developed the A* algorithm[1] in the latter step. This paper focuses on using Rules Extraction to automatically augment web pages with semantic annotations, and find out whether and how ontology Rules Extractions could be used in the extraction process. Keywords- rule extraction, rule onto, semantic similarity, stemming. I. INTRODUCTION Automatic Rules possession is a relatively fresh area of work and in the literature there is still very little guidance about how Rules Extraction projects should be managed. Every project typically starts by setting its scope and goals; in case of Rules Extraction one needs to define the items to be extracted, from which documents to perform extraction, how to integrate and store the extracted data, and how to use it in a final application. Rule acquisition is as essential as ontology acquisition, even though rule acquisition is still a bottleneck in the deployment of rule-based systems. This is time consuming and laborious, because it requires knowledge experts as well as domain experts, and there are communication problems between them. However, sometimes rules have already been implied in Web pages, and it is possible to acquire them from Web pages in the same manner as ontology learning [2]. There are some problems with extracting rules from text. First, which words of the Web page are rule components and which types of rule components are they?. Second, how can we compose rules with the rule components? There are numerous possible combinations of making rules. Our idea for solving these problems is using rules of similar sites in limited situations under a couple of assumptions. Let us suppose that we have to acquire rules from several sites of the same domain. The sites have similar Web pages explaining similar rules from each other. A comparison shopping portal can be an example. II. STEPS IN EXTRACTION OF RULES 2.1 Role of Rule Ontology The purpose of using ontology in our approach is to automate the rule acquisition procedure. The starting point of our approach is that it will be helpful for acquiring rules from a site, if we have similar rules acquired from other similar sites of the same domain[3] . Rule ontology, which , includes the information about rules including terms, rule component types, and rule structures. We named the rule ontology Rule To Onto. It has the advantage that it is structured information and is much smaller than rule bases, so that it is easy to reuse, share, and accumulate. However, some part of the burden of rule acquisition is shifted to ontology acquisition in our approach. The fact that we need similar sites and rule bases is a significant constraint in our approach, even though we have an automatic procedure for building ontology from the existing rule bases. The following figure shows the conversion of html document to text . FIG.1 converting HTML TO TEXT 2.2 Rule Extraction A large number of Rules Extraction systems are based on manually defined grammars whose aim is to identify segments of interest in the stream of
  • 2. Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452 2450 | P a g e processed text. These systems Rules Extraction the processed text using the phrase representation, Regular grammars have been the most popular since text can be searched very effectively for their matches using finite state automata. The following figure shows the extractions. FIG.2 EXTRACTIONS USING RULE ONTO Rules Extraction system utilized a cascade of regular grammars to capture occurrences of increasingly complex events in text; the latter layers matched output of the former. For example, the first levels of regular grammars captured multi-word expressions, then noun and verb groups RULE ONTO Extractions and Rules Extraction, modeling parts of reality with domain RULE ONTO Extractions became increasingly popular and a number of ontology authoring tools appeared. Rules Extraction techniques became the natural choice to populate these ontology Rules Extractions with instances from text automatically. 2.3 Stemming and Semantic Similarity In information retrieval, stemming [4] is the process for reducing inflected (or sometimes derived) words to their stem, base or root form—generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root. typically smaller list of "rules" is stored which provides a path for the algorithm, given an input word form, to find its root form. Some examples of the rules include:  if the word ends in 'ed', remove the 'ed'  if the word ends in 'ing', remove the 'ing'  if the word ends in 'ly', remove the 'ly' The following figure shows the stemming of the word “timing” to “time”. FIG 3.STEMMING Even though the patterns and contents of rules of different sites are similar, they usually use different terms that have the same meaning. They use synonyms in most cases, but they sometimes use semantically similar[5] concepts with different rule structures. For example, Amazon uses the concept region for shipping destinations, but Powells.com uses country in every shipping rate rules. Country is not the synonym of region, but is semantically similar to region. Therefore, we decided to use a semantic similarity measure in addition to synonyms in order to increase the recall rate when we identify variables and values The following figure shows the semantic similarity of few keywords. FIG 4.Semantic Similarity III. RULE ONTOLOGY GENERATION RuleToOnto is domain specific knowledge that provides information about rule components and structures. It is possible to directly use the rules of the previous system instead of the proposed ontology. However, it requires a large space and additional processes to utilize information on rules, while Rule To Onto is a generalized compact set of information for rule acquisition. Thus, we use Rule To Onto instead of the rules themselves. While the rule component identification step needs variables, values,
  • 3. Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452 2451 | P a g e and the relationship between them, the rule composition step requires generalized rule structures. Therefore, Rule To Onto represents the IF and THEN parts of each rule by connecting rules with variables with the IF and THEN relations, in addition to basic information about variables, values, and connections between variables and values. The Rule To Onto schema has three object properties Has Value, IF and THEN, and three classes, Variable, Value, and Rule. 3.1 RuleToOnto Generation Using Protégé The following depicts the RuleToOnto Generation of the keywords using Protégé[6]: Email, mobile, phone, password <?xml version="1.0"?> <rdf:RDF xmlns="http://a.com/ontology#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf- syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf- schema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xml:base="http://a.com/ontology"> <owl:Ontology rdf:about=""/> <owl:Class rdf:ID="email"> <rdfs:subClassOf> <owl:Class rdf:about="#mobile"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="phone"> <rdfs:subClassOf> <owl:Class rdf:about="#password"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="null"/> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf> <owl:Class rdf:ID="null"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf> <owl:Class rdf:about="#null"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf> <owl:Class rdf:about="#null"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf> <owl:Class rdf:about="#null"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf> <owl:Class rdf:about="#null"/> </rdfs:subClassOf> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> <owl:Class rdf:ID="null"> <rdfs:subClassOf rdf:resource="#null"/> </owl:Class> </rdf:RDF> <!-- Created with Protege (with OWL Plugin 1.2 beta, Build 139) http://protege.stanford.edu --> 3.2 Precision and recall Precision also called positive predictive value is the fraction of retrieved instances that are relevant, while recall also known as sensitivity is the fraction of relevant instances that are retrieved. Both precision and recall are therefore based on an understanding and measure of relevance. The following figure shows the precision and recall chart of the extracted rules . FIG.5 Precision and Recall chart
  • 4. Ms. Vidya. V, Mrs. S. C. Punitha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 3, Issue 4, Jul-Aug 2013, pp.2449-2452 2452 | P a g e IV. IV.CONCLUSION This paper focuses on the approaches of Rules Extraction from documents, focusing on semi- structured texts such as HTML pages or emails. One limitation of our approach is that the experiment results do not show that the performance of our approach is better than others, because there is no other rule possession study that we can directly compare our results with. This work can be extended for an image classification method that is later used for multimedia Rules Extraction that classify images found on the Internet into irrelevant pictures and pictures that should be considered for extraction. V. ACKNOWLEDGEMENTS This research work is carried out with valuable support by Mrs. S.C. Punitha, Head, and Department of Computer Science PSGR Krishnammal College for Women, Coimbatore, India. REFERENCES [1] J.C. Beck and M. Fox, “A Generic Framework for Constraint Directed Search and Scheduling,” AI Magazine, vol. 19, no. 4, pp. 101-130, 1998. [2] P. Cimiano and J. Volker, “Text2onto-a Framework for Ontology Learning and Data-Driven Change Discovery,” Proc. 10th Int’l Conf. Applications of Natural Language to Rules Systems (NLDB), pp. 227-238, 2005. [3] Sangun Park and Juyoung Kang “Using Rule Ontology in Repeated RuleAcquisition from Similar Web Sites” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 24, NO. 6, JUNE 2012 [4] B. Priyadharshini, A. Mohana, R. Radhika, “Retrieving Rules Using Composetoonto Based On Stemming Algorithm From Similar Web Sites” , International Journal of Engineering Research & Technology (IJERT) Vol. 2 Issue 3, March - 2013 ISSN: 2278-0181 [5] David Sánchez, Montserrat Batet, David Isern, Aida Valls, “Ontology-based semantic similarity: a new feature-based approach” Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA),Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, Avda. Països Catalans, 26. 43007 Tarragona (Spain) [6] P. Buitelaar, D. Olejnik, and M. Sintek, “A Prote´ge´ Plug-in for Ontology Extraction from Text Based on Linguistic Analysis,” Proc. First European Semantic Web Symp. (ESWS), 2004.