Enviar pesquisa
Carregar
105 108
•
0 gostou
•
333 visualizações
E
Editor IJARCET
Seguir
Tecnologia
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 4
Baixar agora
Baixar para ler offline
Recomendados
625 634
625 634
Editor IJARCET
Cl32543545
Cl32543545
IJERA Editor
Df25632640
Df25632640
IJERA Editor
Similarity based Dynamic Web Data Extraction and Integration System from Sear...
Similarity based Dynamic Web Data Extraction and Integration System from Sear...
IDES Editor
Pxc3893553
Pxc3893553
Ouzza Brahim
PTAR UITM Website review
PTAR UITM Website review
Dirz M
Literature Survey on Web Mining
Literature Survey on Web Mining
IOSR Journals
Website review
Website review
Dirz M
Recomendados
625 634
625 634
Editor IJARCET
Cl32543545
Cl32543545
IJERA Editor
Df25632640
Df25632640
IJERA Editor
Similarity based Dynamic Web Data Extraction and Integration System from Sear...
Similarity based Dynamic Web Data Extraction and Integration System from Sear...
IDES Editor
Pxc3893553
Pxc3893553
Ouzza Brahim
PTAR UITM Website review
PTAR UITM Website review
Dirz M
Literature Survey on Web Mining
Literature Survey on Web Mining
IOSR Journals
Website review
Website review
Dirz M
Aa03401490154
Aa03401490154
ijceronline
01635156
01635156
Mechergui Najla
The Social Web : SCAmore
The Social Web : SCAmore
JISC Netskills
Web personalization using clustering of web usage data
Web personalization using clustering of web usage data
ijfcstjournal
Ijartes v1-i3-002
Ijartes v1-i3-002
IJARTES
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
Editor IJCATR
A Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage Mining
IRJET Journal
An image crawler for content based image retrieval
An image crawler for content based image retrieval
eSAT Publishing House
Minning www
Minning www
Sonali Parab
Pxc3872601
Pxc3872601
StephanieLeBadezet
SEMANTIC WEB: INFORMATION RETRIEVAL FROM WORLD WIDE WEB
SEMANTIC WEB: INFORMATION RETRIEVAL FROM WORLD WIDE WEB
IJCI JOURNAL
L017447590
L017447590
IOSR Journals
Ifyoudeletethisafterreading
Ifyoudeletethisafterreading
Ramakanta Behera
Preval de gingivitis
Preval de gingivitis
dayneris
Changing Media Landscape, The Golden Age of Journalism & Us
Changing Media Landscape, The Golden Age of Journalism & Us
Jayadevan P K
Mapping Digital Media India 2013
Mapping Digital Media India 2013
Melih ÖZCANLI
Online Alışveriş Yapanların Gizemi - Çok Kanallı Perakendecilikte 10 Mit
Online Alışveriş Yapanların Gizemi - Çok Kanallı Perakendecilikte 10 Mit
Melih ÖZCANLI
Caldecott illustration analysis
Caldecott illustration analysis
Holly Manns
Cl32543545
Cl32543545
IJERA Editor
Bb31269380
Bb31269380
IJERA Editor
H0314450
H0314450
iosrjournals
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
ijdkp
Mais conteúdo relacionado
Mais procurados
Aa03401490154
Aa03401490154
ijceronline
01635156
01635156
Mechergui Najla
The Social Web : SCAmore
The Social Web : SCAmore
JISC Netskills
Web personalization using clustering of web usage data
Web personalization using clustering of web usage data
ijfcstjournal
Ijartes v1-i3-002
Ijartes v1-i3-002
IJARTES
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
Editor IJCATR
A Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage Mining
IRJET Journal
An image crawler for content based image retrieval
An image crawler for content based image retrieval
eSAT Publishing House
Minning www
Minning www
Sonali Parab
Pxc3872601
Pxc3872601
StephanieLeBadezet
SEMANTIC WEB: INFORMATION RETRIEVAL FROM WORLD WIDE WEB
SEMANTIC WEB: INFORMATION RETRIEVAL FROM WORLD WIDE WEB
IJCI JOURNAL
L017447590
L017447590
IOSR Journals
Mais procurados
(12)
Aa03401490154
Aa03401490154
01635156
01635156
The Social Web : SCAmore
The Social Web : SCAmore
Web personalization using clustering of web usage data
Web personalization using clustering of web usage data
Ijartes v1-i3-002
Ijartes v1-i3-002
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
A Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage Mining
An image crawler for content based image retrieval
An image crawler for content based image retrieval
Minning www
Minning www
Pxc3872601
Pxc3872601
SEMANTIC WEB: INFORMATION RETRIEVAL FROM WORLD WIDE WEB
SEMANTIC WEB: INFORMATION RETRIEVAL FROM WORLD WIDE WEB
L017447590
L017447590
Destaque
Ifyoudeletethisafterreading
Ifyoudeletethisafterreading
Ramakanta Behera
Preval de gingivitis
Preval de gingivitis
dayneris
Changing Media Landscape, The Golden Age of Journalism & Us
Changing Media Landscape, The Golden Age of Journalism & Us
Jayadevan P K
Mapping Digital Media India 2013
Mapping Digital Media India 2013
Melih ÖZCANLI
Online Alışveriş Yapanların Gizemi - Çok Kanallı Perakendecilikte 10 Mit
Online Alışveriş Yapanların Gizemi - Çok Kanallı Perakendecilikte 10 Mit
Melih ÖZCANLI
Caldecott illustration analysis
Caldecott illustration analysis
Holly Manns
Destaque
(6)
Ifyoudeletethisafterreading
Ifyoudeletethisafterreading
Preval de gingivitis
Preval de gingivitis
Changing Media Landscape, The Golden Age of Journalism & Us
Changing Media Landscape, The Golden Age of Journalism & Us
Mapping Digital Media India 2013
Mapping Digital Media India 2013
Online Alışveriş Yapanların Gizemi - Çok Kanallı Perakendecilikte 10 Mit
Online Alışveriş Yapanların Gizemi - Çok Kanallı Perakendecilikte 10 Mit
Caldecott illustration analysis
Caldecott illustration analysis
Semelhante a 105 108
Cl32543545
Cl32543545
IJERA Editor
Bb31269380
Bb31269380
IJERA Editor
H0314450
H0314450
iosrjournals
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
ijdkp
Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
IJERA Editor
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
407 409
407 409
Editor IJARCET
A Study of Pattern Analysis Techniques of Web Usage
A Study of Pattern Analysis Techniques of Web Usage
ijbuiiir1
A Study on Web Structure Mining
A Study on Web Structure Mining
IRJET Journal
A Study On Web Structure Mining
A Study On Web Structure Mining
Nicole Heredia
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
James Heller
3 05564736
3 05564736
School
Web crawler
Web crawler
crazyprave12490
Web mining
Web mining
Nandini Sahu
WEB MINING.pptx
WEB MINING.pptx
HarshithRaj21
A detail survey of page re ranking various web features and techniques
A detail survey of page re ranking various web features and techniques
ijctet
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
ijsrd.com
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
IOSR Journals
Web Mining
Web Mining
Shobha Rani
HIGWGET-A Model for Crawling Secure Hidden WebPages
HIGWGET-A Model for Crawling Secure Hidden WebPages
ijdkp
Semelhante a 105 108
(20)
Cl32543545
Cl32543545
Bb31269380
Bb31269380
H0314450
H0314450
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
407 409
407 409
A Study of Pattern Analysis Techniques of Web Usage
A Study of Pattern Analysis Techniques of Web Usage
A Study on Web Structure Mining
A Study on Web Structure Mining
A Study On Web Structure Mining
A Study On Web Structure Mining
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
3 05564736
3 05564736
Web crawler
Web crawler
Web mining
Web mining
WEB MINING.pptx
WEB MINING.pptx
A detail survey of page re ranking various web features and techniques
A detail survey of page re ranking various web features and techniques
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Web Mining
Web Mining
HIGWGET-A Model for Crawling Secure Hidden WebPages
HIGWGET-A Model for Crawling Secure Hidden WebPages
Mais de Editor IJARCET
Electrically small antennas: The art of miniaturization
Electrically small antennas: The art of miniaturization
Editor IJARCET
Volume 2-issue-6-2205-2207
Volume 2-issue-6-2205-2207
Editor IJARCET
Volume 2-issue-6-2195-2199
Volume 2-issue-6-2195-2199
Editor IJARCET
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204
Editor IJARCET
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194
Editor IJARCET
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
Editor IJARCET
Volume 2-issue-6-2177-2185
Volume 2-issue-6-2177-2185
Editor IJARCET
Volume 2-issue-6-2173-2176
Volume 2-issue-6-2173-2176
Editor IJARCET
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
Editor IJARCET
Volume 2-issue-6-2159-2164
Volume 2-issue-6-2159-2164
Editor IJARCET
Volume 2-issue-6-2155-2158
Volume 2-issue-6-2155-2158
Editor IJARCET
Volume 2-issue-6-2148-2154
Volume 2-issue-6-2148-2154
Editor IJARCET
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147
Editor IJARCET
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124
Editor IJARCET
Volume 2-issue-6-2139-2142
Volume 2-issue-6-2139-2142
Editor IJARCET
Volume 2-issue-6-2130-2138
Volume 2-issue-6-2130-2138
Editor IJARCET
Volume 2-issue-6-2125-2129
Volume 2-issue-6-2125-2129
Editor IJARCET
Volume 2-issue-6-2114-2118
Volume 2-issue-6-2114-2118
Editor IJARCET
Volume 2-issue-6-2108-2113
Volume 2-issue-6-2108-2113
Editor IJARCET
Volume 2-issue-6-2102-2107
Volume 2-issue-6-2102-2107
Editor IJARCET
Mais de Editor IJARCET
(20)
Electrically small antennas: The art of miniaturization
Electrically small antennas: The art of miniaturization
Volume 2-issue-6-2205-2207
Volume 2-issue-6-2205-2207
Volume 2-issue-6-2195-2199
Volume 2-issue-6-2195-2199
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2190-2194
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2177-2185
Volume 2-issue-6-2177-2185
Volume 2-issue-6-2173-2176
Volume 2-issue-6-2173-2176
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2165-2172
Volume 2-issue-6-2159-2164
Volume 2-issue-6-2159-2164
Volume 2-issue-6-2155-2158
Volume 2-issue-6-2155-2158
Volume 2-issue-6-2148-2154
Volume 2-issue-6-2148-2154
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2139-2142
Volume 2-issue-6-2139-2142
Volume 2-issue-6-2130-2138
Volume 2-issue-6-2130-2138
Volume 2-issue-6-2125-2129
Volume 2-issue-6-2125-2129
Volume 2-issue-6-2114-2118
Volume 2-issue-6-2114-2118
Volume 2-issue-6-2108-2113
Volume 2-issue-6-2108-2113
Volume 2-issue-6-2102-2107
Volume 2-issue-6-2102-2107
Último
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
2toLead Limited
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ridwan Fadjar
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
OnBoard
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Delhi Call girls
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Pooja Nehwal
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Memoori
Último
(20)
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
105 108
1.
ISSN: 2278 –
1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 Reconciling the Website Structure to Improve the Web Navigation Efficiency Joy Shalom Sona, Asha Ambhaikar selecting and pre-processing specific information from Abstract—The www grows tremendously. It increases the retrieved Web resources. complexity of web applications and web navigation. Generalization: automatically discovers general patterns at Recommendations play an important role towards this individual Web sites as well as across multiple sites. direction. Our Recommendation is based on user Browsing patterns. Our approach presents a comprehensive overview of Analysis: validation and/or interpretation of the mined web mining methods and techniques used for the evaluation of patterns. reconciling systems to achieve better web navigation efficiency in order to improve the efficiency of web site. It integrates and There are three areas of Web mining according to the usage coordinates among different reasons for making of the Web data used as input in the data mining process, recommendations including frequency of access, and patterns namely, Web Content Mining (WCM), Web Usage Mining of access by visitors to the web site. We are not argue the (WUM) and Web Structure Mining (WSM). structure or content of the web site but we recommended to web site developer. Our proposed techniques are achieved better web navigation efficiency and it is highly effective from existing Web Mining one. Index Terms—Browsing Efficiency, Reconciling Website System, Web Content Mining, Web Structure Mining. Web Content Web Structure Web Usage Mining Mining Mining I. INTRODUCTION The most of the people browsing the internet for retrieving information. But most of the time, they gets lots of DocumentS Hyperlinks insignificant and irrelevant document even after navigating tructure several links. Factors for web designers when considering the design of a new website include the attractiveness of the design, an Multimedia Structured effective structure to the web page to deliver information Data Record quickly, and user satisfaction among a growing and diverse set of users faced with ever increasing web contents. However, with the development of more and more web-based Web Server Application Application technologies and the growth in web content, the structure of a Log Server Log Level Log website becomes more complex and web navigation becomes a critical issue to both web designers and users. Fig.1 Web Mining Classification A. Web Mining Overview Web content usage mining, Web structure mining, and Web Web mining is an application of the data mining techniques content mining. Web usage mining refers to the discovery of to automatically discover and extract knowledge from the Web. According to Kosala et al [2], Web mining consists of user access patterns from Web usage logs. Web structure the following tasks: mining tries to discover useful knowledge from the structure of hyperlinks which helps to investigate the node and Resource finding: the task of retrieving intended Web connection structure of web sites. According the type of web documents. structural data, web structure mining can be divided into two Information selection and pre-processing: automatically kinds 1) extracting the documents from hyperlinks in the web 2) analysis of the tree-like structure of page structure. Based on the topology of the hyperlinks, web structure mining will Manuscript received May, 2012. categorize the web page and generate the information, such Joy Shalom Sona, Department of Computer Science and Engineering, Chhattisgarh Swami Vivekanand Technical University, RCET,Bhilai, India, as the similarity and mining is concerned with the retrieval of 9926152273(e-mail: sjoyshalom@gmail.com). information from WWW into more structured form and Asha Ambhaikar, Department of Computer Science and Engineering, indexing the information to retrieve it quickly. Web usage Chhattisgarh Swami Vivekanand Technical University, RCET,Bhilai, India,9229655211 (e-mail:asha31.a@rediffmail.com). mining is the process of identifying the browsing patterns by analyzing the user’s navigational behavior. Web structure 105 All Rights Reserved © 2012 IJARCET
2.
ISSN: 2278 –
1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 mining is to discover the model underlying the link structures from server logs in the extended NSCA format. Chen et al. of the Web pages, catalog them and generate information (1996) introduce the concept of maximal forward reference suchas the similarity and relationship between them, taking to characterize user episodes for the mining of traversal advantage of their hyperlink topology. Web classification is patterns. A maximal forward reference is the sequence of shown in Fig 1. pages requested by a user up to the last page before backtracking occurs during a particular server session. The B. Web Content Mining(WCM) SpeedTracer project [Wu et al., 1998] from IBM Watson is Web Content Mining is the process of extracting useful built upon work originally reported in Chen et al. (1996). In information from the contents of web documents. The web addition to episode identification, SpeedTracer makes use of documents may consists of text, images, audio, video or referrer and agent information in the preprocessing routines structured records like tables and lists. Mining can be applied to identify users and server sessions in the absence of on the web documents as well the results pages produced additional client side information. The Web Utilization from a search engine. There are two types of approach in Miner (WUM) system [Spiliopoulou and Faulstich, 1998] content mining called agent based approach and database provides a robust mining language in order to specify based approach. The agent based approach concentrate on characteristics of discovered frequent paths that are searching relevant information using the characteristics of a interesting to the analyst. Zaiane et al. (1998) have loaded particular domain to interpret and organize the collected Web server logs into a data cube structure in order to perform information. The database approach is used for retrieving the data mining as well as On-Line Analytical Processing semi-structure data from the web. (OLAP) activities such as roll-up and drill-down of the data. Their Weblog Miner system has been used to discover C. Web Usage Mining(WUM) association rules, perform classification and time-series Web Usage Mining is the process of extracting useful analysis. Shahabi et al. (1997) and Zarkesh et al. (1997) have information from the secondary data derived from the one of the few Web Usage mining systems that rely on client interactions of the user while surfing on the Web. It extracts side data collection. The client side agent sends back page data stored in server access logs, referrer logs, agent logs, request and time information to the server every time a page client-side cookies, user profile and Meta data. containing the Java applet is loaded or destroyed [5]. D. Web Structure Mining(WSM) B. Adaptive Website The goal of the Web Structure Mining is to generate the Users interact with a website in multiple ways, while their structural summary about the Web site and Web page. It tries mental model about a particular subject can obviously differ to discover the link structure of the hyperlinks at the from those of other users and the web developer. inter-document level. Based on the topology of the Consequently, improving the interaction between users and hyperlinks, Web Structure mining will categorize the Web websites is of importance. Raskin [6] introduces various pages and generate the information like similarity and ways of quantification in measuring interface design in his relationship between different Web sites. This type of mining book. Especially, he mentions information-theoretic can be performed at the document level (intra-page) or at the efficiency, which is defined similarly to the way efficiency is hyperlink level (inter-page). It is important to understand the defined in thermodynamics; in thermodynamics we calculate Web data structure for Information Retrieval efficiency by dividing the power coming out of a process by the power going into the process. If, during a certain time II. RELATED WORK interval, an electrical generator is producing 820 watts while A. WebMining it is driven by an engine that has an output of 1000 W, it has an efficiency 820/1000, or 0.82. Efficiency is also often expressed as a percentage; in this case, the generator has an Web mining has emerged as a specialized field during the efficiency of 82%. This calculation can be applied to last few years and refers to the application of knowledge calculate the information efficiency. Srikant and Yang [7] discovery techniques specifically to web data. Web content propose an algorithm to automatically find pages in a website and web structure mining, respectively, refer to the analysis whose location is different from where visitors expect to find of the content of web pages and the structure of links between them. The key insight is that visitors will backtrack if they do them. Web usage mining, on the other hand, is the process of not find the information where they expect it: the point from applying data mining techniques to the discovery of patterns where they backtrack is the expected location for the page. in web data [5]. Web usage mining involves four steps: user They also use a time threshold to distinguish whether a page identification, data pre-processing, pattern discovery and is target page or not. Nakayama et al. (2000) proposes a analysis. User access patterns are models of user browsing technique that discovers the gap between website designers’ activity. In most cases these are deduced from web server expectations and users’ behavior. The former are assessed by access logs. An alternative method includes client-side measuring the inter-page conceptual relevance and the latter logging, using techniques such as cookies. This is referred to by measuring the inter-page access co-occurrence. They also as web-log mining [4]. Mining activities help us to know the suggest how to apply quantitative data obtained through a data patterns. User patterns, extracted from Web data, have multiple regression analysis that predicts hyperlink traversal been applied to a wide range of applications. Projects by frequency from page layout features. Most adaptive systems Spiliopoulou and Faulstich (1998), Wu et al. (1998), Zaiane include a procedure on mining web log to understand user et al. (1998), Shahabi et al. (1998) have focused on Web behaviors and patterns and to improve their website Usage Mining ingeneral, without extensive tailoring of the automatically and efficiently. However, none of them try to process towards one of the various sub-categories. The calculate the efficiency to improve the web structure. We WebSIFT project is designed to perform Web Usage Mining 106 All Rights Reserved © 2012 IJARCET
3.
ISSN: 2278 –
1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 want to apply the efficiency concept from [6] and develop the 𝐸𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 = (𝑠𝑜𝑟𝑡𝑒𝑠𝑡 𝑝𝑎𝑡 𝑓𝑟𝑜𝑚 𝑠𝑡𝑎𝑟𝑡 𝑝𝑎𝑔𝑒 𝑡𝑜 𝑡𝑎𝑟𝑔𝑒𝑡 𝑝𝑎𝑔𝑒)/𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑛𝑔 𝑐𝑜𝑠𝑡 efficiency calculation function. (1) User operating behavior and shortcomings of the website III. METHODOLOGY determine by the help of user browsing behavior patterns. For calculating the efficiency of a website, there is need’s Our proposed techniques includes following steps: to determined the user’s operating route that is start page and 1. Mining the Web Architecture target page.The term operating cost refers to the number of 2. Determining user log pages visited between the begin page and the target (end) 3. Obtaining Website browsing efficiency page A. Mining the Web Architecture A website consists of Web pages, which connect to each IV. EXPERIMENTAL RESULTS other through hyperlinks. The website can be modeled as a A reconcile website system uses quantitative measures of the graph, G=(V, E). Vertices V={v1, v2,…vn }, where informational , navigational, and graphical aspects of a web vi(i=1,2,…n) denotes a page. Edges or arcs E={eij | the site to generate a suggestion reports for designers to improve hyperlink from the source page i to the destination page j}. P their sites. Our approach is according to time schedules. The is the set of ordered pares (i,j) such that there is a path from i designer sets up the time schedule for the program execution, to j, where each node is visited once. R is the set of ordered program automatically run the web structure and web usage pares (i,j) such that there is a route user navigate from i to j. mining programs and generates a report according to the user behavior. This program mainly gives suggestions, to web users, on 1 how to access pages more efficiency, and to more easily acquire the information they want. When a designer browses the website, the reports for 2 3 4 adjusting the website architecture from the usage mining result, and the website architecture is generated instantly. The changes are based on user browsing behavior in order to find 5 6 7 8 the most efficiently accessible web architecture. As the result, users become more comfortable and can efficiently surf pages in the website. User browsing behavior is 9 10 11 continually recorded onto the database for the next website adjustment. Fig 2: Graph Form of a Website Practical effort on the website we get the approximately desired outcomes. In this experimental effort we are able to Reconciling the web site structure according to the user The web structure mining program proposes to grab the website structure and save it. When a designer types the IP navigation pattern analysis through the user log records. This address of a website to the program, the program will naturally increase the web site efficiency with the automatically starts to mine the entire website architecture. outcomes Firstly, the program downloads the page and analyzes its HTML code. Next, the system seeks out the hyperlinks in the V. CONCLUSION page and repeats the actions until the page does not belong to In this paper we proposed a Reconciling Website System the domain the designer inputs. Finally, the program obtains which improves the web navigation efficiency and suggests the entire website architecture and saves it. the reorganization of the web site. Reconciling Websites can B. Determining User Log make hit pages more accessible, highlight interesting links, connected related pages. Adaptive web sites can advice to a This involves following four tasks: user identification, data Website’s developer summarizing access information and pre-processing, and pattern discovery and pattern analysis. making suggestions for that particular website. These User access patterns are models of users’ browsing activity. suggestion based on the user navigation behavior which In most cases these are deduced from web server access logs. increase the efficiency of website by reorganizing Web A web server access log is a complete review of access of a Structure. This gives the beneficial information to the web server from a client. User browsing records can be collected developer for providing easier navigation to the Website from three different sources: the web server log file, proxy server log file, and browser cookies. A web server log file REFERENCES records all user access activities on that server. As an example we note here that a log consists of the [1] Ji-Hyun Lee, Wei-Kun Shiu: An adaptive website system to improve efficiency with web mining techniques. Advanced following elements: client’s IP address, user id, access time, EngineeringInformatics 18(3): 129-142 (2004) request method (get or post), URL, protocol error code, [2] R. Kosala, H. Blockeel, “Web Mining Research: A Survey”, SIGKDD number of bytes transmitted. Explorations, Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining Vol. 2, No. 1 pp 1-15, 2000. C. Obtaining Website Browsing Efficiency [3] Perkowitz M, Etzioni O. Adaptive web site: an AI challenge. IJCAI- 97 1997. The browsing efficiency of a website can be calculated by Eq.(1). 107 All Rights Reserved © 2012 IJARCET
4.
ISSN: 2278 –
1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 [4] Koutri M, Daskalaki S, Avouris N. Adaptive interaction with web site: an overview of methods and techniques. Computer Science and Information Technologies, CSIT 2002. [5] Srivastava J, Cooley R, Deshpande M, Tan P-N. Web usage mining: discovery and applications of usage patterns from web data. ACM SIGKDD 2000. [6] Raskin J. The human interface, first ed. Menlo, CA: Stratford Publishing, Inc.; 2000. [7] Srikant R, Yang Y. Mining web logs to improve website organization. ACM 2001. [8] Spiliopoulou M, Faulstich L. Wum: A web utilization miner. EDBT Workshop WebDB 98. Spain: Valencia; 1998. [9] Wu K-L, Yu P-S, Ballman A. Speed-tracer: A web usage mining and analysis tool. IBM Systems Journal 1998;37(1). [10] Zaiane O, Xin M, Han J. Discovering web access patterns and trends by applying olap and data mining technology on web logs. In: Advances in Digital Libraries. CA: Santa Barbara; 1998. p.19–29. [11] Shahabi C, Zarkesh A, Adibi J, Shah V. Knowledge discovery from users web-page navigation Workshop on Research Issues in Data Engineering. England: Birmingham; 1997. [12] Chen M-S, Park J-S, Yu P-S. Data mining for path traversal patterns in a web environment. 16th International Conference onDistributed Computing Systems 1996;385–92. [13] Zarkesh A, Adibi J, Shahabi C, Sadri R, Shah V. Analysis and design of server informative www-sites 6th International Conference on Information and Knowledge Management. Nevada: Las Vegas; 1997. [14] Nakayama T, Kato H, Yamane Y. Discovering the gap between web site designers’ expectations and users’ behavior. Computer Networks 2000;33(1-6):811–22. [15] Wang, May and Yen, Benjamin, "Web Structure Reorganization to Improve Web Navigation Efficiency" (2007). PACIS 2007 Proceedings. Paper 46. [16] M. Kilfoil , A. Ghorbani , W. Xing , Z. Lei , J. Lu , J. Zhang , X. Xu,” Toward An Adaptive Web: The State of the Art and Science (2003),CNSR 2003. Joy Shalom Sonahas completed his B.E. degree in computer science in 2008. He is pursuing MTech in computer Technology from Chhattisagrah Swami Vivekanand Technical University, Bhilai, CG, India. His research interest includes Data Mining, Web Mining & Cloud Computing. Asha Ambhaikar was born in July 1965 in Nagpur district, MH, India. She is graduated from Nagpur University, Nagpur, India, in Electronics Engineering in the year 2000 and later did her post graduation in Information Technology from Allahabad Deemed University, Allahabad India. She has submitted her PhD on the topic “Design and Development of Manet Routing Protocol for Improving Scalability”. Currently she is working as an Associate Professor in RCET Bhilai, and she has published more than 17 research papers in reputed national and international journal’s and conferences. Her research interests includes, networking, data warehousing and mining, Distributed system, signal processing, image processing, and information systems and security. 108 All Rights Reserved © 2012 IJARCET
Baixar agora