SlideShare uma empresa Scribd logo
1 de 3
1.
#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, vellore – 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218 /8870603602,
Project Titles: http://shakastech.weebly.com/2015-2016-titles
Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com
PROGRAM TRANSFORMATIONS FOR ASYNCHRONOUS AND BATCHED QUERY
SUBMISSION
ABSTRACT:
Text datasets can be represented using models that do not preserve text structure, or using
models that preserve text structure. Our hypothesis is that depending on the dataset nature, there
can be advantages using a model that preserves text structure over one that does not, and
viceversa. The key is to determine the best way of representing a particular dataset, based on the
dataset itself. In this work, we propose to investigate this problem by combining text distortion
and algorithmic clustering based on string compression. Specifically, a distortion technique
previously developed by the authors is applied to destroy text structure progressively. Following
this, a clustering algorithm based on string compression is used to analyze the effects of the
distortion on the information contained in the texts. Several experiments are carried out on text
datasets and artificiallygenerated datasets. The results show that in strongly structural datasets
the clustering results worsen as text structure is progressively destroyed. Besides, they show that
using a compressor which enables the choice of the size of the left-context symbols helps to
determine the nature of the datasets. Finally, the results are contrasted with a method based on
multidimensional projections and analogous conclusions are obtained.
1.
#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, vellore – 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218 /8870603602,
Project Titles: http://shakastech.weebly.com/2015-2016-titles
Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com
Existing System:
A natural way of taking into account relationships between words (text structure) is
applying compression distances. Such distances give a measure of similarity between two objects
using data compression. This means that they can give a measure of the similarity between two
texts from texts themselves. In other words, texts do not need to be represented using any model,
but they can be used directly. This makes text structure be considered because it is simply
unvaried. This distortion technique removes non-relevant information while preserving both
relevant information and text structure. The way in which this is done is by removing the most
frequent words in the English language from the documents, replacing each of their characters
with an asterisk. This simple idea allows maintenance of text structure, while filtering the
information contained in texts because, thanks to the asterisks, the lengths and the places of
appearance of the removed words are maintained despite the distortion.
PROPOSED SYSTEM:
We apply our distortion technique with a different purpose. In this case, we use our
technique as the tool that allows the discovery of the structural characteristics of datasets, that is,
the discovery of their nature. The analysis carried out to discover dataset nature can be divided
into four parts. First, we study how different compression algorithms capture structure. Second,
we carry out an analysis that studies how changing the size of the context affects the clustering
results. Third, we analyze the dependence of the PPMD orders on the measured NCD using
artificial data generated from probabilistic context-free grammars. Finally, we validate our
approach by comparing it with a method based on visualizing high-dimensional data through
mapping techniques. All the phases of this analysis are focused on evaluating if our approach can
be used to gain an insight into the structural characteristics of datasets.
1.
#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, vellore – 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218 /8870603602,
Project Titles: http://shakastech.weebly.com/2015-2016-titles
Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com
HARDWARE REQUIREMENTS:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Floppy Drive : 1.44 Mb.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• RAM : 256 Mb.
SOFTWARE REQUIREMENTS:
• Operating system : - Windows XP.
• Front End : - JSP
• Back End : - SQL Server

Mais conteúdo relacionado

Destaque

Cv an nguyen thi
Cv  an nguyen thiCv  an nguyen thi
Cv an nguyen thi
An Nguyen
 

Destaque (11)

A distributed three hop routing protocol to
A distributed three hop routing protocol toA distributed three hop routing protocol to
A distributed three hop routing protocol to
 
Authenticated key exchange protocols for parallel network file system
Authenticated key exchange protocols for parallel network file systemAuthenticated key exchange protocols for parallel network file system
Authenticated key exchange protocols for parallel network file system
 
Gestión del Conocimiento
Gestión del ConocimientoGestión del Conocimiento
Gestión del Conocimiento
 
Authenticated key exchange protocols for parallel
Authenticated key exchange protocols for parallelAuthenticated key exchange protocols for parallel
Authenticated key exchange protocols for parallel
 
Authenticated key exchange protocols for parallel network file systems
Authenticated key exchange protocols for parallel network file systemsAuthenticated key exchange protocols for parallel network file systems
Authenticated key exchange protocols for parallel network file systems
 
Rasgosdelfederalismo equipo2
Rasgosdelfederalismo equipo2Rasgosdelfederalismo equipo2
Rasgosdelfederalismo equipo2
 
etapa de la informacion
 etapa de la informacion etapa de la informacion
etapa de la informacion
 
Federalismo Fiscal
Federalismo FiscalFederalismo Fiscal
Federalismo Fiscal
 
Apoyo a la escuela club rotario san nicolas
Apoyo a la escuela club rotario san nicolasApoyo a la escuela club rotario san nicolas
Apoyo a la escuela club rotario san nicolas
 
Evas persentacion 2
Evas persentacion 2Evas persentacion 2
Evas persentacion 2
 
Cv an nguyen thi
Cv  an nguyen thiCv  an nguyen thi
Cv an nguyen thi
 

Mais de Shakas Technologies

Mais de Shakas Technologies (20)

A Review on Deep-Learning-Based Cyberbullying Detection
A Review on Deep-Learning-Based Cyberbullying DetectionA Review on Deep-Learning-Based Cyberbullying Detection
A Review on Deep-Learning-Based Cyberbullying Detection
 
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.
 
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
 
NS2 Final Year Project Titles 2023- 2024
NS2 Final Year Project Titles 2023- 2024NS2 Final Year Project Titles 2023- 2024
NS2 Final Year Project Titles 2023- 2024
 
MATLAB Final Year IEEE Project Titles 2023-2024
MATLAB Final Year IEEE Project Titles 2023-2024MATLAB Final Year IEEE Project Titles 2023-2024
MATLAB Final Year IEEE Project Titles 2023-2024
 
Latest Python IEEE Project Titles 2023-2024
Latest Python IEEE Project Titles 2023-2024Latest Python IEEE Project Titles 2023-2024
Latest Python IEEE Project Titles 2023-2024
 
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
 
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSECYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
 
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Detecting Mental Disorders in social Media through Emotional patterns-The cas...Detecting Mental Disorders in social Media through Emotional patterns-The cas...
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
 
COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTIONCOMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
 
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCECO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
 
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
 
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
 
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
 
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
 
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
 
Fighting Money Laundering With Statistics and Machine Learning.docx
Fighting Money Laundering With Statistics and Machine Learning.docxFighting Money Laundering With Statistics and Machine Learning.docx
Fighting Money Laundering With Statistics and Machine Learning.docx
 
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
 
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Program transformations for asynchronous and batched query

  • 1. 1. #13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, vellore – 6. Off: 0416-2247353 / 6066663 Mo: +91 9500218218 /8870603602, Project Titles: http://shakastech.weebly.com/2015-2016-titles Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com PROGRAM TRANSFORMATIONS FOR ASYNCHRONOUS AND BATCHED QUERY SUBMISSION ABSTRACT: Text datasets can be represented using models that do not preserve text structure, or using models that preserve text structure. Our hypothesis is that depending on the dataset nature, there can be advantages using a model that preserves text structure over one that does not, and viceversa. The key is to determine the best way of representing a particular dataset, based on the dataset itself. In this work, we propose to investigate this problem by combining text distortion and algorithmic clustering based on string compression. Specifically, a distortion technique previously developed by the authors is applied to destroy text structure progressively. Following this, a clustering algorithm based on string compression is used to analyze the effects of the distortion on the information contained in the texts. Several experiments are carried out on text datasets and artificiallygenerated datasets. The results show that in strongly structural datasets the clustering results worsen as text structure is progressively destroyed. Besides, they show that using a compressor which enables the choice of the size of the left-context symbols helps to determine the nature of the datasets. Finally, the results are contrasted with a method based on multidimensional projections and analogous conclusions are obtained.
  • 2. 1. #13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, vellore – 6. Off: 0416-2247353 / 6066663 Mo: +91 9500218218 /8870603602, Project Titles: http://shakastech.weebly.com/2015-2016-titles Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com Existing System: A natural way of taking into account relationships between words (text structure) is applying compression distances. Such distances give a measure of similarity between two objects using data compression. This means that they can give a measure of the similarity between two texts from texts themselves. In other words, texts do not need to be represented using any model, but they can be used directly. This makes text structure be considered because it is simply unvaried. This distortion technique removes non-relevant information while preserving both relevant information and text structure. The way in which this is done is by removing the most frequent words in the English language from the documents, replacing each of their characters with an asterisk. This simple idea allows maintenance of text structure, while filtering the information contained in texts because, thanks to the asterisks, the lengths and the places of appearance of the removed words are maintained despite the distortion. PROPOSED SYSTEM: We apply our distortion technique with a different purpose. In this case, we use our technique as the tool that allows the discovery of the structural characteristics of datasets, that is, the discovery of their nature. The analysis carried out to discover dataset nature can be divided into four parts. First, we study how different compression algorithms capture structure. Second, we carry out an analysis that studies how changing the size of the context affects the clustering results. Third, we analyze the dependence of the PPMD orders on the measured NCD using artificial data generated from probabilistic context-free grammars. Finally, we validate our approach by comparing it with a method based on visualizing high-dimensional data through mapping techniques. All the phases of this analysis are focused on evaluating if our approach can be used to gain an insight into the structural characteristics of datasets.
  • 3. 1. #13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, vellore – 6. Off: 0416-2247353 / 6066663 Mo: +91 9500218218 /8870603602, Project Titles: http://shakastech.weebly.com/2015-2016-titles Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com HARDWARE REQUIREMENTS: • System : Pentium IV 2.4 GHz. • Hard Disk : 40 GB. • Floppy Drive : 1.44 Mb. • Monitor : 15 VGA Colour. • Mouse : Logitech. • RAM : 256 Mb. SOFTWARE REQUIREMENTS: • Operating system : - Windows XP. • Front End : - JSP • Back End : - SQL Server