SlideShare uma empresa Scribd logo
1 de 3
Baixar para ler offline
Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34
www.ijera.com 32 | P a g e
Privacy Preservation of Data in Data Mining
Prachi Kohale*, Sheetal Girase**
*(Department of Inforamation Technology, MIT, PUNE PUNE University)
** (Department of Information Technology,MIT PUNE, PUNE University)
ABSTRACT
For data privacy against an un-trusted party, Anonymization is a widely used technique capable of preserving
attribute values and supporting data mining algorithms. The technique deals with Anonymization methods for
users in a domain-driven data mining outsourcing.
Several issues emerge when anonymization is applied in a real world outsourcing scenario. The majority of
methods have focused on the traditional data mining approach; therefore they do not implement domain
knowledge nor optimize data for domain-driven usage. So, existing techniques are mostly non-interactive in
nature, providing little control to users.
To successfully obtain optimal data privacy and actionable patterns in a real world setting, these concerns need
to be addressed. The technique is based on anonymization framework for users in a domain-driven data mining
outsourcing scenario. The framework involves several components designed to anonymize data while preserving
meaningful or actionable patterns that can be discovered after mining.
In the medical field, for enhancing the quality and importance of healthcare, data mining plays an important
role. For example classification analysis on patients data to determine their probability of having heart disease,
so, appropriate actions can be taken, thus enhancing treatment, saving time and cost. It is helpful to anonymize
data while preserving meaningful of actionable patterns that can be found after mining.
Keywords: DDDM, GT, BT
I. INTRODUCTION
In the real world data mining is constraint based
as opposed to traditional data mining which is data
driven trial and error process. In traditional data
mining knowledge discovery usually performed
without domain intelligence, thus affecting model or
actionability for real business needs [1]. So there
exists gap between objective and business goals as
well as outputs and business expectations. Domain
driven data mining aims to bridge the gap between
these by involving domain experts and knowledge to
obtain actionable patterns or models applicable to
real world requirements.[7] Data anonymization in
domain driven data mining uses methods like k
anonymity and recent dynamic anonymization
methods.
II. ANONYMIZATION
The fig.1 shows anonymization in domain driven
data mining outsourcing. It involves several
components designed to anonymize data while
preserving meaningful and actionable patterns that
can be discovered after mining [8]. In contrast to
traditional data mining this framework integrates the
knowledge to retain values after anonymization.
Domain driven data mining DDDM is to make
system deliver business friendly and decision making
rules and actions that are of solid technical
significance as well [2]. Its having effective
involvement of domain intelligence, network
intelligence, Data intelligence, human intelligence,
social intelligence[9].
fig 1 anonymization in domain driven data mining
outsourcing.
III. ANGEL TECHNIQUE
Generalization is well known method for privacy
preservation of data. It has several drawbacks such as
heavy information loss, difficulty of supporting
marginal publication. [3]To overcome this Angel, a
new anonymization technique that is effective as
generalization in privacy protection but able to retain
significantly more information in microdata. It is
applicable to any principle l-diversity, t-closeness and
to overcome the drawback of several methods.
Compared to traditional generalization it ensures
same privacy guarantee preserve significantly more
information in microdata.
Table 1 ANGEL publication 1
Anonymi
zation
(DDDM)
Anonymi
zation
(Data
Protectio
n)
Data
Privacy
(Patient
data)
Data
Mining
(outsour
cing)
RESEARCH ARTICLE OPEN ACCESS
Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34
www.ijera.com 33 | P a g e
Suppose that we want to publish the microdata of
Table 1, conforming to 2-diversity. ANGEL first
divides the table into batches:
Batch 1: {Alan, Carrie},
Batch 2: {Bob, Daisy},
Batch 3: {Eddy, Gloria},
Batch 4: {Frank, Helena}.
Observe that each batch obeys 2-diversity: it
contains one pneumonia- and one bronchitis-tuple.
ANGEL creates a batch table (BT), as in Table 3.1a,
summarizing the Disease-statistics of each batch. For
example, the first row of Table 3.1a states that
exactly one tuple in Batch 1 carries pneumonia.
Then, ANGEL creates another partitioning into
buckets (which do not have to be 2-diverse).
Bucket 1: {Alan, Bob},
Bucket 2: {Carrie, Daisy},
Bucket 3: {Eddy, Frank},
Bucket 4: {Gloria, Helena}
Table 2 Microdata
Name Age Sex Zip Disease
Allan 21 M 10k Pneumonia
Bob 23 M 58k Flu
Carrie 58 F 12k Bronchitis
Daisy 60 F 60k Pneumonia
Eddy 70 M 78k Flu
Frank 72 M 80k Bronchitis
Table 3 2-diverse generalization
Age Sex Zipcod
e
Disease
[21,
23]
M [10k,
58k]
Pneumon
ia
[21,
23]
M [10k,
58k]
Flu
[58,
60]
F [12k,
60k]
Bronchiti
s
[58,
60]
F [12k,
60k]
Pneumon
ia
[70,
72]
M [78k,
80k]
Flu
[70,
72]
M [78k,
80k]
Bronchiti
s
Zipcode Disease
[10k,
12k]
Pneumon
ia
[10k, Bronchiti
12k] s
[58k,
60k]
Flu
[58k,
60k]
Pneumon
ia
[78k,
80k]
Flu
[78k,
80k]
Bronchiti
s
Finally, ANGEL generalizes the tuples of each
bucket into the same form, producing a generalized
table (GT). Table 3 demonstrates the GT. Note that
GT does not include the Disease attribute, but stores,
for each tuple of the microdata, the ID of the batch
containing it. For instance, the first tuple of Table has
a Batch-ID 1, because its owner Alan belongs to
Batch 1. Tables 3.a and 3.b are the final relations
released by ANGEL[10].
IV. CONCLUSION
Angelization is new anonymization technic for
privacy preserving publication which is applicable to
any monotonic anonymization principle. It produces
anonymized relation that achieve privacy guarantee
as conventional generalization but permits much
more accurate reconstruction of original data
distribution. It offers simple and rigorous solution
which was difficult issue.
V. ACKNOWLEDGEMENTS
I would like to take opportunity to express my
humble gratitude to my guide Prof. Shital P. Girase,
Asst. Prof. Maharashtra Academy of Engineering &
Educational Research’s, Maharashtra Institute of
Technology, Pune, who gave me assistance with their
experienced knowledge and whatever required at all
stages of my project.
References
[1] Longbing Cao “DDDM challenges and
prospectus “ june 2010,755-769
[2] Longbing Cao, Philip S. Yu, Chengqi
Zhang, Yanchang Zhao “Domain Driven
Data Mining”
[3] G. Aggarwal, T. Feder, K. Kenthapadi, S.
Khuller, R. Panigrahy, D. Thomas, and A.
Zhu. “Achieving anonymity via clustering”.
In PODS, 2006.
[4] Josep Domingo-Ferrer and Vicenc¸ Torra “A
Critique of k-Anonymity and Some of Its
Enhancements”. 3 International
Conference on Availability, Reliability and
Security
[5] V.Narmada “An enhanced security
algorithm for distributed databases in
privacy preserving data ases” (IJAEST)
INTERNATIONAL JOURNAL OF
Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34
www.ijera.com 34 | P a g e
ADVANCED ENGINEERING SCIENCES
AND TECHNOLOGIES”.
[6] R. Agrawal and R. Srikant. “Privacy-
preserving data mining”.
[7] J. Han and M. Kamber Morgan Kaufmann
“Data Mining: Concepts and
Techniques”.2000.
[8] Brian, C.S. Loh and Patrick, H.H. Then l.
“Ontology-Enhanced Interactive
Anonymization in Domain-Driven Data
Mining Outsourcing” 2010 Second
International Symposium on Data, Privacy,
and E-Commerce
[9] CHARU C. AGGARWAL,PHILIP S. YU
“PRIVACY-PRESERVING DATA
MINING:MODELS AND ALGORITHM”
IBM T. J. Watson Research Center,
Hawthorne, NY 10532
[10] Yufei Tao1 Hekang Chen2 Xiaokui ”
Enhancing the Utility of Generalization for
Privacy Preserving data publishing”

Mais conteúdo relacionado

Mais procurados

Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Kato Mivule
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Miningidescitation
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining14894
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109IJRAT
 
Utilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewUtilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewKato Mivule
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...IJDKP
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesEditor IJMTER
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsIJERA Editor
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMIJNSA Journal
 
78201919
7820191978201919
78201919IJRAT
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data miningeSAT Publishing House
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data MiningVrushali Malvadkar
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_pptSagar Verma
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 

Mais procurados (19)

Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109
 
Utilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewUtilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an Overview
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
1699 1704
1699 17041699 1704
1699 1704
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
 
78201919
7820191978201919
78201919
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
Hy3414631468
Hy3414631468Hy3414631468
Hy3414631468
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
130509
130509130509
130509
 

Destaque

Introduccion a la Gerencia
Introduccion a la Gerencia Introduccion a la Gerencia
Introduccion a la Gerencia betdana2010
 
Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014Allan Fraga
 
7 สามัญ อังกฤษ
7 สามัญ อังกฤษ7 สามัญ อังกฤษ
7 สามัญ อังกฤษฝฝ' ฝน
 
Blago sa tavana
Blago sa tavanaBlago sa tavana
Blago sa tavanaLimundo
 
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013Gianluigi Spagnoli
 
Peixes gigantes e perigosos
Peixes gigantes e perigososPeixes gigantes e perigosos
Peixes gigantes e perigososvaltermn
 
Centro Comercial Virtual
Centro Comercial VirtualCentro Comercial Virtual
Centro Comercial Virtualjggc
 
Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)gorodche
 

Destaque (20)

Tigana trecho
Tigana trechoTigana trecho
Tigana trecho
 
Taller en Juan Cacharpa
Taller en Juan CacharpaTaller en Juan Cacharpa
Taller en Juan Cacharpa
 
Introduccion a la Gerencia
Introduccion a la Gerencia Introduccion a la Gerencia
Introduccion a la Gerencia
 
Aj04602248254
Aj04602248254Aj04602248254
Aj04602248254
 
Trenajer 8 9
Trenajer 8 9Trenajer 8 9
Trenajer 8 9
 
S13 modulo3 s8- primaria matematica
S13 modulo3 s8- primaria matematicaS13 modulo3 s8- primaria matematica
S13 modulo3 s8- primaria matematica
 
Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014
 
7 สามัญ อังกฤษ
7 สามัญ อังกฤษ7 สามัญ อังกฤษ
7 สามัญ อังกฤษ
 
Blago sa tavana
Blago sa tavanaBlago sa tavana
Blago sa tavana
 
Ap04605296305
Ap04605296305Ap04605296305
Ap04605296305
 
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
 
Am04606239243
Am04606239243Am04606239243
Am04606239243
 
Peixes gigantes e perigosos
Peixes gigantes e perigososPeixes gigantes e perigosos
Peixes gigantes e perigosos
 
Erroma
ErromaErroma
Erroma
 
Centro Comercial Virtual
Centro Comercial VirtualCentro Comercial Virtual
Centro Comercial Virtual
 
Aj04606216221
Aj04606216221Aj04606216221
Aj04606216221
 
Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)
 
Ac04605194204
Ac04605194204Ac04605194204
Ac04605194204
 
Nghi ve me
Nghi ve meNghi ve me
Nghi ve me
 
H046064750
H046064750H046064750
H046064750
 

Semelhante a F046043234

Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachPrivacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachIRJET Journal
 
78201919
7820191978201919
78201919IJRAT
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining TechniquesIJMER
 
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...IJSRD
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET Journal
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishingijcisjournal
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...acijjournal
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata PrivacyTwo-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacydbpublications
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
Big Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsBig Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsIRJET Journal
 
A Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningA Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningIJNSA Journal
 
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...IJECEIAES
 
Privacy preserving on
Privacy preserving onPrivacy preserving on
Privacy preserving onPoonam Hajare
 
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS cscpconf
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionIOSR Journals
 

Semelhante a F046043234 (20)

Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachPrivacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
 
78201919
7820191978201919
78201919
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
 
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishing
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata PrivacyTwo-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
Data Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope SurveyData Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope Survey
 
Big Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsBig Data: Privacy and Security Aspects
Big Data: Privacy and Security Aspects
 
130509
130509130509
130509
 
A Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningA Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved Mining
 
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
 
Privacy preserving on
Privacy preserving onPrivacy preserving on
Privacy preserving on
 
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
 

Último

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

F046043234

  • 1. Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34 www.ijera.com 32 | P a g e Privacy Preservation of Data in Data Mining Prachi Kohale*, Sheetal Girase** *(Department of Inforamation Technology, MIT, PUNE PUNE University) ** (Department of Information Technology,MIT PUNE, PUNE University) ABSTRACT For data privacy against an un-trusted party, Anonymization is a widely used technique capable of preserving attribute values and supporting data mining algorithms. The technique deals with Anonymization methods for users in a domain-driven data mining outsourcing. Several issues emerge when anonymization is applied in a real world outsourcing scenario. The majority of methods have focused on the traditional data mining approach; therefore they do not implement domain knowledge nor optimize data for domain-driven usage. So, existing techniques are mostly non-interactive in nature, providing little control to users. To successfully obtain optimal data privacy and actionable patterns in a real world setting, these concerns need to be addressed. The technique is based on anonymization framework for users in a domain-driven data mining outsourcing scenario. The framework involves several components designed to anonymize data while preserving meaningful or actionable patterns that can be discovered after mining. In the medical field, for enhancing the quality and importance of healthcare, data mining plays an important role. For example classification analysis on patients data to determine their probability of having heart disease, so, appropriate actions can be taken, thus enhancing treatment, saving time and cost. It is helpful to anonymize data while preserving meaningful of actionable patterns that can be found after mining. Keywords: DDDM, GT, BT I. INTRODUCTION In the real world data mining is constraint based as opposed to traditional data mining which is data driven trial and error process. In traditional data mining knowledge discovery usually performed without domain intelligence, thus affecting model or actionability for real business needs [1]. So there exists gap between objective and business goals as well as outputs and business expectations. Domain driven data mining aims to bridge the gap between these by involving domain experts and knowledge to obtain actionable patterns or models applicable to real world requirements.[7] Data anonymization in domain driven data mining uses methods like k anonymity and recent dynamic anonymization methods. II. ANONYMIZATION The fig.1 shows anonymization in domain driven data mining outsourcing. It involves several components designed to anonymize data while preserving meaningful and actionable patterns that can be discovered after mining [8]. In contrast to traditional data mining this framework integrates the knowledge to retain values after anonymization. Domain driven data mining DDDM is to make system deliver business friendly and decision making rules and actions that are of solid technical significance as well [2]. Its having effective involvement of domain intelligence, network intelligence, Data intelligence, human intelligence, social intelligence[9]. fig 1 anonymization in domain driven data mining outsourcing. III. ANGEL TECHNIQUE Generalization is well known method for privacy preservation of data. It has several drawbacks such as heavy information loss, difficulty of supporting marginal publication. [3]To overcome this Angel, a new anonymization technique that is effective as generalization in privacy protection but able to retain significantly more information in microdata. It is applicable to any principle l-diversity, t-closeness and to overcome the drawback of several methods. Compared to traditional generalization it ensures same privacy guarantee preserve significantly more information in microdata. Table 1 ANGEL publication 1 Anonymi zation (DDDM) Anonymi zation (Data Protectio n) Data Privacy (Patient data) Data Mining (outsour cing) RESEARCH ARTICLE OPEN ACCESS
  • 2. Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34 www.ijera.com 33 | P a g e Suppose that we want to publish the microdata of Table 1, conforming to 2-diversity. ANGEL first divides the table into batches: Batch 1: {Alan, Carrie}, Batch 2: {Bob, Daisy}, Batch 3: {Eddy, Gloria}, Batch 4: {Frank, Helena}. Observe that each batch obeys 2-diversity: it contains one pneumonia- and one bronchitis-tuple. ANGEL creates a batch table (BT), as in Table 3.1a, summarizing the Disease-statistics of each batch. For example, the first row of Table 3.1a states that exactly one tuple in Batch 1 carries pneumonia. Then, ANGEL creates another partitioning into buckets (which do not have to be 2-diverse). Bucket 1: {Alan, Bob}, Bucket 2: {Carrie, Daisy}, Bucket 3: {Eddy, Frank}, Bucket 4: {Gloria, Helena} Table 2 Microdata Name Age Sex Zip Disease Allan 21 M 10k Pneumonia Bob 23 M 58k Flu Carrie 58 F 12k Bronchitis Daisy 60 F 60k Pneumonia Eddy 70 M 78k Flu Frank 72 M 80k Bronchitis Table 3 2-diverse generalization Age Sex Zipcod e Disease [21, 23] M [10k, 58k] Pneumon ia [21, 23] M [10k, 58k] Flu [58, 60] F [12k, 60k] Bronchiti s [58, 60] F [12k, 60k] Pneumon ia [70, 72] M [78k, 80k] Flu [70, 72] M [78k, 80k] Bronchiti s Zipcode Disease [10k, 12k] Pneumon ia [10k, Bronchiti 12k] s [58k, 60k] Flu [58k, 60k] Pneumon ia [78k, 80k] Flu [78k, 80k] Bronchiti s Finally, ANGEL generalizes the tuples of each bucket into the same form, producing a generalized table (GT). Table 3 demonstrates the GT. Note that GT does not include the Disease attribute, but stores, for each tuple of the microdata, the ID of the batch containing it. For instance, the first tuple of Table has a Batch-ID 1, because its owner Alan belongs to Batch 1. Tables 3.a and 3.b are the final relations released by ANGEL[10]. IV. CONCLUSION Angelization is new anonymization technic for privacy preserving publication which is applicable to any monotonic anonymization principle. It produces anonymized relation that achieve privacy guarantee as conventional generalization but permits much more accurate reconstruction of original data distribution. It offers simple and rigorous solution which was difficult issue. V. ACKNOWLEDGEMENTS I would like to take opportunity to express my humble gratitude to my guide Prof. Shital P. Girase, Asst. Prof. Maharashtra Academy of Engineering & Educational Research’s, Maharashtra Institute of Technology, Pune, who gave me assistance with their experienced knowledge and whatever required at all stages of my project. References [1] Longbing Cao “DDDM challenges and prospectus “ june 2010,755-769 [2] Longbing Cao, Philip S. Yu, Chengqi Zhang, Yanchang Zhao “Domain Driven Data Mining” [3] G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. “Achieving anonymity via clustering”. In PODS, 2006. [4] Josep Domingo-Ferrer and Vicenc¸ Torra “A Critique of k-Anonymity and Some of Its Enhancements”. 3 International Conference on Availability, Reliability and Security [5] V.Narmada “An enhanced security algorithm for distributed databases in privacy preserving data ases” (IJAEST) INTERNATIONAL JOURNAL OF
  • 3. Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34 www.ijera.com 34 | P a g e ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES”. [6] R. Agrawal and R. Srikant. “Privacy- preserving data mining”. [7] J. Han and M. Kamber Morgan Kaufmann “Data Mining: Concepts and Techniques”.2000. [8] Brian, C.S. Loh and Patrick, H.H. Then l. “Ontology-Enhanced Interactive Anonymization in Domain-Driven Data Mining Outsourcing” 2010 Second International Symposium on Data, Privacy, and E-Commerce [9] CHARU C. AGGARWAL,PHILIP S. YU “PRIVACY-PRESERVING DATA MINING:MODELS AND ALGORITHM” IBM T. J. Watson Research Center, Hawthorne, NY 10532 [10] Yufei Tao1 Hekang Chen2 Xiaokui ” Enhancing the Utility of Generalization for Privacy Preserving data publishing”