SlideShare uma empresa Scribd logo
1 de 23
PRIVACY FOR CONTINUAL DATA
PUBLISHING
Junpei Kawamoto, Kouichi Sakurai
(Kyushu University, Japan)

This work is partly supported by Grants-in-Aid for Scientific Research (B)(23300027),
Japan Society for the Promotion of Science (JSPS)
Jan. 10	

2	

Privacy for Continual Data Publishing 	

Analysis of Location data (Big data)	
•  We can easily gather location data from GPS, etc.	

Which cross roads are danger?	
Find car accidents quickly	
Find available roads	

Count	

Frequent
Patterns	

Change Point
Detection	

Etc.
Jan. 10	

3	

Privacy for Continual Data Publishing 	

Privacy for Publishing Location Data	
•  Publishing location data of people.
Publish	
Collector	

Collector	

Analyst	

•  Location data should be kept secret sometimes.
•  Someone wants to keep where he was secret.
•  Privacy preserving data publishing is necessary.
Jan. 10	

4	

Privacy for Continual Data Publishing 	

Assumption of collector	
•  Collecting people’s location and publishing histograms.
Publish	
collector	

π

π

π
Analyst	

t = 3	

t = 2	

t = 1	

POI	
 Count	

POI	
 Count	

POI	
 Count	

A	

15000	

A	

15200	

A	

15300	

B	

30300	

B	

30100	

B	

30000	

•  Every time span, the collector publishes a histogram.
•  We argue what kind of privacy the collector should guarantee.
Jan. 10	

Privacy for Continual Data Publishing 	

5	

Related Work: Differential Privacy1	
•  Privacy definition of de facto standard.
•  Keeps any person’s locations are in histograms secret,
•  Adds Laplace-noises to histograms,

⎛ | x − µ | ⎞
1
⎟
exp⎜ −
⎜
2φ
φ ⎟
⎝
⎠
•  Guarantees privacy for attacks using any kind of knowledge.

•  Added noises are too big in less-populated areas.	
The number of people in
a less-populated area
[1] C.Dwork, F.McSherry, K.Nissim, A.Smith, “Calibrating noise to sensitivity in private data analysis”, Proc. of the
Third Conference on Theory of Cryptography, pp. 265-284, 2006.
Jan. 10	

6	

Privacy for Continual Data Publishing 	

Related Work: Differential Privacy1	
•  Privacy definition of de facto standard
•  Keeps any person’s locations are in histograms secret
•  Adds Laplace-noises to histograms

⎛ | x − µ | ⎞
1
⎟
exp⎜ −
⎜
2φ
φ ⎟
⎝ Our objective:
⎠

to construct privacy definition for private histograms with
preserving utilities of outputs as kind of knowledge
•  Guarantees privacy for attacks using any much as possible
•  Added noises are too big in less-populated areas	
The number of people in
a less-populated area

vs.	

[1] C.Dwork, F.McSherry, K.Nissim, A.Smith, “Calibrating noise to sensitivity in private data analysis”, Proc. of the
Third Conference on Theory of Cryptography, pp. 265-284, 2006.
Jan. 10	

7	

Privacy for Continual Data Publishing 	

Main idea of our privacy definition	
•  Differential privacy hides any moves
•  We assume it isn’t necessary to hide explicit moves
Under
construction	

D	

Under
construction	

A	

C	
Turns left to B	

Most of people entering from A	

B	

Public knowledge	

If an adversary knows a victim was in A at time t and the victim
moves B at time t+1, we don’t care the privacy.
Jan. 10	

8	

Privacy for Continual Data Publishing 	

Main idea of our privacy definition 	
•  Employing Markov process to argue explicit/implicit moves
•  We assume if outputs don’t give more information than the Markov
0.1	
process to adversaries, the outputs are private
A -> A: explicit
A -> B: implicit	
Focus privacy of this move	

0.9	

A	

0.5	

B	

0.5	
Markov process	

Public	
•  We employ “Adversarial Privacy”2
•  A privacy definition bounds information outputs give adversaries.

[2] V.Rastogi, M.Hay, G.Miklau, D.Suciu, “Relationship Privacy: Output Perturbation for Queries with Joins”, Proc.
of the ACM Symposium on Principles of Database Systems, pp.107-116, 2009.
Jan. 10	

Privacy for Continual Data Publishing 	

9	

Adversarial Privacy	
•  The definition
•  p(X): adversaries’ prior belief of an event X
•  p(X | O): adversaries’ posterior belief of X after observing an output O
•  The output O is ε-adversarial private iff for any X,

p(X | O) ≦ eε p(X)
•  We need to design X and O for the problem applied adversarial privacy
•  X: a person is in POI lj at time t i.e. Xt = lj
•  O: published histogram at time t i.e. π(t)

•  p: an algorithm computing adversaries’ belief
•  We design p for some adversary classes depended on use cases	

One of the our contributions
Jan. 10	

Privacy for Continual Data Publishing 	

Adversary Classes	
•  Markov-Knowledge Adversary (MK)
•  Guessing which POI a victim is in at time t
•  Utilizing the Markov process and output histograms before time t
•  Any-Person-Knowledge Adversary (APK)
•  Guessing which POI a victim is in at time t
•  Utilizing the Markov process and output histograms before time t
and which POI the victim was in at time t – 1

10
Jan. 10	

Privacy for Continual Data Publishing 	

Adversary Classes	
•  Markov-Knowledge Adversary (MK)
•  Guessing which POI a victim is in at time t
•  Utilizing the Markov process and output histograms before time t
•  Any-Person-Knowledge Adversary (APK)
•  Guessing which POI a victim is in at time t
•  Utilizing the Markov process and output histograms before time t
and which POI the victim was in at time t – 1

APK class is stronger than ML class.
Today, we focus on APK classes.	

11
Jan. 10	

Privacy for Continual Data Publishing 	

Beliefs of APK-class adversaries	
•  Prior belief before observing output π(t)

p(Xt = l j | Xt−1 = li , (π(t −1)t P)t , π(t −1);P)
•  Posterior belief after observing output π(t)
•        l j | X t−1 = li , π(t), π(t −1);P)
p(Xt =
•  Thus, output π(t) is ε-adversarial private for APK class iff	

•  ∀li, lj,	

p(Xt = l j | Xt−1 = li , π(t), π(t −1);P)
≤ eε
p(Xt = l j | Xt−1 = li , (π(t −1)t P)t , π(t −1);P)

12
Jan. 10	

13	

Privacy for Continual Data Publishing 	

Computing private histograms	
•  Loss of modified histogram
•  π0(t): original histogram at time t
π(t): adversarial private histogram at time t

loss(π(t), π 0 (t))= π(t) − π 0 (t)

2

•  Problem of computing adversarial private histograms
•  a optimization problem
•  minimize loss(π(t), π0(t))
•  s.t. ∀li, lj,

p(Xt = l j | Xt−1 = li , π(t), π(t −1);P)
≤ eε
p(Xt = l j | Xt−1 = li , (π(t −1)t P)t , π(t −1);P)

•  We employ a heuristic algorithm to solve this.
Jan. 10	

14	

Privacy for Continual Data Publishing 	

Extension for High-order Markov Process	
•  We assumed 1st-order Markov Process
0.9	
•  Elements of published histograms
means a POI

0.1	
A	

0.5	
B	

0.5	

•  High-order Markov Process let us publish counts of paths
•  We can convert high-order Markov process to 1st-order Markov
process
B→C	

A→B	
B→D	
A→D	
Example of 2-order Markov process	

•  We can publish counts of 2-length paths
Jan. 10	

15	

Privacy for Continual Data Publishing 	

Extension for High-order Markov Process	
•  We assumed 1st-order Markov Process
0.9	
•  Elements of published histograms
means a POI

0.1	
A	

0.5	
B	

0.5	

•  High-order Markov Process let us publish counts of paths
•  We can convert high-order Markov process to 1st-order Markov
process
B→C	

A→B	
Our proposal guarantee privacy
B→D	
for publishing n-gram paths’ counts	
A→D	
Example of 2-order Markov process	

•  We can publish counts of 2-length paths
Jan. 10	

Privacy for Continual Data Publishing 	

Evaluation	
•  Set two mining tasks
•  Change point detection
•  Frequent paths extraction
•  Datasets
•  Moving people in Tokyo, 1998 provided by People Flow Project3
•  Construct two small datasets: Shibuya and Machida
•  Shibuya: lots of people moving, to evaluate in urban area
•  Machida: less people moving, to evaluate in sub-urban area

[3] http://pflow.csis.u-tokyo.ac.jp/index-j.html

16
Jan. 10	

Privacy for Continual Data Publishing 	

Number of people (Shibuya)
Plain: Original data
AdvP: Proposal
DP-1: DP (ε=1)
DP-100: DP (ε=100)	

Errors in lesspopulated times

DP:
Differential
privacy	

Almost
same	

17
Jan. 10	

18	

Privacy for Continual Data Publishing 	

Change point detection (Shibuya)	
Change Point Scores	

•  AdvP (proposal) has errors in rush hours
•  But, there are no false positive

•  DP-1, DP-100 have many errors
•  DP-100 is too weak setting but has errors

Errors
Jan. 10	

Privacy for Continual Data Publishing 	

Number of people (Machida) 	

Almost
same	

Too many noises	

19
Jan. 10	

20	

Privacy for Continual Data Publishing 	

Change point detection (Machida)	
Change point scores	

•  AdvP (proposal) has errors in rush hours
•  DP-1, DP-100 have errors in any time

errors
Jan. 10	

21	

Privacy for Continual Data Publishing 	

Frequent paths extraction
•  We employ NDCG6 to evaluate accuracies of outputs

good

Shibuya	

Machida	

bad

[6] K.Järvelin, J.Kekäläinen, ”IR evaluation methods for retrieving highly relevant documents,” Proc. of the 23rd
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.41-48,
2000.
Jan. 10	

22	

Privacy for Continual Data Publishing 	

Frequent paths extraction
•  We employ NDCG6 to evaluate accuracies of outputs

good

Shibuya	

Machida	

bad
•  Outputs by our proposal archives better results than differential privacy
in both Shibuya and Machida.
•  Our proposal is effective for publishing paths’ counts	

[6] K.Järvelin, J.Kekäläinen, ”IR evaluation methods for retrieving highly relevant documents,” Proc. of the 23rd
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.41-48,
2000.
Jan. 10	

Privacy for Continual Data Publishing 	

Conclusion	
•  Propose a new privacy definition
•  Preserving utilities of outputs as much as possible
•  Assuming Markov process on people’s moves
•  Employing adversarial privacy framework
•  Evaluations with two data mining tasks
•  Change point detection and frequent paths extraction
•  Our privacy archives better utility than differential privacy
•  Future work
•  Applying to other mining tasks
•  Comparing with other privacy definitions

23

Mais conteúdo relacionado

Semelhante a Privacy for Continual Data Publishing

Toward Accurate Data Analysis under Local Privacy
Toward Accurate Data Analysis under Local PrivacyToward Accurate Data Analysis under Local Privacy
Toward Accurate Data Analysis under Local PrivacyTakao Murakami
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handoutYi-Shin Chen
 
Adversarial Classification Under Differential Privacy
Adversarial Classification Under Differential PrivacyAdversarial Classification Under Differential Privacy
Adversarial Classification Under Differential PrivacyÁlvaro Cárdenas
 
A Study on Privacy Level in Publishing Data of Smart Tap Network
A Study on Privacy Level in Publishing Data of Smart Tap NetworkA Study on Privacy Level in Publishing Data of Smart Tap Network
A Study on Privacy Level in Publishing Data of Smart Tap NetworkHa Phuong
 
Social Event Detection
Social Event DetectionSocial Event Detection
Social Event DetectionVincent Traag
 
Visualizing (BIG) data.
Visualizing (BIG) data.Visualizing (BIG) data.
Visualizing (BIG) data.Jameson Toole
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingAlex Pinto
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social StreamsPramod Anantharam
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsAlbert Bifet
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.Albert Bifet
 
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017MLconf
 
Python for Data Science with Anaconda
Python for Data Science with AnacondaPython for Data Science with Anaconda
Python for Data Science with AnacondaTravis Oliphant
 
Social genome mining for crisis prediction
Social genome mining for crisis predictionSocial genome mining for crisis prediction
Social genome mining for crisis predictionPeter Wlodarczak
 
FFWD - Fast Forward With Degradation
FFWD - Fast Forward With DegradationFFWD - Fast Forward With Degradation
FFWD - Fast Forward With DegradationRolando Brondolin
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Junpei Kawamoto
 
Bytewise approximate matching, searching and clustering
Bytewise approximate matching, searching and clusteringBytewise approximate matching, searching and clustering
Bytewise approximate matching, searching and clusteringLiwei Ren任力偉
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...CS, NcState
 

Semelhante a Privacy for Continual Data Publishing (20)

Toward Accurate Data Analysis under Local Privacy
Toward Accurate Data Analysis under Local PrivacyToward Accurate Data Analysis under Local Privacy
Toward Accurate Data Analysis under Local Privacy
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊[系列活動] 資料探勘速遊
[系列活動] 資料探勘速遊
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handout
 
Adversarial Classification Under Differential Privacy
Adversarial Classification Under Differential PrivacyAdversarial Classification Under Differential Privacy
Adversarial Classification Under Differential Privacy
 
A Study on Privacy Level in Publishing Data of Smart Tap Network
A Study on Privacy Level in Publishing Data of Smart Tap NetworkA Study on Privacy Level in Publishing Data of Smart Tap Network
A Study on Privacy Level in Publishing Data of Smart Tap Network
 
Social Event Detection
Social Event DetectionSocial Event Detection
Social Event Detection
 
Visualizing (BIG) data.
Visualizing (BIG) data.Visualizing (BIG) data.
Visualizing (BIG) data.
 
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and SharingData-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
 
Python for Data Science with Anaconda
Python for Data Science with AnacondaPython for Data Science with Anaconda
Python for Data Science with Anaconda
 
Social genome mining for crisis prediction
Social genome mining for crisis predictionSocial genome mining for crisis prediction
Social genome mining for crisis prediction
 
FFWD - Fast Forward With Degradation
FFWD - Fast Forward With DegradationFFWD - Fast Forward With Degradation
FFWD - Fast Forward With Degradation
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
 
Bytewise approximate matching, searching and clustering
Bytewise approximate matching, searching and clusteringBytewise approximate matching, searching and clustering
Bytewise approximate matching, searching and clustering
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 

Mais de Junpei Kawamoto

レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測Junpei Kawamoto
 
初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀Junpei Kawamoto
 
Securing Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesSecuring Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesJunpei Kawamoto
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援Junpei Kawamoto
 
暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造Junpei Kawamoto
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索Junpei Kawamoto
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシマルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシJunpei Kawamoto
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化Junpei Kawamoto
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシマルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシJunpei Kawamoto
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法Junpei Kawamoto
 
プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案Junpei Kawamoto
 
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesA Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesJunpei Kawamoto
 
位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法Junpei Kawamoto
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Junpei Kawamoto
 
Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionJunpei Kawamoto
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法Junpei Kawamoto
 
VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2Junpei Kawamoto
 
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Junpei Kawamoto
 
Security of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaSSecurity of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaSJunpei Kawamoto
 

Mais de Junpei Kawamoto (19)

レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
 
初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀
 
Securing Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesSecuring Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced Databases
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
 
暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシマルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシマルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
 
プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案
 
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesA Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
 
位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
 
Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based Encryption
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
 
VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2
 
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
 
Security of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaSSecurity of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaS
 

Último

Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandIES VE
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 

Último (20)

Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 

Privacy for Continual Data Publishing

  • 1. PRIVACY FOR CONTINUAL DATA PUBLISHING Junpei Kawamoto, Kouichi Sakurai (Kyushu University, Japan) This work is partly supported by Grants-in-Aid for Scientific Research (B)(23300027), Japan Society for the Promotion of Science (JSPS)
  • 2. Jan. 10 2 Privacy for Continual Data Publishing Analysis of Location data (Big data) •  We can easily gather location data from GPS, etc. Which cross roads are danger? Find car accidents quickly Find available roads Count Frequent Patterns Change Point Detection Etc.
  • 3. Jan. 10 3 Privacy for Continual Data Publishing Privacy for Publishing Location Data •  Publishing location data of people. Publish Collector Collector Analyst •  Location data should be kept secret sometimes. •  Someone wants to keep where he was secret. •  Privacy preserving data publishing is necessary.
  • 4. Jan. 10 4 Privacy for Continual Data Publishing Assumption of collector •  Collecting people’s location and publishing histograms. Publish collector π π π Analyst t = 3 t = 2 t = 1 POI Count POI Count POI Count A 15000 A 15200 A 15300 B 30300 B 30100 B 30000 •  Every time span, the collector publishes a histogram. •  We argue what kind of privacy the collector should guarantee.
  • 5. Jan. 10 Privacy for Continual Data Publishing 5 Related Work: Differential Privacy1 •  Privacy definition of de facto standard. •  Keeps any person’s locations are in histograms secret, •  Adds Laplace-noises to histograms, ⎛ | x − µ | ⎞ 1 ⎟ exp⎜ − ⎜ 2φ φ ⎟ ⎝ ⎠ •  Guarantees privacy for attacks using any kind of knowledge. •  Added noises are too big in less-populated areas. The number of people in a less-populated area [1] C.Dwork, F.McSherry, K.Nissim, A.Smith, “Calibrating noise to sensitivity in private data analysis”, Proc. of the Third Conference on Theory of Cryptography, pp. 265-284, 2006.
  • 6. Jan. 10 6 Privacy for Continual Data Publishing Related Work: Differential Privacy1 •  Privacy definition of de facto standard •  Keeps any person’s locations are in histograms secret •  Adds Laplace-noises to histograms ⎛ | x − µ | ⎞ 1 ⎟ exp⎜ − ⎜ 2φ φ ⎟ ⎝ Our objective: ⎠ to construct privacy definition for private histograms with preserving utilities of outputs as kind of knowledge •  Guarantees privacy for attacks using any much as possible •  Added noises are too big in less-populated areas The number of people in a less-populated area vs. [1] C.Dwork, F.McSherry, K.Nissim, A.Smith, “Calibrating noise to sensitivity in private data analysis”, Proc. of the Third Conference on Theory of Cryptography, pp. 265-284, 2006.
  • 7. Jan. 10 7 Privacy for Continual Data Publishing Main idea of our privacy definition •  Differential privacy hides any moves •  We assume it isn’t necessary to hide explicit moves Under construction D Under construction A C Turns left to B Most of people entering from A B Public knowledge If an adversary knows a victim was in A at time t and the victim moves B at time t+1, we don’t care the privacy.
  • 8. Jan. 10 8 Privacy for Continual Data Publishing Main idea of our privacy definition •  Employing Markov process to argue explicit/implicit moves •  We assume if outputs don’t give more information than the Markov 0.1 process to adversaries, the outputs are private A -> A: explicit A -> B: implicit Focus privacy of this move 0.9 A 0.5 B 0.5 Markov process Public •  We employ “Adversarial Privacy”2 •  A privacy definition bounds information outputs give adversaries. [2] V.Rastogi, M.Hay, G.Miklau, D.Suciu, “Relationship Privacy: Output Perturbation for Queries with Joins”, Proc. of the ACM Symposium on Principles of Database Systems, pp.107-116, 2009.
  • 9. Jan. 10 Privacy for Continual Data Publishing 9 Adversarial Privacy •  The definition •  p(X): adversaries’ prior belief of an event X •  p(X | O): adversaries’ posterior belief of X after observing an output O •  The output O is ε-adversarial private iff for any X, p(X | O) ≦ eε p(X) •  We need to design X and O for the problem applied adversarial privacy •  X: a person is in POI lj at time t i.e. Xt = lj •  O: published histogram at time t i.e. π(t) •  p: an algorithm computing adversaries’ belief •  We design p for some adversary classes depended on use cases One of the our contributions
  • 10. Jan. 10 Privacy for Continual Data Publishing Adversary Classes •  Markov-Knowledge Adversary (MK) •  Guessing which POI a victim is in at time t •  Utilizing the Markov process and output histograms before time t •  Any-Person-Knowledge Adversary (APK) •  Guessing which POI a victim is in at time t •  Utilizing the Markov process and output histograms before time t and which POI the victim was in at time t – 1 10
  • 11. Jan. 10 Privacy for Continual Data Publishing Adversary Classes •  Markov-Knowledge Adversary (MK) •  Guessing which POI a victim is in at time t •  Utilizing the Markov process and output histograms before time t •  Any-Person-Knowledge Adversary (APK) •  Guessing which POI a victim is in at time t •  Utilizing the Markov process and output histograms before time t and which POI the victim was in at time t – 1 APK class is stronger than ML class. Today, we focus on APK classes. 11
  • 12. Jan. 10 Privacy for Continual Data Publishing Beliefs of APK-class adversaries •  Prior belief before observing output π(t) p(Xt = l j | Xt−1 = li , (π(t −1)t P)t , π(t −1);P) •  Posterior belief after observing output π(t) •        l j | X t−1 = li , π(t), π(t −1);P) p(Xt = •  Thus, output π(t) is ε-adversarial private for APK class iff •  ∀li, lj, p(Xt = l j | Xt−1 = li , π(t), π(t −1);P) ≤ eε p(Xt = l j | Xt−1 = li , (π(t −1)t P)t , π(t −1);P) 12
  • 13. Jan. 10 13 Privacy for Continual Data Publishing Computing private histograms •  Loss of modified histogram •  π0(t): original histogram at time t π(t): adversarial private histogram at time t loss(π(t), π 0 (t))= π(t) − π 0 (t) 2 •  Problem of computing adversarial private histograms •  a optimization problem •  minimize loss(π(t), π0(t)) •  s.t. ∀li, lj, p(Xt = l j | Xt−1 = li , π(t), π(t −1);P) ≤ eε p(Xt = l j | Xt−1 = li , (π(t −1)t P)t , π(t −1);P) •  We employ a heuristic algorithm to solve this.
  • 14. Jan. 10 14 Privacy for Continual Data Publishing Extension for High-order Markov Process •  We assumed 1st-order Markov Process 0.9 •  Elements of published histograms means a POI 0.1 A 0.5 B 0.5 •  High-order Markov Process let us publish counts of paths •  We can convert high-order Markov process to 1st-order Markov process B→C A→B B→D A→D Example of 2-order Markov process •  We can publish counts of 2-length paths
  • 15. Jan. 10 15 Privacy for Continual Data Publishing Extension for High-order Markov Process •  We assumed 1st-order Markov Process 0.9 •  Elements of published histograms means a POI 0.1 A 0.5 B 0.5 •  High-order Markov Process let us publish counts of paths •  We can convert high-order Markov process to 1st-order Markov process B→C A→B Our proposal guarantee privacy B→D for publishing n-gram paths’ counts A→D Example of 2-order Markov process •  We can publish counts of 2-length paths
  • 16. Jan. 10 Privacy for Continual Data Publishing Evaluation •  Set two mining tasks •  Change point detection •  Frequent paths extraction •  Datasets •  Moving people in Tokyo, 1998 provided by People Flow Project3 •  Construct two small datasets: Shibuya and Machida •  Shibuya: lots of people moving, to evaluate in urban area •  Machida: less people moving, to evaluate in sub-urban area [3] http://pflow.csis.u-tokyo.ac.jp/index-j.html 16
  • 17. Jan. 10 Privacy for Continual Data Publishing Number of people (Shibuya) Plain: Original data AdvP: Proposal DP-1: DP (ε=1) DP-100: DP (ε=100) Errors in lesspopulated times DP: Differential privacy Almost same 17
  • 18. Jan. 10 18 Privacy for Continual Data Publishing Change point detection (Shibuya) Change Point Scores •  AdvP (proposal) has errors in rush hours •  But, there are no false positive •  DP-1, DP-100 have many errors •  DP-100 is too weak setting but has errors Errors
  • 19. Jan. 10 Privacy for Continual Data Publishing Number of people (Machida) Almost same Too many noises 19
  • 20. Jan. 10 20 Privacy for Continual Data Publishing Change point detection (Machida) Change point scores •  AdvP (proposal) has errors in rush hours •  DP-1, DP-100 have errors in any time errors
  • 21. Jan. 10 21 Privacy for Continual Data Publishing Frequent paths extraction •  We employ NDCG6 to evaluate accuracies of outputs good Shibuya Machida bad [6] K.Järvelin, J.Kekäläinen, ”IR evaluation methods for retrieving highly relevant documents,” Proc. of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.41-48, 2000.
  • 22. Jan. 10 22 Privacy for Continual Data Publishing Frequent paths extraction •  We employ NDCG6 to evaluate accuracies of outputs good Shibuya Machida bad •  Outputs by our proposal archives better results than differential privacy in both Shibuya and Machida. •  Our proposal is effective for publishing paths’ counts [6] K.Järvelin, J.Kekäläinen, ”IR evaluation methods for retrieving highly relevant documents,” Proc. of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.41-48, 2000.
  • 23. Jan. 10 Privacy for Continual Data Publishing Conclusion •  Propose a new privacy definition •  Preserving utilities of outputs as much as possible •  Assuming Markov process on people’s moves •  Employing adversarial privacy framework •  Evaluations with two data mining tasks •  Change point detection and frequent paths extraction •  Our privacy archives better utility than differential privacy •  Future work •  Applying to other mining tasks •  Comparing with other privacy definitions 23