SlideShare a Scribd company logo
1 of 40
Download to read offline
Advisors:
Dr. Jordi FORNE ́ MUÑOZ
Dr. David REBOLLO MONEDERO
In partial fulfilment of the requirements for the degree of: Doctor of philosophy.
Silvia Puglisi
silvia.puglisi@upc.edu
Analysis, modelling and protection
of online private data.
Agenda
Background. Introduction and scope of the investigation.
Objectives. Objectives of the investigation.
Ongoing and future work. Publications and current research efforts.
Background
Online privacy
Is Privacy the right to be forgotten?
In 2011, the amount of digital information created and replicated globally exceeded
1.8 zettabytes (1.8 trillion gigabytes).
75% of this information is created by individuals through new media fora such as
blogs and via social networks.
By the end of 2011, Facebook had 845 million monthly active users, sharing over 30
billion pieces of content.
Library Briefing - Library of the European Parliament - 01/03/2012
What is online privacy anyway?
In an online context, the right to privacy has commonly been interpreted as a right to
“information self-determination”.
Acts typically claimed to breach online privacy concern the collection of personal
information without consent, the selling of personal information and the further processing
of that information.
Do we have online privacy?
Do we have online privacy?
Do we have online privacy?
Irani, Danesh et al. [1] describe how personal information leaks on social networks can
be used for concrete attacks.
Acquisti, Alessandro, and Ralph Gross [2] also presented a method to infer people Social
Security numbers by using only publicly available information.
Goga, Oana et al. [3] describe how user activity on one site can implicitly reveal their
identity onto another site.
Chen, Terence et al. [4] showed a correlation between the amount and type of
information revealed in social network profiles.
The age of the “metadata”
“meta-data” is collected and stored by public and private organisations about where,
when and who created and accessed a particular online content.
In the private sphere it has been said that “literally, Google knows more about us than we
can remember our-selves”.
This situation has led to growing concerns regarding online privacy.
In China, one estimate suggests there are over 30 000 government censors monitoring
online information.
Library Briefing - Library of the European Parliament - 01/03/2012
Ex: Google Conversion Tracking
<html>
<body>
<!-- Below is sample text link with a phone number. You need to replace the number
with your own phone number and the CALL NOW text with the text you want to hyperlink.
-->
<a onclick="goog_report_conversion('tel:949-555-1234')" href="#" >CALL
NOW</a>
</body>
</html>
Some websites implement Google forwarding number that measures the calls made by
potential customers.
Why metadata matters?
Metadata is more interesting than actual information. Ex:
● They know you called the suicide prevention hotline. But the actual conversation
remain secret.
● They know you checked HIV related websites, talked to a HIV testing service, then
spoke to your doctor. But they don’t know what was discussed during the calls.
Furthermore, Bizer, Christian et al. [5] have shown how websites already embed
structured data to describe product, services, events, and make user information
available already into their HTML pages using markup standards such as Microformats,
Microdata and RDFa.
Hyperdata && Hypermedia
Hyperdata indicates data objects linked to other data objects in other places, as
hypertext indicates text linked to other text in other places.
Hyperdata enables formation of a web of data, evolving from the "data on the Web" that
is not inter-related (or at least, not linked).
Hypermedia, an extension of the term hypertext, is a nonlinear medium of information
which includes graphics, audio, video, plain text and hyperlinks.
Source: Wikipedia
What is REST?
REST, an architectural style introduced by Roy Thomas Fielding in 2000, which has been
at the core of the web design and development.
REST represents an abstraction over the actual architecture of the web.
In REST identification, representation and format are independent concepts.
Specifically:
An URI can identify a resource without knowing what formats the resource uses to
exchange representations.
Likewise the protocols and representations used by the resource to communicate
can be modified independently from the URI identifying the resource.
RESTful Architectures
REST Interfaces
The uniformity of REST interfaces is build upon four guiding principles:
● The identification of resources through the URI mechanism.
● The manipulation of resources through their representations.
● The use of self-descriptive messages.
● Implementing hypermedia as engine of the application state (HATEOAS)
Hypermedia and privacy protection
Information self-determination is not even possible if users have no control on their online
footprint.
Hypermedia provides context over unstructured footprint information.
Users and applications use REST interfaces to interact with one another exchanging
resource representations.
The web follows REST principles and so do users’ online traces.
Hypermedia and privacy protection
Genc,Yegin,et al. [6] introduce a method to map text message into a wide context, and by
computing the distance between them, classify their content.
Ducheneaut, Nicolas et al. [7] explain how recommender systems need to incorporate
contextual information from the physical world, as users move continuously and
frequently engage in a variety of activities.
Sakaki, Takeshi et al. [8] discuss how real-time interaction between online users and the
offline world can be used to detect target events, turning the actual users into sensors
themselves.
Objectives
Objective 1
Development of a hypermedia model of the
user online footprint
Objective 1
This hypermedia model of the user online footprint is constructed by analysing the
different interactions that the user has online with various services and platforms.
Hyperme is the proposed hyperdata model of a user online footprint.
The hyperme model links the user footprints created across different services and the
features associated with them in a hypergraph.
The user footprints is therefore transformed into an object that can be explored based on
some desired features.
Objective 1
Users stream private information
towards devices, applications and
platforms.
These information is shared with
groups of different people with distinct
access rights.
Private (?) information is only shared
with service providers.
Objective 1
The hyperme model capture
different aspect of user activities
online:
● Everything in the hyperme
model is a signal.
● Signals can be easily profiled.
● Signals can be linked
between each other.
● Footprints become objects
that can be explored.
Objective 1
The last two weeks
activity of Stephen Fry
twitter account have been
analysed (@stephenfry)
Objective 1
Objective 1
Objective 2
Analysis of data flows from social networks to
third party advertisers
Objective 2
The aim of this objective is understanding what data is leaked by third party advertising
networks and how third party advertising networks and social platforms track users as
they surf the web.
The exchange of identity information is followed from the client to third party advertising
platforms.
Methods implemented by third party advertising networks are discovered and classified
by analysing network requests (HTTP) and actual data flow (JavaScript calls).
Mathematical distance between the user profile and the observed advertising profile is
taken as a measurement of how accurately third party platforms are tracking the user.
Objective 2
Objective 3
Evaluation of different PETs in Content
Recommendation Systems
Objective 3
The goal of this objective is the evaluation of different PETs in Content Recommendation
Systems.
Our aim is to show how a recommendation system is affected by the application of
certain PETs by a part of the user population.
Users may, in fact, wish to protect their privacy while also maintaining a satisfactory level
of utility of the information received by the recommendation platform.
Different levels of privacy protection are evaluated.
Objective 4
Evaluation of different PETs to prevent
information leaks on third party advertising
networks
Objective 4
The goal of this objective is the evaluation of different PETs to prevent third party
advertising networks to pervasively track users through their browsing pattern and social
platform profile.
In particular we are concerned with understanding how third-party advertising network
can be prevented to access certain private data regarding the user.
Objective 5
Extension of the hyperme model to cover
aspects of location identity
Objective 5
This objective aims at:
Analysing the amount and extent of geographical tagged information shared through
online activities.
Establish links between location information and spatial context.
Evaluate different PETs to protect user’s location privacy.
Ongoing and future work
Ongoing and future work
At the moment we are applying the hyperme hypermedia model to profile user activities
online.
We are especially concerned with answering the following questions:
● How is advertising influenced by online activities?
● To what extent does social networks activity influence third party advertising?
● To what extent can mobile phone activity influence third party advertising?
● What PETs can be implemented to protect users’ privacy?
Ongoing and future work
We are collaborating with Dr. Markus Huber @ SBA Research (Vienna, Austria) on the
following topics:
● Analyse Alexa Top Million websites to make a statistics of current tracking services
implemented.
● Testing current anti-tracking technologies to find how effectives these are.
We are aiming at submitting a paper to the 36th IEEE Symposium on Security and
Privacy.
Publications
The following article was submitted an article to the journal Computer Standards &
Interfaces, on the topic of content based recommendation systems and privacy
enhancing techniques:
S. Puglisi, J. Parra-Arnau, D. Rebollo-Monedero and J. Forne ́,
On Content-Based Recommendation and Users Privacy in Social Tagging Systems,
Preprint submitted to Computer Standards & Interfaces, April, 2014. Submitted for
publication.
I grew up with the understanding that the world I lived in was one where people enjoyed a
sort of freedom to communicate with each other in privacy, without it being monitored,
without it being measured or analyzed or sort of judged by these shadowy figures or
systems, any time they mention anything that travels across public lines.
- Edward Snowden
Thank you.
References
[1] D. Irani, S. Webb, and C. Pu, “Modeling unintended personal-
information leakage from multiple online social networks,” IEEE
Internet Computing, 2011.
[2] A. Acquisti and R.Gross,“Predicting social security numbers from
public data,”in Proceedings of the National academy of sciences,
2009.
[3] O. Goga, H. Lei, S. H. K. Parthasarathi, G. Friedland, R.
Sommer, and R. Teixeira, “Exploiting innocuous activity for
correlating users across sites,” in Proceedings of the 22nd
international conference on World Wide Web, 2013.
[4] T. Chen, M. A. Kaafar, A. Friedman, and R. Boreli, “Is more
always merrier? a deep dive into online social footprints,” in
Proceedings of the 2012 ACM workshop on Workshop on online
social networks, 2012.
[5] C.Bizer, K.Eckert, R.Meusel, H.Mü hleisen, M.Schuhmacher, and
J.Vo ̈lker, “Deployment of rdfa, microdata, and microformats on the
web – a quantitative analysis,” in 12th International Semantic Web
Conference, 21-25 October 2013, Sydney, Australia, In-Use track,
2013.
[6] Y.Genc,Y. Sakamoto, and J. Nickerson, “Discovering context:
Classifying tweets through a semantic transform based on
wikipedia,” in Foundations of Augmented Cognition. Directing the
Future of Adaptive Systems, ser. Lecture Notes in Computer
Science, D. Schmorrow and C. Fidopiastis, Eds. Springer Berlin
Heidelberg, 2011, vol. 6780, pp. 484–492.
[7] N. Ducheneaut, K. Partridge, Q. Huang, B. Price, M. Roberts, E.
Chi, V. Bellotti, and B. Begole, “Collaborative filtering is not
enough? experiments with a mixed-model recommender for leisure
activities,” in User Modeling, Adaptation, and Personalization, ser.
Lecture Notes in Computer Science, G.-J. Houben, G. McCalla, F.
Pianesi, and M. Zancanaro, Eds. Springer Berlin Heidelberg, 2009,
vol. 5535, pp. 295–306.
[8] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes
twitter users: real-time event detection by social sensors,” in
Proceedings of the 19th international conference on World wide
web. ACM, 2010, pp. 851–860.

More Related Content

What's hot

Cert Overview
Cert OverviewCert Overview
Cert Overviewmattnik
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic WaveKaniska Mandal
 
From Linked Documentary Resources to Linked Computational Resources
From Linked Documentary Resources to Linked Computational ResourcesFrom Linked Documentary Resources to Linked Computational Resources
From Linked Documentary Resources to Linked Computational ResourcesPhiloWeb
 
Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content ManagementScott Abel
 
A methodology for internal Web ethics
A methodology for internal Web ethicsA methodology for internal Web ethics
A methodology for internal Web ethicsPhiloWeb
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisInfini Graph
 
website, browser,Domain name, Email, Social networks,Ecommerce
website, browser,Domain name, Email, Social networks,Ecommercewebsite, browser,Domain name, Email, Social networks,Ecommerce
website, browser,Domain name, Email, Social networks,EcommerceSumbal Noureen
 
Linked Data and Users in Library - Does the library communicate efficiently?
Linked Data and Users in Library - Does the library communicate efficiently?Linked Data and Users in Library - Does the library communicate efficiently?
Linked Data and Users in Library - Does the library communicate efficiently?Hansung University
 
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel ApproachMining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approachijma
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...John Breslin
 
Privacy and the library patron: an ongoing ethical challenge
Privacy and the library patron: an ongoing ethical challengePrivacy and the library patron: an ongoing ethical challenge
Privacy and the library patron: an ongoing ethical challengedmcmenemy
 
Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions R A Akerkar
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic WebJohn Breslin
 
Intelligent Information Agent
Intelligent Information AgentIntelligent Information Agent
Intelligent Information AgentShuvra Ghosh
 
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Using Maltego Tungsten to Explore Cyber-Physical Confluence in GeolocationUsing Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Using Maltego Tungsten to Explore Cyber-Physical Confluence in GeolocationShalin Hai-Jew
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social SemanticsMatthew Rowe
 

What's hot (18)

Cert Overview
Cert OverviewCert Overview
Cert Overview
 
Threats_Report_2013
Threats_Report_2013Threats_Report_2013
Threats_Report_2013
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
 
From Linked Documentary Resources to Linked Computational Resources
From Linked Documentary Resources to Linked Computational ResourcesFrom Linked Documentary Resources to Linked Computational Resources
From Linked Documentary Resources to Linked Computational Resources
 
Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content Management
 
A methodology for internal Web ethics
A methodology for internal Web ethicsA methodology for internal Web ethics
A methodology for internal Web ethics
 
Social Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & AnalysisSocial Targeting: Understanding Social Media Data Mining & Analysis
Social Targeting: Understanding Social Media Data Mining & Analysis
 
website, browser,Domain name, Email, Social networks,Ecommerce
website, browser,Domain name, Email, Social networks,Ecommercewebsite, browser,Domain name, Email, Social networks,Ecommerce
website, browser,Domain name, Email, Social networks,Ecommerce
 
Linked Data and Users in Library - Does the library communicate efficiently?
Linked Data and Users in Library - Does the library communicate efficiently?Linked Data and Users in Library - Does the library communicate efficiently?
Linked Data and Users in Library - Does the library communicate efficiently?
 
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel ApproachMining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...
 
Pragmatic Web 4.0
Pragmatic Web 4.0Pragmatic Web 4.0
Pragmatic Web 4.0
 
Privacy and the library patron: an ongoing ethical challenge
Privacy and the library patron: an ongoing ethical challengePrivacy and the library patron: an ongoing ethical challenge
Privacy and the library patron: an ongoing ethical challenge
 
Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions Harvesting Intelligence from User Interactions
Harvesting Intelligence from User Interactions
 
The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic Web
 
Intelligent Information Agent
Intelligent Information AgentIntelligent Information Agent
Intelligent Information Agent
 
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Using Maltego Tungsten to Explore Cyber-Physical Confluence in GeolocationUsing Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
Using Maltego Tungsten to Explore Cyber-Physical Confluence in Geolocation
 
Tutorial: Social Semantics
Tutorial: Social SemanticsTutorial: Social Semantics
Tutorial: Social Semantics
 

Viewers also liked

Viewers also liked (8)

Resource recommendation vs privacy enhancement
Resource recommendation vs privacy enhancementResource recommendation vs privacy enhancement
Resource recommendation vs privacy enhancement
 
Asa ceva numai in vise vezi
Asa ceva numai in vise veziAsa ceva numai in vise vezi
Asa ceva numai in vise vezi
 
Magazinul raiului
Magazinul raiului Magazinul raiului
Magazinul raiului
 
On line footprint
On line footprintOn line footprint
On line footprint
 
Mobilitapp
MobilitappMobilitapp
Mobilitapp
 
Excellent presentation on_cancer
Excellent presentation on_cancerExcellent presentation on_cancer
Excellent presentation on_cancer
 
Bull_MASP_3
Bull_MASP_3Bull_MASP_3
Bull_MASP_3
 
you_never_surf_alone
you_never_surf_aloneyou_never_surf_alone
you_never_surf_alone
 

Similar to Analysis, modelling and protection of online private data.

Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
 
Channel model data2012
Channel model data2012Channel model data2012
Channel model data2012STIinnsbruck
 
F.S. Nucci - Search as an architectural component: searching for a new paradigm
F.S. Nucci - Search as an architectural component: searching for a new paradigmF.S. Nucci - Search as an architectural component: searching for a new paradigm
F.S. Nucci - Search as an architectural component: searching for a new paradigmFIA2010
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
 
Empowerment Technologies Quarter 3 Module 1
Empowerment Technologies Quarter 3 Module 1Empowerment Technologies Quarter 3 Module 1
Empowerment Technologies Quarter 3 Module 1SheilaBungalan1
 
Privacy Management and the Social Web
Privacy Management and the Social Web Privacy Management and the Social Web
Privacy Management and the Social Web Jan Schmidt
 
Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...Editor IJCATR
 
Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...Editor IJCATR
 
Social media in hospitality
Social media in hospitalitySocial media in hospitality
Social media in hospitalityAnil Bilgihan
 
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterGroundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterDan Nguyen
 
Knime social media_white_paper
Knime social media_white_paperKnime social media_white_paper
Knime social media_white_paperFiras Husseini
 
Scei technical whitepaper-19.06.2012
Scei technical whitepaper-19.06.2012Scei technical whitepaper-19.06.2012
Scei technical whitepaper-19.06.2012STIinnsbruck
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...Anil Mishra
 
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Paul Gilbreath
 
The Web and the Collective Intelligence - How to use Collective Intelligence ...
The Web and the Collective Intelligence - How to use Collective Intelligence ...The Web and the Collective Intelligence - How to use Collective Intelligence ...
The Web and the Collective Intelligence - How to use Collective Intelligence ...Hélio Teixeira
 

Similar to Analysis, modelling and protection of online private data. (20)

Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
 
Channel model data2012
Channel model data2012Channel model data2012
Channel model data2012
 
F.S. Nucci - Search as an architectural component: searching for a new paradigm
F.S. Nucci - Search as an architectural component: searching for a new paradigmF.S. Nucci - Search as an architectural component: searching for a new paradigm
F.S. Nucci - Search as an architectural component: searching for a new paradigm
 
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
 
Empowerment Technologies Quarter 3 Module 1
Empowerment Technologies Quarter 3 Module 1Empowerment Technologies Quarter 3 Module 1
Empowerment Technologies Quarter 3 Module 1
 
Privacy Management and the Social Web
Privacy Management and the Social Web Privacy Management and the Social Web
Privacy Management and the Social Web
 
Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...
 
Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...Integration of Bayesian Theory and Association Rule Mining in Predicting User...
Integration of Bayesian Theory and Association Rule Mining in Predicting User...
 
Social media in hospitality
Social media in hospitalitySocial media in hospitality
Social media in hospitality
 
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterGroundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
 
F017433947
F017433947F017433947
F017433947
 
8108-37744-1-PB.pdf
8108-37744-1-PB.pdf8108-37744-1-PB.pdf
8108-37744-1-PB.pdf
 
EMPO ICT.pptx
EMPO ICT.pptxEMPO ICT.pptx
EMPO ICT.pptx
 
RESEARCH PROPOSAL
RESEARCH PROPOSALRESEARCH PROPOSAL
RESEARCH PROPOSAL
 
Knime social media_white_paper
Knime social media_white_paperKnime social media_white_paper
Knime social media_white_paper
 
Scei technical whitepaper-19.06.2012
Scei technical whitepaper-19.06.2012Scei technical whitepaper-19.06.2012
Scei technical whitepaper-19.06.2012
 
Reveal - Social Media Verification - poster
Reveal - Social Media Verification - posterReveal - Social Media Verification - poster
Reveal - Social Media Verification - poster
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...
 
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
 
The Web and the Collective Intelligence - How to use Collective Intelligence ...
The Web and the Collective Intelligence - How to use Collective Intelligence ...The Web and the Collective Intelligence - How to use Collective Intelligence ...
The Web and the Collective Intelligence - How to use Collective Intelligence ...
 

Recently uploaded

Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一Fs
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 

Recently uploaded (20)

Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
定制(UAL学位证)英国伦敦艺术大学毕业证成绩单原版一比一
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 

Analysis, modelling and protection of online private data.

  • 1. Advisors: Dr. Jordi FORNE ́ MUÑOZ Dr. David REBOLLO MONEDERO In partial fulfilment of the requirements for the degree of: Doctor of philosophy. Silvia Puglisi silvia.puglisi@upc.edu Analysis, modelling and protection of online private data.
  • 2. Agenda Background. Introduction and scope of the investigation. Objectives. Objectives of the investigation. Ongoing and future work. Publications and current research efforts.
  • 4. Online privacy Is Privacy the right to be forgotten? In 2011, the amount of digital information created and replicated globally exceeded 1.8 zettabytes (1.8 trillion gigabytes). 75% of this information is created by individuals through new media fora such as blogs and via social networks. By the end of 2011, Facebook had 845 million monthly active users, sharing over 30 billion pieces of content. Library Briefing - Library of the European Parliament - 01/03/2012
  • 5. What is online privacy anyway? In an online context, the right to privacy has commonly been interpreted as a right to “information self-determination”. Acts typically claimed to breach online privacy concern the collection of personal information without consent, the selling of personal information and the further processing of that information.
  • 6. Do we have online privacy?
  • 7. Do we have online privacy?
  • 8. Do we have online privacy? Irani, Danesh et al. [1] describe how personal information leaks on social networks can be used for concrete attacks. Acquisti, Alessandro, and Ralph Gross [2] also presented a method to infer people Social Security numbers by using only publicly available information. Goga, Oana et al. [3] describe how user activity on one site can implicitly reveal their identity onto another site. Chen, Terence et al. [4] showed a correlation between the amount and type of information revealed in social network profiles.
  • 9. The age of the “metadata” “meta-data” is collected and stored by public and private organisations about where, when and who created and accessed a particular online content. In the private sphere it has been said that “literally, Google knows more about us than we can remember our-selves”. This situation has led to growing concerns regarding online privacy. In China, one estimate suggests there are over 30 000 government censors monitoring online information. Library Briefing - Library of the European Parliament - 01/03/2012
  • 10. Ex: Google Conversion Tracking <html> <body> <!-- Below is sample text link with a phone number. You need to replace the number with your own phone number and the CALL NOW text with the text you want to hyperlink. --> <a onclick="goog_report_conversion('tel:949-555-1234')" href="#" >CALL NOW</a> </body> </html> Some websites implement Google forwarding number that measures the calls made by potential customers.
  • 11. Why metadata matters? Metadata is more interesting than actual information. Ex: ● They know you called the suicide prevention hotline. But the actual conversation remain secret. ● They know you checked HIV related websites, talked to a HIV testing service, then spoke to your doctor. But they don’t know what was discussed during the calls. Furthermore, Bizer, Christian et al. [5] have shown how websites already embed structured data to describe product, services, events, and make user information available already into their HTML pages using markup standards such as Microformats, Microdata and RDFa.
  • 12. Hyperdata && Hypermedia Hyperdata indicates data objects linked to other data objects in other places, as hypertext indicates text linked to other text in other places. Hyperdata enables formation of a web of data, evolving from the "data on the Web" that is not inter-related (or at least, not linked). Hypermedia, an extension of the term hypertext, is a nonlinear medium of information which includes graphics, audio, video, plain text and hyperlinks. Source: Wikipedia
  • 13. What is REST? REST, an architectural style introduced by Roy Thomas Fielding in 2000, which has been at the core of the web design and development. REST represents an abstraction over the actual architecture of the web. In REST identification, representation and format are independent concepts. Specifically: An URI can identify a resource without knowing what formats the resource uses to exchange representations. Likewise the protocols and representations used by the resource to communicate can be modified independently from the URI identifying the resource.
  • 15. REST Interfaces The uniformity of REST interfaces is build upon four guiding principles: ● The identification of resources through the URI mechanism. ● The manipulation of resources through their representations. ● The use of self-descriptive messages. ● Implementing hypermedia as engine of the application state (HATEOAS)
  • 16. Hypermedia and privacy protection Information self-determination is not even possible if users have no control on their online footprint. Hypermedia provides context over unstructured footprint information. Users and applications use REST interfaces to interact with one another exchanging resource representations. The web follows REST principles and so do users’ online traces.
  • 17. Hypermedia and privacy protection Genc,Yegin,et al. [6] introduce a method to map text message into a wide context, and by computing the distance between them, classify their content. Ducheneaut, Nicolas et al. [7] explain how recommender systems need to incorporate contextual information from the physical world, as users move continuously and frequently engage in a variety of activities. Sakaki, Takeshi et al. [8] discuss how real-time interaction between online users and the offline world can be used to detect target events, turning the actual users into sensors themselves.
  • 19. Objective 1 Development of a hypermedia model of the user online footprint
  • 20. Objective 1 This hypermedia model of the user online footprint is constructed by analysing the different interactions that the user has online with various services and platforms. Hyperme is the proposed hyperdata model of a user online footprint. The hyperme model links the user footprints created across different services and the features associated with them in a hypergraph. The user footprints is therefore transformed into an object that can be explored based on some desired features.
  • 21. Objective 1 Users stream private information towards devices, applications and platforms. These information is shared with groups of different people with distinct access rights. Private (?) information is only shared with service providers.
  • 22. Objective 1 The hyperme model capture different aspect of user activities online: ● Everything in the hyperme model is a signal. ● Signals can be easily profiled. ● Signals can be linked between each other. ● Footprints become objects that can be explored.
  • 23. Objective 1 The last two weeks activity of Stephen Fry twitter account have been analysed (@stephenfry)
  • 26. Objective 2 Analysis of data flows from social networks to third party advertisers
  • 27. Objective 2 The aim of this objective is understanding what data is leaked by third party advertising networks and how third party advertising networks and social platforms track users as they surf the web. The exchange of identity information is followed from the client to third party advertising platforms. Methods implemented by third party advertising networks are discovered and classified by analysing network requests (HTTP) and actual data flow (JavaScript calls). Mathematical distance between the user profile and the observed advertising profile is taken as a measurement of how accurately third party platforms are tracking the user.
  • 29. Objective 3 Evaluation of different PETs in Content Recommendation Systems
  • 30. Objective 3 The goal of this objective is the evaluation of different PETs in Content Recommendation Systems. Our aim is to show how a recommendation system is affected by the application of certain PETs by a part of the user population. Users may, in fact, wish to protect their privacy while also maintaining a satisfactory level of utility of the information received by the recommendation platform. Different levels of privacy protection are evaluated.
  • 31. Objective 4 Evaluation of different PETs to prevent information leaks on third party advertising networks
  • 32. Objective 4 The goal of this objective is the evaluation of different PETs to prevent third party advertising networks to pervasively track users through their browsing pattern and social platform profile. In particular we are concerned with understanding how third-party advertising network can be prevented to access certain private data regarding the user.
  • 33. Objective 5 Extension of the hyperme model to cover aspects of location identity
  • 34. Objective 5 This objective aims at: Analysing the amount and extent of geographical tagged information shared through online activities. Establish links between location information and spatial context. Evaluate different PETs to protect user’s location privacy.
  • 36. Ongoing and future work At the moment we are applying the hyperme hypermedia model to profile user activities online. We are especially concerned with answering the following questions: ● How is advertising influenced by online activities? ● To what extent does social networks activity influence third party advertising? ● To what extent can mobile phone activity influence third party advertising? ● What PETs can be implemented to protect users’ privacy?
  • 37. Ongoing and future work We are collaborating with Dr. Markus Huber @ SBA Research (Vienna, Austria) on the following topics: ● Analyse Alexa Top Million websites to make a statistics of current tracking services implemented. ● Testing current anti-tracking technologies to find how effectives these are. We are aiming at submitting a paper to the 36th IEEE Symposium on Security and Privacy.
  • 38. Publications The following article was submitted an article to the journal Computer Standards & Interfaces, on the topic of content based recommendation systems and privacy enhancing techniques: S. Puglisi, J. Parra-Arnau, D. Rebollo-Monedero and J. Forne ́, On Content-Based Recommendation and Users Privacy in Social Tagging Systems, Preprint submitted to Computer Standards & Interfaces, April, 2014. Submitted for publication.
  • 39. I grew up with the understanding that the world I lived in was one where people enjoyed a sort of freedom to communicate with each other in privacy, without it being monitored, without it being measured or analyzed or sort of judged by these shadowy figures or systems, any time they mention anything that travels across public lines. - Edward Snowden Thank you.
  • 40. References [1] D. Irani, S. Webb, and C. Pu, “Modeling unintended personal- information leakage from multiple online social networks,” IEEE Internet Computing, 2011. [2] A. Acquisti and R.Gross,“Predicting social security numbers from public data,”in Proceedings of the National academy of sciences, 2009. [3] O. Goga, H. Lei, S. H. K. Parthasarathi, G. Friedland, R. Sommer, and R. Teixeira, “Exploiting innocuous activity for correlating users across sites,” in Proceedings of the 22nd international conference on World Wide Web, 2013. [4] T. Chen, M. A. Kaafar, A. Friedman, and R. Boreli, “Is more always merrier? a deep dive into online social footprints,” in Proceedings of the 2012 ACM workshop on Workshop on online social networks, 2012. [5] C.Bizer, K.Eckert, R.Meusel, H.Mü hleisen, M.Schuhmacher, and J.Vo ̈lker, “Deployment of rdfa, microdata, and microformats on the web – a quantitative analysis,” in 12th International Semantic Web Conference, 21-25 October 2013, Sydney, Australia, In-Use track, 2013. [6] Y.Genc,Y. Sakamoto, and J. Nickerson, “Discovering context: Classifying tweets through a semantic transform based on wikipedia,” in Foundations of Augmented Cognition. Directing the Future of Adaptive Systems, ser. Lecture Notes in Computer Science, D. Schmorrow and C. Fidopiastis, Eds. Springer Berlin Heidelberg, 2011, vol. 6780, pp. 484–492. [7] N. Ducheneaut, K. Partridge, Q. Huang, B. Price, M. Roberts, E. Chi, V. Bellotti, and B. Begole, “Collaborative filtering is not enough? experiments with a mixed-model recommender for leisure activities,” in User Modeling, Adaptation, and Personalization, ser. Lecture Notes in Computer Science, G.-J. Houben, G. McCalla, F. Pianesi, and M. Zancanaro, Eds. Springer Berlin Heidelberg, 2009, vol. 5535, pp. 295–306. [8] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: real-time event detection by social sensors,” in Proceedings of the 19th international conference on World wide web. ACM, 2010, pp. 851–860.