SlideShare uma empresa Scribd logo
1 de 71
Baixar para ler offline
OCEAN: Open-source Collation of
eGovernment data And Networks
Understanding Privacy Leaks in Open
Government Data
Srishti Gupta
Advisor: Dr. Ponnurangam Kumaraguru
M.Tech Thesis Defense
20-November-2013
Thesis Committee
 Dr. Muttukrishnan Rajarajan, City University,
London
 Dr. Vinayak Naik, IIIT-Delhi
 Dr. PK (Chair), IIIT-Delhi

2
Demo

3
Academic Honors
 Gupta, S., Gupta, M., and Kumaraguru, P. OCEAN: Open- Poster
source Collation of eGovernment data And Networks. Poster
at Security and Privacy Symposium (SPS), IIT-K, 2013.
BEST

 Gupta, S., Gupta, M., and Kumaraguru, P. Is Government a
Friend or Foe? Privacy in Open Government Data. Poster at
IBM-ICARE, IISc Bangalore, 2012.

4
Recognition
IIITD Homepage [ Aug ’13 ]

Hindustan [ April ’13 ]

550
Unique
Visitors
(as on Nov 17,
2013)

5
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work
 Research Contribution
 Methodology
 Experiments and Analysis
 Conclusion
 Future Work
 Questions

6
Research Motivation and Aim

Identity Theft- On rise!

7
Ways to get PII
OSN
E-mail, Docs,
Spreadsheet

Mail Thefts, Pharming
Shoulder Surfing
Dumpster Diving

Social Engineering
(e.g., Fake accounts)

 Not credible
 Limited Info.

Open Government Data Source
8
Research Motivation and Aim

Open Government Data Sources
 ‘Open’: Publicly available
 eGovernment initiatives by different state government in
the form of databases / services.
 Objective?
 Improve information gathering procedure
 Reduce the burden on citizens to access their data

 Pros: Improved data availability, easy verification.
 Cons: Databases publicly available, leading to information
disclosure, privacy breach.
9
Information Leakage in Open
Government Data Sources ??

10
Research Motivation and Aim

PII Leakage

Voter ID, Name, Father’s name, Age, Gender, Date Of
Birth, DL number, PAN, Phone number

Personally Identifiable
Information (PII)
11
Research Motivation and Aim

The Other Side! “People’s View”

CONSCIOUS
DECISION !

(Kumaraguru, 2012)

12
Citizens do not want their PII to
be leaked !

13
Research Motivation and Aim

Research Aim
 To develop a technology to showcase publicly available
personal information online

 To highlight the privacy issues on aggregation of available
personal information

14
Presentation Outline

System Outline

Identification of
data sources

Data Extraction

Threat Modelling

Evaluation (Privacy
Score, Recall, SUS)

Information Aggregation

15
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work
 Research Contribution
 Methodology
 Experiments and Analysis
 Conclusion
 Future Work
 Questions

16
Related Work and Research Contribution

Related Work
Yasni
(www.yasni.com)

17
Related Work and Research Contribution

Related Work
Pipl
(www.pipl.com)

18
Related Work and Research Contribution

Related Work
Various country-specific systems built with Open Government Data
Name

Country

Description

IndianKanoon

India

 Legal search engine
 Indexes judgements of the Supreme Court and several High
Courts

India

 Application Programming Interface
 Gives data about state assembly elections and profiles of MP's in
Maharashtra

USA

 Real-time locations of city buses
 Fares for other public transportation

UK

 Comparing locations
 Gives crime, education, transport and census data for a location

(http://www.indiankano
on.org/)

OpenCivic.in
(http://www.opencivic.i
n/)

ABQ Ride
(http://www.cabq.gov/a
bq-apps/city-appslisting/abq-ride)

Illustreets
(http://data.gov.uk/app
s/illustreets)

19
Related Work and Research Contribution

Research Gap
Indian Kanoon

Open
Government Data

Open Source Data
Aggregation

OCEAN

Yasni / Pipl

PII Leakage
20
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work
 Research Contribution
 Methodology
 Experiments and Analysis
 Conclusion
 Future Work
 Questions

21
Related Work and Research Contribution

Research Contribution
 First deployed system which shows the aggregated personal
information about the residents of Delhi.
 Threat modelling on the various open government databases.
 Privacy Score: Risk associated with the person on the leaking PII.
 Empirical understanding of privacy perceptions, awareness and
expectations of the users from the open government data.

22
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work
 Research Contribution
 Methodology





Identification of open government data sources
Threat Modelling
Data Extraction
Information Aggregation

 Experiments and Analysis
 Conclusion
 Future Work
 Questions
23
System Architecture

24
Presentation Outline

System Outline

Identification of
data sources

Data Extraction

Threat Modelling

Evaluation (Privacy
Score, Recall, SUS)

Information Aggregation

25
Methodology

Driving Licence
DL-XXYYYYAAAAAAA where
DL: state(Delhi), XX: Location in Delhi, YYYY: Year of issue of the
license, AAAAAAA is unique

26
Methodology

Voter ID
XXX12345678 where
X: ‘A’ – ‘Z’ and last 8 digits- numerals

27
Methodology

PAN
XXXTL1234X where
XXX: ‘A’ – ‘Z’, T: Type of holder, L: First character of last-name,
1234: Sequential number, X: Check digit

28
Methodology

Online Social Networks
Name , Gender, Profile image, Profile url

Name , Followers / Following count, Location, Profile image, Profile
url
Name , Gender, Facebook / Twitter contact, Friend / Follower
count, Badge / Mayorship / Check-in count, Location, Profile
image, Profile url
Name , Location, Profile image, Profile url

Name , Gender, Relationship status, Location, Organization,
Birthday, E-mail, Language, Profile image, Profile url
29
Presentation Outline

System Outline

Identification of
data sources

Data Extraction

Threat Modelling

Evaluation (Privacy
Score, Recall, SUS)

Information Aggregation

30
Methodology

II. Threat Modelling
TRUST BOUNDARY

USER

Name, Address, Relation name,
Age, Gender, Voter ID

Driving License
number
DRIVING
LICENSE

Name, Address, Father’s
name, Driving License no.,
DOB

OPEN
GOVERNMENT
DATA

Name, DOB

VOTER ROLLS

Name,
Constituency

Name, PAN
PAN
31
Research Motivation and Aim

Attack Scenario (I)
 Online Voter ID card – Multiple fake voter ID cards can be
created from the available PII

32
Research Motivation and Aim

Attack Scenario (II)
 View tax statements (Income tax e-filing) – Fake accounts
can be created to view TDS statements.

33
Research Motivation and Aim

Attack Scenario (III)
 Procure a SIM card / phone connection
 Fake documents can be created

 Credit / debit cards can be applied in victim’s name
 Networking accounts can be created

34
Methodology

II. Threat Modelling
DREAD Model: Microsoft’s Risk Assessment Model
Term

Remarks

Damage

How big the damage would be if the attack
succeeded?

Reproducibility

How easy it is to reproduce the attack to work?

Exploitability

How much time, effort, and expertise is needed to
exploit the threat?

Affected Users

If a threat were exploited, what percentage of users
would be affected?

Discoverability

How easy is it for an attacker to discover this
threat?

35
Methodology

II. Threat Modelling
Scheme: High (3), Medium (2), Low (1)
Threat: Malicious user can identify PII of Delhi residents

[Threat modelling: http://msdn.microsoft.com/en-us/library/ff648644.aspx]

36
Methodology

II. Threat Modelling
According to Microsoft’s DREAD model,
Range

Level of risk

5 -7

Low

8 – 11

Medium

12 – 15

High

In our case,
Overall rating = 2 + 3 + 2 + 3 + 3 = 13 (High)

It means that this threat pose a significant risk to the
various information portal websites of Delhi government
and needs to be addressed as soon as possible !
37
Presentation Outline

System Outline

Identification of
data sources

Data Extraction

Threat Modelling

Evaluation (Privacy
Score, Recall, SUS)

Information Aggregation

38
Methodology

III. Data Extraction
Data was collected from various open government data sources using
PHP scripts and stored as MySQL databases.

OPEN GOVT. WEBSITES

Alphabets a-z for name,
across 70 constituencies
Random 5 seeds,
‘Incremental attack’

Name and DOB from DL

VOTER
[81,95,053]
DRIVING LICENCE
[2,24,982]
PAN
[53,419]

39
Methodology

III. Data Extraction
 Public data from various online social networking sites was
collected using public API calls.
 OAuth tokens were used for authentication and authorization.

FACEBOOK
[33,77,102]

TWITTER
[15,57,715]

FOURSQUARE
[29,393]

UNIQUE NAME

GOOGLEPLUS
[28,900]

API CALLS

LINKEDIN
[1,86,798]

40
Presentation Outline

System Outline

Identification of
data sources

Data Extraction

Threat Modelling

Evaluation (Privacy
Score, Recall, SUS)

Information Aggregation

41
Methodology

IV. Information Aggregation
 Family Tree
 Information within Voter ID database aggregated to find
relationships among records.
 OCEAN has 3,90,353 such users.

42
Methodology

IV. Information Aggregation
 Mapping of users across Voter ID and Driving licence database.
Table Schema:
Database

Attributes

Voter ID

Voter ID, Name, Address, Father's / Mother's / Husband's name,
Age, Gender

Driving Licence

Name, Address, Father's name, DOB, Validity period, vehicle
category

 Done on the basis of similarity between name, relation name and
address of the users across the database.
 OCEAN has 6,384 such users.
43
Methodology

IV. Information Aggregation

Challenge: The address formats for various sources is different
44
Methodology

IV. Information Aggregation
 Mapping of users across Voter ID, Driving licence and PAN
database.

 Subset of DL having PAN were chosen.

 OCEAN has 1,693 such users.

45
Methodology

IV. Information Aggregation
 Mapping users across Foursquare, Facebook and Twitter.

 Some users specify their other OSN’s contact on Foursquare. The
information available from such users is aggregated together.

 OCEAN has 11 such users

46
Methodology

IV. Information Aggregation

Challenge: Not many users link their OSN accounts
47
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work and Research Contribution
 Methodology
 System User Interface
 Experiments and Analysis
 Conclusion
 Future Work
 Questions

48
Presentation Outline

System Outline

Identification of
data sources

Data Extraction

Threat Modelling

Evaluation (Privacy Score,
Recall, SUS)

Information Aggregation

49
Experiments and Analysis

Survey Dataset
 62 complete responses.
 51% males, 49% females.
 77% in the age group 20 – 25.
 23% had friends / self experience identity thefts online.

50
Experiments and Analysis

Evaluation Metric I - Privacy Score
 Privacy score measure the risk associated with a person on the
basis of how much PII about that person is revealed from open
government data sources.
 Privacy score (user) = Σ Sensitivity score (attributes)
 Sensitivity score -> {1, 2, 3, 4, 5}
Range

Level

<20 %

1

21 – 30 %

2

31 – 50 %

3

51 – 60 %

4

>61 %

5
51
Experiments and Analysis

Privacy Score
Attribute

Percentage of users unwilling to share
personal information with anyone

Privacy Level

Voter ID

56.4%

4

Driving licence no.

58%

4

PAN

67.7%

5

Full name

14.5%

1

Home address

82.25%

5

Age

29%

2

DOB

50%

3

Father’s name

38.7%

3

Gender

14.5%

1

Level 5

1
Willingness to share
52
Experiments and Analysis

Privacy Score
Privacy score for 84,22,459 users:
 Case 1: Users having only Voter ID (97.3%)
PS = Σ(Voter ID, name, father’s name, age, gender, address) = 16

 Case 2: Users having only Driving licence number (2%)
PS = Σ(DL number, name, relative’s name, DOB, address) = 17

 Case 3: Users having only PAN (1%)
PS = Σ(PAN, DL number, name, relative’s name, DOB, address) = 25

53
Experiments and Analysis

Privacy Score
 Case 4: Users having Voter ID and DL number (0.07%)
PS = Σ(Voter ID, DL number, name, father’s name, age, gender, DOB,
address) = 24

 Case 5: Users having Voter ID, DL number and PAN (0.02%)
PS = Σ(Voter ID, DL number, PAN, name, father’s name, age, gender,
DOB, address) = 29

1,693 people
Highest Risk!

54
Evaluation Metrics

Evaluation Metric II
 Recall (Based on user study)
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑤ℎ𝑜 𝑐𝑜𝑢𝑙𝑑 𝑏𝑒 𝑖𝑑𝑒𝑛𝑡𝑖𝑓𝑖𝑒𝑑 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑒𝑎𝑟𝑐ℎ 𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠 𝑑𝑜𝑛𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚

Thus, Recall = ( 179 / 389 ) = 46%

Low Recall
 Data collection not 100%.
(Out of 12 million voter records, we have ~8 million records)

 Respondents might be unclear about constituency.
55
Evaluation Metrics

Evaluation Metric III
 System Usability Score (SUS)
Measured using the standard method as defined by Brooke et.al.
For OCEAN, value was 74.5 / 100 which means that people found the
system usable and convenient to use.

(Brooke, 1996)

56
Experiments and Analysis

User Awareness
 Government started various open initiatives to increase
the level of transparency with citizens.
 But, only 19% survey respondents aware.
 Around 76% have started using these for less than 2 years.

 Proper schemes required to convey the existence.

57
Experiments and Analysis

User Experience
 Majority, 62% were shocked to see the availability of
personal information to this extent.

 People felt that the information can be used maliciously
against them.

 People now feel scared in sharing their information with
various government departments.

58
Experiments and Analysis

User Expectations

59
Feedback

Feedback
“It was an eye-opener
to a common man.”

“Waiting for an
upgraded version
which will work for
other states also.”

I am really shocked
that the exact ID
numbers are available
online without much
security against data
mining at this scale.”

“A great shortcoming
and security flaw has
been pointed out by
OCEAN. Great work.”

“Good system. Great
work ! Didn't know
such a system existed.”

60
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work
 Research Contribution
 Methodology
 Experiments and Analysis
 Conclusion
 Future Work
 Questions

61
Conclusion

Conclusion
 Large amount of personal information is available on
government servers.
 Information aggregation yields more information about a
person.
 Threat Modelling on open government data sources shows risk
associated with PII leakage and need for preventive measures.
 1,693 users are most vulnerable to identity thefts risks.
 People felt the need of access control on the data and proper
privacy laws against the misuse of information.
62
Presentation Outline

Presentation Outline
 Research Motivation and Aim
 Related Work
 Research Contribution
 Methodology
 Experiments and Analysis
 Conclusion
 Future Work
 Questions

63
Future Work

Future Work
 Datasets can be extended to other states in India.

 Mapping users across offline (govt. databases) and online
(social networking sites) worlds.

 Data collection can be expanded to improve the recall.

64
Future Work

Acknowledgments
Mayank Gupta, B.Tech, DCE
Niharika Sachdeva, PhD, IIIT-Delhi
Precog members, friends and family

65
References
 Kumaraguru, P., and Sachdeva, N. Privacy in India: Attitudes and
Awareness V 2.0. Tech. rep., PreCog-TR-12-001, PreCog@IIIT-Delhi,
2012. http://precog.iiitd.edu.in/research/privacyindia/
 McCallister, Erika, Tim Grance, and Karen Scanfone. "Guide to
protecting the confidentiality of personally identifiable information
(PII)(draft), January 2009." NIST Special Publication: 800-122.
 Schwartz, Paul M., and Daniel J. Solove. "PII Problem: Privacy and a
New Concept of Personally Identifiable Information, The." NYUL Rev. 86
(2011): 1814.
 Mont, Marco Casassa, Siani Pearson, and Pete Bramhall. "Towards
accountable management of identity and privacy: Sticky policies and
enforceable tracing services." Database and Expert Systems
Applications, 2003. Proceedings. 14th International Workshop on. IEEE,
2003.
 Jones, Rosie, et al. "I know what you did last summer: query logs and
user privacy." Proceedings of the sixteenth ACM conference on
Conference on information and knowledge management. ACM, 2007
66
References (I)
 Nashash, Hyam. "EDUCATION AS A BUILDING BLOCK IN OPENING UP
GOVERNMENT DATA." European Scientific Journal 9.13 (2013).
 Barber, Grayson. "Personal Information in Government Records:
Protecting the Public Interest in Privacy." . Louis U. Pub. L. Rev. 25
(2006): 63.
 Krishnamurthy, Balachander, and Craig E. Wills. "On the leakage of
personally identifiable information via online social networks."
Proceedings of the 2nd ACM workshop on Online social networks.
ACM, 2009.
 Jurgens, David. "That’s What Friends Are For: Inferring Location in
Online Social Media Platforms Based on Social Relationships." Seventh
International AAAI Conference on Weblogs and Social Media. 2013.
 Zheleva, Elena, and Lise Getoor. "To join or not to join: the illusion of
privacy in social networks with mixed public and private user profiles."
Proceedings of the 18th international conference on World wide web.
ACM, 2009.

67
References (II)
 Mislove, Alan, et al. "You are who you know: inferring user profiles in
online social networks." Proceedings of the third ACM international
conference on Web search and data mining. ACM, 2010.
 Harel, Amir, et al. "M-score: estimating the potential damage of data
leakage incident by assigning misuseability weight." Proceedings of the
2010 ACM workshop on Insider threats. ACM, 2010.

 Wright, Glover, Pranesh Prakash Sunil Abraham, and Nishant Shah.
"Open government data study: India." Study commissioned by the
Transparency and Accountability Initiative (2010).
 Godse, Mr Vinayak, and Director–Data Protection. "RISE PROJECT."
(2010).bibitem{brooke1996sus} Brooke, John. ``SUS-A quick and dirty
usability scale." Usability evaluation in industry 189 (1996): 194.
 Social media report 2012: Social media comes of age.
http://www.nielsen.com/us/en/reports/2012/state-of-the-media-thesocial-media-report-2012.html
68
Thank You!

69
Questions?

70
For any further information,
please write to

pk@iiitd.ac.in
precog.iiitd.edu.in

71

Mais conteúdo relacionado

Semelhante a OCEAN: Open-source Collation of eGovernment data And Networks: Understanding Privacy Leaks in Open Government Data

Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...IIIT Hyderabad
 
Machine Leaning & Social Media
Machine Leaning & Social MediaMachine Leaning & Social Media
Machine Leaning & Social MediaIIIT Hyderabad
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencySimon Buckingham Shum
 
Credibility, Identity Resolution, Privacy, and Policing in Online Social Media
Credibility, Identity Resolution, Privacy, and Policing in Online Social MediaCredibility, Identity Resolution, Privacy, and Policing in Online Social Media
Credibility, Identity Resolution, Privacy, and Policing in Online Social MediaIIIT Hyderabad
 
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...TimDraws
 
Ashish sonal_banglore
Ashish sonal_bangloreAshish sonal_banglore
Ashish sonal_bangloreIPPAI
 
Finding Insights in Florida Voter Participation
Finding Insights in Florida Voter ParticipationFinding Insights in Florida Voter Participation
Finding Insights in Florida Voter ParticipationKarthikeyan Umapathy
 
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
Mining and analyzing social media   part 1 - hicss47 tutorial - dave kingMining and analyzing social media   part 1 - hicss47 tutorial - dave king
Mining and analyzing social media part 1 - hicss47 tutorial - dave kingDave King
 
Credibility, Identity Resolution, and Privacy on Online Social Media
Credibility, Identity Resolution, and Privacy on Online Social Media Credibility, Identity Resolution, and Privacy on Online Social Media
Credibility, Identity Resolution, and Privacy on Online Social Media IIIT Hyderabad
 
تحلیل شبکه‌های اجتماعی چرا و چگونه
تحلیل شبکه‌های اجتماعی چرا و چگونهتحلیل شبکه‌های اجتماعی چرا و چگونه
تحلیل شبکه‌های اجتماعی چرا و چگونهskillupevent
 
slides_critical-data-modeling.pptx
slides_critical-data-modeling.pptxslides_critical-data-modeling.pptx
slides_critical-data-modeling.pptxHassoSchaap
 
Large scale data analytics for smart cities and related use cases
Large scale data analytics for smart cities and related use casesLarge scale data analytics for smart cities and related use cases
Large scale data analytics for smart cities and related use casesPayamBarnaghi
 
Online social network analysis with machine learning techniques
Online social network analysis with machine learning techniquesOnline social network analysis with machine learning techniques
Online social network analysis with machine learning techniquesHari KC
 
Developing a Federal Vision for Identity Management
Developing a Federal Vision for Identity ManagementDeveloping a Federal Vision for Identity Management
Developing a Federal Vision for Identity ManagementDuane Blackburn
 
User Identities Across Social Networks: Quantifying Linkability and Nudging U...
User Identities Across Social Networks: Quantifying Linkability and Nudging U...User Identities Across Social Networks: Quantifying Linkability and Nudging U...
User Identities Across Social Networks: Quantifying Linkability and Nudging U...IIIT Hyderabad
 
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...Prachi Jain
 
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...IIIT Hyderabad
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...
Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...
Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...REVULN
 

Semelhante a OCEAN: Open-source Collation of eGovernment data And Networks: Understanding Privacy Leaks in Open Government Data (20)

Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
 
Machine Leaning & Social Media
Machine Leaning & Social MediaMachine Leaning & Social Media
Machine Leaning & Social Media
 
Black Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic TransparencyBlack Box Learning Analytics? Beyond Algorithmic Transparency
Black Box Learning Analytics? Beyond Algorithmic Transparency
 
Credibility, Identity Resolution, Privacy, and Policing in Online Social Media
Credibility, Identity Resolution, Privacy, and Policing in Online Social MediaCredibility, Identity Resolution, Privacy, and Policing in Online Social Media
Credibility, Identity Resolution, Privacy, and Policing in Online Social Media
 
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
 
Ashish sonal_banglore
Ashish sonal_bangloreAshish sonal_banglore
Ashish sonal_banglore
 
Finding Insights in Florida Voter Participation
Finding Insights in Florida Voter ParticipationFinding Insights in Florida Voter Participation
Finding Insights in Florida Voter Participation
 
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
Mining and analyzing social media   part 1 - hicss47 tutorial - dave kingMining and analyzing social media   part 1 - hicss47 tutorial - dave king
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
 
Credibility, Identity Resolution, and Privacy on Online Social Media
Credibility, Identity Resolution, and Privacy on Online Social Media Credibility, Identity Resolution, and Privacy on Online Social Media
Credibility, Identity Resolution, and Privacy on Online Social Media
 
تحلیل شبکه‌های اجتماعی چرا و چگونه
تحلیل شبکه‌های اجتماعی چرا و چگونهتحلیل شبکه‌های اجتماعی چرا و چگونه
تحلیل شبکه‌های اجتماعی چرا و چگونه
 
slides_critical-data-modeling.pptx
slides_critical-data-modeling.pptxslides_critical-data-modeling.pptx
slides_critical-data-modeling.pptx
 
Large scale data analytics for smart cities and related use cases
Large scale data analytics for smart cities and related use casesLarge scale data analytics for smart cities and related use cases
Large scale data analytics for smart cities and related use cases
 
Online social network analysis with machine learning techniques
Online social network analysis with machine learning techniquesOnline social network analysis with machine learning techniques
Online social network analysis with machine learning techniques
 
Developing a Federal Vision for Identity Management
Developing a Federal Vision for Identity ManagementDeveloping a Federal Vision for Identity Management
Developing a Federal Vision for Identity Management
 
User Identities Across Social Networks: Quantifying Linkability and Nudging U...
User Identities Across Social Networks: Quantifying Linkability and Nudging U...User Identities Across Social Networks: Quantifying Linkability and Nudging U...
User Identities Across Social Networks: Quantifying Linkability and Nudging U...
 
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on ...
 
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...
Call Me MayBe: Understanding Nature and Risks of Sharing Mobile Numbers on On...
 
Proposal.docx
Proposal.docxProposal.docx
Proposal.docx
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...
Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...
Dr. Da-Yu Kao - The Investigation, Forensics, and Governance of ATM Heist Thr...
 

Mais de IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 

Mais de IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 

Último

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

OCEAN: Open-source Collation of eGovernment data And Networks: Understanding Privacy Leaks in Open Government Data

  • 1. OCEAN: Open-source Collation of eGovernment data And Networks Understanding Privacy Leaks in Open Government Data Srishti Gupta Advisor: Dr. Ponnurangam Kumaraguru M.Tech Thesis Defense 20-November-2013
  • 2. Thesis Committee  Dr. Muttukrishnan Rajarajan, City University, London  Dr. Vinayak Naik, IIIT-Delhi  Dr. PK (Chair), IIIT-Delhi 2
  • 4. Academic Honors  Gupta, S., Gupta, M., and Kumaraguru, P. OCEAN: Open- Poster source Collation of eGovernment data And Networks. Poster at Security and Privacy Symposium (SPS), IIT-K, 2013. BEST  Gupta, S., Gupta, M., and Kumaraguru, P. Is Government a Friend or Foe? Privacy in Open Government Data. Poster at IBM-ICARE, IISc Bangalore, 2012. 4
  • 5. Recognition IIITD Homepage [ Aug ’13 ] Hindustan [ April ’13 ] 550 Unique Visitors (as on Nov 17, 2013) 5
  • 6. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work  Research Contribution  Methodology  Experiments and Analysis  Conclusion  Future Work  Questions 6
  • 7. Research Motivation and Aim Identity Theft- On rise! 7
  • 8. Ways to get PII OSN E-mail, Docs, Spreadsheet Mail Thefts, Pharming Shoulder Surfing Dumpster Diving Social Engineering (e.g., Fake accounts)  Not credible  Limited Info. Open Government Data Source 8
  • 9. Research Motivation and Aim Open Government Data Sources  ‘Open’: Publicly available  eGovernment initiatives by different state government in the form of databases / services.  Objective?  Improve information gathering procedure  Reduce the burden on citizens to access their data  Pros: Improved data availability, easy verification.  Cons: Databases publicly available, leading to information disclosure, privacy breach. 9
  • 10. Information Leakage in Open Government Data Sources ?? 10
  • 11. Research Motivation and Aim PII Leakage Voter ID, Name, Father’s name, Age, Gender, Date Of Birth, DL number, PAN, Phone number Personally Identifiable Information (PII) 11
  • 12. Research Motivation and Aim The Other Side! “People’s View” CONSCIOUS DECISION ! (Kumaraguru, 2012) 12
  • 13. Citizens do not want their PII to be leaked ! 13
  • 14. Research Motivation and Aim Research Aim  To develop a technology to showcase publicly available personal information online  To highlight the privacy issues on aggregation of available personal information 14
  • 15. Presentation Outline System Outline Identification of data sources Data Extraction Threat Modelling Evaluation (Privacy Score, Recall, SUS) Information Aggregation 15
  • 16. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work  Research Contribution  Methodology  Experiments and Analysis  Conclusion  Future Work  Questions 16
  • 17. Related Work and Research Contribution Related Work Yasni (www.yasni.com) 17
  • 18. Related Work and Research Contribution Related Work Pipl (www.pipl.com) 18
  • 19. Related Work and Research Contribution Related Work Various country-specific systems built with Open Government Data Name Country Description IndianKanoon India  Legal search engine  Indexes judgements of the Supreme Court and several High Courts India  Application Programming Interface  Gives data about state assembly elections and profiles of MP's in Maharashtra USA  Real-time locations of city buses  Fares for other public transportation UK  Comparing locations  Gives crime, education, transport and census data for a location (http://www.indiankano on.org/) OpenCivic.in (http://www.opencivic.i n/) ABQ Ride (http://www.cabq.gov/a bq-apps/city-appslisting/abq-ride) Illustreets (http://data.gov.uk/app s/illustreets) 19
  • 20. Related Work and Research Contribution Research Gap Indian Kanoon Open Government Data Open Source Data Aggregation OCEAN Yasni / Pipl PII Leakage 20
  • 21. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work  Research Contribution  Methodology  Experiments and Analysis  Conclusion  Future Work  Questions 21
  • 22. Related Work and Research Contribution Research Contribution  First deployed system which shows the aggregated personal information about the residents of Delhi.  Threat modelling on the various open government databases.  Privacy Score: Risk associated with the person on the leaking PII.  Empirical understanding of privacy perceptions, awareness and expectations of the users from the open government data. 22
  • 23. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work  Research Contribution  Methodology     Identification of open government data sources Threat Modelling Data Extraction Information Aggregation  Experiments and Analysis  Conclusion  Future Work  Questions 23
  • 25. Presentation Outline System Outline Identification of data sources Data Extraction Threat Modelling Evaluation (Privacy Score, Recall, SUS) Information Aggregation 25
  • 26. Methodology Driving Licence DL-XXYYYYAAAAAAA where DL: state(Delhi), XX: Location in Delhi, YYYY: Year of issue of the license, AAAAAAA is unique 26
  • 27. Methodology Voter ID XXX12345678 where X: ‘A’ – ‘Z’ and last 8 digits- numerals 27
  • 28. Methodology PAN XXXTL1234X where XXX: ‘A’ – ‘Z’, T: Type of holder, L: First character of last-name, 1234: Sequential number, X: Check digit 28
  • 29. Methodology Online Social Networks Name , Gender, Profile image, Profile url Name , Followers / Following count, Location, Profile image, Profile url Name , Gender, Facebook / Twitter contact, Friend / Follower count, Badge / Mayorship / Check-in count, Location, Profile image, Profile url Name , Location, Profile image, Profile url Name , Gender, Relationship status, Location, Organization, Birthday, E-mail, Language, Profile image, Profile url 29
  • 30. Presentation Outline System Outline Identification of data sources Data Extraction Threat Modelling Evaluation (Privacy Score, Recall, SUS) Information Aggregation 30
  • 31. Methodology II. Threat Modelling TRUST BOUNDARY USER Name, Address, Relation name, Age, Gender, Voter ID Driving License number DRIVING LICENSE Name, Address, Father’s name, Driving License no., DOB OPEN GOVERNMENT DATA Name, DOB VOTER ROLLS Name, Constituency Name, PAN PAN 31
  • 32. Research Motivation and Aim Attack Scenario (I)  Online Voter ID card – Multiple fake voter ID cards can be created from the available PII 32
  • 33. Research Motivation and Aim Attack Scenario (II)  View tax statements (Income tax e-filing) – Fake accounts can be created to view TDS statements. 33
  • 34. Research Motivation and Aim Attack Scenario (III)  Procure a SIM card / phone connection  Fake documents can be created  Credit / debit cards can be applied in victim’s name  Networking accounts can be created 34
  • 35. Methodology II. Threat Modelling DREAD Model: Microsoft’s Risk Assessment Model Term Remarks Damage How big the damage would be if the attack succeeded? Reproducibility How easy it is to reproduce the attack to work? Exploitability How much time, effort, and expertise is needed to exploit the threat? Affected Users If a threat were exploited, what percentage of users would be affected? Discoverability How easy is it for an attacker to discover this threat? 35
  • 36. Methodology II. Threat Modelling Scheme: High (3), Medium (2), Low (1) Threat: Malicious user can identify PII of Delhi residents [Threat modelling: http://msdn.microsoft.com/en-us/library/ff648644.aspx] 36
  • 37. Methodology II. Threat Modelling According to Microsoft’s DREAD model, Range Level of risk 5 -7 Low 8 – 11 Medium 12 – 15 High In our case, Overall rating = 2 + 3 + 2 + 3 + 3 = 13 (High) It means that this threat pose a significant risk to the various information portal websites of Delhi government and needs to be addressed as soon as possible ! 37
  • 38. Presentation Outline System Outline Identification of data sources Data Extraction Threat Modelling Evaluation (Privacy Score, Recall, SUS) Information Aggregation 38
  • 39. Methodology III. Data Extraction Data was collected from various open government data sources using PHP scripts and stored as MySQL databases. OPEN GOVT. WEBSITES Alphabets a-z for name, across 70 constituencies Random 5 seeds, ‘Incremental attack’ Name and DOB from DL VOTER [81,95,053] DRIVING LICENCE [2,24,982] PAN [53,419] 39
  • 40. Methodology III. Data Extraction  Public data from various online social networking sites was collected using public API calls.  OAuth tokens were used for authentication and authorization. FACEBOOK [33,77,102] TWITTER [15,57,715] FOURSQUARE [29,393] UNIQUE NAME GOOGLEPLUS [28,900] API CALLS LINKEDIN [1,86,798] 40
  • 41. Presentation Outline System Outline Identification of data sources Data Extraction Threat Modelling Evaluation (Privacy Score, Recall, SUS) Information Aggregation 41
  • 42. Methodology IV. Information Aggregation  Family Tree  Information within Voter ID database aggregated to find relationships among records.  OCEAN has 3,90,353 such users. 42
  • 43. Methodology IV. Information Aggregation  Mapping of users across Voter ID and Driving licence database. Table Schema: Database Attributes Voter ID Voter ID, Name, Address, Father's / Mother's / Husband's name, Age, Gender Driving Licence Name, Address, Father's name, DOB, Validity period, vehicle category  Done on the basis of similarity between name, relation name and address of the users across the database.  OCEAN has 6,384 such users. 43
  • 44. Methodology IV. Information Aggregation Challenge: The address formats for various sources is different 44
  • 45. Methodology IV. Information Aggregation  Mapping of users across Voter ID, Driving licence and PAN database.  Subset of DL having PAN were chosen.  OCEAN has 1,693 such users. 45
  • 46. Methodology IV. Information Aggregation  Mapping users across Foursquare, Facebook and Twitter.  Some users specify their other OSN’s contact on Foursquare. The information available from such users is aggregated together.  OCEAN has 11 such users 46
  • 47. Methodology IV. Information Aggregation Challenge: Not many users link their OSN accounts 47
  • 48. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work and Research Contribution  Methodology  System User Interface  Experiments and Analysis  Conclusion  Future Work  Questions 48
  • 49. Presentation Outline System Outline Identification of data sources Data Extraction Threat Modelling Evaluation (Privacy Score, Recall, SUS) Information Aggregation 49
  • 50. Experiments and Analysis Survey Dataset  62 complete responses.  51% males, 49% females.  77% in the age group 20 – 25.  23% had friends / self experience identity thefts online. 50
  • 51. Experiments and Analysis Evaluation Metric I - Privacy Score  Privacy score measure the risk associated with a person on the basis of how much PII about that person is revealed from open government data sources.  Privacy score (user) = Σ Sensitivity score (attributes)  Sensitivity score -> {1, 2, 3, 4, 5} Range Level <20 % 1 21 – 30 % 2 31 – 50 % 3 51 – 60 % 4 >61 % 5 51
  • 52. Experiments and Analysis Privacy Score Attribute Percentage of users unwilling to share personal information with anyone Privacy Level Voter ID 56.4% 4 Driving licence no. 58% 4 PAN 67.7% 5 Full name 14.5% 1 Home address 82.25% 5 Age 29% 2 DOB 50% 3 Father’s name 38.7% 3 Gender 14.5% 1 Level 5 1 Willingness to share 52
  • 53. Experiments and Analysis Privacy Score Privacy score for 84,22,459 users:  Case 1: Users having only Voter ID (97.3%) PS = Σ(Voter ID, name, father’s name, age, gender, address) = 16  Case 2: Users having only Driving licence number (2%) PS = Σ(DL number, name, relative’s name, DOB, address) = 17  Case 3: Users having only PAN (1%) PS = Σ(PAN, DL number, name, relative’s name, DOB, address) = 25 53
  • 54. Experiments and Analysis Privacy Score  Case 4: Users having Voter ID and DL number (0.07%) PS = Σ(Voter ID, DL number, name, father’s name, age, gender, DOB, address) = 24  Case 5: Users having Voter ID, DL number and PAN (0.02%) PS = Σ(Voter ID, DL number, PAN, name, father’s name, age, gender, DOB, address) = 29 1,693 people Highest Risk! 54
  • 55. Evaluation Metrics Evaluation Metric II  Recall (Based on user study) 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑤ℎ𝑜 𝑐𝑜𝑢𝑙𝑑 𝑏𝑒 𝑖𝑑𝑒𝑛𝑡𝑖𝑓𝑖𝑒𝑑 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑒𝑎𝑟𝑐ℎ 𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠 𝑑𝑜𝑛𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑠𝑦𝑠𝑡𝑒𝑚 Thus, Recall = ( 179 / 389 ) = 46% Low Recall  Data collection not 100%. (Out of 12 million voter records, we have ~8 million records)  Respondents might be unclear about constituency. 55
  • 56. Evaluation Metrics Evaluation Metric III  System Usability Score (SUS) Measured using the standard method as defined by Brooke et.al. For OCEAN, value was 74.5 / 100 which means that people found the system usable and convenient to use. (Brooke, 1996) 56
  • 57. Experiments and Analysis User Awareness  Government started various open initiatives to increase the level of transparency with citizens.  But, only 19% survey respondents aware.  Around 76% have started using these for less than 2 years.  Proper schemes required to convey the existence. 57
  • 58. Experiments and Analysis User Experience  Majority, 62% were shocked to see the availability of personal information to this extent.  People felt that the information can be used maliciously against them.  People now feel scared in sharing their information with various government departments. 58
  • 59. Experiments and Analysis User Expectations 59
  • 60. Feedback Feedback “It was an eye-opener to a common man.” “Waiting for an upgraded version which will work for other states also.” I am really shocked that the exact ID numbers are available online without much security against data mining at this scale.” “A great shortcoming and security flaw has been pointed out by OCEAN. Great work.” “Good system. Great work ! Didn't know such a system existed.” 60
  • 61. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work  Research Contribution  Methodology  Experiments and Analysis  Conclusion  Future Work  Questions 61
  • 62. Conclusion Conclusion  Large amount of personal information is available on government servers.  Information aggregation yields more information about a person.  Threat Modelling on open government data sources shows risk associated with PII leakage and need for preventive measures.  1,693 users are most vulnerable to identity thefts risks.  People felt the need of access control on the data and proper privacy laws against the misuse of information. 62
  • 63. Presentation Outline Presentation Outline  Research Motivation and Aim  Related Work  Research Contribution  Methodology  Experiments and Analysis  Conclusion  Future Work  Questions 63
  • 64. Future Work Future Work  Datasets can be extended to other states in India.  Mapping users across offline (govt. databases) and online (social networking sites) worlds.  Data collection can be expanded to improve the recall. 64
  • 65. Future Work Acknowledgments Mayank Gupta, B.Tech, DCE Niharika Sachdeva, PhD, IIIT-Delhi Precog members, friends and family 65
  • 66. References  Kumaraguru, P., and Sachdeva, N. Privacy in India: Attitudes and Awareness V 2.0. Tech. rep., PreCog-TR-12-001, PreCog@IIIT-Delhi, 2012. http://precog.iiitd.edu.in/research/privacyindia/  McCallister, Erika, Tim Grance, and Karen Scanfone. "Guide to protecting the confidentiality of personally identifiable information (PII)(draft), January 2009." NIST Special Publication: 800-122.  Schwartz, Paul M., and Daniel J. Solove. "PII Problem: Privacy and a New Concept of Personally Identifiable Information, The." NYUL Rev. 86 (2011): 1814.  Mont, Marco Casassa, Siani Pearson, and Pete Bramhall. "Towards accountable management of identity and privacy: Sticky policies and enforceable tracing services." Database and Expert Systems Applications, 2003. Proceedings. 14th International Workshop on. IEEE, 2003.  Jones, Rosie, et al. "I know what you did last summer: query logs and user privacy." Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM, 2007 66
  • 67. References (I)  Nashash, Hyam. "EDUCATION AS A BUILDING BLOCK IN OPENING UP GOVERNMENT DATA." European Scientific Journal 9.13 (2013).  Barber, Grayson. "Personal Information in Government Records: Protecting the Public Interest in Privacy." . Louis U. Pub. L. Rev. 25 (2006): 63.  Krishnamurthy, Balachander, and Craig E. Wills. "On the leakage of personally identifiable information via online social networks." Proceedings of the 2nd ACM workshop on Online social networks. ACM, 2009.  Jurgens, David. "That’s What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships." Seventh International AAAI Conference on Weblogs and Social Media. 2013.  Zheleva, Elena, and Lise Getoor. "To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles." Proceedings of the 18th international conference on World wide web. ACM, 2009. 67
  • 68. References (II)  Mislove, Alan, et al. "You are who you know: inferring user profiles in online social networks." Proceedings of the third ACM international conference on Web search and data mining. ACM, 2010.  Harel, Amir, et al. "M-score: estimating the potential damage of data leakage incident by assigning misuseability weight." Proceedings of the 2010 ACM workshop on Insider threats. ACM, 2010.  Wright, Glover, Pranesh Prakash Sunil Abraham, and Nishant Shah. "Open government data study: India." Study commissioned by the Transparency and Accountability Initiative (2010).  Godse, Mr Vinayak, and Director–Data Protection. "RISE PROJECT." (2010).bibitem{brooke1996sus} Brooke, John. ``SUS-A quick and dirty usability scale." Usability evaluation in industry 189 (1996): 194.  Social media report 2012: Social media comes of age. http://www.nielsen.com/us/en/reports/2012/state-of-the-media-thesocial-media-report-2012.html 68
  • 71. For any further information, please write to pk@iiitd.ac.in precog.iiitd.edu.in 71