Just like in the case of Security, building Privacy at the design stage itself ensures privacy gets baked into the specific application/ process/ initiative. There is a formal Privacy By Design (PbD) framework available and it has been incorporated into several laws & regulations as well. To actually implement PbD into specific applications needs the translation and application of this framework and its principles into specific, detailed, step by step guidelines/ standards. This Hackathon endeavours to do exactly that
2. SACON 2020
Privacy Hackathon: An Introduction
• Privacy Regulations are mushrooming
globally, and organizations need to
adhere to Privacy Principles and
provide Rights to Data Subjects
• All Privacy regulations, require
organizations to have a certain degree
of insight and control of the Personal
Data they collect and store about
individuals, or risk paying hefty fines.
• Hence, gaining visibility and control
over Personal Data is a crucial
component for Privacy Compliance
readiness.
Background
• There are a variety of challenges that
are begging for solutions in the Privacy
Ecosystem
• Some directly relate to the technology
being used
• Legacy technology has not been built for
enabling the Privacy aspects.
• Key among these is the Identification
and protection of Personal Data
Elements
Current Scenario Hackathon Goals
• The Privacy Hackathon will
attempt to find ways to break
into and therefore defend the
various Personal Data elements
that are collected/stored/
transmitted/ accessed across
the ecosystem
• The Hackathon looks at new
and old ways in which the
applications are built and will
drill down to the database
level to identify the solutions
3. SACON 2020
Privacy Hackathon: An Introduction
Mobile Apps
Websites/Web
Apps
Databases
(SQL , noSQL)
ML/AI Program
Personal Data is collected through various
Channels, stored in a structured or
unstructured manner. Derived Personal
Data is also generated by an organization
through AI/ML programs.
The challenges we have in the hackathon
today tries to identify solutions to
Personal Data Identification and Tagging in
each of these Channels/Platforms
4. SACON 2020
What is Personal Data?
Any data that can – directly or
indirectly - or in combination
with other data – make a person
‘identifiable’
What is Personal Data?
Device Identifiers
Online Identifiers
Social Media MarkersMetadata Data that has been
processed using
analytics that can
identify a person
Trackers & CookiesLocation Data
Above – the – surface (ATS) Personal data
Demographic/
Identity Data
Health/
Biometric/Genetic/
Gender Data
Political Affiliations/
Personal beliefs/
Criminal History/etc
Financial Data
Govt Ids
Personal Data
(PD)
Below – the – surface (BTS) Personal data
5. SACON 2020
Challenge 1: Personal Data Access on a Mobile App
• A Mobile App on a smart phone accesses a bunch of data stored on the local device.
Mobile Apps, through “Dangerous Permissions” and embedded SDKs and have
access to Personal Data like Contacts, Photos, Location, etc.
Context
• Organizations deploying Mobile Apps are at a risk of Privacy violations based on the
way Personal Data is processed by the App. However, it is a challenge to identify
what Personal Data is being accessed and how it being used or shared by the App
Problem
• We need to write scripts to look at how to access the data being used/stored on the
local device; and identify if any of these can lead to Personal Data. (The app can be
any, we would prefer if you use one which you have been working for. )
Goal
• Techniques (could be code) to access the data and identify the Personal Data
elements being used by the Mobile AppExpected Output
6. SACON 2020
Challenge 2: Personal Data Access on a Web App/Website
• A Website/Webapp/PWA stores and accesses a bunch of data from the local device.
This could be in the form of cookies, trackers, embedded librariesContext
• Organizations deploying Web Apps are at a risk of Privacy violations based on the
way Personal Data is processed by the App. However, it is a challenge to identify
what Personal Data is being accessed and how it being used or shared by the App
Problem
• We need to write scripts to look at how to access the data being used/stored on the
local device; and identify if any of these can lead to Personal Data. (The app can be
any, we would prefer if you use one which you have been working for. )
Goal
• Techniques (could be code) to access the data and identify the Personal Data
elements being used by the Web AppExpected Output
7. SACON 2020
Challenge 3: Personal Data Discovery on a SQL DB (Metadata Tagging & Personal Data
Identification & Isolation)
•Organizations use Databases like mySQL and Oracle to store data (including Personal Data)
in a structured form. Transaction Databases use multiple referential integrity models to
store and access data in the form of various data types.
Context
•A key challenge organizations face today is on how to identify and tag the data in the SQL
DB into Personal Data elements. Unless an organization has a strong control over what
Personal Data is being accessed, who accesses it, how is it being used, it is at a risk of
Privacy violation.
Problem
•The solution needs to have a way to identify (through interface OR script-based arguments)
as well as store the tag info as meta data without disrupting the actual data OR the
application interfacing with the DB
Goal
•Techniques (could be code) to access the data and identify the Personal Data elements
being used across the DB. So we identify once and map them across the DB.Expected Output
8. SACON 2020
Challenge 4: Personal Data Discovery on noSQL DB (Metadata Tagging & Personal Data
Identification & Isolation)
•NoSQL databases are used by Organizations and are especially useful for working with large sets of distributed data. The
nosql db uses architectural elements to store and access data in the form of various data types.Context
•A key challenge organizations face today is on how to identify and tag the data in the noSQL DB into Personal Data elements.
Unless an organization has a strong control over what Personal Data is being accessed, who accesses it, how is it being used,
it is at a risk of Privacy violation.Problem
•The solution needs to have a way to identify Personal Data (through interface OR script-based arguments) as well as store the
tag info as meta data without disrupting the actual data OR the application interfacing with the DB.
• Nosql DB are also used in Analytics applications and can contain data aggregated across various data structures. Therefore
we need to identify and tag them accordingly so that the tag also carries forward when performing the analytics.
Goal
•Techniques (could be code) to access the data and identify the Personal Data elements being used across the DB. So we
identify once and map them across the DB.Expected Output
9. SACON 2020
Challenge 5: Techniques for trapping identified Personal Data in a ML/AI program
(Metadata Tagging & Personal Data Identification & Isolation)
•The ML/AI program uses various behavioural assessments as well as pre-defined responses to generate data providing
behavioural insights. This generated data could also be classified as Personal Data and this is an ever-growing data set.Context
•A key challenge organizations face today is on how to identify and tag the data being used and generated by the ML/AI
Platform into Personal Data elements. Unless an organization has a strong control over what Personal Data is being accessed,
who accesses it, how is it being used, it is at a risk of Privacy violation.Problem
•The solution needs to have a way to identify Personal Data (through interface OR script-based arguments) as well as store the
tag info as meta data without disrupting the actual data OR the application interfacing with the Platform.
•An additional element to be checked for here is the actual purpose or functionality of the programs and whether the data use
and model sets are going beyond the original specified purpose. This part is an add-on.
Goal
•Techniques (could be code) to access the data and identify the Personal Data elements being used and generated across the
ML/AI Platform.Expected Output