Blockchain
- What is Blockchain?
- Blockchain trends
Emerging data protection techniques
- Secure multiparty computation
- Trusted execution environments
- Use cases for analytics
- Industry Standards
Tokenization
- Convert a digital value into a digital token
- Tokenization local or in a centralized model
- Tokenization and scalability
Cloud
- Analytics in Hybrid cloud
2. 2
Cloud Security
Alliance (CSA)Tokenization Management and
Security
Cloud Management and Security Payment Card Industry (PCI)
Security Standards Council (SSC):
1. Tokenization Task Force
2. Encryption Task Force, Point to
Point Encryption Task Force
3. Risk Assessment SIG
4. eCommerce SIG
5. Cloud SIG, Virtualization SIG
6. Pre-Authorization SIG, Scoping
SIG Working Group
• Chief Security Strategist at Protegrity, previously Head of Innovation at TokenEx and
Chief Technology Officer at Atlantic BT, Compliance Engineering, and IT Architect at IBM
Ulf Mattsson
• Products and Services:
• Data Encryption, Tokenization, Data Discovery, Cloud Application Security Brokers
(CASB), Web Application Firewalls (WAF), Robotics, and Applications
• Security Operation Center (SOC), Managed Security Services (MSSP)
• Inventor of more than 70 issued US Patents and developed Industry Standards
with ANSI X9, CSA and PCI DSS Dec 2019
May 2020
May 2020
3. 3
Agenda
• Blockchain
• What is Blockchain?
• Blockchain trends
• Emerging data protection techniques
• Secure multiparty computation
• Trusted execution environments
• Use cases for analytics
• Industry Standards
• Tokenization
• Convert a digital value into a digital token
• Tokenization local or in a centralized model
• Tokenization and scalability
• Cloud
• Analytics in Hybrid cloud
6. 6Source: Gartner
Blockchain has five elements
1. Distribution: Blockchain participants are
located physically apart from each other
and are connected on a network
2. Encryption: Blockchain uses technologies
such as public and private keys to record the
data in the blocks securely and semi-
anonymously
3. Immutability: Completed transactions are
cryptographically signed, time-stamped and
sequentially added to the ledger
4. Tokenization: Transactions and other
interactions in a blockchain involve the
secure exchange of value
5. Decentralization: Both network information
and the rules for how the network operates
are maintained by nodes on the distributed
network due to a consensus mechanism
9. 9
Enterprise Blockchain platforms
Amazon Hyperledger Fabric
Ant Financial Ant Blockchain Technology, Hyperledger
Anthem Hyperledger Fabric
Aon R3 Corda
Baidu Hyperledger Fabric—
Bitfury Bitcoin, Exonum
BMW Hyperledger Fabric, Ethereum, Quorum,
Broadridge Hyperledger Fabric, Quorum, Corda, DAM
Cargill Hyperledger Sawtooth, Hyperledger Grid
China Construction Bank Hyperchain, Hyperledger Fabric
Citigroup Axcore, Symbiont Assembly, Quorum
Coinbase Bitcoin, ethereum, XRP and 24 others
Credit Suisse Corda, Paxos
Daimler Hyperledger, Corda, Ethereum
De Beers Ethereum
Depository Trust & Clearing Corporation (DTCC) Axcore
Dole Foods IBM Blockchain, Hyperledger Fabric—
Facebook Hotstuff
Figure Hyperledger Fabric
Foxconn Ethereum
General Electric Microsoft Azure, Corda, Quorum, Hyperl
Google Chainlink, Bitcoin, Ethereum, Bitcoin Cas
Honeywell Hyperledger Fabric
HSBC Ethereum, Corda, Hyperledger Fabric
Enterprise B
IBM
ING Group
Intercontinental Exchange
JPMorgan
LVMH
Mastercard
Microsoft
Nasdaq
National Settlement Depository
Nestlé
Optum
Overstock
Ripple
Royal Dutch Shell
Samsung
Santander
Signature Bank
Silvergate Bank
Square
Tencent
T-Mobile
UBS
United Nations
Vanguard
VMware
Walmart
Examples of 50 Enterprises
use of Blockchain platforms
Platform Hyperledger Ethereum Corda Bitcoin
Enterprise
customers
27 24 9 9
a 1 1
b 1 1
c
d
e 1
f 1
g 1 1
h 1 1 1
i 1 1
j 1 1
k 1
l 1 1
m
n 1 1 1
o 1
p 1
q 1
r 1
s 1
t 1
u 1 1 1
v 1 1 1
Forbes
15. 15
If there is a Picasso’s painting
valued at $50 million, it can be
tokenized.
• The same applies to gold
and diamonds.
Company stocks are more
complicated because in most
jurisdictions it is prohibited to sell
fractional parts of company shares.
Bankex — “Bankex provides the universal solution which can transform different asset classes to a digital
system/field/economy/area providing it with liquidity, flexibility, and safety for asset owners and investors like never
before”
Maecenas — “Maecenas is a new online marketplace promises to give art lovers the chance to buy shares in famous
paintings.[The Telegraph]”
LaToken — “LATOKEN’s mission is to make capital markets and trading available 24/7 T+0, with a broader range of asset
classes. We aim to facilitate capital reallocation into promising businesses, which will foster job creation with higher
productivity.”
Transform different asset classes
16. 16
Tokenization in real estate
• Suppose there is a $200,000
apartment
• Tokenization can transform
this apartment into 200,000
tokens
• Thus, each token represents a
0.0005% share of the
underlying asset
• Finally, we issue the token on
some sort of a platform
supporting smart contracts
• For example on Ethereum,
• The tokens can be freely
bought and sold on different
exchanges
• Imagine you want to invest in real estate, but your initial investment is modest
— say $5,000.
• Perhaps you want to start small and increase your investment gradually.
You are not becoming a legal owner of the property. However, because Blockchain is a public ledger that is immutable, it ensures that
once you buy tokens, nobody can “erase” your ownership even if it is not registered in a government-run registry.
17. 17
What happens if a company that
handles tokenization sells the
property?
• Token owners just own tokens.
• They have no legal rights on the
property and thus are not
protected by the law.
• Therefore, legal changes are
needed to accommodate these
new business models.
A problem is that this system brings us back some sort of centralization.
• The whole idea of Blockchain and especially smart contracts is to create a trustless environment.
• While this is possible to achieve when tokenizing digital assets, with real world, physical assets, this is not the case.
• Therefore, we have to accept a certain dose of centralization.
Legislation and centralization
20. 20
Blockchain Plans
Q: What are your organization’s plans in terms of blockchain?
2019 Gartner CIO Survey:
• 60% of CIOs expect some
kind of blockchain
deployment in the next
three years.
• Deployed blockchain or plan
to deploy it in the next 12
months,
1. financial services (18%)
2. services (17%)
3. transportation (16%)
22. 22
Blockchain enabling technologies: 2009-2020
This early phase of blockchain-enabled experiments is built on top of existing systems to reduce cost and friction in private,
proprietary activities. They have only limited distribution capabilities to a small number of nodes either within or between
enterprises.
Blockchain-inspired solutions: 2016-2023
The current phase of blockchain-inspired solutions is usually designed to address a specific operational issue – most often in
terms of inter-organisational process or record keeping inefficiency.
Blockchain complete solutions: 2020s
Blockchain complete offerings,
starting in the 2020s, will have all five
elements, delivering on the full value
proposition of blockchain including
decentralization and tokenization.
Blockchain enhanced solutions: Post-
2025
Blockchain enhanced solutions offer
all five elements and combine them
with complementary technologies
such as AI or IoT.
Blockchain technologies
Gartner
28. 28
• By 2023, blockchain will be scalable technically,
and will support trusted private transactions with
the necessary data confidentiality.
• Over time, permissioned blockchains will integrate
with public blockchains.
• Blockchain adds little value unless it is part of a
network that exchanges information and value.
• The network collaboration challenges have initially
driven organizations to turn to consortia to derive
the most immediate value from blockchain.
• Four types of consortia exist:
• technology-centric; geographically centric; industry
centric and process-centric.
Source: Gartner
Blockchain Will Be Scalable by 2023
Blockchain remains immature for enterprise deployments due to a range of technical issues
including poor scalability and interoperability.
Scalability
Roadmap
30. 30
Increased need for data analytics drives requirements.
Data Lake,
ETL, Files
…
• Policy Enforcement Point (PEP)Protected data fields
U
• Encryption Key Management
U
External Data
Internal
Data
Secure Multi Party Computation
Analytics, Data Science, AI and ML
Data Pipeline
Data Collaboration
Data Pipeline
Data Privacy
On-premises
Cloud
Internal and Individual Third-Party Data Sharing
31. 31
https://royalsociety.org
Secure Multi-Party Computing (MPC)
Using MPC,
different parties
send encrypted
messages to each
other, and obtain
the model F(A,B,C)
they wanted to
compute without
revealing their own
private input, and
without the need
for a trusted
central authority.
Secure Multi-Party machine learningCentral trusted authority
A B C
F(A, B,C)
F(A, B,C) F(A, B,C)
Protected data fields
U
B
A C
F(A, B,C)
U U
U
33. 33
Use case - Financial services industry
Confidential financial datasets which are vital for gaining significant insights.
• The use of this data requires navigating a minefield of private client information as well as sharing data
between independent financial institutions, to create a statistically significant dataset.
• Data privacy regulations such as CCPA, GDPR and other emerging regulations around the world
• Data residency controls as well as enable data sharing in a secure and private fashion.
Reduce and remove the legal, risk and compliance processes
• Collaboration across divisions, other organizations and across jurisdictions where data cannot be
relocated or shared
• Generating privacy respectful datasets with higher analytical value for Data Science and Analytics
applications.
34. 34
Use case: Bank - Internal Data Usage by Other Units
A large bank wanted to broaden access to its data lake without compromising data privacy,
preserving the data’s analytical value, and at reasonable infrastructure costs.
• Current approaches to de-identify data did not fulfill the compliance requirements and business
needs, which had led to several bank projects being stopped.
• The issue with these techniques, like masking, tokenization, and aggregation, was that they did
not sufficiently protect the data without overly degrading data quality.
This approach allows creating privacy protected datasets that retain their analytical value
for Data Science and business applications.
A plug-in to the organization’s analytical pipeline to enforce the compliance policies before
the data was consumed by data science and business teams from the data lake.
• The analytical quality of the data was preserved for machine learning purposes by-using AI and
leveraging privacy models like differential privacy and k-anonymity.
Improved data access for teams increased the business’ bottom line without adding
excessive infrastructure costs, while reducing the risk of-consumer information exposure.
35. 35
Use case – Retail - Data for Secondary Purposes
Large aggregator of credit card transaction data.
Open a new revenue stream
• Using its data with its business partners: retailers, banks and advertising companies.
• They could help their partners achieve better ad conversion rate, improved customer satisfaction, and more timely
offerings.
• Needed to respect user privacy and specific regulations. In this specific case, they wanted to work with a retailer.
• Allow the retailer to gain insights while protecting user privacy, and the credit card organization’s IP.
• An analyst at each organization’s office first used the software to link the data without exchanging any of the underlying
data.
Data used to train the machine learning and statistical models.
• In this specific use-case, a logistic and linear regression model was trained using secure multi-party computation (SMC).
• In the simplest form SMC splits a dataset into secret shares and enables you to train a model without needing to put
together the pieces.
• The information that is communicated between the peers is encrypted at all times and cannot be reverse engineered.
• The resultant machine learning model coefficients (output of the training) were only shared with the partner identified as
the receiver of such information.
With the augmented dataset, the retailer was able to get a better picture of its customers buying habits.
36. 36
Shared
responsibili
ties across
cloud
service
models
Data Protection for Multi-
cloud
Payment
Application
Payment
Network
Payment
Data
Policy,
tokenization,
encryption
and keys
Gateway
Call Center
Application
PI* Data
Tokenization
Salesforce
Analytics
Application
Differential Privacy (DP),
K-anonymity model
PI* Data
Microsoft
Election
Guard
development
kit
Election
Data
Homomorphic Encryption (HE)
Data
Warehouse
PI* Data
Vault-less tokenization (VLT)
Use-cases of some data privacy techniques
Voting
Application
*: PI Data (Personal information) means information that identifies, relates to, describes, is capable of being associated
with, or could reasonably be linked, directly or indirectly, with a consumer or household according to CCPA
Dev/test
Systems
Masking
PI* Data
38. 38
Field Privacy Action (PA) PA Config
Variant Twin
Output
Gender Pseudonymise AD-lks75HF9aLKSa
Pseudonymization
Generalization
Field Privacy Action (PA) PA Config
Variant Twin
Output
Age Integer Range Bin
Step 10 +
Pseud.
Age_KXYC
Age Integer Range Bin
Custom
Steps
18-25
Aggregation/Binning
Field Privacy Action (PA) PA Config
Variant Twin
Output
Balance Nearest Unit Value Thousand 94000
Rounding
Generalization
Source data:
Output data:
Last name Balance Age Gender
Folds 93791 23 m
… … … …
Generalization
Source data:
Output data:
Patient Age Gender Region Disease
173965429 57 Female Hamburg Gastric ulcer
Patient Age Gender Region Disease
173965429 >50 Female Germany Gastric ulcer
Generalization
Examples of data de-identification
Source: INTERNATIONAL STANDARD ISO/IEC 20889, Privitar, Anonos
40. 40
Pseudonymization vs. Anonymization
Pseudonymization is recognized as an important method for privacy protection of personal health information
• Such services may be used nationally, as well as for trans-border communication.
Application areas include:
• indirect use of clinical data; clinical trials and post-marketing surveillance;
• pseudonymous care; patient identification systems; public health monitoring and assessment;
• confidential patient-safety reporting; comparative quality indicator reporting;
• peer review; consumer groups; field service.
Anonymization
• Anonymization is the process and set of tools used where no longitudinal consistency is needed.
• The anonymization process is also used where pseudonymization has been used to address the remaining data
attributes.
• Anonymization utilizes tools like redaction, removal, blanking, substitution, randomization, shifting, skewing, truncation,
grouping, etc. Anonymization can lead to a reduced possibility of linkage.
• Each element allowed to pass should be justified. Each element should present the minimal risk, given the intended use of
the resulting data-set. Thus, where the intended use of the resulting data-set does not require fine-grain codes, a
grouping of codes might be used.
ISO 25237 Health informatics
41. 41
Risk
Reduction
Source:
INTERNATIONAL
STANDARD ISO/IEC
20889
Transit Use Storage Singling out Linking Inference
Pseudonymization Tokenization
Protects the data flow
from attacks
Yes Yes Yes Yes Direct identifiers No Partially No
Deterministic
encryption
Protects the data when
not used in processing
operations
Yes No Yes Yes All attributes No Partially No
Order-preserving
encryption
Protects the data from
attacks
Partially Partially Partially Yes All attributes No Partially No
Homomorphic
encryption
Protects the data also
when used in processing
operations
Yes Yes Yes Yes All attributes No No No
Masking
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes Local identifiers Yes Partially No
Local suppression
Protects the data in
analytical applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partially
Record suppression
Removes the data from
the data set
Yes Yes Yes Yes All attributes Yes Yes Yes
Sampling
Exposes only a subset of
the data for analytical
applications
Partially Partially Partially Yes All attributes Partially Partially Partially
Generalization
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partially
Rounding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partially
Top/bottom coding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partially
Noise addition
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partially
Permutation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partially
Micro aggregation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No All attributes No Partially Partially
Differential privacy
Protects the data in
analytical applications
No Yes Yes No
Identifying
attributes
Yes Yes Partially
K-anonymity
Protects the data in
analytical applications
No Yes Yes Yes Quai identifiers Yes Partially No
Privacy models
Applicable to
types of
attributes
Reduces the risk of
Cryptographic tools
Suppression
Generalization
Technique name
Data
truthfulness
at record
level
Use Case / User Story
Data protected in
Randomization
Technique name
43. 43
10 000 000 -
1 000 000 -
100 000 -
10 000 -
1 000 -
100 -
Transactions per second*
I
Format
Preserving
Encryption
Tokenization Speed
I
Vaultless
Data
Tokenization
I
AES CBC
Encryption
Standard
I
Vault-based
Data
Tokenization
*: Speed will depend on the configuration
45. 45
Personally Identifiable Information
(PII) in compliance with the EU Cross
Border Data Protection Laws,
specifically
• Datenschutzgesetz 2000 (DSG
2000) in Austria, and
• Bundesdatenschutzgesetz in
Germany.
This required access to Austrian and
German customer data to be
restricted to only requesters in each
respective country.
• Achieved targeted compliance with
EU Cross Border Data Security laws
• Implemented country-specific data
access restrictions
Data sources
Case Study
A major international bank performed a consolidation of all European operational data sources
to Italy
46. 46
Examples of Protected Data
Field Real Data Tokenized / Pseudonymized
Name Joe Smith csu wusoj
Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA
Date of Birth 12/25/1966 01/02/1966
Telephone 760-278-3389 760-389-2289
E-Mail Address joe.smith@surferdude.org eoe.nwuer@beusorpdqo.org
SSN 076-39-2778 076-28-3390
CC Number 3678 2289 3907 3378 3846 2290 3371 3378
Business URL www.surferdude.com www.sheyinctao.com
Fingerprint Encrypted
Photo Encrypted
X-Ray Encrypted
Healthcare /
Financial
Services
Dr. visits, prescriptions, hospital stays and
discharges, clinical, billing, etc.
Financial Services Consumer Products and
activities
Protection methods can be equally applied
to the actual data, but not needed with de-
identification
47. 47
Access to DataLow High
High -
Low -
I I
Lower Risk and Higher Productivity
with More Access to More Data
User Productivity
Risk
More
Access to
Data
Low Risk Tokens
High Risk Clear Data
49. 49
Data protection techniques: Deployment on-premises, and clouds
Data
Warehouse
Centralized Distributed
On-
premises
Public
Cloud
Private
Cloud
Vault-based tokenization y y
Vault-less tokenization y y y y y y
Format preserving
encryption
y y y y y
Homomorphic encryption y y
Masking y y y y y y
Hashing y y y y y y
Server model y y y y y y
Local model y y y y y y
L-diversity y y y y y y
T-closeness y y y y y y
Privacy enhancing data de-identification
terminology and classification of techniques
De-
identification
techniques
Tokenization
Cryptographic
tools
Suppression
techniques
Formal
privacy
measurement
models
Differential
Privacy
K-anonymity
model
53. 53
A Data Security Gateway can protect sensitive data in Cloud and On-premise
• Policy Enforcement Protected data
U
• Encryption Key
On-premise
54. 54
Protection throughout the lifecycle of data in Hadoop
Tokenizes or encrypts
sensitive data fields
Enterprise
Policies
Privacy policies may be
managed on-prem or
Cloud Platform
• Policy Enforcement Point (PEP)
Protected data fields
U
Separation of Duties
• Encryption Key Management
Big Data Analytics
Data
Producers
Data
Users
Google Cloud
UU
Big Data Protection with Granular Field Level Protection for Google Cloud
55. 55
Protect data before landing
Enterprise
Policies
Apps using de-identified
data
Sensitive data streams
Enterprise on-
prem
Data lifted to S3 is
protected before use
S3
• Applications can use de-
identified data or data in the
clear based on policies
• Protection of data in AWS S3
before landing in a S3 bucket
Protection of data
in AWS S3 with
Separation of Duties
• Policy Enforcement Point (PEP)
Separation of Duties
• Encryption Key Management
56. 56
Trusted execution environments
Trusted Execution Environments (TEEs) provide secure computation capability through a combination of special-purpose
hardware in modern processors and software built to use those hardware features.
The special-purpose hardware provides a mechanism by which a process can run on a processor without its memory or
execution state being visible to any other process on the processor,
• not even the operating system or other privileged code.
*: Source: http://publications.officialstatistics.org
Computation in a TEE is not
performed on data while it remains
encrypted.
• Typically, the memory space of
each TEE (enclave) application is
protected from access
• AES-encrypted when and if
it is stored off-chip.
Usability is low and products/services are emerging in MS Azure, IBM’s cloud service Amazon AWS (late 2020)*
57. 57
Legal Compliance and Nation-State Attacks
• Many companies have information that is attractive to governments and intelligence services.
• Others worry that litigation may result in a subpoena for all their data.
Securosis, 2019
Multi-Cloud Data Privacy considerations
Jurisdiction
• Cloud service
providers
redundancy is great
for resilience, but
regulatory concerns
arises when moving
data across regions
which may have
different laws and
jurisdictions.
SecuPi
58. 58Securosis, 2019
Consistency
• Most firms are quite familiar with their
on-premises encryption and key
management systems, so they often
prefer to leverage the same tool and skills
across multiple clouds.
• Firms often adopt a “best of breed” cloud
approach.
Examples of Hybrid Cloud considerations
Trust
• Some customers simply do not trust
their vendors.
Vendor Lock-in and Migration
• A common concern is vendor
lock-in, and an inability to
migrate to another cloud
service provider.
Cloud Gateway
Google Cloud AWS Cloud Azure Cloud
S3
Salesforce
The 2014 Verizon Data Breach Investigations Report concluded that enterprises are losing ground in the fight against persistent cyber-attacks. We simply cannot catch the bad guys until it is too late. This picture is not improving.
Verizon concluded that less than 14% of breaches are detected by internal security tools. Detection by third party entities increased from approximately 10% to 25% during the last three years.
Specifically theft of payment card information 99% of the cases that someone else told the victim they had suffered a breach.
One reason is that our current approach with monitoring and intrusion detection products can't tell you what normal looks like in your own systems and SIEM technology is simply too slowly to be useful for security analytics.
Big Data security analytics may help over time, but we don't have time to wait.
Biggest hacks and security breaches of 2014 include eBay, Target, Sony and Microsoft, Celebrity iCloud, NSA, Heartbleed, Sony
The successful attack on JP Morgan Chase surprised me most as the largest US bank lost personal information of 76 million households and it took several months to detect.
A framework for GDPR readiness GDPR compliance is complex, because the regulation itself is complex. It outlines obligations for data holders that can affect all parts of a business, from data collection to customer communication practices. However, GDPR is also open-ended: it doesn’t tell you in detail how to meet those obligations, or that any given technological approach will suffice. That’s why IBM has developed a straightforward approach to help simplify the ways you think about conformance. The IBM GDPR framework offers an actionable five-phase approach to GDPR readiness, which recognizes that readiness is a continuum: every organization will have a unique place on the journey to readiness. In Phase 1, you assess your situation. You figure out which of the data you collect and store is covered by GDPR regulations, and then you plot a course to discover it. Phase 2 is where you design your approach. You need to come up with a solid plan for data collection, use and storage. And you need to develop an architecture and strategy that will balance risks and business objectives. Your goal in Phase 3 is to transform your practices, understanding that the data you deem valuable to your organization is equally valuable to the people it represents. This is where you need to develop a sustainable privacy compliance program, implement security and governance controls (TOMs — Technical and Organizational Measures) and potentially appoint a Data Protection Officer. By the time you get to Phase 4, you’re ready to operate your program. Now you’re continually inspecting your data, monitoring personal data access, testing your security, using privacy and security by design principles and purging unneeded data. And Phase 5 — the final phase — is where you’re ready to conform with the necessary GDPR requirements. Now you’re fulfilling data subject requests for access, correction, erasure and transfer. You’re also prepared for audits with documentation of your activities and ready to inform regulators and data subjects in the event of a data breach.
A framework for GDPR readiness GDPR compliance is complex, because the regulation itself is complex. It outlines obligations for data holders that can affect all parts of a business, from data collection to customer communication practices. However, GDPR is also open-ended: it doesn’t tell you in detail how to meet those obligations, or that any given technological approach will suffice. That’s why IBM has developed a straightforward approach to help simplify the ways you think about conformance. The IBM GDPR framework offers an actionable five-phase approach to GDPR readiness, which recognizes that readiness is a continuum: every organization will have a unique place on the journey to readiness. In Phase 1, you assess your situation. You figure out which of the data you collect and store is covered by GDPR regulations, and then you plot a course to discover it. Phase 2 is where you design your approach. You need to come up with a solid plan for data collection, use and storage. And you need to develop an architecture and strategy that will balance risks and business objectives. Your goal in Phase 3 is to transform your practices, understanding that the data you deem valuable to your organization is equally valuable to the people it represents. This is where you need to develop a sustainable privacy compliance program, implement security and governance controls (TOMs — Technical and Organizational Measures) and potentially appoint a Data Protection Officer. By the time you get to Phase 4, you’re ready to operate your program. Now you’re continually inspecting your data, monitoring personal data access, testing your security, using privacy and security by design principles and purging unneeded data. And Phase 5 — the final phase — is where you’re ready to conform with the necessary GDPR requirements. Now you’re fulfilling data subject requests for access, correction, erasure and transfer. You’re also prepared for audits with documentation of your activities and ready to inform regulators and data subjects in the event of a data breach.
Simply minimizing the data you collect doesn’t do anything to protect the information that’s left. This is something you should be doing no matter what, however…
De-identification or Anonymization can be a cost effective approach to protect data