Shane Lamont, Chief Technology Officer - Big Data and Cloud at HSBC Data Services, talks about how to balance conflicting objectives of data access and data privacy on the In:Confidence 2019 main stage (April 4th at Printworks, London).
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
In:Confidence 2019 - Balancing the conflicting objectives of data access and data privacy
1. 1
Balancing Data Access and Data Privacy in Analytics environments
InConfidence
April 2019
PUBLIC
2. 2
Balancing Data Access and Data Privacy
What environments will cover today?
PUBLIC
Production Environments Real data, real risks, very valuable, needs
protection. Normally one + plus appropriate
contingency measures
Test Environments
Development Environments Synthetic or anonymised data used for
developers to test software. May have
many of these for short periods of time.
Real or pre-built synthetic data. Used to
verify that the system works as intended.
Normally one per project / activity.
Environment Types Environment Contents & use
3. 3
Balancing Data Access and Data Privacy in Analytics Environments
We want to provide customers with great services through analytics, what’s the challenge?
PUBLIC
Customers
Services
Data Scientists
Access Controls
Data
I want great services and
data privacy
I want data access to
provide great services
I want
your data!
I want to know how much my
[boss, neighbour, father-in-law]
earns and what they spend it
on
4. 4
Balancing Data Access and Data Privacy
We add controls, what’s the challenge?
PUBLIC
Does Product placement
analyst need Salary?
Account?
Does Credit Card Analyst
need Counterparty?
Does a Financial Crime Analyst
need everything?
first last e-mail nid dob address occupation
Bob Smith bob@smith.com UK-151 23-Sep-67 999 Letsby Avenue Policeman
Iva House iva@house.com UK-23B 07-Nov-74 23b Maddup Avenue Homeowner
… … .. … … … …
S Holmes sherlock@d.com UK-221B 06-Jan-54 221B Baker Street Detective
DR/CR Amount Counterparty Type Country Account
DR 100 Greengrocer CARD UK 12348943
CR 3000 HSBC SAL UK 23954804
DR 500 Airplane Company CARD ES 23452345
DR 500 Political Party DD EG 33445566
Does Marketing Analyst
need National ID?
Customers
Accounts / Transactions
Controls = Identity + Approved Limited Access to sensitive fields
5. 5
Role first2 last e-mail nid dob address occupation
Marketing
Product
Credit
Fin Crime
Others ……
Balancing Data Access and Data Privacy
We add a few views, what’s the challenge?
PUBLIC
However, there is complexity from may dimensions
Roles
Views for each role – simple example with 4 views
Technology Geography Business Regulatory Data Privacy
Marketing Product Credit Fin Crime
6. 6
Balancing Data Access and Data Privacy
We need a lot of views, how do we do that?
PUBLIC
Cloud helps with all of these, but context aware
helps with complexity
Files (redacted or not) Database views
Types of views
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
Context Aware
7. 7
Balancing Data Access and Data Privacy
If ‘context aware’ views can help, how could they work in theory (and in practice)?
PUBLIC
1. Identify your critical data elements (CDEs)
2. Catalogue / map your data to your CDEs
3. Identify your roles and map them the CDEs
4. Add context to Roles and Data
5. Create an access control layer (ACL)
6. Access the data only through the ACL
Context Content
• “What you are allowed to access”
• Organisational level
• Identifies data sets and roles
• E.g. UK Analysts see UK data
• “What you are allowed to see”
• Data set level
• Your role can see these fields
• E.g. UK Analysts see name, DoB, ….
Approach (MVP1.0….)
8. 8
Balancing Data Access and Data Privacy
Practical steps towards context aware views (1)
PUBLIC
1. Identify CDEs
2. Map data to
CDEs
3. Identify roles
and privileges
4. Add context to
Roles and Data
5. Create access
control layer
(ACL)
6. Call the ACL
7. Return the data
Critical Data Element
Name
e-mail
National ID
Date of Birth
Address
Occupation
…..
Non Critical Data Elements
Shoe size
Comments
Country of residence
Everything else
…
…
…
Role Allowed to see
Marketing Name
Marketing Address
Financial Crime Name
Financial Crime e-mail
Financial Crime National ID
Financial Crime Date of Birth
Financial Crime Address
Financial Crime Occupation
Credit Analyst Address
Credit Analyst Occupation
Product e-mail
Data Identifier Is CDE CDE Type
DB.TABLE.COLUMN Y/N List
UK.CUSTDB.FNAME Y Name
UK.CUSTDB.EMAIL1 Y e-mail
UK.CUSTDB.EMAIL2 Y e-mail
UK.CUSTDB.DOB Y Date of Birth
UK.CUSTDB.PADDR Y Address
UK.CUSTDB.OCC Y Occupation
UK.CUSTDB.Preference N N/A
UK.CUSTDB.INTERESTS N N/A
UK.CUSTDB.FOOD N N/A
…….
1 2 3
9. 9
Balancing Data Access and Data Privacy
Practical steps towards context aware views (2)
PUBLIC
1. Identify CDE
2. Identify roles
and privileges
3. Map data to
CDEs
4. Add context to
Roles and Data
5. Create access
control layer
(ACL)
6. Call the ACL
7. Return the data
A. Marketing role can access UK & US. HK Customers is ‘HK’ owned. No deal.
B. Bob Smith is in Marketing. Marketing is allowed to see UK data. UK Customers? Deal.
4
This context helps you with decisions for access / no access
Entity Key Entity Value Element Key Element Value
Role Marketing Allowed Countries UK
Role Marketing Allowed Countries US
Role Marketing_HK Allowed Countries HK
Role Marketing Member Bob Smith
Person Bob Smith Home Country UK
DataSet UK Customers Data Owner UK
DataSet US Customers Data Owner US
DataSet HK Customers Data Owner HK
B)
A)
10. 10
Balancing Data Access and Data Privacy
A simple access control layer
PUBLIC
1. Identify CDE
2. Identify roles and privileges
3. Map data to CDEs
4. Add context to Roles and Data
5. Create access control layer (ACL)
6. Call the ACL
7. Return the data
CDE?
Allowed?
Y
Return field
N
Access context and data dictionary
Return field
Y
Redact field N
Return
redacted
field
Data returned
Request Data
5
6
Context
Content
7
11. 11
Role first2 last e-mail nid dob address occupation
Marketing Bob Smith XXX@XXX.XXX REDACT 01-Jan-00 999 Letsby Avenue REDACT
Marketing Iva House XXX@XXX.XXX REDACT 01-Jan-00 23b Maddup Avenue REDACT
Marketing S Holmes XXX@XXX.XXX REDACT 01-Jan-00 221B Baker Street REDACT
Product REDACT REDACT bob@smith.com REDACT 01-Jan-00 REDACT REDACT
Product REDACT REDACT iva@house.com REDACT 01-Jan-00 REDACT REDACT
Product REDACT REDACT sherlock@d.com REDACT 01-Jan-00 REDACT REDACT
Credit REDACT REDACT XXX@XXX.XXX REDACT 01-Jan-00 999 Letsby Avenue Policeman
Credit REDACT REDACT XXX@XXX.XXX REDACT 01-Jan-00 23b Maddup Avenue Homeowner
Credit REDACT REDACT XXX@XXX.XXX REDACT 01-Jan-00 221B Baker Street Detective
Fin Crime Bob Smith bob@smith.com UK-151 23-Sep-67 999 Letsby Avenue Policeman
Fin Crime Iva House iva@house.com UK-23B 07-Nov-74 23b Maddup Avenue Homeowner
Fin Crime S Holmes sherlock@d.com UK-221B 06-Jan-54 221B Baker Street Detective
Balancing Data Access and Data Privacy
What would the data returned look like to different roles?
PUBLIC
12. 12
Balancing Data Access and Data Privacy
Wrap up and thoughts for the audience
Summary
• We want to improve customer service through great analytics
• But we need to ensure we have appropriate controls
• Views help with this, but traditional approaches create lots of complexity and admin
• 'Context aware views' are one way of managing this complexity and reducing admin
• Considering data context & data content enables granular roles
• An access layer that brokers data request & response provides gateway control
• Solutions can be simplified (greatly) with consideration of different environments
Takeaway thoughts
• Do you need this?
• How would you implement this?
• What contexts are important to you?
• How would this apply to streaming in Prod?
PUBLIC