NYOUG - New York Oracle Users Group:
- Risks Associated with Cloud Computing
- Data Tokens in a Cloud Environment
- Data Tokenization at the Gateway Layer
- Data Tokenization at the Database Layer
- Risk Management and PCI
Securing data today and in the future - Oracle NYC
1. Securing Data Today
and in the Future
Ulf Mattsson
CTO Protegrity
ulf . mattsson [at] protegrity . com
2. Ulf Mattsson
20 years with IBM Development & Global Services
Inventor of 22 patents – Encryption and Tokenization
Co-founder of Protegrity (Data Security)
Research member of the International Federation for Information
Processing (IFIP) WG 11.3 Data and Application Security
Member of
• Cloud Security Alliance (CSA)
• PCI Security Standards Council (PCI SSC)
• American National Standards Institute (ANSI) X9
• Information Systems Security Association (ISSA)
• Information Systems Audit and Control Association (ISACA)
5. Best Source of Incident Data
“It is fascinating that the top threat events
in both 2010 and 2011 are the same
and involve external agents hacking and installing malware
to compromise the confidentiality and integrity of servers.”
Source: 2011 Data Breach Investigations Report, Verizon Business RISK team
Source: Securosis, http://securosis.com/
6. Data Breaches – Mainly Online Data Records
900+ breaches
900+ million compromised records:
%
Source: 2010 Data Breach Investigations Report, Verizon Business RISK team and USSS
7. Compromised Data Types - # Records
Payment card data
Personal information
Usernames, passwords
Intellectual property
Bank account data
Medical records
Classified information
System information
Sensitive organizational data
0 20 40 60 80 100 120
%
Source: Data Breach Investigations Report, Verizon Business RISK team and USSS
8. Industry Groups Represented - # Breaches
Hospitality
Retail
Financial Services
Government
Tech Services
Manufacturing
Transportation
Media
Healthcare
Business Services
0 10 20 30 40 %50
Source: Data Breach Investigations Report, Verizon Business RISK team and USSS
9. Breach Discovery Methods - # Breaches
Third party fraud detection
Notified by law enforcement
Reported by customer/partner…
Unusual system behavior
Reported by employee
Internal security audit or scan
Internal fraud detection
Brag or blackmail by perpetrator
Third party monitoring service
0 10 20 30 40 50 %
Source: Data Breach Investigations Report, Verizon Business RISK team and USSS
11. Example of How the Problem is Occurring – PCI DSS
Encrypt
Data on Attacker
SSL
Public
Public
Network
Networks
(PCI DSS)
Private Network
Clear Text
Data Application
Clear Text Data
Database
Encrypt
Data OS File
At Rest System
(PCI DSS)
Storage
System
Source: PCI Security Standards Council, 2011
12. PCI DSS - Ways to Render the PAN* Unreadable
Two-way cryptography with associated key management
processes
One-way cryptographic hash functions
Index tokens and pads
Truncation (or masking – xxxxxx xxxxxx 6781)
* PAN: Primary Account Number (Credit Card Number)
13. Protecting the Data Flow - Example
: Enforcement point
Unprotected sensitive information:
Protected sensitive information
17. Positioning Different Protection Options
Evaluation Criteria Strong Formatted Data
Encryption Encryption Tokens
Security & Compliance
Total Cost of Ownership
Use of Encoded Data
Best Worst
18. Securing Data Fields – Impact of Different Methods
Intrusiveness
(to Applications and Databases)
Hashing - !@#$%a^///&*B()..,,,gft_+!@4#$2%p^&*
Standard
Encryption
Strong Encryption - !@#$%a^.,mhu7/////&*B()_+!@
Alpha - aVdSaH 1F4hJ
1D3a Tokenizing or
Encoding Numeric - 666666 777777 8888 Formatted
Encryption
Partial - 123456 777777 1234
Clear Text - 123456 123456 1234 Original Data Data
I I
Length
Original Longer
21. Hiding Data in Plain Sight – Data Tokenization
Y&SFD%))S( Tokenization
Gateway
4000 0012 3456 7899
Data Token
40 12 3456 7890 7899
Application Cloud
Database Environment
: Data Transformer
Unprotected sensitive information:
021
Protected sensitive information:
22. Token Flexibility for Different Categories of Data
Type of Data Input Token Comment
Token Properties
Credit Card 3872 3789 1620 3675 8278 2789 2990 2789 Numeric
Medical ID 29M2009ID 497HF390D Alpha-Numeric
Date 10/30/1955 12/25/2034 Date
E-mail Address bob.hope@protegrity.com empo.snaugs@svtiensnni.snk Alpha Numeric, delimiters in
input preserved
SSN delimiters 075-67-2278 287-38-2567 Numeric, delimiters in input
Credit Card 3872 3789 1620 3675 8278 2789 2990 3675 Numeric, Last 4 digits exposed
Policy Masking
Credit Card 3872 3789 1620 3675 clear, encrypted, tokenized at rest Presentation Mask: Expose 1st
3872 37## #### #### 6 digits
23. Example: HIPAA – 18 Direct Identifiers
1. Names
2. Geographic subdivisions smaller than a state, including
3. All elements of dates (e.g., date of birth, admission)
4. Telephone numbers
5. Fax numbers
6. E-mail addresses
7. Social Security numbers
8. Medical record numbers
9. Health plan beneficiary numbers
10. Account numbers
11. Certificate/license numbers
12. Vehicle identifiers and serial numbers, including license plate numbers
13. Device identifiers and serial numbers
14. Web universal locators (URLs)
15. IP address numbers
16. Biometric identifiers, including fingerprints and voice prints
17. Full-face photographic images and any comparable images
18. Other unique identifying numbers, characteristics or codes
24. Visa Best Practices for Tokenization Version 1
Published July 14, 2010.
Token Generation Token Types
Single Use Token Multi Use Token
Algorithm and
Key Reversible
Known strong algorithm
(NIST Approved) -
Unique Sequence
Number
One way
Hash Secret per Secret per
Irreversible
Function
transaction merchant
Randomly generated
value
25. Tokenization Use Case Example
A leading retail chain
• 1500 locations in the U.S. market
Simplify PCI Compliance
• 98% of Use Cases out of audit scope
• Ease of install (had 18 PCI initiatives at one time)
Tokenization solution was implemented in 2 weeks
• Reduced PCI Audit from 7 months to 3 months
• No 3rd Party code modifications
• Proved to be the best performance option
• 700,000 transactions per days
• 50 million card holder data records
• Conversion took 90 minutes (plan was 30 days)
• Next step – tokenization server at 1500 locations
26. Different Approaches for Tokenization
Traditional Tokenization
• Dynamic Model or Pre-Generated Model
• 5 tokens per second - 5000 tokenizations per second
Next Generation Tokenization
• Memory-tokenization
• 200,000 - 9,000,000+ tokenizations per second
• “The tokenization scheme offers excellent security, since it is
based on fully randomized tables.” *
• “This is a fully distributed tokenization approach with no need
for synchronization and there is no risk for collisions.“ *
*: Prof. Dr. Ir. Bart Preneel, Katholieke University Leuven, Belgium
27. Tokenization Summary
Traditional Tokenization Memory Tokenization
Footprint Large, Expanding. Small, Static.
The large and expanding footprint of Traditional The small static footprint is the enabling factor that
Tokenization is it’s Achilles heal. It is the source of delivers extreme performance, scalability, and expanded
poor performance, scalability, and limitations on its use.
expanded use.
High Complex replication required. No replication required.
Availability, Deploying more than one token server for the Any number of token servers can be deployed without
DR, and purpose of high availability or scalability will require the need for replication or synchronization between the
Distribution complex and expensive replication or servers. This delivers a simple, elegant, yet powerful
synchronization between the servers. solution.
Reliability Prone to collisions. No collisions.
The synchronization and replication required to Memory Tokenizations’ lack of need for replication or
support many deployed token servers is prone to synchronization eliminates the potential for collisions .
collisions, a characteristic that severely limits the
usability of traditional tokenization.
Performance, Will adversely impact performance & scalability. Little or no latency. Fastest industry tokenization.
Latency, and The large footprint severely limits the ability to place The small footprint enables the token server to be
Scalability the token server close to the data. The distance placed close to the data to reduce latency. When placed
between the data and the token server creates in-memory, it eliminates latency and delivers the fastest
latency that adversely effects performance and tokenization in the industry.
scalability to the extent that some use cases are not
possible.
Extendibility Practically impossible. Unlimited Tokenization Capability.
Based on all the issues inherent in Traditional Memory Tokenization can be used to tokenize many
Tokenization of a single data category, tokenizing data categories with minimal or no impact on footprint
more data categories may be impractical. or performance.
30. Risks Associated with Cloud Computing
Handing over sensitive data to a
third party
Threat of data breach or loss
Weakening of corporate network
security
Uptime/business continuity
Financial strength of the cloud
computing provider
Inability to customize applications
0 10 20 30 40 50 60 70 %
Source: The evolving role of IT managers and CIOs Findings from the 2010 IBM Global IT Risk Study
31. Amazon Cloud & PCI DSS
Just because AWS is certified doesn't mean you are
• You still need to deploy a PCI compliant application/service and
anything on AWS is still within your assessment scope
PCI-DSS 2.0 doesn't address multi-tenancy concerns
You can store PAN data on S3, but it still needs to be
encrypted in accordance with PCI-DSS requirements
• Amazon doesn't do this for you
• You need to implement key management, rotation, logging, etc.
If you deploy a server instance in EC2 it still needs to be
assessed by your QSA (PCI auditor)
• Organization's assessment scope isn't necessarily reduced
Tokenization can reduce your handling of PAN data
Source: Securosis, http://securosis.com/
33. “Pass Security Before Entering The Cloud”
User
123456 123456 1234
Security
Check Point
123456 123456 1234
Sensitive data
123456 999999 1234
Secured data
Cloud
Unprotected sensitive information:
Protected sensitive information
34. Data Tokens in a Cloud Environment – Integration Example
990-23-1013 4000 0012 3456 7899
123-45 -1013 40 12 3456 7890 7899
Tokenization
Gateway
123-45 -1013 40 12 3456 7890 7899
Application
Databases
Cloud Environment
: Data Token
Unprotected sensitive information:
034
Protected sensitive information
35. Data Tokens in a Cloud Environment – Integration Example
Security
Admin
User
Tokenization Tokenization
Gateway Gateway
Application
Databases
Cloud Environment
: Data Token
Unprotected sensitive information:
035
Protected sensitive information
36. Data Tokenization at the Gateway Layer
User User
Application Application
Tokenization
Cloud
Gateway Environment
Database
Database
: Data Token
Unprotected sensitive information:
036
Protected sensitive information
37. Data Tokenization at the Gateway Layer
User User
Application Application
Tokenization
Gateway
Cloud
Environment
Database Database
: Data Token
Unprotected sensitive information:
037
Protected sensitive information
38. Data Tokenization at the Application Layer
User Security
Admin
Application
Token Server
Database
Cloud
: Data Token
Unprotected sensitive information:
038
Protected sensitive information
39. Data Tokenization at the Database Layer
User Security
Admin
Application
Token Server
Database
Cloud
: Data Token
Unprotected sensitive information:
039
Protected sensitive information
40. Securing Encryption Keys
User Encryption Key
Administration
An entity that uses a
given key should not
SaaS
be the entity that
stores that key
PaaS
IaaS
Encryption
Keys
Cloud
Source: http://csrc.nist.gov/groups/SNS/cloud-computing/
040
42. Risk Management and PCI – Security Aspects
Different data security methods and algorithms
Policy enforcement implemented at different system layers
Data Security Method Hashing Formatted Strong Data
Encryption Encryption Tokenization
System Layer
Application
Database Column
Database File
Storage Device
Best Worst
43. Risk Management and PCI – Security Aspects
Integration at different system layers
Different data security methods and algorithms
Data Security Method
Hashing Formatted Strong Data
Encryption Encryption Tokenization
System Layer
Application
Database Column
Database File
Storage Device
: N/A Best Worst
44. Evaluating Field Encryption & Tokenization
Evaluation Criteria Strong Field Formatted Tokenization
Encryption Encryption (distributed)
Disconnected environments
Distributed environments
Performance impact when loading data
Transparent to applications
Expanded storage size
Transparent to databases schema
Long life-cycle data
Unix or Windows mixed with “big iron” (EBCDIC)
Easy re-keying of data in a data flow
High risk data
Security - compliance to PCI, NIST
Best Worst
45. Vendors/Products Providing Database Protection
Feature 3rd Party Oracle 9 Oracle 10 Oracle 11 IBM DB2 MS SQL
Database file encryption
Database column encryption
Column encryption adds 32-
52 bytes (10.2.0.4, 11.1.0.7)
Formatted encryption
Data tokenization
Database activity monitoring
Multi vendor encryption
Data masking
Central key management
HSM support (11.1.0.7)
Re-key support (tablespace)
Best Worst
46. Column Encryption Solutions – Some Considerations
Area of Evaluation 3rd Oracle Oracle
Party 10 TDE 11 TDE
Performance, manage UDT or views/triggers
Support for both encryption and replication
Support for Oracle Domain Index for fast search
Keys are local; re-encryption if moving A -> B
Separation of duties/key control vector
Encryption format specified
Data type support
Index support beyond equality comparison
HSM (hardware crypto) support (11.1.0.6 )
HSM password not stored in file
Automated and secure master key backup procedure
Keys exportable
Best Worst
47. Choose Your Defenses – Cost Effective PCI DSS
Firewalls
Encryption/Tokenization for data at rest
Anti-virus & anti-malware solution
Encryption for data in motion
Access governance systems
Identity & access management systems
Correlation or event management systems
Web application firewalls (WAF) WAF
Endpoint encryption solution
Data loss prevention systems (DLP) DLP
Intrusion detection or prevention systems
Database scanning and monitoring (DAM) DAM
ID & credentialing system
Encryption/Tokenization
0 10 20 30 40 50 60 70 80 90 %
Source: 2009 PCI DSS Compliance Survey, Ponemon Institute
48. Deploy Defenses
Matching Data Protection Solutions with Risk Level
Risk Level Solution
Data Risk
Field Level Low Risk Monitor
Credit Card Number 25 (1-5)
Social Security Number 20
CVV 20 At Risk
Monitor, mask,
Customer Name 12 (6-15)
access control
Secret Formula 10 limits, format
Employee Name 9 control
Employee Health Record 6 encryption
High Risk Replacement,
Zip Code 3
(16-25) strong
encryption
49. Choose Your Defenses – Total Cost of Ownership
Cost
Cost of Aversion – Expected Losses
Protection of Data from the Risk
Total Cost
Optimal
Risk
X
Risk
I I Level
Strong Weak
Protection Protection
50. Best Practices - Data Security Management
Policy
File System
Protector Database
Protector
Audit
Log
Application
Protector
Enterprise
Data Security
Administrator
Tokenization Secure
Server Archive
050 : Encryption service
51. About Protegrity
Proven enterprise data security software and innovation leader
• Sole focus on the protection of data
• Patented Technology, Continuing to Drive Innovation
Growth driven by compliance and risk management
• PCI (Payment Card Industry)
• PII (Personally Identifiable Information)
• PHI (Protected Health Information) – HIPAA
• State and Foreign Privacy Laws, Breach Notification Laws
• High Cost of Information Breach ($4.8m average cost), immeasurable costs of brand
damage , loss of customers
• Requirements to eliminate the threat of data breach and non-compliance
Cross-industry applicability
• Retail, Hospitality, Travel and Transportation
• Financial Services, Insurance, Banking
• Healthcare
• Telecommunications, Media and Entertainment
• Manufacturing and Government
52. Please contact me for more information
Ulf Mattsson, CTO Protegrity
Ulf . Mattsson [at] protegrity . com