One problem that every information security organization faces is how to accurately quantify the risks that they manage. In most cases, there is not enough information available to do this. There is now enough known about data breaches to let us draw interesting conclusions, some of which may even have implications in other areas of information security. This talk describes what we can learn from a careful analysis of the available information on data breaches, how we can extend what we learn about data breaches to other aspects of information security, and why doing this makes sense.
Luther Martin, Chief Security Architect, Voltage Security, Inc.
Luther Martin is the Chief Security Architect at Voltage Security, Inc., a vendor of encryption technology and products. He began his career in information security at the National Security Agency, where he graduated from the NSA's Cryptologic Mathematician Program in 1991, and eventually became the Technical Director of the NSA's Engineering and Physical Sciences Security Division.
After leaving the NSA, he has worked at both security consulting and product companies. Notable accomplishments during this period include creating the security code review for consulting firm Ernst & Young, running the first commercial security code review projects, and creating the public-key infrastructure technology that was used in the U.S. Postal Service's PC Postage program.
He is the author of Introduction to Identity-based Encryption, and has contributed to seven other books and over 100 articles on the topics of information security and risk management.
2. Overview
Attempt at humor
Getting in the right frame of mind to think
about statistics
A reminder of some concepts from statistics
What we can learn from data breaches
What this tells us
Some generalizations that might or might not
be accurate
2
5. Estimating some numbers
What’s the probability of an exploitable
vulnerability existing in your web server right
now?
What’s the probability of your web server
being hacked in the next 12 months?
If you don’t encrypt email, what’s the
probability of it being intercepted and read on
the Internet?
Too hard?
5
6. Some easier questions
What’s the current mortgage foreclosure
rate?
What’s the current fraud loss rate in the US
for payment (credit and debit) cards?
What’s the current charge-off rate in the US
for credit card loans?
6
7. The foreclosure rate
Currently about 1 in
381 per month, or
about 3 percent
per year
http://www.realtytrac.com/trend
center/
7
8. Payment card fraud loss rate
8
http://www.kansascityfed.org/Publicat/Econrev/pdf/10q2Sullivan.pdf
9. The charge-off rate for credit cards
9
http://www.federalreserve.gov/releases/chargeoff/
10. More about statistics
We described each of these using only one
number
An average
That’s not the whole story
The average person has less than 2 legs!
1.99…< 2
Most people have an above-average number
of legs!
10
11. Even more about statistics
It’s often useful to have a second number that
tells how much variation we have in our data
Sets of data can both have the same
average, but be very different
Same mean, different variance
11
12. The normal distribution
The so-called normal distribution (“bell
curve”) appears again and again in statistics
Many things end up with a normal distribution
when you might not expect it
12
13. The Central Limit Theorem
If you add random values together you
tend to get a normal distribution
Proof by picture:
13
14. Why a known distribution is useful
If we know that we have data that follows a particular
probability distribution we can predict what we’ll see
in the future with fairly good accuracy
If you flip a fair coin 100 times then
You’ll get about 50 Heads
There’s about a 73 percent chance of getting 45 to 55 Heads
There’s about a 2 percent chance of getting more than 60
Heads
What this doesn’t do is predict how any particular flip of the
coin will turn out
14
15. One more review of math: logarithms
Logarithms are exponents
So if we have these numbers:
10, 100, 1,000, 10,000, …
or 101, 102, 103, 104, …
Then their logarithms are
1, 2, 3, 4, …
Note that multiplying corresponds to adding
exponents (logs): 102 x 103 = 102+3 = 105
15
16. Logarithms naturally occur in lots of ways
Human perception of sound (or light) is
roughly proportional to logarithm of the sound
level rather than the sound level
If you double the sound pressure level it doesn’t double how
loud it sounds to us
Instead, double the logarithm of the sound pressure level
That’s why decibels are used to measure
sound levels, etc
So logarithms may be annoying but they’re
also useful in some cases
16
17. Another use for logarithms
Logarithms are also a good way to handle big ranges
in numbers
Radio: transmit kilowatts (1,000 Watts), receive
milliwatts (0.001 Watts)
Hard to plot big ranges on one graph
Very small numbers look just like zero
Taking logarithms makes a big range easier to
handle
3 to -3 instead of 1,000 to 0.001
17
18. What about data breaches?
The most comprehensive data is that
maintained by the Open Security Foundation
www.datalossdb.org
Currently has information on close to 3,000
data breaches
Probably the most useful source of
information on data breaches
What patterns can we find in the OSF’s data?
18
22. The log of breach size matches a
normal distribution very well
22
mean 3.2, standard deviation 1.2
23. What does this tell us?
We may be able to understand the process
that leads to data breaches
We may be able to predict some things about
future data breaches
We may be able to find a good metric for
industry-wide efforts to reduce data breaches
We really need comprehensive data to find
patterns that might be there
Very small breaches are as important as very big
24. Understanding the process
Just like we get a normal distribution from
adding several random values together, we
get a lognormal distribution when we multiply
several random values together
Multiplying corresponds to adding exponents
(logs)
This suggests that what we see for data
breaches may be explained by a layered
model of security
26. The general case: if we have
1. The security provided by two technologies when
they’re both used is greater or equal to the security
of each of the components when they’re used by
themselves
2. If two technologies are independent then the
security provided by the two technologies when
they’re used together is equal to the sum of the
security provided by each of the technologies
3. The security provided by any technology is non-
negative
26
27. It’s more than just data breaches
Note that this model of the effect of bypassing
layers of security leading to multiplying the
hacker’s success doesn’t just apply to data
breaches
It also applies to any other aspect of
information security
When we learn how to quantify other types of
security incidents we’ll probably find that the
damage from them also follows a lognormal
distribution
28. Then we have to have…
A measure of security that works that way
has to essentially be a logarithm
Measuring security breaches in terms of
logarithms may end up making more sense
that measuring security breaches directly
We see it with data breaches
We’ll probably see it for other types of losses
once we learn how to quantify those losses
28
29. Does this interpretation make sense?
Other places where the lognormal distribution appears:
The concentration of gold or uranium in ore deposits
The latency period of bacterial food poisoning
The age of the onset of Alzheimer's disease
The amount of air pollution in Los Angeles
The abundance of fish species
The size of ice crystals in ice cream
The number of words spoken in a telephone conversation
The length of sentences written by George Bernard Shaw or
Gilbert K. Chesterton
http://stat.ethz.ch/~stahel/lognormal/bioscience.pdf
29
30. What can we predict?
There’s about a 1 percent chance of any breach
exposing 1 million or more records
There’s about a 0.1 percent chance of any breach
exposing 10 million or more records
We can expect about 68 percent of breaches to
expose between 100 and 25,000 records
We can expect about 95 percent of breaches to
expose between 6 and 400,000 records
Etc.
30
31. What can we NOT predict?
How many data breaches we should expect
to see in the next 12 months
Whether or not any particular business will
suffer a data breach in the next 12 months
Whether or not your business will suffer a
data breach in the next 12 months
Etc.
31
32. Other patterns: Benford’s law
Benford’s law tells us that the leading digits in
data tend to not be evenly distributed
Probability of leading digit being n is
P(n) = log(1+1/n)
n 1 2 3 4 5 6 7 8 9
P(n) 0.30 0.18 0.12 0.10 0.08 0.07 0.06 0.05 0.05
33. Why Benford’s law might make sense
Consider what happens with exponential
growth
Start with 1 and multiply by 1.1 at each step:
1, 1.10 1.21, 1.33, 1.46, 1.61, 1.77, 1.95,
2.14, 2.36, 2.59, 2.85, 3.14, 3.45, 3.80,
4.18, 4.59, 5.05, 5.56, 6.12, 6.73, 7.40,
8.14, 8.95, 9.85, 10.83, …
Note that 1 is the most common, etc.
34. Benford’s law for breaches
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
1 2 3 4 5 6 7 8 9
OSF Data
Benford's Law
35. Other patterns
There are other patterns that we can find
But they’re really just ways to repackage the
exponential growth idea
No real new ideas
Zipf’s law
Pareto’s principle
35
36. Zipf’s law
Zipf’s law
Order the data from biggest to smallest
Then the total contribution from any entry is inversely
proportional to its position in the table
Second entry is about 1/2 of the first one
Third entry is about 1/3 of the first one
The nth entry is about 1/n of the first one
R2 = 0.873
36
37. Pareto’s principle
Sometimes known as the “80-20 rule”
Very similar to the others that we’ve mentioned
In general have k% of the population accounts for
(100 - k)% of something for some k between 50 and
100
For k = 80 we get the 80-20 rule
Empirically, most data seem to cluster around k being
in the middle of this range
It’s yet another power law
37
38. Bottom line
It certainly looks like it’s possible to find some
interesting structure in the data that’s available for
data breaches
The size of data breaches seems to follow a very well
defined pattern
We may see this same pattern in other part of
information security when we learn how to quantify
other types of losses due to security breaches
We need lots of data to see the patterns in it
Data on small breaches is as important as the big ones
38
39. Practical implications (so what?)
Developing metrics
Developing ROI models
Pricing insurance
Are we winning or are hackers winning?
Any time when quantifying a loss is useful
Etc.
40. Some useful references
The OSF’s data breach database
http://datalossdb.org/
E. Limpert, W. Stahel and M. Abbt, “Lognormal Distributions across
the Sciences: Keys and Clues”
http://stat.ethz.ch/~stahel/lognormal/bioscience.pdf
The Voltage corporate blog
http://superconductor.voltage.com
CSO Magazine article on finding patterns in data breaches
http://www.csoonline.com/article/501584/data-breaches-
patterns-and-their-implications
40
Notas do Editor
ISSA Journal, March 2008, https://www.issa.org/Library/Journals/2008/March/Martin%20-%20The%20Information%20Security%20Life%20Cycle.pdf
Offer free VSN to the person with the best answer
NY 1 in 1,660 (0.7%); CA 1 in 194 (6%); NV 1 in 84 (13%); WY 1 in 2,621 (0.4%)
About $0.09 per $100 in the US; much less in other countries
Historically about 4%; now about 10%
So Lake Wobegon is almost possible – all but one child can be above average
Bell Atlantic and Dean’s troposcatter system
HM Revenue & Customs, National Archives and Records Admin
Frank Benford, “The law of anomalous numbers,” Proceedings of the American Philosophical Society, Vol. 78, pp. 551-72, 1938.