SlideShare uma empresa Scribd logo
1 de 7
Baixar para ler offline
1
WHY CAN’T WE SEEM TO DO MORE
WITH BIG DATA?
We are living in an age inundated with
information. Our world is increasingly in-
strumented—sensors are collecting data
on everything from hospital patients’ vital
signs, to the moment-by-moment naviga-
tion of commercial aircraft, to consumer
behavior based on buying patterns and
the use of membership cards. Waves of
data are coming from social media sites,
from radio-frequency tracking systems,
from the use of UPC barcodes. Our
modern society is wired for data.
Yet there is a growing belief in both
business and government that we should
be doing far more to take advantage of
this wealth of information. We might
use certain types of information for one
purpose or another, but we nearly always
view big data through multiple stove-
pipes, rather than treating it holistically.
We do not appear to be able to tap the full
potential of all the data available to us.
We have the technical ability. There
have been significant innovations in com-
puter technology in recent years, particu-
larly with the advent of cloud computing.
Yet like the promise of big data, the
promise of the cloud—including unprec-
edented savings, much greater access to
data, and better decision-making—still
seems largely unfulfilled.
What holds us back is not technology,
but a mindset. We are locked into an out-
moded approach to data, one that relies
on techniques created well before big data
arrived on the scene. Those techniques
give us access to only limited slices of in-
formation, and are not designed to easily
connect an analyst with multiple sources
of data. They were sufficient in their day,
but are no longer enough. Ultimately, we
are not doing more with big data because
we do not have complete access to it. We
are never able to use it all at once, and so
we are unable to track overall trends, or
see entire patterns, or ask complex ques-
tions that consider everything we know.
To meet this need—and take full
advantage of both big data and cloud
computing—a new approach has been
invented. Known as the Cloud Analytics
Reference Architecture, it is the result of
an ongoing collaboration between Booz
Allen Hamilton and the U.S. government
to leverage big data to search for terror-
ists and other threats. Intelligence ana-
lysts are now using the Cloud Analytics
Reference Architecture to paint a com-
prehensive picture that incorporates the
full range of intelligence data at once, in-
cluding reports that have been amassed
and are ongoing from the field. Unlike
conventional techniques, this new ap-
proach makes it possible for analysts to
use all available intelligence data, applying
DELIVERING ON THE
PROMISE OF BIG
DATA AND
THE CLOUD
by
Mark Jacobsohn
Senior Vice President
Booz Allen Hamilton
Joshua Sullivan, PhD
Vice President
Booz Allen Hamilton
©2012 Booz Allen Hamilton Inc. All rights reserved. No part of this document may be reproduced without
prior written permission of Booz Allen Hamilton.
2 NOVEMBER 2012
an expanding set of analytic services to help them gain
critical mission insights.
The Cloud Analytics Reference Architecture, which
is being adapted to the larger business and govern-
ment communities, removes the traditional con-
straints by bringing together innovations in two areas
of current technology. First, it uses the power of the
cloud to put an organization’s entire storehouse of
data into a common pool, or “data lake,” making all
of it easily accessible for the first time. It then uses
sophisticated computer analytics, such as machine
learning and natural language processing, to help ex-
tract the kind of knowledge and insight that creates
value, guides strategy, and drives business and mis-
sion success. Although the Cloud Analytics Refer-
ence Architecture builds upon current techniques, it is
not an incremental step forward. It is an entirely new
approach—one specifically designed for our new age
of data.
One way to understand how the Reference Archi-
tecture works is to view it in layers (see Figure 1). Its
foundation is the cloud computing and network infra-
structure, which supports the methods by which data is
managed—most notably, the data lake. The data lake, in
turn, supports a two-step process to analyze the data.
In the first step, special tools known as pre-analytics
filter information from the data lake, and give it an un-
derlying organization. That sets the stage for computer
analytics—in the next layer up—to search for valuable
knowledge. These elements support the final phase, the
visualization and interaction, where the human insights
and action take place.
THE POWER OF THE CLOUD ANALYTICS
REFERENCE ARCHITECTURE
The Reference Architecture opens up the enormous
potential of big data by allowing us to search for insight
in new ways. It enables us to look for overarching pat-
terns, and ask intuitive questions of all the data, rather
than limiting us to narrowly defined queries within
data sets. The Reference Architecture allows comput-
ers to take over much of the work humans are doing
now—freeing people to focus on the search for insight.
It makes it possible for non-computer experts, for the
first time, to frame the questions, look for patterns, and
follow hunches.
This is not some kind of magical solution—far from
it. The Reference Architecture is simply a new way of
looking at data, but one that revolutionizes our ability
to gain knowledge and insight. With conventional tech-
niques, the data and analytics are locked into stovepipes,
or silos. We can explore only limited amounts of data
at any one time—and then only with predetermined
questions that have already been built in. The Reference
Architecture removes these constraints by eliminating
the silos, and consolidating all the information in the
data lake. What results is not chaotic or overwhelming.
Rather, the rich diversity of information in the data lake
Figure 1. Primary Elements of the Cloud Analytics Reference Architecture
3NOVEMBER 2012
becomes a powerful force. The data lake is more than
a means of storage—it is a medium expressly designed
to foster connections in data. And the Reference Archi-
tecture explores those connections to search for valu-
able correlations and patterns This actually reduces the
complexity of big data, making it manageable and use-
ful, and creating efficiencies.
Instead of using data to ask “canned” questions that
test what we may already know, the Reference Architec-
ture uses data to discover new possibilities—solutions
and answers that we have not even considered. The
power of the Reference Architecture is that it constant-
ly evolves and adapts as we search for insight, taking us
beyond the limits of our imagination.
WHAT THE CLOUD ANALYTICS REFERENCE
ARCHITECTURE DOES
The Cloud Analytics Reference Architecture re-
moves the constraints created by data silos. While
the rigid structures used in conventional techniques
provide ease of storage, they carry severe disadvan-
tages. They give us an artificial view of the world based
on data models, rather than on reality and meaning. It
is akin to reading a map through a tube—we can never
immerse ourselves in the diversity of big data, and in-
stead make decisions based on limited and constrained
information. Much of data science in the last ten years
has been devoted to improving access to the silos and
building bridges between them. But that does not solve
the underlying problem—that the data is regimented
and locked in.
Eliminating the need for silos gives us access to all
the data at once—including data from multiple outside
sources. Users no longer need to move from database
to database, pulling out specific information. And, be-
cause there are no data silos, there is no need to build
complex bridges between them.
If we want to know, for example, which parts of
our computer network are most vulnerable to attack in
the next six hours, we can take into account a wide va-
riety of data sources at the same time. We might look at
whether today is a holiday in certain foreign countries,
which means that the young hackers known as “script
kiddies” are more likely to be out of school and so have
time on their hands to launch an attack. If we deter-
mine that a particular group is targeting us, we might
examine how its members are connected, asking wheth-
er they had a common professor at a university, and if
so, what techniques did he or she teach. The Reference
Architecture gives us the ability to ask a full suite of
questions rather than a pre-selected few.
The Cloud Analytics Reference Architecture al-
lows us to experiment more with the data. The Ref-
erence Architecture’s flexibility provides a new kind of
freedom—to follow hunches wherever they may lead,
to quickly shift direction to pursue promising avenues
of inquiry, to easily factor in new knowledge and in-
sights as they arise.
With the conventional approach, it is difficult to add
or switch variables that are not already part of a dataset
or data base. That typically requires tearing apart and
rebuilding both the structure that the data is in and the
computer analytics that are custom-designed to handle
specific lines of inquiry. The process is expensive and
time consuming, and so consequently, we tend to focus
instead on doing better analysis with the limited tools
available on our narrow slices of data.
With the Reference Architecture, we might decide,
in the network security example above, to add new vari-
ables to the mix, such as the current propagation speed
of commonly used viruses and botnets. Even if those
variables come from outside data sources, we do not
have to tear down and rebuild our data structures and
analytics to consider them—they seamlessly become
part of our inquiry.
The Cloud Analytics Reference Architecture al-
lows us to ask more intuitive questions. With the
conventional approach, we do not really ask questions
of the data—we create hypotheses, and then test the
data to see whether we are right. In order to pose these
hypotheses, we have to guess in advance what the an-
swers might be, often a difficult proposition.
To determine where our network is most vulnerable,
for example, we would need to start with a hypothe-
sis—say, that any attacks will occur through outdated
operating systems. That hypothesis, accurate or not,
would drive our initial line of inquiry.
With the conventional approach, we also need to
be familiar with the data we are considering, includ-
ing where it is (in what specific datasets or databases),
what format it is in, and even to a large extent what the
data itself contains. That level of knowledge might be
achievable when we are working with a limited number
of datasets or databases, but not with the vast amounts
of information now becoming available to us. We often
have to put aside, or assume away, factors that we might
actually believe are critical.
Add to these handicaps our inability to go beyond
the pre-selected questions or easily change variables,
and it becomes an impossible task. And so we never
try it. We end up settling for marginal questions, and
marginal answers.
4 NOVEMBER 2012
With the Reference Architecture, however, we can
structure an inquiry around a single, intuitive, big-pic-
ture question: What part of our computer network is
most vulnerable to attack in the next six hours? We do
not need to know much about any of the data sources
we are consulting—the data will point us to the answer.
The Cloud Analytics Reference Architecture al-
lows us to more readily look for unexpected pat-
terns—it lets the data talk to us, so to speak. Even
if we could ask all the questions we want, the way we
want, there is simply too much data to formulate every
question that might be important. Our questions can
also be limited by our biases about the issues we are
researching. We may not know what areas to explore,
or what we should be looking at. To get the full picture,
and help guide our inquiries, we need to see what pat-
terns naturally emerge in the data.
While we can look for patterns with the convention-
al approach, there are two significant drawbacks. We
can only do such searches within our narrowly defined
datasets and databases, rather than with the entire range
of data available to us. We also must first guess what
those specific patterns might be, and then test them out
with hypotheses. But what about the patterns we do not
even know might exist? How do we get to the hidden
knowledge that often proves so valuable?
Because there are no limiting data and analytic struc-
tures in the Reference Architecture, we do not need to
pose hypotheses, and our search for patterns encom-
passes the entire range of data. For example, the U.S.
military is now using the Reference Architecture to
search for patterns in war zone intelligence data, to map
out convoy routes least likely to encounter improvised
explosive devices (IEDs).
The Cloud Analytics Reference Architecture
allows computers to take over much of the work
humans are doing now—enabling people to focus
on creating value. Conventional methods require that
people play a large role in processing the data—in-
cluding selecting samples to be analyzed, creating data
structures, posing hypotheses, and sifting through and
refining results. That intense level of effort may be
workable for small amounts of data, but no organiza-
tion has the personnel or resources to use that method
to process big data.
The Cloud Analytics Reference Architecture solves
this problem by giving a great deal of that work to the
computers, particularly tasks that are repetitive and
computationally intensive. This reduces human error,
and substantially speeds up the work.
When we use the Reference Architecture to pose
more intuitive questions, or to find patterns, we are es-
sentially asking the computer to take us as close as it can
to finding the answers we want. It is then up to us, using
our cognitive skills, to find meaning in those answers.
By separating out what the computer can do—the
analytics—and what only people can do—the actual
analysis—the Cloud Analytics Reference Architecture
greatly eases the human workload. It is a division of la-
bor that frees subject-matter experts to look at the larg-
er picture. At the same time, the Reference Architecture
rapidly highlights areas that analysts should not waste
their time exploring—enabling them to focus their time
and attention in the right direction.
For example, agencies that investigate consumer
complaints against financial institutions often do not
know which individual complaints are indicative of a
broader patterns of consumer abuse, and so deserve
the most attention. Investigators rarely have the time to
sort through the vast array of sources that might pro-
vide valuable clues, such as blogs and social media sites
where consumers commonly air their grievances. With
a data lake that included all such available information,
the Reference Architecture’s analytics could quickly
identify patterns, such as consumer abuse affecting
large numbers of people. Investigators could then fo-
cus their resources on the most serious cases.
The Cloud Analytics Reference Architecture’s
analysis capability enables subject matter experts
to explore the data. If we are to drive business and
mission success, we must give direct access to the data
to the analysts, or subject matter experts, who under-
stand what that success might mean. However, be-
cause of the high level of computer expertise needed
to design custom data storage structures and analytics,
much of the analysis today is conducted by computer
scientists, computer engineers, and mathematicians act-
ing as agents for the subject matter experts. They are
typically the ones who translate the overall goals of the
business and government analysts into the language of
the machine. Whenever there is a middleman in any
field, things tend to get lost in the translation, and data
analysis is no exception. Here, it leads to a disconnect
between the people who need knowledge and insight
(the subject matter experts) and the data itself. It also
substantially slows the process.
In the top layers of the Reference Architecture, the
middleman syndrome goes away. The ability to ask in-
tuitive questions, and to look for patterns, provides the
analysts with direct access to the data. That gives them
the flexibility they need to experiment and explore,
and allows the system to reach maximum velocity. The
computer scientists, computer engineers and mathema-
ticians still play a key role, but now are no longer the
ones who drive the inquiries into the data.
5NOVEMBER 2012
For example, investigators who suspect fraud may
be occurring are often hampered by the need to go
through computer experts to query the data. Their re-
quest may be one of many, and by the time they get
back the information they need to act, the criminals
have often long since committed the fraud and dis-
appeared. With the Reference Architecture, however,
investigators could query the data themselves, quickly
pinpoint the fraud, and take action in time to stop the
activity.
THE FOUNDATION OF THE REFERENCE
ARCHITECTURE: A NEW APPROACH TO
INFRASTRUCTURE
The Reference Architecture takes advantage of
the immense storage ability of the cloud, though in a
different way than in the past. With the conventional
approach, cloud storage does not eliminate the data si-
los—it simply makes them fatter. Organizations must
continually reinvest in infrastructure as analytic needs
change. Building bridges between silos, for example,
typically requires reconfiguring and even expanding the
infrastructure.
The Reference Architecture, by contrast, has an in-
herent flexibility that enables organizations to pursue
new analytical approaches with few if any changes to
the underlying infrastructure. One reason is that the
data lake is easily expandable. Because it stores infor-
mation so efficiently, it can accommodate both the
natural growth of an organization’s data, as well as the
addition of data from multiple outside sources. At the
same time, the Reference Architecture replaces the cur-
rent, custom-built analytics with a new generation of
tools that are highly reusable for almost any number
of inquiries. With the Reference Architecture, organi-
zations do not need to rebuild infrastructure as their
levels of data and analytics increase. An organization’s
initial investment in infrastructure is therefore both en-
during and cost-effective.
HOW THE DATA LAKE WORKS
With the conventional approach, the computer is
able to locate the information it needs because it knows
precisely where it is—in one database or another. The
information is identified largely by its location. With the
data lake, information is still identified for use, but now
in a way other than by location. Specific pieces of infor-
mation are identified by “tags”—details that have been
embedded in them for sorting and identification.
For example, an investor’s portfolio balance (the
data) is generally stored with identifying information
such as the name of the investor, the account number,
one or more dates, the location of the account, the
types of investments, the country the investor lives in,
and so on. This “metadata” is what gets tagged, and is
located by the computer during inquiries.
The process of tagging information is not new—it
is commonly done within specific datasets and databas-
es. What is new is using the technique to eliminate the
need for datasets and databases altogether.
The tags themselves are also a way of gaining
knowledge from the data. In the example above, they
might allow us to look for, say, connections between
investors’ countries and their types of investments. The
basic data—the portfolio balance—might not even be
part of the inquiry. Such connections can be made with
the conventional approach, but only if the custom-built
databases and computer analytics have already been de-
signed to take them into consideration. With the data
lake, all the data, metadata and identifying tags are avail-
able for any inquiry or search for patterns. And, such
inquiries or searches can pivot off of any one of those
pieces of information. This greatly expands the usabil-
ity of the data available to an organization. It actually
makes big data even bigger.
An important advantage of the data lake is that
there is no need to build, tear down, and rebuild rigid
data structures. For example, suppose we develop an
improved approach to translating English into Chi-
nese. With conventional techniques, the database is the
translation. To make major changes, we would have to
go back to the original data (the English and Chinese
words), and build a completely new structure. With the
Reference Architecture, however, we would simply pull
out the data in a new way, easily reusing it.
In addition, the data lake smoothly accepts every
type of data, including “unstructured” data—infor-
mation that has not been organized for inclusion in
a data base. An example might be the doctors’ and
nurses’ notes that accompany a patient’s electronic
health records.
Two other critical emerging data types are batch and
streaming. Batch data is typically collected on an auto-
mated basis and then delivered for analysis en masse—
for example, the utility meter readings from homes.
Streaming data is information from a continuous feed,
such as video surveillance.
Most of the flood of big data is unstructured, batch
and streaming, and so it is essential that organizations
have the ability to make full use of all types. With the
data lake, there is no second-class or third-class data.
All of it, including structured, unstructured, batch and
streaming, is equally “ingested” into the data lake, and
available for every inquiry.
6 NOVEMBER 2012
It is an environment that is not random and chaotic,
but rather is purposeful. The data lake is like a viscous
medium that holds the data in place, and at the same
time fosters connections. Because the data is all in one
place, it is, in a sense, all connected.
GATHERING INFORMATION FROM THE DATA
LAKE: THE PRE-ANALYTICS
In the first step in analyzing the data, the Reference
Architecture uses tools known as pre-analytics to filter
data from the data lake and then give it an underlying
organization. For example, a recent study by Booz Al-
len and a large hospital chain in the Midwest analyzed
the electronic medical records of hundreds of patients,
to track the progression of a life-threatening condition
known as severe sepsis. Pre-analytics were used to first
pull patients’ vital signs from a version of a data lake,
and—using the time-and-date stamps embedded in the
records—organize them in chronological order. Once
that was accomplished, computer analytics could then
search for patterns in the way the patients’ vital signs
changed over time.
Pre-analytics accomplish a number of tasks at once.
Using the tags, they locate and pull out the relevant data
from the data lake. They then prepare that data for the
analytics, sorting and organizing the information in any
number of ways. The pre-analytics allow great flexibil-
ity in the inquiries—for example, one such tool might
transliterate a name like Muhammad into every possible
spelling (e.g., Mohammad, Mahamed, Muhamet). This
would enable the computer to collect and analyze infor-
mation about a particular person, even if that person’s
name is spelled differently in different sources of data.
Although pre-analytical tools are commonly used
in the conventional approach, they are typically part of
the rigid structure that must be torn down and rebuilt
as inquiries change. Generally, they cannot be reused—
for example, each name to be transliterated would re-
quire an entirely new pre-analytic. Because such work is
resource-intensive, only a limited number of such tools
can be built, severely hampering an organization’s abil-
ity to make full use of its data. By contrast, the pre-
analytics in the Cloud Analytics Reference Architecture
are designed for use with the data lake, and so are not
part of a custom-built structure. They are both flex-
ible and reusable, giving organizations almost endless
windows into their data. Moreover, they are designed to
be interoperable from the moment they come on-line,
creating a set of easily shared services for all users of
the data.
THE POWER OF COMPUTER ANALYTICS
Once the data has been prepared, the search for
knowledge and insight can begin. As with the other ele-
ments of the Reference Architecture, computer analyt-
ics are used in an entirely new way.
An analogy might be the difference between the
smartphones of today and the separate functions for
telephones, personal digital assistants and computers of
the not-so-distant past. Smartphones do more than just
combine those functions—they create a new world of
possibilities. The computer analytics in the Cloud Ana-
lytics Reference Architecture do the same.
There are several types of analytics in the Reference
Architecture, including:
Ad hoc queries. These are the analytics that ask
questions of the data. While in the conventional ap-
proach the analytics are part of the narrow, custom-
built structure, here they are free to pursue any line of
inquiry. For example, a financial institution might want
to know which of its foreign investors are at greatest
risk of switching to another firm, based on dozens of
characteristics of current and former customers. Later,
analysts might want to change the question somewhat,
asking the extent to which the political turmoil in cer-
tain countries plays a role. They can use the same ana-
lytic to ask the second question, and any number of
other questions—like the pre-analytics, they are flexible
and reusable. And they enable the kinds of improvised,
intuitive questions that can yield particularly valuable
results.
Machine learning. This is the search for patterns.
Because all of the data is available at once, and because
there is no need to hypothesize in advance what pat-
terns might exist, these analytics can look for patterns
that emerge anywhere across the data.
Alerting. This type analytic sends an alert when
something unexpected appears in the patterns. Such
anomalies are often clues to the kind of hidden knowl-
edge that can provide business with a competitive ad-
vantage, and help government organizations achieve
their missions.
Pre-Computation. These analytics enable organiza-
tions to do much of the analyzing in advance, creating
efficiencies. For example, an auto insurance company
might pre-compute the policy price for every individual
vehicle in the U.S., so that, with a few additional details,
a potential customer can be given an instant quote.
7NOVEMBER 2012
PUTTING IT ALL TOGETHER: VISUALIZATION AND
INTERACTION
Decision-makers may be understandably concerned
that all this big data will be overwhelming, that remov-
ing the tube from the map will simply lead to informa-
tion overload. Quite the opposite is true. The Cloud
Analytics Reference Architecture addresses the issue
head-on by incorporating the visualization—how the
knowledge is presented to us—into the analytics from
the outset. That is, the analytics not only conduct the
inquiries, they help contextualize and focus the results.
At the visualization and interaction level of Refer-
ence Architecture, this focus enables the analysts to
more easily make sense of the information, to frame
better, more intuitive inquiries, and to gain deeper in-
sights. Building the visualization into the analytics has
another advantage—it provides the ability for quick and
effective feedback between the two layers, so that the
presentation of the findings can be continually refined
for the decision-maker.
With the Reference Architecture, the flood of infor-
mation is not overwhelming—it is readied for action as
never before. This breakthrough in visualization could
have as profound an effect on decision-making as bar
graphs and pie charts did in the 1950s and 1960s, when
statistics became widely used in business. Those visuals
presented all the essential information at a glance, chang-
ing the nature of decision-making. The Reference Ar-
chitecture will do the same—but this time with big data.
DELIVERING ON THE PROMISE
The possibilities of big data and the cloud are not
pipe dreams. But they will not be fulfilled on their
own—conscious effort and deliberate planning are
needed. Unless organizations make the right infra-
structure decisions, they cannot hope to build a data
lake. Unless they make the right data management de-
cisions, they will never break free from the rigid data
and analytic structures that are so limiting. The Cloud
Analytics Reference Architecture can be seen as a road
map for that decision-making, one that shows the im-
portance of a holistic, rather than piecemeal, haphazard
approach. Each element is closely tied to each of the
other elements, and so all must be considered together.
The Cloud Analytics Reference Architecture is no
more expensive to build than traditional approach, and
is considerably more cost-effective in the long run. Be-
cause the elements of the Cloud Analytics Reference
Architecture are largely reusable, they can scale an or-
ganization’s big data in an affordable way.
The Cloud Analytics Reference Architecture is al-
ready being used by the U.S. government to make our
nation safer, and it can help other organizations in gov-
ernment and business create value, solve real-world
problems, and drive success. The grand promise of big
data and the cloud is now within reach.
FOR MORE INFORMATION
Mark Jacobsohn
jacobsohn_mark@bah.com
301-497-6989
Joshua Sullivan, PhD
sullivan_joshua@bah.com
301-543-4611
www.boozallen.com/cloud
This document is part of a collection of papers developed by Booz Allen Hamilton to introduce new concepts and ideas spanning cloud
solutions, challenges, and opportunities across government and business. For media inquiries or more information on reproducing this
document, please contact:
James Fisher—Senior Manager, Media Relations, 703-377-7595, fisher_james_w@bah.com
Carrie Lake—Manager, Media Relations, 703-377-7785, lake_carrie@bah.com

Mais conteúdo relacionado

Mais procurados

Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Big data privacy issues in public social media
Big data privacy issues in public social mediaBig data privacy issues in public social media
Big data privacy issues in public social mediaSupriya Radhakrishna
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
IABE Big Data information paper - An actuarial perspective
IABE Big Data information paper - An actuarial perspectiveIABE Big Data information paper - An actuarial perspective
IABE Big Data information paper - An actuarial perspectiveMateusz Maj
 
Business_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanBusiness_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanLuke Caratan
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveKun Le
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPeter Wang
 
ebook.driving decision-making, security
ebook.driving decision-making, securityebook.driving decision-making, security
ebook.driving decision-making, securityRoman Chanclor
 
Big data security and privacy issues in the
Big data security and privacy issues in theBig data security and privacy issues in the
Big data security and privacy issues in theIJNSA Journal
 
BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013Brian Crotty
 
Privacy Preserving Aggregate Statistics for Mobile Crowdsensing
Privacy Preserving Aggregate Statistics for Mobile CrowdsensingPrivacy Preserving Aggregate Statistics for Mobile Crowdsensing
Privacy Preserving Aggregate Statistics for Mobile CrowdsensingIJSRED
 
Big Data: Opportunities, Strategy and Challenges
Big Data: Opportunities, Strategy and ChallengesBig Data: Opportunities, Strategy and Challenges
Big Data: Opportunities, Strategy and ChallengesGregg Barrett
 
Information economics and big data
Information economics and big dataInformation economics and big data
Information economics and big dataMark Albala
 
Solve Big Data Security Issues
Solve Big Data Security IssuesSolve Big Data Security Issues
Solve Big Data Security IssuesEditor IJCATR
 

Mais procurados (18)

Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
Big data privacy issues in public social media
Big data privacy issues in public social mediaBig data privacy issues in public social media
Big data privacy issues in public social media
 
The ABCs of Big Data
The ABCs of Big DataThe ABCs of Big Data
The ABCs of Big Data
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Big Data (security Issue)
Big Data (security Issue)Big Data (security Issue)
Big Data (security Issue)
 
IABE Big Data information paper - An actuarial perspective
IABE Big Data information paper - An actuarial perspectiveIABE Big Data information paper - An actuarial perspective
IABE Big Data information paper - An actuarial perspective
 
Big Data: 8 facts and 8 fictions
Big Data: 8 facts and 8 fictionsBig Data: 8 facts and 8 fictions
Big Data: 8 facts and 8 fictions
 
Big Data
Big DataBig Data
Big Data
 
Business_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanBusiness_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_Caratan
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data Analysis
 
ebook.driving decision-making, security
ebook.driving decision-making, securityebook.driving decision-making, security
ebook.driving decision-making, security
 
Big data security and privacy issues in the
Big data security and privacy issues in theBig data security and privacy issues in the
Big data security and privacy issues in the
 
BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013
 
Privacy Preserving Aggregate Statistics for Mobile Crowdsensing
Privacy Preserving Aggregate Statistics for Mobile CrowdsensingPrivacy Preserving Aggregate Statistics for Mobile Crowdsensing
Privacy Preserving Aggregate Statistics for Mobile Crowdsensing
 
Big Data: Opportunities, Strategy and Challenges
Big Data: Opportunities, Strategy and ChallengesBig Data: Opportunities, Strategy and Challenges
Big Data: Opportunities, Strategy and Challenges
 
Information economics and big data
Information economics and big dataInformation economics and big data
Information economics and big data
 
Solve Big Data Security Issues
Solve Big Data Security IssuesSolve Big Data Security Issues
Solve Big Data Security Issues
 

Destaque

Predicting Mission Success through Improved Data Collection, Reuse and Analysis
Predicting Mission Success through Improved Data Collection, Reuse and AnalysisPredicting Mission Success through Improved Data Collection, Reuse and Analysis
Predicting Mission Success through Improved Data Collection, Reuse and AnalysisBooz Allen Hamilton
 
The Cybersecurity Executive Order
The Cybersecurity Executive OrderThe Cybersecurity Executive Order
The Cybersecurity Executive OrderBooz Allen Hamilton
 
Re-Imagined Infrastructure System: US 2040 Economy
Re-Imagined Infrastructure System: US 2040 EconomyRe-Imagined Infrastructure System: US 2040 Economy
Re-Imagined Infrastructure System: US 2040 EconomyBooz Allen Hamilton
 
Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...
Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...
Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...Booz Allen Hamilton
 
Methodology for Platform Modernization
Methodology for Platform ModernizationMethodology for Platform Modernization
Methodology for Platform ModernizationBooz Allen Hamilton
 
Mission Engineering Solution Infographic
Mission Engineering Solution InfographicMission Engineering Solution Infographic
Mission Engineering Solution InfographicBooz Allen Hamilton
 
Using Advanced Analytics for Data-Driven Decision Making
Using Advanced Analytics for Data-Driven Decision MakingUsing Advanced Analytics for Data-Driven Decision Making
Using Advanced Analytics for Data-Driven Decision MakingBooz Allen Hamilton
 
Digital Forensics: Digital Evidence That Endures
Digital Forensics: Digital Evidence That EnduresDigital Forensics: Digital Evidence That Endures
Digital Forensics: Digital Evidence That EnduresBooz Allen Hamilton
 
RightIT™ Maximizing Government IT Efficiency
RightIT™ Maximizing Government IT EfficiencyRightIT™ Maximizing Government IT Efficiency
RightIT™ Maximizing Government IT EfficiencyBooz Allen Hamilton
 
The Defense Industry Under Attack
The Defense Industry Under AttackThe Defense Industry Under Attack
The Defense Industry Under AttackBooz Allen Hamilton
 
Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...
Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...
Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...Booz Allen Hamilton
 

Destaque (20)

Predicting Mission Success through Improved Data Collection, Reuse and Analysis
Predicting Mission Success through Improved Data Collection, Reuse and AnalysisPredicting Mission Success through Improved Data Collection, Reuse and Analysis
Predicting Mission Success through Improved Data Collection, Reuse and Analysis
 
The Cybersecurity Executive Order
The Cybersecurity Executive OrderThe Cybersecurity Executive Order
The Cybersecurity Executive Order
 
Sais.34.1
Sais.34.1Sais.34.1
Sais.34.1
 
Re-Imagined Infrastructure System: US 2040 Economy
Re-Imagined Infrastructure System: US 2040 EconomyRe-Imagined Infrastructure System: US 2040 Economy
Re-Imagined Infrastructure System: US 2040 Economy
 
Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...
Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...
Acquiring the Right Talent for the Cyber Age: The Need for a Candidate Develo...
 
Information Security Governance
Information Security GovernanceInformation Security Governance
Information Security Governance
 
Methodology for Platform Modernization
Methodology for Platform ModernizationMethodology for Platform Modernization
Methodology for Platform Modernization
 
Mission Engineering Solution Infographic
Mission Engineering Solution InfographicMission Engineering Solution Infographic
Mission Engineering Solution Infographic
 
Using Advanced Analytics for Data-Driven Decision Making
Using Advanced Analytics for Data-Driven Decision MakingUsing Advanced Analytics for Data-Driven Decision Making
Using Advanced Analytics for Data-Driven Decision Making
 
Digital Forensics: Digital Evidence That Endures
Digital Forensics: Digital Evidence That EnduresDigital Forensics: Digital Evidence That Endures
Digital Forensics: Digital Evidence That Endures
 
RightIT™ Maximizing Government IT Efficiency
RightIT™ Maximizing Government IT EfficiencyRightIT™ Maximizing Government IT Efficiency
RightIT™ Maximizing Government IT Efficiency
 
3-D Program Management
3-D Program Management3-D Program Management
3-D Program Management
 
Dynamic Defense
Dynamic DefenseDynamic Defense
Dynamic Defense
 
The Vigilant Enterprise
The Vigilant EnterpriseThe Vigilant Enterprise
The Vigilant Enterprise
 
Cloud Brokering Brochure
Cloud Brokering BrochureCloud Brokering Brochure
Cloud Brokering Brochure
 
The Defense Industry Under Attack
The Defense Industry Under AttackThe Defense Industry Under Attack
The Defense Industry Under Attack
 
Mission Readiness
Mission ReadinessMission Readiness
Mission Readiness
 
The Biggest Bang Theory
The Biggest Bang TheoryThe Biggest Bang Theory
The Biggest Bang Theory
 
Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...
Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...
Affidavit of Eligibility and Release Associated with the Degas/Cassatt Like t...
 
Reform Playbook
Reform PlaybookReform Playbook
Reform Playbook
 

Semelhante a Delivering on the Promise of Big Data and the Cloud

A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxvrickens
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxdoylymaura
 
Challenges and outlook with Big Data
Challenges and outlook with Big Data Challenges and outlook with Big Data
Challenges and outlook with Big Data IJCERT JOURNAL
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Thingspateelhs
 
What is Big Data Pipe?
What is  Big Data Pipe?What is  Big Data Pipe?
What is Big Data Pipe?sunil173422
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxvipulkondekar
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerMicrosoft
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data MiningIOSR Journals
 
Accenture Tech Vision2011 Report V6 1901
Accenture Tech Vision2011 Report V6 1901Accenture Tech Vision2011 Report V6 1901
Accenture Tech Vision2011 Report V6 1901Ann Honomichl
 
Big Data
Big DataBig Data
Big DataBBDO
 
141900791 big-data
141900791 big-data141900791 big-data
141900791 big-dataglittaz
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013WCJones6348
 
Future Trends in the Modern Data Stack Landscape
Future Trends in the Modern Data Stack LandscapeFuture Trends in the Modern Data Stack Landscape
Future Trends in the Modern Data Stack LandscapeCiente
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor networkparry prabhu
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfrajsharma159890
 

Semelhante a Delivering on the Promise of Big Data and the Cloud (20)

A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docx
 
International Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docxInternational Conference on Smart Computing and Electronic Ent.docx
International Conference on Smart Computing and Electronic Ent.docx
 
Big data
Big dataBig data
Big data
 
Challenges and outlook with Big Data
Challenges and outlook with Big Data Challenges and outlook with Big Data
Challenges and outlook with Big Data
 
Big Data - CRM's Promise Land
Big Data - CRM's Promise LandBig Data - CRM's Promise Land
Big Data - CRM's Promise Land
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Things
 
What is Big Data Pipe?
What is  Big Data Pipe?What is  Big Data Pipe?
What is Big Data Pipe?
 
Unit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptxUnit 1 Introduction to Data Analytics .pptx
Unit 1 Introduction to Data Analytics .pptx
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data Mining
 
Accenture Tech Vision2011 Report V6 1901
Accenture Tech Vision2011 Report V6 1901Accenture Tech Vision2011 Report V6 1901
Accenture Tech Vision2011 Report V6 1901
 
Big Data
Big DataBig Data
Big Data
 
141900791 big-data
141900791 big-data141900791 big-data
141900791 big-data
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013
 
Future Trends in the Modern Data Stack Landscape
Future Trends in the Modern Data Stack LandscapeFuture Trends in the Modern Data Stack Landscape
Future Trends in the Modern Data Stack Landscape
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
 

Mais de Booz Allen Hamilton

You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest ChallengesYou Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest ChallengesBooz Allen Hamilton
 
Examining Flexibility in the Workplace for Working Moms
Examining Flexibility in the Workplace for Working MomsExamining Flexibility in the Workplace for Working Moms
Examining Flexibility in the Workplace for Working MomsBooz Allen Hamilton
 
Booz Allen's 10 Cyber Priorities for Boards of Directors
Booz Allen's 10 Cyber Priorities for Boards of DirectorsBooz Allen's 10 Cyber Priorities for Boards of Directors
Booz Allen's 10 Cyber Priorities for Boards of DirectorsBooz Allen Hamilton
 
Homeland Threats: Today and Tomorrow
Homeland Threats: Today and TomorrowHomeland Threats: Today and Tomorrow
Homeland Threats: Today and TomorrowBooz Allen Hamilton
 
Preparing for New Healthcare Payment Models
Preparing for New Healthcare Payment ModelsPreparing for New Healthcare Payment Models
Preparing for New Healthcare Payment ModelsBooz Allen Hamilton
 
The Product Owner’s Universe: Agile Coaching
The Product Owner’s Universe: Agile CoachingThe Product Owner’s Universe: Agile Coaching
The Product Owner’s Universe: Agile CoachingBooz Allen Hamilton
 
Immersive Learning: The Future of Training is Here
Immersive Learning: The Future of Training is HereImmersive Learning: The Future of Training is Here
Immersive Learning: The Future of Training is HereBooz Allen Hamilton
 
Nuclear Promise: Reducing Cost While Improving Performance
Nuclear Promise: Reducing Cost While Improving PerformanceNuclear Promise: Reducing Cost While Improving Performance
Nuclear Promise: Reducing Cost While Improving PerformanceBooz Allen Hamilton
 
Frenemies – When Unlikely Partners Join Forces
Frenemies – When Unlikely Partners Join ForcesFrenemies – When Unlikely Partners Join Forces
Frenemies – When Unlikely Partners Join ForcesBooz Allen Hamilton
 
Booz Allen Secure Agile Development
Booz Allen Secure Agile DevelopmentBooz Allen Secure Agile Development
Booz Allen Secure Agile DevelopmentBooz Allen Hamilton
 
Booz Allen Industrial Cybersecurity Threat Briefing
Booz Allen Industrial Cybersecurity Threat BriefingBooz Allen Industrial Cybersecurity Threat Briefing
Booz Allen Industrial Cybersecurity Threat BriefingBooz Allen Hamilton
 
Booz Allen Hamilton and Market Connections: C4ISR Survey Report
Booz Allen Hamilton and Market Connections: C4ISR Survey ReportBooz Allen Hamilton and Market Connections: C4ISR Survey Report
Booz Allen Hamilton and Market Connections: C4ISR Survey ReportBooz Allen Hamilton
 
Modern C4ISR Integrates, Innovates and Secures Military Networks
Modern C4ISR Integrates, Innovates and Secures Military NetworksModern C4ISR Integrates, Innovates and Secures Military Networks
Modern C4ISR Integrates, Innovates and Secures Military NetworksBooz Allen Hamilton
 
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...Booz Allen Hamilton
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Hamilton
 

Mais de Booz Allen Hamilton (20)

You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest ChallengesYou Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
 
Examining Flexibility in the Workplace for Working Moms
Examining Flexibility in the Workplace for Working MomsExamining Flexibility in the Workplace for Working Moms
Examining Flexibility in the Workplace for Working Moms
 
The True Cost of Childcare
The True Cost of ChildcareThe True Cost of Childcare
The True Cost of Childcare
 
Booz Allen's 10 Cyber Priorities for Boards of Directors
Booz Allen's 10 Cyber Priorities for Boards of DirectorsBooz Allen's 10 Cyber Priorities for Boards of Directors
Booz Allen's 10 Cyber Priorities for Boards of Directors
 
Inaugural Addresses
Inaugural AddressesInaugural Addresses
Inaugural Addresses
 
Military Spouse Career Roadmap
Military Spouse Career Roadmap Military Spouse Career Roadmap
Military Spouse Career Roadmap
 
Homeland Threats: Today and Tomorrow
Homeland Threats: Today and TomorrowHomeland Threats: Today and Tomorrow
Homeland Threats: Today and Tomorrow
 
Preparing for New Healthcare Payment Models
Preparing for New Healthcare Payment ModelsPreparing for New Healthcare Payment Models
Preparing for New Healthcare Payment Models
 
The Product Owner’s Universe: Agile Coaching
The Product Owner’s Universe: Agile CoachingThe Product Owner’s Universe: Agile Coaching
The Product Owner’s Universe: Agile Coaching
 
Immersive Learning: The Future of Training is Here
Immersive Learning: The Future of Training is HereImmersive Learning: The Future of Training is Here
Immersive Learning: The Future of Training is Here
 
Nuclear Promise: Reducing Cost While Improving Performance
Nuclear Promise: Reducing Cost While Improving PerformanceNuclear Promise: Reducing Cost While Improving Performance
Nuclear Promise: Reducing Cost While Improving Performance
 
Frenemies – When Unlikely Partners Join Forces
Frenemies – When Unlikely Partners Join ForcesFrenemies – When Unlikely Partners Join Forces
Frenemies – When Unlikely Partners Join Forces
 
Booz Allen Secure Agile Development
Booz Allen Secure Agile DevelopmentBooz Allen Secure Agile Development
Booz Allen Secure Agile Development
 
Booz Allen Industrial Cybersecurity Threat Briefing
Booz Allen Industrial Cybersecurity Threat BriefingBooz Allen Industrial Cybersecurity Threat Briefing
Booz Allen Industrial Cybersecurity Threat Briefing
 
Booz Allen Hamilton and Market Connections: C4ISR Survey Report
Booz Allen Hamilton and Market Connections: C4ISR Survey ReportBooz Allen Hamilton and Market Connections: C4ISR Survey Report
Booz Allen Hamilton and Market Connections: C4ISR Survey Report
 
CITRIX IN AMAZON WEB SERVICES
CITRIX IN AMAZON WEB SERVICESCITRIX IN AMAZON WEB SERVICES
CITRIX IN AMAZON WEB SERVICES
 
Modern C4ISR Integrates, Innovates and Secures Military Networks
Modern C4ISR Integrates, Innovates and Secures Military NetworksModern C4ISR Integrates, Innovates and Secures Military Networks
Modern C4ISR Integrates, Innovates and Secures Military Networks
 
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
 
Women On The Leading Edge
Women On The Leading Edge Women On The Leading Edge
Women On The Leading Edge
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
 

Último

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 

Último (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 

Delivering on the Promise of Big Data and the Cloud

  • 1. 1 WHY CAN’T WE SEEM TO DO MORE WITH BIG DATA? We are living in an age inundated with information. Our world is increasingly in- strumented—sensors are collecting data on everything from hospital patients’ vital signs, to the moment-by-moment naviga- tion of commercial aircraft, to consumer behavior based on buying patterns and the use of membership cards. Waves of data are coming from social media sites, from radio-frequency tracking systems, from the use of UPC barcodes. Our modern society is wired for data. Yet there is a growing belief in both business and government that we should be doing far more to take advantage of this wealth of information. We might use certain types of information for one purpose or another, but we nearly always view big data through multiple stove- pipes, rather than treating it holistically. We do not appear to be able to tap the full potential of all the data available to us. We have the technical ability. There have been significant innovations in com- puter technology in recent years, particu- larly with the advent of cloud computing. Yet like the promise of big data, the promise of the cloud—including unprec- edented savings, much greater access to data, and better decision-making—still seems largely unfulfilled. What holds us back is not technology, but a mindset. We are locked into an out- moded approach to data, one that relies on techniques created well before big data arrived on the scene. Those techniques give us access to only limited slices of in- formation, and are not designed to easily connect an analyst with multiple sources of data. They were sufficient in their day, but are no longer enough. Ultimately, we are not doing more with big data because we do not have complete access to it. We are never able to use it all at once, and so we are unable to track overall trends, or see entire patterns, or ask complex ques- tions that consider everything we know. To meet this need—and take full advantage of both big data and cloud computing—a new approach has been invented. Known as the Cloud Analytics Reference Architecture, it is the result of an ongoing collaboration between Booz Allen Hamilton and the U.S. government to leverage big data to search for terror- ists and other threats. Intelligence ana- lysts are now using the Cloud Analytics Reference Architecture to paint a com- prehensive picture that incorporates the full range of intelligence data at once, in- cluding reports that have been amassed and are ongoing from the field. Unlike conventional techniques, this new ap- proach makes it possible for analysts to use all available intelligence data, applying DELIVERING ON THE PROMISE OF BIG DATA AND THE CLOUD by Mark Jacobsohn Senior Vice President Booz Allen Hamilton Joshua Sullivan, PhD Vice President Booz Allen Hamilton ©2012 Booz Allen Hamilton Inc. All rights reserved. No part of this document may be reproduced without prior written permission of Booz Allen Hamilton.
  • 2. 2 NOVEMBER 2012 an expanding set of analytic services to help them gain critical mission insights. The Cloud Analytics Reference Architecture, which is being adapted to the larger business and govern- ment communities, removes the traditional con- straints by bringing together innovations in two areas of current technology. First, it uses the power of the cloud to put an organization’s entire storehouse of data into a common pool, or “data lake,” making all of it easily accessible for the first time. It then uses sophisticated computer analytics, such as machine learning and natural language processing, to help ex- tract the kind of knowledge and insight that creates value, guides strategy, and drives business and mis- sion success. Although the Cloud Analytics Refer- ence Architecture builds upon current techniques, it is not an incremental step forward. It is an entirely new approach—one specifically designed for our new age of data. One way to understand how the Reference Archi- tecture works is to view it in layers (see Figure 1). Its foundation is the cloud computing and network infra- structure, which supports the methods by which data is managed—most notably, the data lake. The data lake, in turn, supports a two-step process to analyze the data. In the first step, special tools known as pre-analytics filter information from the data lake, and give it an un- derlying organization. That sets the stage for computer analytics—in the next layer up—to search for valuable knowledge. These elements support the final phase, the visualization and interaction, where the human insights and action take place. THE POWER OF THE CLOUD ANALYTICS REFERENCE ARCHITECTURE The Reference Architecture opens up the enormous potential of big data by allowing us to search for insight in new ways. It enables us to look for overarching pat- terns, and ask intuitive questions of all the data, rather than limiting us to narrowly defined queries within data sets. The Reference Architecture allows comput- ers to take over much of the work humans are doing now—freeing people to focus on the search for insight. It makes it possible for non-computer experts, for the first time, to frame the questions, look for patterns, and follow hunches. This is not some kind of magical solution—far from it. The Reference Architecture is simply a new way of looking at data, but one that revolutionizes our ability to gain knowledge and insight. With conventional tech- niques, the data and analytics are locked into stovepipes, or silos. We can explore only limited amounts of data at any one time—and then only with predetermined questions that have already been built in. The Reference Architecture removes these constraints by eliminating the silos, and consolidating all the information in the data lake. What results is not chaotic or overwhelming. Rather, the rich diversity of information in the data lake Figure 1. Primary Elements of the Cloud Analytics Reference Architecture
  • 3. 3NOVEMBER 2012 becomes a powerful force. The data lake is more than a means of storage—it is a medium expressly designed to foster connections in data. And the Reference Archi- tecture explores those connections to search for valu- able correlations and patterns This actually reduces the complexity of big data, making it manageable and use- ful, and creating efficiencies. Instead of using data to ask “canned” questions that test what we may already know, the Reference Architec- ture uses data to discover new possibilities—solutions and answers that we have not even considered. The power of the Reference Architecture is that it constant- ly evolves and adapts as we search for insight, taking us beyond the limits of our imagination. WHAT THE CLOUD ANALYTICS REFERENCE ARCHITECTURE DOES The Cloud Analytics Reference Architecture re- moves the constraints created by data silos. While the rigid structures used in conventional techniques provide ease of storage, they carry severe disadvan- tages. They give us an artificial view of the world based on data models, rather than on reality and meaning. It is akin to reading a map through a tube—we can never immerse ourselves in the diversity of big data, and in- stead make decisions based on limited and constrained information. Much of data science in the last ten years has been devoted to improving access to the silos and building bridges between them. But that does not solve the underlying problem—that the data is regimented and locked in. Eliminating the need for silos gives us access to all the data at once—including data from multiple outside sources. Users no longer need to move from database to database, pulling out specific information. And, be- cause there are no data silos, there is no need to build complex bridges between them. If we want to know, for example, which parts of our computer network are most vulnerable to attack in the next six hours, we can take into account a wide va- riety of data sources at the same time. We might look at whether today is a holiday in certain foreign countries, which means that the young hackers known as “script kiddies” are more likely to be out of school and so have time on their hands to launch an attack. If we deter- mine that a particular group is targeting us, we might examine how its members are connected, asking wheth- er they had a common professor at a university, and if so, what techniques did he or she teach. The Reference Architecture gives us the ability to ask a full suite of questions rather than a pre-selected few. The Cloud Analytics Reference Architecture al- lows us to experiment more with the data. The Ref- erence Architecture’s flexibility provides a new kind of freedom—to follow hunches wherever they may lead, to quickly shift direction to pursue promising avenues of inquiry, to easily factor in new knowledge and in- sights as they arise. With the conventional approach, it is difficult to add or switch variables that are not already part of a dataset or data base. That typically requires tearing apart and rebuilding both the structure that the data is in and the computer analytics that are custom-designed to handle specific lines of inquiry. The process is expensive and time consuming, and so consequently, we tend to focus instead on doing better analysis with the limited tools available on our narrow slices of data. With the Reference Architecture, we might decide, in the network security example above, to add new vari- ables to the mix, such as the current propagation speed of commonly used viruses and botnets. Even if those variables come from outside data sources, we do not have to tear down and rebuild our data structures and analytics to consider them—they seamlessly become part of our inquiry. The Cloud Analytics Reference Architecture al- lows us to ask more intuitive questions. With the conventional approach, we do not really ask questions of the data—we create hypotheses, and then test the data to see whether we are right. In order to pose these hypotheses, we have to guess in advance what the an- swers might be, often a difficult proposition. To determine where our network is most vulnerable, for example, we would need to start with a hypothe- sis—say, that any attacks will occur through outdated operating systems. That hypothesis, accurate or not, would drive our initial line of inquiry. With the conventional approach, we also need to be familiar with the data we are considering, includ- ing where it is (in what specific datasets or databases), what format it is in, and even to a large extent what the data itself contains. That level of knowledge might be achievable when we are working with a limited number of datasets or databases, but not with the vast amounts of information now becoming available to us. We often have to put aside, or assume away, factors that we might actually believe are critical. Add to these handicaps our inability to go beyond the pre-selected questions or easily change variables, and it becomes an impossible task. And so we never try it. We end up settling for marginal questions, and marginal answers.
  • 4. 4 NOVEMBER 2012 With the Reference Architecture, however, we can structure an inquiry around a single, intuitive, big-pic- ture question: What part of our computer network is most vulnerable to attack in the next six hours? We do not need to know much about any of the data sources we are consulting—the data will point us to the answer. The Cloud Analytics Reference Architecture al- lows us to more readily look for unexpected pat- terns—it lets the data talk to us, so to speak. Even if we could ask all the questions we want, the way we want, there is simply too much data to formulate every question that might be important. Our questions can also be limited by our biases about the issues we are researching. We may not know what areas to explore, or what we should be looking at. To get the full picture, and help guide our inquiries, we need to see what pat- terns naturally emerge in the data. While we can look for patterns with the convention- al approach, there are two significant drawbacks. We can only do such searches within our narrowly defined datasets and databases, rather than with the entire range of data available to us. We also must first guess what those specific patterns might be, and then test them out with hypotheses. But what about the patterns we do not even know might exist? How do we get to the hidden knowledge that often proves so valuable? Because there are no limiting data and analytic struc- tures in the Reference Architecture, we do not need to pose hypotheses, and our search for patterns encom- passes the entire range of data. For example, the U.S. military is now using the Reference Architecture to search for patterns in war zone intelligence data, to map out convoy routes least likely to encounter improvised explosive devices (IEDs). The Cloud Analytics Reference Architecture allows computers to take over much of the work humans are doing now—enabling people to focus on creating value. Conventional methods require that people play a large role in processing the data—in- cluding selecting samples to be analyzed, creating data structures, posing hypotheses, and sifting through and refining results. That intense level of effort may be workable for small amounts of data, but no organiza- tion has the personnel or resources to use that method to process big data. The Cloud Analytics Reference Architecture solves this problem by giving a great deal of that work to the computers, particularly tasks that are repetitive and computationally intensive. This reduces human error, and substantially speeds up the work. When we use the Reference Architecture to pose more intuitive questions, or to find patterns, we are es- sentially asking the computer to take us as close as it can to finding the answers we want. It is then up to us, using our cognitive skills, to find meaning in those answers. By separating out what the computer can do—the analytics—and what only people can do—the actual analysis—the Cloud Analytics Reference Architecture greatly eases the human workload. It is a division of la- bor that frees subject-matter experts to look at the larg- er picture. At the same time, the Reference Architecture rapidly highlights areas that analysts should not waste their time exploring—enabling them to focus their time and attention in the right direction. For example, agencies that investigate consumer complaints against financial institutions often do not know which individual complaints are indicative of a broader patterns of consumer abuse, and so deserve the most attention. Investigators rarely have the time to sort through the vast array of sources that might pro- vide valuable clues, such as blogs and social media sites where consumers commonly air their grievances. With a data lake that included all such available information, the Reference Architecture’s analytics could quickly identify patterns, such as consumer abuse affecting large numbers of people. Investigators could then fo- cus their resources on the most serious cases. The Cloud Analytics Reference Architecture’s analysis capability enables subject matter experts to explore the data. If we are to drive business and mission success, we must give direct access to the data to the analysts, or subject matter experts, who under- stand what that success might mean. However, be- cause of the high level of computer expertise needed to design custom data storage structures and analytics, much of the analysis today is conducted by computer scientists, computer engineers, and mathematicians act- ing as agents for the subject matter experts. They are typically the ones who translate the overall goals of the business and government analysts into the language of the machine. Whenever there is a middleman in any field, things tend to get lost in the translation, and data analysis is no exception. Here, it leads to a disconnect between the people who need knowledge and insight (the subject matter experts) and the data itself. It also substantially slows the process. In the top layers of the Reference Architecture, the middleman syndrome goes away. The ability to ask in- tuitive questions, and to look for patterns, provides the analysts with direct access to the data. That gives them the flexibility they need to experiment and explore, and allows the system to reach maximum velocity. The computer scientists, computer engineers and mathema- ticians still play a key role, but now are no longer the ones who drive the inquiries into the data.
  • 5. 5NOVEMBER 2012 For example, investigators who suspect fraud may be occurring are often hampered by the need to go through computer experts to query the data. Their re- quest may be one of many, and by the time they get back the information they need to act, the criminals have often long since committed the fraud and dis- appeared. With the Reference Architecture, however, investigators could query the data themselves, quickly pinpoint the fraud, and take action in time to stop the activity. THE FOUNDATION OF THE REFERENCE ARCHITECTURE: A NEW APPROACH TO INFRASTRUCTURE The Reference Architecture takes advantage of the immense storage ability of the cloud, though in a different way than in the past. With the conventional approach, cloud storage does not eliminate the data si- los—it simply makes them fatter. Organizations must continually reinvest in infrastructure as analytic needs change. Building bridges between silos, for example, typically requires reconfiguring and even expanding the infrastructure. The Reference Architecture, by contrast, has an in- herent flexibility that enables organizations to pursue new analytical approaches with few if any changes to the underlying infrastructure. One reason is that the data lake is easily expandable. Because it stores infor- mation so efficiently, it can accommodate both the natural growth of an organization’s data, as well as the addition of data from multiple outside sources. At the same time, the Reference Architecture replaces the cur- rent, custom-built analytics with a new generation of tools that are highly reusable for almost any number of inquiries. With the Reference Architecture, organi- zations do not need to rebuild infrastructure as their levels of data and analytics increase. An organization’s initial investment in infrastructure is therefore both en- during and cost-effective. HOW THE DATA LAKE WORKS With the conventional approach, the computer is able to locate the information it needs because it knows precisely where it is—in one database or another. The information is identified largely by its location. With the data lake, information is still identified for use, but now in a way other than by location. Specific pieces of infor- mation are identified by “tags”—details that have been embedded in them for sorting and identification. For example, an investor’s portfolio balance (the data) is generally stored with identifying information such as the name of the investor, the account number, one or more dates, the location of the account, the types of investments, the country the investor lives in, and so on. This “metadata” is what gets tagged, and is located by the computer during inquiries. The process of tagging information is not new—it is commonly done within specific datasets and databas- es. What is new is using the technique to eliminate the need for datasets and databases altogether. The tags themselves are also a way of gaining knowledge from the data. In the example above, they might allow us to look for, say, connections between investors’ countries and their types of investments. The basic data—the portfolio balance—might not even be part of the inquiry. Such connections can be made with the conventional approach, but only if the custom-built databases and computer analytics have already been de- signed to take them into consideration. With the data lake, all the data, metadata and identifying tags are avail- able for any inquiry or search for patterns. And, such inquiries or searches can pivot off of any one of those pieces of information. This greatly expands the usabil- ity of the data available to an organization. It actually makes big data even bigger. An important advantage of the data lake is that there is no need to build, tear down, and rebuild rigid data structures. For example, suppose we develop an improved approach to translating English into Chi- nese. With conventional techniques, the database is the translation. To make major changes, we would have to go back to the original data (the English and Chinese words), and build a completely new structure. With the Reference Architecture, however, we would simply pull out the data in a new way, easily reusing it. In addition, the data lake smoothly accepts every type of data, including “unstructured” data—infor- mation that has not been organized for inclusion in a data base. An example might be the doctors’ and nurses’ notes that accompany a patient’s electronic health records. Two other critical emerging data types are batch and streaming. Batch data is typically collected on an auto- mated basis and then delivered for analysis en masse— for example, the utility meter readings from homes. Streaming data is information from a continuous feed, such as video surveillance. Most of the flood of big data is unstructured, batch and streaming, and so it is essential that organizations have the ability to make full use of all types. With the data lake, there is no second-class or third-class data. All of it, including structured, unstructured, batch and streaming, is equally “ingested” into the data lake, and available for every inquiry.
  • 6. 6 NOVEMBER 2012 It is an environment that is not random and chaotic, but rather is purposeful. The data lake is like a viscous medium that holds the data in place, and at the same time fosters connections. Because the data is all in one place, it is, in a sense, all connected. GATHERING INFORMATION FROM THE DATA LAKE: THE PRE-ANALYTICS In the first step in analyzing the data, the Reference Architecture uses tools known as pre-analytics to filter data from the data lake and then give it an underlying organization. For example, a recent study by Booz Al- len and a large hospital chain in the Midwest analyzed the electronic medical records of hundreds of patients, to track the progression of a life-threatening condition known as severe sepsis. Pre-analytics were used to first pull patients’ vital signs from a version of a data lake, and—using the time-and-date stamps embedded in the records—organize them in chronological order. Once that was accomplished, computer analytics could then search for patterns in the way the patients’ vital signs changed over time. Pre-analytics accomplish a number of tasks at once. Using the tags, they locate and pull out the relevant data from the data lake. They then prepare that data for the analytics, sorting and organizing the information in any number of ways. The pre-analytics allow great flexibil- ity in the inquiries—for example, one such tool might transliterate a name like Muhammad into every possible spelling (e.g., Mohammad, Mahamed, Muhamet). This would enable the computer to collect and analyze infor- mation about a particular person, even if that person’s name is spelled differently in different sources of data. Although pre-analytical tools are commonly used in the conventional approach, they are typically part of the rigid structure that must be torn down and rebuilt as inquiries change. Generally, they cannot be reused— for example, each name to be transliterated would re- quire an entirely new pre-analytic. Because such work is resource-intensive, only a limited number of such tools can be built, severely hampering an organization’s abil- ity to make full use of its data. By contrast, the pre- analytics in the Cloud Analytics Reference Architecture are designed for use with the data lake, and so are not part of a custom-built structure. They are both flex- ible and reusable, giving organizations almost endless windows into their data. Moreover, they are designed to be interoperable from the moment they come on-line, creating a set of easily shared services for all users of the data. THE POWER OF COMPUTER ANALYTICS Once the data has been prepared, the search for knowledge and insight can begin. As with the other ele- ments of the Reference Architecture, computer analyt- ics are used in an entirely new way. An analogy might be the difference between the smartphones of today and the separate functions for telephones, personal digital assistants and computers of the not-so-distant past. Smartphones do more than just combine those functions—they create a new world of possibilities. The computer analytics in the Cloud Ana- lytics Reference Architecture do the same. There are several types of analytics in the Reference Architecture, including: Ad hoc queries. These are the analytics that ask questions of the data. While in the conventional ap- proach the analytics are part of the narrow, custom- built structure, here they are free to pursue any line of inquiry. For example, a financial institution might want to know which of its foreign investors are at greatest risk of switching to another firm, based on dozens of characteristics of current and former customers. Later, analysts might want to change the question somewhat, asking the extent to which the political turmoil in cer- tain countries plays a role. They can use the same ana- lytic to ask the second question, and any number of other questions—like the pre-analytics, they are flexible and reusable. And they enable the kinds of improvised, intuitive questions that can yield particularly valuable results. Machine learning. This is the search for patterns. Because all of the data is available at once, and because there is no need to hypothesize in advance what pat- terns might exist, these analytics can look for patterns that emerge anywhere across the data. Alerting. This type analytic sends an alert when something unexpected appears in the patterns. Such anomalies are often clues to the kind of hidden knowl- edge that can provide business with a competitive ad- vantage, and help government organizations achieve their missions. Pre-Computation. These analytics enable organiza- tions to do much of the analyzing in advance, creating efficiencies. For example, an auto insurance company might pre-compute the policy price for every individual vehicle in the U.S., so that, with a few additional details, a potential customer can be given an instant quote.
  • 7. 7NOVEMBER 2012 PUTTING IT ALL TOGETHER: VISUALIZATION AND INTERACTION Decision-makers may be understandably concerned that all this big data will be overwhelming, that remov- ing the tube from the map will simply lead to informa- tion overload. Quite the opposite is true. The Cloud Analytics Reference Architecture addresses the issue head-on by incorporating the visualization—how the knowledge is presented to us—into the analytics from the outset. That is, the analytics not only conduct the inquiries, they help contextualize and focus the results. At the visualization and interaction level of Refer- ence Architecture, this focus enables the analysts to more easily make sense of the information, to frame better, more intuitive inquiries, and to gain deeper in- sights. Building the visualization into the analytics has another advantage—it provides the ability for quick and effective feedback between the two layers, so that the presentation of the findings can be continually refined for the decision-maker. With the Reference Architecture, the flood of infor- mation is not overwhelming—it is readied for action as never before. This breakthrough in visualization could have as profound an effect on decision-making as bar graphs and pie charts did in the 1950s and 1960s, when statistics became widely used in business. Those visuals presented all the essential information at a glance, chang- ing the nature of decision-making. The Reference Ar- chitecture will do the same—but this time with big data. DELIVERING ON THE PROMISE The possibilities of big data and the cloud are not pipe dreams. But they will not be fulfilled on their own—conscious effort and deliberate planning are needed. Unless organizations make the right infra- structure decisions, they cannot hope to build a data lake. Unless they make the right data management de- cisions, they will never break free from the rigid data and analytic structures that are so limiting. The Cloud Analytics Reference Architecture can be seen as a road map for that decision-making, one that shows the im- portance of a holistic, rather than piecemeal, haphazard approach. Each element is closely tied to each of the other elements, and so all must be considered together. The Cloud Analytics Reference Architecture is no more expensive to build than traditional approach, and is considerably more cost-effective in the long run. Be- cause the elements of the Cloud Analytics Reference Architecture are largely reusable, they can scale an or- ganization’s big data in an affordable way. The Cloud Analytics Reference Architecture is al- ready being used by the U.S. government to make our nation safer, and it can help other organizations in gov- ernment and business create value, solve real-world problems, and drive success. The grand promise of big data and the cloud is now within reach. FOR MORE INFORMATION Mark Jacobsohn jacobsohn_mark@bah.com 301-497-6989 Joshua Sullivan, PhD sullivan_joshua@bah.com 301-543-4611 www.boozallen.com/cloud This document is part of a collection of papers developed by Booz Allen Hamilton to introduce new concepts and ideas spanning cloud solutions, challenges, and opportunities across government and business. For media inquiries or more information on reproducing this document, please contact: James Fisher—Senior Manager, Media Relations, 703-377-7595, fisher_james_w@bah.com Carrie Lake—Manager, Media Relations, 703-377-7785, lake_carrie@bah.com