O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Cloud Playbook

4.271 visualizações

Publicada em

The Cloud Playbook showcases how Booz Allen’s Cloud Analytics Reference Architecture can be utilized to build technology infrastructures that can withstand the weight of massive data sets - and deliver the deep insights organizations need to drive innovation.

  • Seja o primeiro a comentar

Cloud Playbook

  1. 1. Cloud Analytics Playbook
  2. 2. 1.0 | Summary 1.0 | Summary The problems of explosive data growth and how cloud analytics provide the solution page 4 2.0 | Differentiation 2.0 | DifferentiationAn enormous amount of valuable information is out there, Introduces the Architecture and explains how Booz Allen’s unique approach to people, processes,waiting to be transformed into differentiating services. Booz Allen and technology gets the job done page 10Hamilton uses its Cloud Analytics Reference Architecture to build 3.0 | Depthtechnology infrastructures that can withstand the weight of massive 3.0 | Depth Takes the Architecture apart layer by layer with detailed visuals thatdatasets—and deliver the deep insights organizations need to show you how we frame a solution page 16drive innovation. 4.0 | Successes 4.0 | Successes Presents real-world examples from the hundreds of organizations who have successfully worked with Booz Allen to implement an analytics solution using the Architecture page 31Prefer to read this on your iPad?Search “Booz Allen” at the iTunes App Store,®or simply scan the QR code. 3
  3. 3. 1.0Summaryin this sectionA majority of executives believe theircompanies are unprepared to leveragetheir data. We look at why that is andhow to change it.
  4. 4. 1.0 | Summary Extracting True Insights The Growing Data Analysis Gap 2.0 | Differentiation We are living in the greatest age of information discovery the world has ever known. According to recent industry research, we now generate more data every 2 days than we did from the dawn of early civilization through the year 2003 combined. And data rates are still growing—approximately 40% each year. Fueled in large part by the more than five billion mobile phones in use around the globe, our world is increasingly measured, instrumented, monitored, and automated in ways that generate incredible amounts of rich and complex data. Unfortunately, the number of big data analysts and the capabilities of traditional tools aren’t keeping pace with this unprecedented data growth. At Booz Allen, we’ve watched this trend for some time now—we call it the “data analysis gap.” It’s clear that data has outstripped 3.0 | Depth common analytics tools and staffing levels. In order to move forward, organizations must be able to analyze data on a massive scale and quickly use it to provide deeper insights, create new products, and differentiate their services. 4.0 | Successes sustainable challenging missed opportunities 44 timesquantity Data By 2020, the amount of information in our economy will grow 44 times. Analysis Very few organizations are prepared for this wave of data. Gap 2009 2020 time (Source: IDC) 5
  5. 5. Preparing forWhat’s AheadThe ability to compete and win in the information economy will come from powerful analytics that draw insights and value from Done right, analytics hosted indata, and from high-fidelity visualizations that present those insights in impactful, intuitive ways. Both will become key influencers the cloud will help:of corporate decision making and consumer purchasing. ▶▶ Improve overall performanceMany of the world’s IT systems are not ready for the technology revolution happening as organizations seek to transform how they and efficiencyuse data. Their infrastructures face three major challenges: ▶▶ Better understand customer and employee needsVolume Variety Velocity ▶▶ Translate data into actionableNot enough storage capacity and Data comes in many different formats, Inability to process data in real time in intelligence and faster decisionanalytical capabilities to handle which can be difficult and expensive order to extract the most value from it makingmassive volumes of data to integrate ▶▶ Reduce IT costs ▶▶ mprove scalability to handle ITo help organizations overcome these hurdles and prepare for what’s next, Booz Allen has pioneered strategies for the future growthimplementation of the Digital Enterprise—a way of using technology, machine-based analytics, and human-powered analysis to createcompetitive and mission advantage. However, before you invest in a cloud analytics solution, you should fully understand the scope of what’s involved and engage in the properA Framework for the Future planning to ensure that all the right elements will be in place.Booz Allen has a framework for intelligently integrating cloud computing technology and advanced analytic capabilities,called the Cloud Analytics Reference Architecture. The Architecture is designed to solve compute-intensive problems thatwere previously out of reach for most organizations, including large-scale image processing, sensor data correlation, socialnetwork analysis, encryption/decryption, data mining, simulations, and pattern recognition.At the core of the Architecture are systems that accommodate petabytes of data at reasonable cost and allow analyticsto run at previously unattainable scales in reasonable amounts of time. However, human insights and action are still thefundamental drivers.The purpose of the Architecture is to allow machines to do 80% of the work—the mundane tasks they are best suited for—and enable people to do the 20% of the work they do best, tasks that involve analysis and creativity.
  6. 6. 1.0 | SummaryThe TransformativePower of Cloud Analytics 2.0 | DifferentiationBooz Allen is the leader in the Analysis using standard cloudemerging field of cloud analytics. computing solutions extendsOur unique approach combines basic analytic techniques tocloud and other technologies large or very large datasets.with superior analytic tradecraft This is a logical entry point forto create breakthroughs in how cloud solutions because cloudorganizations capture, store, technology is the most efficient, 3.0 | Depthcorrelate, pre-compute, and extract cost-effective way to run analyticsvalue from large sets of data. on large amounts of data.To understand the power of Advanced analytics is wherecloud analytics, it helps to see predictive capabilities are broughtthe progression from basic into the mix. It’s generally useddata analytics performed in to evaluate the future impact ofmost organizations today. As an strategic decisions. However, it 4.0 | Successesinfrastructure is built out along represents a step back in termsthe continuum to cloud analytics, of the size of datasets that canthe size and scale of data it can be manipulated.process increases along with theability to drive performance and Cloud analytics transcendsimprove decision making. the limits of the other forms of analysis. It delivers insights toBasic data analysis usually answer previously unanswerablehappens in core business questions such as:functions with smaller datasets.Reports are usually created ▶▶ ow can we gain competitive Hon a “one-off” basis, limited advantage in our marketto distribution within a specific space?department to support routinedecision making. ▶▶ here can we save money W within our organization? ▶▶ ow should we turn our data H into a product? 7
  7. 7. From Datato Digital EnterpriseBooz Allen clients havea wide variety of dataanalysis challenges and ITinfrastructures. Our flexible,scalable Cloud Analytics STAGE 3Reference Architecture has Significant improvementsthree stages or entry points in performance are realizedto accommodate these when you achieve successdifferences. STAGE 2 in managing the flow of We begin to modernize information at scale andIn each stage, we enable applications to handle derive the fullest value fromshifts in technology STAGE 1 the demands of advanced your data.investments while helping The focus is on saving analytics. Faster, reusable,manage risk and maximize money and reducing risk. and more intuitive applicationsthe  reward. That meansleveraging the assets youalready own and taking logical You may have already begun some of these initiatives; we leverage will enable everyone in your organization to work smarter. 3 CLOUD ANALYTICSsteps to add what’s needed. what’s working now as we ▶ reate deep insight into C 2 SMART DATA discover new ways to relevant mission dataThis is the only way to build a increase efficiency.structure to instrument data at scaleso you can truly experience ▶ sk and answer previously A ▶ nhanced Enterprise Ebreakthrough analytics. unanswerable questions 1 IT EFFICIENCIES Data Architecture ▶ larify pedigree C (data tagging) ▶ Data center consolidation ▶ Multidimensional indexing ▶ Server and data consolidation ▶ Adopt distributed database ▶ Increased automation ▶ Reusable applications ▶ Modernized security posture and metrics ▶ Reduced licensing costs IT Maturity
  8. 8. 1.0 | SummaryA Better ApproachAs the leaders in cloud analytics, Booz Allen has a proven approach 2.0 | Differentiationdelivered by some of the industry’s best talent. Here’s why we’re different:Technical framework A LOOK AHEADOur Architecture combines the collective experience of thousands of people who have road testedtechnologies from across the cloud solution landscape in hundreds of client organizations, rangingfrom the U.S. Federal Government to commercial and international clients. 3.0 | Depth Section 2.0Best practices Differentiation: Introduces and diagramsWe have an exclusive set of lessons learned and breadth of technical knowledge that saves time and the Architecture, and explains how itmoney while reducing risk. reflects Booz Allen’s unique approach. You’ll also read about our core design principles, extensive service offerings,Core principles and technology choices. 4.0 | SuccessesThese are “rules of the road” we’ve developed to build the most effective solution with the highest Pages 10–15return on investment. They encompass everything from how data should be stored to how to improverelationships with the end users of your data. Section 3.0 Depth: Takes the Architecture apart layerCritical skill sets by layer with detailed visuals, designWe bring technologists as architecture and solutions specialists, domain experts who know your concepts, and recommended solutionsindustry and your data, and data scientists who explore and examine data from disparate sources from the cloud vendor landscape. Theand recommend how best to use it. No one else in the industry offers a better combination of talent. section ends with a look at how security is built into all levels. Pages 16–30Vendor neutralityOur approach utilizes a broad ecosystem of products and custom systems culled from an exhaustive Section 4.0survey of available options. In the crowded, fragmented, and continually evolving landscape of cloud Successes: Presents real-worldsolutions, we recommend only the best fit and value for your organization. examples from our extensive file of case studies. We present the solutions and challenges, describe and diagram the implementations, and explain the results. Pages 31–35 9
  9. 9. 2.0Differentiationin this sectionBooz Allen’s approach to cloudanalytics is unmatched in theindustry. Read about our uniqueprinciples and best practices.
  10. 10. 1.0 | SummaryA LayeredFramework Human Insights and Actions 2.0 | Differentiation Enabled by customizable interfacesThe Booz Allen Cloud Analytics and visualizations of the dataReference Architecture Visualization, Reporting,incorporates a wide range Dashboards, andof services to move from Query Interfacea technology infrastructurewith chaotic, distributed databurdened by noise to large- Analytics and Services Services (SOA) Your tools for analysis, modeling,scale data processing and testing, and simulationsanalytics characterized byspeed, precision, security, 3.0 | Depthscalability, and cost efficiency. Analytics and DiscoveryHowever, Booz Allen’s approachis about much more thaninfrastructure. We start withyour need to make better senseand better use of your mission Views and Indexesdata, and build from there. 4.0 | Successes Streaming Indexes Data Management The single, secure repository for all of your valuable data Data Lake Metadata Tagging Data Sources InfrastructureA FRAMEWORK FOR SECURITY Infrastructure/ The technology platform for storing Management and managing your dataPage 30 details our security processes 11
  11. 11. Booz AllenCloud Analytics Service Offeringscloud strategy and Cloud application migration advanced cloud analytics infrastructureeconomics Expertise in the assessment, prioritization, Delivery of scalable analytics platforms allowing Design and implementation of IaaS offeringsDelivery of strategy, technology, and economic architectural mapping, re-engineering, and the processing of information at extreme scale; to provide global access to data storage,analysis for evaluating and planning all of the optimization of workloads that have high value and eDiscovery: high-volume, full text indexing, computing, and networking services onbusiness, technical, operational, and financial and are ready for migration to the cloud and context-based search of information demand through self-service portalsaspects of a cloud transitioncloud security software and platform vdi deployment and data center migration integration and optimizationUnified risk management approach to define cloud Expertise in the secure implementation of SaaS andsecurity requirements, controls, and a continuous PaaS service delivery models, data migration, and Delivery of flexible and dynamic virtual desktop Identify critical factors, design, and executemonitoring framework to address data protection, integration with existing enterprise infrastructure infrastructure to simplify management, reduce the transformation of legacy IT systems toidentity, privacy, regulatory, and compliance risks and applications licensing costs, and increase desktop security virtualized and cloud computing environments and data protection requirements
  12. 12. 1.0 | SummaryComplex EcosystemWe’ll help you navigate the crowded, fragmented, and continually evolving 2.0 | Differentiationvendor ecosystem to design a best-of-breed solution for your organization. 3.0 | Depth 4.0 | Successes Human Insights and Actions Analytics and Services Data Management Infrastructure Data Integration 13
  13. 13. Booz AllenCloud Analytics Core Principles In-situ processing Use commodity hardware Schema on read The Architecture demands that “you send the Hardware should be expected to fail as the If you have all the source data indexed and query- question to the data,” because most big data normal condition. The Architecture supports able, plus the ability to create aggregations, then processes are disk I/O-bound. In-situ processing both scalability and fault tolerance to achieve you can manage complex ontologies and demands means that most of the computation is done optimal application load balancing. in a very efficient manner. locally to the data, so that analytics run faster. This can enhance existing analytic capabilities and/or allow you to ask entirely new types of questions. Throw away nothing Economies of scale Change development process Near-linear scalable hardware and software What used to be called service-oriented In order to develop a tight, iterative relationship systems allow much more data to be stored, architecture (SOA) means that you can with your end users, you can develop/research which enable reprocessing of historical data define the value and cost of services in a new capability in hours (not months), and the with new algorithms and correlations that your enterprise, and plan your development process of discovery and integration with the rest bring new insights. actions, either to reduce the cost of low-value of the enterprise begins much sooner, too. components or increase the scale of high- value components. Data tagging You can now afford to tag all of your data for sensitivity or other controls (such as geographic). This is the fastest, most reliable way to instrument change across your entire Data Lake.
  14. 14. 1.0 | SummaryDeeper The Cloud Analytics Reference Architecture enables staff at all levels to quickly gather and act on granular insights from all of your available data, regardless of its format or location. Below are some of the ways human insights andInsights actions are enhanced by this new framework, which fosters greater collaboration and teamwork, and, ultimately, delivers the highest business value from your information and your computing infrastructure. 2.0 | Differentiation 3.0 | DepthHuman Insights and Actions Analytics and Services Data Management InfrastructureDecisionMakers, Analysts and Data Developers and Data System AdministratorsInvestigators, Interdictors, Scientists Scientists and IT Staff 4.0 | SuccessesAND Analysts ▶▶ reate and use many views into the C ▶▶ o longer constrained by years-old N ▶▶ educe IT costs through commoditization R▶▶ eal-time alerting, situational awareness, and R same data schemas and economies of scale dissemination specific to their clearance level ▶▶ Automatically find trends and outliers ▶▶ C atalog and index the data that is ▶▶ Meet long-term scalability requirements▶▶ nvestigate and provide feedback I ▶▶ valuate analysis methods to determine E relevant today on reporting and enhance best-of-breed tradecraft ▶▶ F ree to create new views and▶▶ Interact and search using tailored tools reporting metrics ▶▶ R eference undiscovered trends in original data ▶▶ A pply advanced machine learning and statistical methods ▶▶ In-situ hypothesis testing 15 15
  15. 15. 3.0Depthin this sectionWe diagram and describe each layerof the Cloud Analytics ReferenceArchitecture, including our designprinciples and technology choices.
  16. 16. 1.0 | SummaryReferenceArchitecture 2.0 | DifferentiationBooz Allen’s Cloud Analytics layer 1Reference Architecture providesa holistic approach to people, Human Insights and Actionsprocesses, and technology in four Building on results and outputs from various analyticaltightly integrated layers. methods, multiple data visualizations can be created in your new cloud analytics solution. These are used to compose the interactive, real-time dashboard interfacesKey Attributes your decision-makers and analysts need to make sense Human Insights and Actions of your data.By design, the Booz Allen CloudAnalytics Reference Architecture: 3.0 | Depth layer 2▶▶ s reliable, allowing distributed I Analytics and Services storage and replication of bytes Both traditional and “Big Data” tools and software across networks and hardware can operate on the information stored in your Data that is assumed to fail at any time Lake, producing advanced specific analysis, modeling,▶▶ A llows for massive, world-scale testing, and simulations you need for decision making. storage that separates metadata from data Analytics and Services 4.0 | Successes▶▶ S upports a write-once, sporadic append, read-many usage structure layer 3▶▶ S tores records of various sizes, Data Management from a few bytes up to a few Your Data Lake is a secure, distributed repository of a terabytes in size wide variety of data sources. Security, metadata, and▶▶ A llows compute cycles to be indexing of Big Data are enabled by distributed key easily moved to the data store, value systems (NoSQL), but the Architecture allows for instead of moving the data to traditional relational databases as well. a processer farm Data Management layer 4 Infrastructure This foundational layer allows for quick, streamlined, low-risk deployment of the cloud implementation. The plug-and-play, vendor-neutral framework is unique to Booz Allen.A FRAMEWORK FOR SECURITY InfrastructurePage 30 details our security processes 17
  17. 17. layer 1Human Insights and Actionsarchitecture model Monitoring Exploratory Geospatial Line Chart Analytics and Services
  18. 18. 1.0 | SummaryHuman Insights and Actions (continued)PRINCIPLES AND TECHNOLOGIES 2.0 | DifferentiationIn analytics solutions built on the Architecture, the data that’s available and the desired results drive the interfaces— TECHNOLOGY EXAMPLES 3.0 | Depthnot the other way around. When user communities and stakeholders aren’t restricted by their tools, they can performcomplex visualizations to identify patterns they previously couldn’t see. HTML5, JavaScript, OWF, Synapse Lightweight, custom web-based applications and dashboards tailoredThat freedom defines the guiding principles behind this first layer of the Architecture: to specific user communities or stakeholders for data exploration, event alerting, and monitoring, as well as▶▶ Design and build the framework so that the desired data and analytic results define the visualization continuous quality improvement▶▶ Reuse results and outputs of analytics across different visualizations▶▶ ecouple the underlying analytics and data access from the visualizations and interfaces so that it’s possible D 4.0 | Successes to build customized, interactive dashboard interfaces composed of dynamically linked visualizations Commercial products (Splunk, Out-of-the-box, easy-to-build dashboards Pentaho, Datameer Business for historical trending and real-time Infographics, etc.) monitoring to analyze user transactions, customer behavior, network patterns, security threats, and fraudulent activity Adobe Flex and Adobe Flash Despite the rise of HTML5, Adobe Flex and Flash applications still remain strong candidates for quickly building and deploying rich user interfaces 19
  19. 19. layer 2Analytics and Servicesarchitecture model Human Insights and Actions Time Series Social Network Analysis R, SAS, Matlab, MapReduce, Hive, Mathematica Pig, Hama
  20. 20. 1.0 | SummaryAnalytics and Services (continued)PRINCIPLES AND TECHNOLOGIES 2.0 | Differentiation TECHNOLOGY EXAMPLES Data Mining Data mining is used to discover patterns in large datasets and draws from multiple fields including artificial intelligence, machine learning, statistics, and database systems.Frequently where data is concerned, the whole is greater than the sum of its parts. In the most strategic business 3.0 | Depth Machine Learning Machine learning is used to learndecisions, the ability to combine multiple types of analyses creates a holistic picture that can lead to much more classifiers and prediction models invaluable insight. With the Cloud Analytics Reference Architecture, you can implement different the absence of an expert and employstypes of analytical methods. many algorithms in the areas of decision trees, association learning, artificial neural networks, inductive logicThis integrated approach is an anchor for the guiding principles behind our Analytics and Services layer: programming, support vector machines, clustering, Bayesian networks, genetic▶▶ llow both traditional and Big Data analysis tools and software to operate on a centralized repository of data A algorithms, reinforcement learning, and (the DataLake) representation learning. 4.0 | Successes▶▶ Integrate results and outputs of analyses and visualize them on dashboards for decision making▶▶ Decouple tools from the various types of analyses to make the system more extensible and adaptable▶▶ nclude a service-oriented architecture layer to reuse results and outputs in many different ways relevant to I Natural Language Processing (NLP) NLP is used to process unstructured and semi-structured documents for different stakeholders and decisionmakers the purposes of information retrieval,▶▶ ncorporate Certified Catastrophe Risk Analysis (CCRA) to allow a variety of data analysis tools and software I sentiment analysis, statistical machine translation, and classification. to be integrated and used; it also enables results and outputs of analyses to be visualized and used across multiple interfaces Network Analysis Network analysis using graph theory and social network analysis are used to understand association and relationships between entities of interests. Statistical Analysis Traditional statistical methods using univariate and multivariate analysis on relatively small datasets are employed to make inferences, test hypotheses, and summarize data. 21
  21. 21. layer 2Analytics and Services (continued)discovering your dataBefore working with Booz Allen, most clients faced a fundamentalchallenge with data discovery. They didn’t know what data was actuallyavailable or how to sort through all of it to identify the most importantbusiness problems or trends it could reveal. SearchTechnical frameworkDiscovery is intimately related to search and analysis. All three feed into insight in a nonlinear fashion.A search-discovery-analytics process that solves business problems without consuming disproportionateresources meets these user needs:▶▶ Real-time, ad hoc access to content▶▶ Aggressive prioritization based on importance to the user and the business▶▶ ata-driven decision making, which relies on the ability to try different approaches and ideas in order D Analytics Discovery to discover previously unimagined insights▶▶ Feedback/learning from the past intelligently applied to today’s dataHow booz allen simplifies discoveryOther solutions require analysts to break down data into numerous subsets and samples before it can How does the Architecture support fast, efficient, and scalable search on entire datasets, not just samples?be digested. This expensive, time-consuming process is one of the major roadblocks to turning data intotrue business intelligence. ▶▶ ulk and soft real-time indexing enable the solution to handle billions of records with subsecond B search and facetingEven though the Booz Allen Cloud Analytics Reference Architecture supports the most advanced ▶▶ arge-scale, cost-effective storage and processing capabilities accommodate “whole data” Lanalysis, it can also allow your staff to sift through all of your data on a basic level. Without tedious or consumption and analysis; in-memory caching of critical data ensures applications meet performancesophisticated sampling and complex tools, they can discover what’s useful and what’s not useful for a requirementsspecific business problem. ▶▶ LP and machine learning tools can scale to enhance discovery and analysis on very large datasets N
  22. 22. 1.0 | SummaryAnalytics and Services (continued)HOW THE DATA SCIENCE LIFECYCLE DISTILLS INSIGHTS 2.0 | DifferentiationThe data science lifecycle consists of three basic steps:Step 1First, data is sampled using a cloud analytics platform. This step may involve a sophisticated analyticthat runs in the cloud, such as one that crawls a social network to find people with certain types ofrelationships with an individual or organization. This sampling can be done using either high-level query 3.0 | Depthlanguages that are specially made for scalable cloud analytics or low-level developer interfaces.Step 2Next, a data scientist models the data sample in order to understand it better. This is usually done usinga statistical modeling environment on the data scientist’s workstation. 4.0 | SuccessesStep 3Finally, once a trend is established using the model, the data scientist works with analysts and domainexperts to explain the trend and yield insights.This cycle is repeated until the data science team reaches actionable insights and intelligence thatcan be presented to senior leadership for decision making purposes. Information may be delivered ina visualization, dashboard, or written report. 23
  23. 23. layer 3Data Managementarchitecture model Human Insights and Actions Analytics and Services Batch Structured Unstructured
  24. 24. 1.0 | SummaryData Management (continued)PRINCIPLES AND TECHNOLOGIES 2.0 | DifferentiationA central feature of the Architecture, the Data Lake delivers on the promise of cloud analytics to offer previously hidden insights and TECHNOLOGY EXAMPLESdrive better decisions. It’s a secure repository for data of all types and origins. Instead of precategorizing data, which restricts itsusability from the moment it enters your organization, the Architecture combines unstructured, structured, and streaming data types 3.0 | Depth Hadoop Distributed File The primary open-source, distributedand makes them available for many different forms of analysis. System (HDFS) storage system creates multiple replicas of data blocks andThe following principles demonstrate how the Architecture enables your organization to use this repository of enterprise distributes them on compute nodes throughout a cluster to enabledata to the best advantage: reliable, rapid computations.▶▶ Provide inherent replication of the data through a distributed file system▶▶ se distributed key value (NoSQL) data storage to enable security and metadata tagging at the data level Accumulo NoSQL store based on Google’s U BigTable design features cell-level as well as indexing for specialized retrieval 4.0 | Successes security access labels and a server-▶▶ R elax schema constraints and provide the flexibility to adapt to changing data sources and types with the side programming mechanism schema-on-read approach of distributed key value data storage that can modify key/value pairs▶▶ at various points in the data Store the Data Lake on commodity hardware and scale linearly in performance and storage management process.▶▶ Don’t presummarize or precategorize data▶▶ Enable rapid ingest of data, aggressive indexing, and dynamic question-focused datasets through scale Hbase, Cassandra, MongoDB Open-source NoSQL databases focused on a combination of consistency, availability, and partition tolerance. Neo4j NoSQL scalable graph database storing data in nodes and the relationships of a graph. 25
  25. 25. layer 3Data Management (continued)DATA LAKEBooz Allen works with organizations in corporate and government sectors that have an urgentneed to make sense of volumes of data from diverse sources, including those that had beeninaccessible or extremely difficult to utilize, such as streams from social networks. Now analystsand decisionmakers can form new connections between all of this data to uncover previouslyhidden trends and relationships. Enterprise Machine-to- Transaction Sensor Data Data machine logs communication Intrusion and malware detection Fraud Detection Cyber Government Defense Finance healthcare Security Logs Regulatory Compliance Enhanced Situational Enhanced Email Quarterly Filings Awareness Forecasting Reports System Logs Financials Press Articles Individual organizations require different types of data. Not all types of data listed above may apply to every organization.
  26. 26. 1.0 | Summary Enhanced Media Content Booz Allen’s strategy and technology consultants are highly regarded subject matter experts. Through groundbreaking conference keynotes, whiteboard talks, and papers, they help educate and shape the analytics industry. 2.0 | Differentiation We invite you and your team to take advantage of the educational resources listed below to gain strategic insights about the use of analytics, explore technical topics in depth, and stay on top of the latest trends.Presentations VideosYahoo! Hadoop Summit: Biometric Databases and Hadoop Cloud Whiteboard PlaylistInvented and demonstrated methods for dense data correlation (e.g., imagery and Short instructional videos on a range of topics from introductory talks for executives 3.0 | Depthbiometrics) within a Hadoop distributed computing platform using new machine learning to tutorials for data analysts. Check back frequently for new material.parallel methods. Cloud Analytics for Executive LeadershipYahoo! Hadoop Summit: Culvert—A Robust Framework for Secondary Indexing of Booz Allen Principal Josh Sullivan discusses how analysis of data can be usedStructured and Unstructured Data as a tool to provide insight to executives.Demonstration of Booz Allen’s secondary indexing solutions and design patterns,which support online index updates as well as a variation of the HIVE query language Informed Decision Making: Sampling Techniques for Cloud Dataover Accumulo and other BigTable-like databases to allow indexing one or more columns Booz Allen Data Scientist Ed Kohlwey explains how sampling large amounts 4.0 | Successesin a table. of data can be useful for program managers to make informed decisions.Slidecast: Hadoop World—Protein Alignment Developer Perspectives: The FuzzyTable DatabaseDemonstration of advanced analytics in using protein alignment sequences to identify Booz Allen Data Scientist Drew Farris explains how to use the FuzzyTabledisease markers using Hadoop, HBase, Accumulo, and novel machine learning concepts. biometrics database.Slidecast: Innovative Cyber Defense with Cloud AnalyticsPresentation on improving intelligence analysis through a hybrid cloud approach Workshopto analytics, with descriptions and diagrams from Booz Allen client solutions. O’Reilly Strata Conference: Beyond MapReduce—Getting Creative with Parallel Processing Technical discussion of MapReduce as an excellent environment for some parallelSlidecast: Integrating Tahoe with Hadoop’s MapReduce computing tasks and the many ways to use a cluster beyond MapReduce.Invented and demonstrated method to use least-authority encrypted file system as pluginto HDFS within Hadoop cluster. Learn MorePapersMassive Data Analytics in the Cloud Scan the QR code, or go directly to:Overview of the business impact of cloud computing, and how data clouds are boozallen.com/cloudshaping new advances in intelligence analysis. 27
  27. 27. layer 4Infrastructurearchitecture model Human Insights and Actions Analytics and Services Virtual Desktop Integration
  28. 28. 1.0 | SummaryInfrastructure (continued)PRINCIPLES AND TECHNOLOGIES 2.0 | Differentiation 3.0 | Depth TECHNOLOGY EXAMPLESInfrastructure is the foundation for any cloud implementation. What The following principles guide the infrastructure layer ofmakes the Booz Allen Cloud Analytics Reference Architecture unique the Architecture:is its plug-and-play, vendor-neutral framework. This framework not Amazon Web Services, Microsoft Cloud tool chain for provisioning,only allows a greater range of choices in selecting resources and ▶▶ ake it easy to transform physical resources from legacy IT M Azure, Puppet, VMware, vSphere configuration, orchestration, and monitoring of virtual environment.building services, it also allows for a faster, more streamlined, more systems to secure, virtualized data centers and trusted cloud These tools provide the building blockssecure, and lower risk deployment. computing environments for IaaS, PaaS, and foundation for ▶▶ mplement core services to provide the mechanisms to realize I SaaS. Run multiple operating systems 4.0 | Successes on-demand self-service, broad network access, resource pooling, and virtual network platforms on the same hardware—sharing computing, rapid elasticity, and measured service storage, and networking resources. ▶▶ mploy virtualization to increase utilization of existing assets and E resources, and improve operational effectiveness ▶▶ ngineer in-depth security to provide controls and continuous E Security through VMware, McAfee, Protect assets—physical, logical, and monitoring in order to fully address data protection, identity, Symantec, Cisco, TripWire, EnCase virtual—while automating governance privacy, regulatory, and compliance risks and compliance. 29 29
  29. 29. Reference ArchitectureSecurity Framework Business Assets to Be Protected Geography Threats and Processes that Require Security Distributed Sites | Remote Workers | Jurisdictions Organizational Security Time Dependencies Governance | Supply Chain | Strategic Partnerships Transaction Throughput | Lifetimes and Deadlines Business Layer Conceptual Business Attributes Technical and Management Security Strategies Roles and Responsibilities Security Requirements to Support the Business Trust Relationships Time Dependenciesa framework Control and Enablement Objectives Security Domains, Boundaries, and Associations When Is Protection Relevant?for security Resulting from Risk AssessmentThe Architecture is designed to Business Layer Logicalprotect your data at rest and Business Information to Be Secured Interrelationships Security Servicesin flight, with security controls Security and Risk Attributes Authentication Confidentiality and Integrity Protection | Strategic Partnershipsembedded in each layer. This Security Entities Management Policyis obviously more than just Business Layer Physicala technology challenge. We Security-Related Data Structures Security Mechanisms Human Interfaceunderstand the need to embed Tables | Messages | Pointers | Certificates | Signatures Encryption | Access Control | Digital Signatures Screen Formats | User Interaction Access Control | Systemsnew processes and training Security Rules Security Infrastructure Physical Layout of Hardware, Software, and Conditions | Practices | Procedures Time Dependenciesregimens so your staff handles Communication Lines Sequence of Processes and Sessionssensitive data correctly. We alsoadvise you on how to secure your Business Layer Component Security IT Products Personnel Management Tools Time Dependenciesfacilities and ensure that all off- Risk Management Identities | Roles | Functions | Access Controls | Lists Time Schedules | Clocks | Timers and Interruptspremise facilities have the right Tools for Monitoring and Reporting Security Process Locator Tools Dynamic Inventory of Nodes | Addresses and Locationscontrols in place as well. Tools, Standards, and Protocols Business Layer Service Management Service Delivery Management Management of Security Operations Management of Environment Assurance of Operational Continuity Admin | Backups | Monitoring | Emergency Response Buildings | Sites | Platforms and Networks Operational Risk Management Personnel Management Management Schedule Risk Assessment | Monitoring and Reporting Account Provisioning | User Support | Management Security-Related | Calendar and Timetable
  30. 30. Case Studies 1.0 | Summary 2.0 | Differentiation 3.0 | Depth 4.0 4.0 | Successes Successes in this section These case studies show how Booz Allen uses superior technology and analytics expertise to solve complex problems for clients in a wide range of corporate and government sectors. 31
  31. 31. Case Studies Example OneImproving Intelligence Analysis Mission Solutions To fulfill their mission, this Booz Allen worked closely with scalable and flexible to support Data Management organization requires data the client to adopt a data cloud future innovation and evolution The data sources had multiple correlation, quick access to implementation by augmenting without reengineering. formats, were large in size, analytic results, ad-hoc queries, the legacy relational databases and distressed with noise. advanced scalable analytics, with cloud computing and Interfaces and Visualizations The solution created deep and real-time alerting. analytics. The design focused Dashboards, web applications, insight through fusion of To provide their analysts on keeping transactional- client applications, and rich different data types at scale. with a continuous pipeline based queries in the current clients interfaced and integrated The solution enabled the of prioritized, actionable relational databases, while with advanced analytics ability to follow the lineage or information, they needed a doing the “heavy lifting” in infrastructure and legacy pedigree of the data, allowing secure, scalable, automated the cloud and outputting the relational databases through a the client to map cost in solution that would more interesting, processed, or SOA business logic layer. relation to the value of the data quickly and precisely sift desired analytic results into or how well it is being used. through large (and growing) relational data stores for quick Analytics and Services volumes of complex data transactional access. The solution called for Infrastructure characterized by a variety of predictive analytics to forecast The solution used Accumulo formats and noise. In addition, With many existing systems potential events from existing (distributed key value they needed to leverage their and applications dependent on data and anomaly detection to systems/NoSQL database) existing analytics infrastructure the legacy relational database extract potentially significant for content normalization in the new platform. for transactional queries of information and patterns. The and indexing, MapReduce as data, Booz Allen pulled together solution leveraged the core the precomputation engine, excess servers from the client’s principle of cloud analytics that and HDFS for scalable ingest infrastructure to build a hybrid enables automated analysis and storage. cloud solution. Also, as the techniques, precomputation, client’s needs change to adapt and aggressive indexing. to the mission, the solution is Impact Rather than simply focus on gaining IT efficiencies by using cloud technology for infrastructure, Booz Allen focused on applying cloud analytics and in-depth understanding of the organization’s operational and mission needs to extract more value faster from massive datasets. The new cloud solution provided immediate and striking improvements across the increasing volume of structured and unstructured data using aggressive indexing techniques, on-demand analytics, and precomputed results for common analytics. The final solution combined sophistication with scalability, moving the organization from a situation in which analysts stitched together sparse bits of data to a platform for distilling real-time, actionable information from the full aggregation of data.