SlideShare uma empresa Scribd logo
1 de 16
EzBake
A Secure Apps Engine for DoDIIS
Matthew Carroll, GM of 42six
February 06, 2014
Outline
Why Build an Apps Engine?
The Architecture
What’s Next

2!
The DoDIIS App Conundrum
Budget Cuts
• 
• 
• 
• 
• 

There is not enough money to transition over 400+ apps within DIA (business and mission)
Outsourcing IaaS to C2S and GovCloud needs to be monitored for cost reimbursable over time
Application elasticity is critical to understanding true costs of ownership and maintenance
Data is a much bigger cost than expected
Need to consolidate systems engineering support

Technology migration is not simple
•  Most apps are CRUD based; write a report, find a report
•  Security business logic is baked into each app
•  Number one question: why can’t I choose the technology that best fits my app?
Not a Big Data problem….yet
•  On the order of TBs at best
•  Highly connected but not big
Security is the ultimate killer of time
•  Most time is spent meeting PL3 needs and encrypting traffic

3!
Analytics are great but…
THE COMMUNITY DRIVE TO ANALYTICS AND ENRICHMENT
ENGINES HAS LEFT DIA PLAYING CATCH UP IN ITS MIGRATION OF
APPS TO A COMMON PLATFORM.
1. 

Make it as easy as possible for any legacy app to transition

2. 

Will not dictate technologies

3. 

Provide standards for security and access to datastores

4.  The platform must deploy across multiple brokers, i.e. EC2,
OpenStack, VMware and be completely transparent to the app
team

4!
Where to start?
Starting was hard. But it became very clear that several epics were essential to
migrate applications in an efficient manner:
1.  Streaming of data into applications must be done in a standard way. Velocity
and size of data is not as much as a factor to DIA as is the method to which the
data is consumed and distributed. To answer this a stream-based data interface
must be built to support the nexus of data distribution within the environment, we
call this Frack.
2.  Everyone likes the concept of migrating to NoSQL but it becomes unmanageable
from a DevOps perspective if everyone picks their own database for their own use
cases. Furthermore, the point is to be multi-tenant. So we created datasets, a
means to expose indexing patterns instead of explicit databases, exposed
through a common security layer.
3.  Too much time is spent on baking in non-application specific logic into each
application vice supporting a common service tier. In order to build standards
around common service-based functions we built Services.

5!
Integration vs. Engineering
Of the major issues identified early on in
the project the most hindering of issues
was the deployment model.
• 

App teams are spending 80% of their
time integrating to new database and
new services vice building application
functionality

• 

Applications would each follow their
own System Installation Procedure (SIP)
by which each would deploy their own
software

• 

Scale was defined through provisioning
of machines vice true automated
elasticity

• 

Start developing within 1 hour and
deploy capability within 30 days

6!
EzBake
EzBake provides an integrated way to compose the different
elements of your application: collecting, processing, storing, and
querying data.
• 

Focus on application logic

• 

Simple API that leverages complex, distributed frameworks

• 

Easy to use local development kit

• 

Deploy in minutes

• 

Framework is accredited, applications inherit accreditation

• 

Subscription-based data-feed-model

• 

Automated elasticity

• 

Design for failure

7!
The Components
The core of the platform is pure open-source solutions and is broken
into the following primary components:
• 

Streaming Ingest (Frack): This is the interface for building data flow topologies which
abstracts the physical stream processor

• 

Common Services (Procedures): Scaled and commonly used thrift services, typically
utilized during streaming ingest

• 

Data Persistence (Dataset): These are our indexing patterns, called Datasets, exposed
as Thrift services and abstracts the physical databases

• 

Query: Both direct access to Datasets and Aggregate Query across the various
Datasets

• 

Security: Both at the data persistence and user access layers

• 

Batch Analytics: MapReduce abstractions that allow input from Datasets and output to
Datasets and will leverage the GovCloud DataCloud

• 

Deployment: Currently use OpenShift for automated deployment but plan to migrate to
Docker + YARN

8!
Technology Agnostic
Each app has their own needs and it is not on the platform builder to
force the team into a particular technology, rather offer a solution to
meet the use case
• 

Instead of a jack-of-all-trades
indexing for free text search,
geospatial search, etc use mission
specific indices for specific
application logic needs

• 

Focus on storage patterns vice
database specific operations
thereby enforcing data access
standards across the enterprise

• 

Allow for new cartridges for web
frameworks including node.js,
python, Ruby, etc.

9!
The Architecture

10!
Sharing
Sharing is the key component of EzBake in order to achieve cost
savings and provide agility for the application developer
• 

Sharing is exposed via the Common Services and the Aggregate Query

• 

The intent of the Common Services is to expose any functionality currently ingrained
within stove-piped applications. By exposing that functionality as a service, other
applications can leverage it, instead of application teams writing the same logic over and
over again, such as entity extraction, date normalization, etc.

• 

The Common Services are wrapped in Thrift services, scaled out on the virtual
infrastructure deployed through OpenShift

• 

The Aggregate Query is in development for delivery in EzBake v2.0, the current design
will extend Impala to expose the EzBake Datasets as input for the distributed query
engine

• 

App teams will expose “intents” within the Datasets for which they can respond, like a
“person”, “place”, or “event” and the Impala engine will query plan and aggregate the
results back to the requestor

11!
Security
Built-in from the start, EzBake implements security across all features.
• 

Datasets are where the bulk of
the security occurs, applying
row level security to the data
based on the user’s
authorization string

• 

Row level security must be
implemented in different ways,
to support multiple types of
datastores, for example, for the
term dataset, which is
ElasticSearch, we included a
filter plugin that applies the
boolean logic check at query
time

• 

Embedding security across the
platform allows the application
teams to streamline their
accreditation process

12!
Metering and Monitoring
Data driven decisions

• 

Javascript API for web
apps, Thrift API for
services and REST for
others

• 

Improve application
usability/usefulness by
examining analytics on
usage patterns

• 

Diagnose issues with
system, services and apps

• 

Determine cost allocation
based on what agencies
and organizations are
using the system

13!
Timeline

14!
What’s Next
EzBake provides an integrated way to compose the different
elements of your application: collecting, processing, storing, and
querying data.
•  Distributed query via Impala (Intents are coming)
•  Apache Spark integration (dynamic ranking)
•  Graph support - Titan
•  Change YARN to control Docker
•  Upgrade to CDH5
•  Extend Apache Sentry

15!
Questions!
Contact Us!
Matthew Carroll
GM, 42six
matt@42six.com
@mcarroll_

Mais conteúdo relacionado

Mais procurados

Q radar architecture deep dive
Q radar architecture   deep diveQ radar architecture   deep dive
Q radar architecture deep diveKamal Mouline
 
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Odinot Stanislas
 
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFB
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFBMonitoring and Securing a Geo-Dispersed Data Center at Hill AFB
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFBElasticsearch
 
5 Paths to HPC - SUSE
5 Paths to HPC - SUSE5 Paths to HPC - SUSE
5 Paths to HPC - SUSEJeff Reser
 
The IBM dashboard for operational metrics
The IBM dashboard for operational metricsThe IBM dashboard for operational metrics
The IBM dashboard for operational metricsPlatform CF
 
User Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: StethoscopeUser Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: StethoscopeJesse Kriss
 
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto
 
Countering Threats with the Elastic Stack at CERDEC/ARL
Countering Threats with the Elastic Stack at CERDEC/ARLCountering Threats with the Elastic Stack at CERDEC/ARL
Countering Threats with the Elastic Stack at CERDEC/ARLElasticsearch
 
Centralized logging in a changing environment at the UK’s DVLA
Centralized logging in a changing environment at the UK’s DVLACentralized logging in a changing environment at the UK’s DVLA
Centralized logging in a changing environment at the UK’s DVLAElasticsearch
 
Intel apj cloud big data summit sdi press briefing - panhorst
Intel apj cloud  big data summit   sdi press briefing - panhorstIntel apj cloud  big data summit   sdi press briefing - panhorst
Intel apj cloud big data summit sdi press briefing - panhorstIntelAPAC
 
Data Onboarding Breakout Session
Data Onboarding Breakout SessionData Onboarding Breakout Session
Data Onboarding Breakout SessionSplunk
 
Microservices Architecture & Testing Strategies
Microservices Architecture & Testing StrategiesMicroservices Architecture & Testing Strategies
Microservices Architecture & Testing StrategiesAraf Karsh Hamid
 
PaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overviewPaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overviewCisco DevNet
 
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...Lucidworks
 
Infrastructure monitoring made easy, from ingest to insight
 Infrastructure monitoring made easy, from ingest to insight Infrastructure monitoring made easy, from ingest to insight
Infrastructure monitoring made easy, from ingest to insightElasticsearch
 
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessHow eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessElasticsearch
 

Mais procurados (20)

Apache Metron: Community Driven Cyber Security
Apache Metron: Community Driven Cyber Security Apache Metron: Community Driven Cyber Security
Apache Metron: Community Driven Cyber Security
 
Q radar architecture deep dive
Q radar architecture   deep diveQ radar architecture   deep dive
Q radar architecture deep dive
 
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
 
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFB
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFBMonitoring and Securing a Geo-Dispersed Data Center at Hill AFB
Monitoring and Securing a Geo-Dispersed Data Center at Hill AFB
 
5 Paths to HPC - SUSE
5 Paths to HPC - SUSE5 Paths to HPC - SUSE
5 Paths to HPC - SUSE
 
The IBM dashboard for operational metrics
The IBM dashboard for operational metricsThe IBM dashboard for operational metrics
The IBM dashboard for operational metrics
 
The Life of an Internet of Things Electron
The Life of an Internet of Things ElectronThe Life of an Internet of Things Electron
The Life of an Internet of Things Electron
 
User Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: StethoscopeUser Focused Security at Netflix: Stethoscope
User Focused Security at Netflix: Stethoscope
 
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of Engagement
 
Countering Threats with the Elastic Stack at CERDEC/ARL
Countering Threats with the Elastic Stack at CERDEC/ARLCountering Threats with the Elastic Stack at CERDEC/ARL
Countering Threats with the Elastic Stack at CERDEC/ARL
 
Apache Spot
Apache SpotApache Spot
Apache Spot
 
Centralized logging in a changing environment at the UK’s DVLA
Centralized logging in a changing environment at the UK’s DVLACentralized logging in a changing environment at the UK’s DVLA
Centralized logging in a changing environment at the UK’s DVLA
 
Multi Cloud Architecture Approach
Multi Cloud Architecture ApproachMulti Cloud Architecture Approach
Multi Cloud Architecture Approach
 
Intel apj cloud big data summit sdi press briefing - panhorst
Intel apj cloud  big data summit   sdi press briefing - panhorstIntel apj cloud  big data summit   sdi press briefing - panhorst
Intel apj cloud big data summit sdi press briefing - panhorst
 
Data Onboarding Breakout Session
Data Onboarding Breakout SessionData Onboarding Breakout Session
Data Onboarding Breakout Session
 
Microservices Architecture & Testing Strategies
Microservices Architecture & Testing StrategiesMicroservices Architecture & Testing Strategies
Microservices Architecture & Testing Strategies
 
PaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overviewPaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overview
 
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
 
Infrastructure monitoring made easy, from ingest to insight
 Infrastructure monitoring made easy, from ingest to insight Infrastructure monitoring made easy, from ingest to insight
Infrastructure monitoring made easy, from ingest to insight
 
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessHow eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
 

Destaque

Performance Models for Apache Accumulo
Performance Models for Apache AccumuloPerformance Models for Apache Accumulo
Performance Models for Apache AccumuloSqrrl
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudCloudera, Inc.
 
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Docker, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester WebinarCloudera, Inc.
 

Destaque (6)

Performance Models for Apache Accumulo
Performance Models for Apache AccumuloPerformance Models for Apache Accumulo
Performance Models for Apache Accumulo
 
Iframe src
Iframe srcIframe src
Iframe src
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
Building Web Scale Apps with Docker and Mesos by Alex Rukletsov (Mesosphere)
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 

Semelhante a EzBake Secure Apps Engine for DoDIIS

Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summitMatt Carroll
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014Amazon Web Services
 
Cloud workload migration guidelines
Cloud workload migration guidelinesCloud workload migration guidelines
Cloud workload migration guidelinesJen Wei Lee
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsVMware Tanzu
 
Cloud migration presentation
Cloud migration presentationCloud migration presentation
Cloud migration presentationyeshlenchetty
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science PlatformDecision Science Community
 
Migração - EBC on the road Brazil Edition [Portuguese]
Migração - EBC on the road Brazil Edition [Portuguese]Migração - EBC on the road Brazil Edition [Portuguese]
Migração - EBC on the road Brazil Edition [Portuguese]Amazon Web Services
 
Migrating Critical Applications to the Cloud - isaca seattle - sanitized
Migrating Critical Applications to the Cloud - isaca seattle - sanitizedMigrating Critical Applications to the Cloud - isaca seattle - sanitized
Migrating Critical Applications to the Cloud - isaca seattle - sanitizedUnifyCloud
 
Migrating Critical Applications To The Cloud - ISACA Seattle - Sanitized
Migrating Critical Applications To The Cloud - ISACA Seattle - SanitizedMigrating Critical Applications To The Cloud - ISACA Seattle - Sanitized
Migrating Critical Applications To The Cloud - ISACA Seattle - SanitizedNorm Barber
 
Over view of software artitecture
Over view of software artitectureOver view of software artitecture
Over view of software artitectureABDEL RAHMAN KARIM
 
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...NUS-ISS
 
Enabling multicloud in the enterprise with DevSecOps
Enabling multicloud in the enterprise with DevSecOpsEnabling multicloud in the enterprise with DevSecOps
Enabling multicloud in the enterprise with DevSecOpsJosh Boyd
 
Techniques for Developing Systems in IT Management System
Techniques for Developing Systems in IT Management SystemTechniques for Developing Systems in IT Management System
Techniques for Developing Systems in IT Management SystemGruppo Banca Sella
 
Connect Ops and Security with Flexible Web App and API Protection
Connect Ops and Security with Flexible Web App and API ProtectionConnect Ops and Security with Flexible Web App and API Protection
Connect Ops and Security with Flexible Web App and API ProtectionDevOps.com
 
Radu crahmaliuc 23feb2012
Radu crahmaliuc 23feb2012Radu crahmaliuc 23feb2012
Radu crahmaliuc 23feb2012Agora Group
 
Applying systems thinking to AWS enterprise application migration
Applying systems thinking to AWS enterprise application migrationApplying systems thinking to AWS enterprise application migration
Applying systems thinking to AWS enterprise application migrationKacy Clarke
 
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)Knoldus Inc.
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devopsUlf Mattsson
 
Best Practices for Integrating Applications Development
Best Practices for Integrating Applications DevelopmentBest Practices for Integrating Applications Development
Best Practices for Integrating Applications DevelopmentKovair
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startupsSekhar Mohanty
 

Semelhante a EzBake Secure Apps Engine for DoDIIS (20)

Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summit
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Cloud workload migration guidelines
Cloud workload migration guidelinesCloud workload migration guidelines
Cloud workload migration guidelines
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native apps
 
Cloud migration presentation
Cloud migration presentationCloud migration presentation
Cloud migration presentation
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
Migração - EBC on the road Brazil Edition [Portuguese]
Migração - EBC on the road Brazil Edition [Portuguese]Migração - EBC on the road Brazil Edition [Portuguese]
Migração - EBC on the road Brazil Edition [Portuguese]
 
Migrating Critical Applications to the Cloud - isaca seattle - sanitized
Migrating Critical Applications to the Cloud - isaca seattle - sanitizedMigrating Critical Applications to the Cloud - isaca seattle - sanitized
Migrating Critical Applications to the Cloud - isaca seattle - sanitized
 
Migrating Critical Applications To The Cloud - ISACA Seattle - Sanitized
Migrating Critical Applications To The Cloud - ISACA Seattle - SanitizedMigrating Critical Applications To The Cloud - ISACA Seattle - Sanitized
Migrating Critical Applications To The Cloud - ISACA Seattle - Sanitized
 
Over view of software artitecture
Over view of software artitectureOver view of software artitecture
Over view of software artitecture
 
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
NUS-ISS Learning Day 2018- Designing software to make the most of cloud platf...
 
Enabling multicloud in the enterprise with DevSecOps
Enabling multicloud in the enterprise with DevSecOpsEnabling multicloud in the enterprise with DevSecOps
Enabling multicloud in the enterprise with DevSecOps
 
Techniques for Developing Systems in IT Management System
Techniques for Developing Systems in IT Management SystemTechniques for Developing Systems in IT Management System
Techniques for Developing Systems in IT Management System
 
Connect Ops and Security with Flexible Web App and API Protection
Connect Ops and Security with Flexible Web App and API ProtectionConnect Ops and Security with Flexible Web App and API Protection
Connect Ops and Security with Flexible Web App and API Protection
 
Radu crahmaliuc 23feb2012
Radu crahmaliuc 23feb2012Radu crahmaliuc 23feb2012
Radu crahmaliuc 23feb2012
 
Applying systems thinking to AWS enterprise application migration
Applying systems thinking to AWS enterprise application migrationApplying systems thinking to AWS enterprise application migration
Applying systems thinking to AWS enterprise application migration
 
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)Migrating to Cloud: Inhouse Hadoop to Databricks (3)
Migrating to Cloud: Inhouse Hadoop to Databricks (3)
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devops
 
Best Practices for Integrating Applications Development
Best Practices for Integrating Applications DevelopmentBest Practices for Integrating Applications Development
Best Practices for Integrating Applications Development
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startups
 

Mais de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mais de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Último (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

EzBake Secure Apps Engine for DoDIIS

  • 1. EzBake A Secure Apps Engine for DoDIIS Matthew Carroll, GM of 42six February 06, 2014
  • 2. Outline Why Build an Apps Engine? The Architecture What’s Next 2!
  • 3. The DoDIIS App Conundrum Budget Cuts •  •  •  •  •  There is not enough money to transition over 400+ apps within DIA (business and mission) Outsourcing IaaS to C2S and GovCloud needs to be monitored for cost reimbursable over time Application elasticity is critical to understanding true costs of ownership and maintenance Data is a much bigger cost than expected Need to consolidate systems engineering support Technology migration is not simple •  Most apps are CRUD based; write a report, find a report •  Security business logic is baked into each app •  Number one question: why can’t I choose the technology that best fits my app? Not a Big Data problem….yet •  On the order of TBs at best •  Highly connected but not big Security is the ultimate killer of time •  Most time is spent meeting PL3 needs and encrypting traffic 3!
  • 4. Analytics are great but… THE COMMUNITY DRIVE TO ANALYTICS AND ENRICHMENT ENGINES HAS LEFT DIA PLAYING CATCH UP IN ITS MIGRATION OF APPS TO A COMMON PLATFORM. 1.  Make it as easy as possible for any legacy app to transition 2.  Will not dictate technologies 3.  Provide standards for security and access to datastores 4.  The platform must deploy across multiple brokers, i.e. EC2, OpenStack, VMware and be completely transparent to the app team 4!
  • 5. Where to start? Starting was hard. But it became very clear that several epics were essential to migrate applications in an efficient manner: 1.  Streaming of data into applications must be done in a standard way. Velocity and size of data is not as much as a factor to DIA as is the method to which the data is consumed and distributed. To answer this a stream-based data interface must be built to support the nexus of data distribution within the environment, we call this Frack. 2.  Everyone likes the concept of migrating to NoSQL but it becomes unmanageable from a DevOps perspective if everyone picks their own database for their own use cases. Furthermore, the point is to be multi-tenant. So we created datasets, a means to expose indexing patterns instead of explicit databases, exposed through a common security layer. 3.  Too much time is spent on baking in non-application specific logic into each application vice supporting a common service tier. In order to build standards around common service-based functions we built Services. 5!
  • 6. Integration vs. Engineering Of the major issues identified early on in the project the most hindering of issues was the deployment model. •  App teams are spending 80% of their time integrating to new database and new services vice building application functionality •  Applications would each follow their own System Installation Procedure (SIP) by which each would deploy their own software •  Scale was defined through provisioning of machines vice true automated elasticity •  Start developing within 1 hour and deploy capability within 30 days 6!
  • 7. EzBake EzBake provides an integrated way to compose the different elements of your application: collecting, processing, storing, and querying data. •  Focus on application logic •  Simple API that leverages complex, distributed frameworks •  Easy to use local development kit •  Deploy in minutes •  Framework is accredited, applications inherit accreditation •  Subscription-based data-feed-model •  Automated elasticity •  Design for failure 7!
  • 8. The Components The core of the platform is pure open-source solutions and is broken into the following primary components: •  Streaming Ingest (Frack): This is the interface for building data flow topologies which abstracts the physical stream processor •  Common Services (Procedures): Scaled and commonly used thrift services, typically utilized during streaming ingest •  Data Persistence (Dataset): These are our indexing patterns, called Datasets, exposed as Thrift services and abstracts the physical databases •  Query: Both direct access to Datasets and Aggregate Query across the various Datasets •  Security: Both at the data persistence and user access layers •  Batch Analytics: MapReduce abstractions that allow input from Datasets and output to Datasets and will leverage the GovCloud DataCloud •  Deployment: Currently use OpenShift for automated deployment but plan to migrate to Docker + YARN 8!
  • 9. Technology Agnostic Each app has their own needs and it is not on the platform builder to force the team into a particular technology, rather offer a solution to meet the use case •  Instead of a jack-of-all-trades indexing for free text search, geospatial search, etc use mission specific indices for specific application logic needs •  Focus on storage patterns vice database specific operations thereby enforcing data access standards across the enterprise •  Allow for new cartridges for web frameworks including node.js, python, Ruby, etc. 9!
  • 11. Sharing Sharing is the key component of EzBake in order to achieve cost savings and provide agility for the application developer •  Sharing is exposed via the Common Services and the Aggregate Query •  The intent of the Common Services is to expose any functionality currently ingrained within stove-piped applications. By exposing that functionality as a service, other applications can leverage it, instead of application teams writing the same logic over and over again, such as entity extraction, date normalization, etc. •  The Common Services are wrapped in Thrift services, scaled out on the virtual infrastructure deployed through OpenShift •  The Aggregate Query is in development for delivery in EzBake v2.0, the current design will extend Impala to expose the EzBake Datasets as input for the distributed query engine •  App teams will expose “intents” within the Datasets for which they can respond, like a “person”, “place”, or “event” and the Impala engine will query plan and aggregate the results back to the requestor 11!
  • 12. Security Built-in from the start, EzBake implements security across all features. •  Datasets are where the bulk of the security occurs, applying row level security to the data based on the user’s authorization string •  Row level security must be implemented in different ways, to support multiple types of datastores, for example, for the term dataset, which is ElasticSearch, we included a filter plugin that applies the boolean logic check at query time •  Embedding security across the platform allows the application teams to streamline their accreditation process 12!
  • 13. Metering and Monitoring Data driven decisions •  Javascript API for web apps, Thrift API for services and REST for others •  Improve application usability/usefulness by examining analytics on usage patterns •  Diagnose issues with system, services and apps •  Determine cost allocation based on what agencies and organizations are using the system 13!
  • 15. What’s Next EzBake provides an integrated way to compose the different elements of your application: collecting, processing, storing, and querying data. •  Distributed query via Impala (Intents are coming) •  Apache Spark integration (dynamic ranking) •  Graph support - Titan •  Change YARN to control Docker •  Upgrade to CDH5 •  Extend Apache Sentry 15!
  • 16. Questions! Contact Us! Matthew Carroll GM, 42six matt@42six.com @mcarroll_