On November 6th, we got together at Google Campus to talk about Mesos and DC/OS.
Ignacio Mulas, Sparta & Spark Product Owner at Stratio, explained how to build an environment that can secure and govern its data for operational and analytical applications on top of DC/OS platform. He showed that analytical and machine learning pipelines can be combined with operational processes maintaining the security and providing governing tools to manage our data. He focused on the architecture and tools needed to achieve an ecosystem like this and we will show a demo of it. He also explained how we can develop our pipelines interactively with auto-discovered data catalogs and explore our results.
Find out more: https://www.stratio.com/events/discover-how-to-deploy-a-secure-big-data-pipeline-with-dcos/
4. SAP : ERP
Mobile App
Campaign
Manager
CRM
Call
center
THE ROOT OF THE PROBLEM OF PHYSICAL COMPANIES HAS BEEN
IDENTIFIED: SILOS & APPLICATION CENTRIC
Big Data LakeDATA MART
DATA MART
E-commerce
DATA
WAREHOUSE
TPV
APP
Lost data
No Real Time
10X Data Replication
Low TPO/TCO
10X Costs
Day-1 analytics
Non-integrated vision
Silos between departments
Not a real IA
Problems
5. Mobile APP Campaign
Management
Digital
Marketing
Legacy
Applications
Call center
Core
Application
ATG
TPV APP
CRM
E-commerce
Microservices of the Data
Intelligence layer
New Applications are developed
through microservice orchestration
reducing code in half
Unique data at the center and
applications around it using it in real
time with maximum intelligence
Operational and Informational
Applications use the microservices
of the Data as a Service layer
Microservices
SOLUTION: STRATIO DATACENTRIC
Operationalizing Big Data
DATA
Data intelligence
Api Daas
(Data as a Service)
DC/OS
Infrastructure and container manager
MultiDataStore
& Multiprocessing
6. Outer look....
Stratio DataCentric
Stratio
EOS
Stratio
XData
Stratio
Sparta
Stratio
Discovery
Stratio
Governance
Stratio
GoSec
Deploy and
manage all your
services with a
single click
Gain a centralized
vision of all your data
and easily govern its
access and
management
Apply real-time and
batch processing
across multiple
engines in distributed
environments
Become a truly data-
driven company with
AI
Turn difficult
concepts into
something simple
Protect your data
against security
breaches and
maintain
compliance
Stratio
Intelligence
Begin the journey
from data to
knowledge
Microservices
Framework
Design, Develop and
manager applications
easily
8. Key non-functional requirements on data centric
1. Security levels & profiling —On this scenario, we need to be able to support encrypted
communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security
manager that enforces complex policies on applications and data.
2. Isolation of resources—we should guarantee that each application/user have what they need to work
properly without stepping into others resources. Mixing different workloads should not affect the correct
functioning of the most critical services, i.e. operational microservices vs big data frameworks.
3. Data governance tools—getting all together imposes new levels of data management requirements
where data is not modelled but auto-discovered and enriched with business context.
4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and
operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a
day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
9. Key non-functional requirements on data centric
1. Security levels & profiling —On this scenario, we need to be able to support encrypted
communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security
manager that enforces complex policies on applications and data.
µs
SSO
Policies Audit
µs-2
Secrets
10. Key non-functional requirements on data centric
2. Isolation of resources—we should guarantee that each application/user have what they need to work
properly without stepping into others resources. Mixing different workloads should not affect the correct
functioning of the most critical services, i.e. operational microservices vs big data frameworks.
µs
Big Data
Process
...
- Network isolation
- CPU, RAM, Disk isolation
11. Key non-functional requirements on data centric
3. Data governance tools—getting all together imposes new levels of data management requirements
where data is not modelled but auto-discovered and enriched with business context.
Big Data
Tool
A process / application need data to work properly but, we need
to maintain certain guarantees:
- Data Security:
- Who are you?
- Are you authorized to read/write data from here
- Data processes development:
- Where can I read a trusted source of information
containing my clients emails?
- Is this personal data? I need to follow GDPR!
- Can I delete this record? I do not think it is used in
our business…
- Who created this?
Data
Dictionary
Business
glossary
Lineage
A process / application need data to work properly but, we need
to maintain certain guarantees:
- Data Security:
- Who are you?
- Are you authorized to read/write data from here
- Data processes development:
- Where can I read a trusted source of information
containing my clients emails?
- Is this personal data? I need to follow GDPR!
- Can I delete this record? I do not think it is used in
our business…
- Who created this?
12. Key non-functional requirements on data centric
4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and
operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a
day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
Different deployment models:
● Replace version
● Blue/Green
● Canary Testing
● Versioning and history
● Rollback mechanisms
● Models retraining
● Functioning Evaluation
● Metrics tracking
● Versions comparison
Applications are monitored on several
metrics:
● Application metrics
● Business metrics
● Computational metrics
Deployment Monitoring
Management Evaluation
15. Functional case: Client Scoring for a financial institution
1. Data exploration—Occurs early in a project; may include viewing sample data, running queries
for statistical profiling, exploratory analysis and visualizing data.
2. Data preparation —Iterative task; may include cleaning, standardizing, transforming,
denormalizing, and aggregating data; typically the most time-intensive task of a project
3. Data validation —Recurring task; may include viewing sample data, running queries for
statistical profiling and aggregate analysis, and visualizing data; typically occurs as part of data
exploration, data preparation, development, pre-deployment, and post-deployment phases
4. Productionalization—Occurs late in a project; may include deploying code to production,
backfilling datasets, training models, validating data, and scheduling workflows
29. ● Facial Recognition: ability to correctly identify a high percentage of the known individuals, given the image of face.
Ability to learn new faces.
● Emotion classification: ability to correctly classify above 65% of the emotions of persons, given the image of face.
The emotions identified are: happiness, sadness, surprise, anger.
● Object Recognition: ability to segment and classify objects from images.
● Natural Interaction Agent: ability to talk to humans in a natural way (typing or through voice using a phone terminal).
Ability to trigger basic actions based on the identified intent, e.g., "show a document" or "switch on a light bulb".
● Semantic Document Retrieval: ability to find documents based on their content. The way of querying is based on a
natural interaction using standard text.
● Question Answering: ability to answer a specific questions from a text or a document. E.g., "when was Peter born?"
=> "May 20th, 2001"
● Awareness: ability to manage any amount of data in an almost instantaneous way in order to reach conclusions,
create warnings or trigger actions. The data managed by this ability could come from the previous abilities and/or any
other external feed.
New Capabilities…