SlideShare a Scribd company logo
1 of 30
Building an enterprise-ready
analytics and operational
ecosystem on DC/OS
Ignacio Mulas
Index:
● Data Centric Overview
● Non-functional Requirements
● Functional Case:
○ Data Exploration
○ Data Preparation
○ Data Validation
○ Productionalization
○ Evaluation
Overview
SAP : ERP
Mobile App
Campaign
Manager
CRM
Call
center
THE ROOT OF THE PROBLEM OF PHYSICAL COMPANIES HAS BEEN
IDENTIFIED: SILOS & APPLICATION CENTRIC
Big Data LakeDATA MART
DATA MART
E-commerce
DATA
WAREHOUSE
TPV
APP
Lost data
No Real Time
10X Data Replication
Low TPO/TCO
10X Costs
Day-1 analytics
Non-integrated vision
Silos between departments
Not a real IA
Problems
Mobile APP Campaign
Management
Digital
Marketing
Legacy
Applications
Call center
Core
Application
ATG
TPV APP
CRM
E-commerce
Microservices of the Data
Intelligence layer
New Applications are developed
through microservice orchestration
reducing code in half
Unique data at the center and
applications around it using it in real
time with maximum intelligence
Operational and Informational
Applications use the microservices
of the Data as a Service layer
Microservices
SOLUTION: STRATIO DATACENTRIC
Operationalizing Big Data
DATA
Data intelligence
Api Daas
(Data as a Service)
DC/OS
Infrastructure and container manager
MultiDataStore
& Multiprocessing
Outer look....
Stratio DataCentric
Stratio
EOS
Stratio
XData
Stratio
Sparta
Stratio
Discovery
Stratio
Governance
Stratio
GoSec
Deploy and
manage all your
services with a
single click
Gain a centralized
vision of all your data
and easily govern its
access and
management
Apply real-time and
batch processing
across multiple
engines in distributed
environments
Become a truly data-
driven company with
AI
Turn difficult
concepts into
something simple
Protect your data
against security
breaches and
maintain
compliance
Stratio
Intelligence
Begin the journey
from data to
knowledge
Microservices
Framework
Design, Develop and
manager applications
easily
Non-Functional
requirements
Key non-functional requirements on data centric
1. Security levels & profiling —On this scenario, we need to be able to support encrypted
communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security
manager that enforces complex policies on applications and data.
2. Isolation of resources—we should guarantee that each application/user have what they need to work
properly without stepping into others resources. Mixing different workloads should not affect the correct
functioning of the most critical services, i.e. operational microservices vs big data frameworks.
3. Data governance tools—getting all together imposes new levels of data management requirements
where data is not modelled but auto-discovered and enriched with business context.
4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and
operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a
day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
Key non-functional requirements on data centric
1. Security levels & profiling —On this scenario, we need to be able to support encrypted
communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security
manager that enforces complex policies on applications and data.
µs
SSO
Policies Audit
µs-2
Secrets
Key non-functional requirements on data centric
2. Isolation of resources—we should guarantee that each application/user have what they need to work
properly without stepping into others resources. Mixing different workloads should not affect the correct
functioning of the most critical services, i.e. operational microservices vs big data frameworks.
µs
Big Data
Process
...
- Network isolation
- CPU, RAM, Disk isolation
Key non-functional requirements on data centric
3. Data governance tools—getting all together imposes new levels of data management requirements
where data is not modelled but auto-discovered and enriched with business context.
Big Data
Tool
A process / application need data to work properly but, we need
to maintain certain guarantees:
- Data Security:
- Who are you?
- Are you authorized to read/write data from here
- Data processes development:
- Where can I read a trusted source of information
containing my clients emails?
- Is this personal data? I need to follow GDPR!
- Can I delete this record? I do not think it is used in
our business…
- Who created this?
Data
Dictionary
Business
glossary
Lineage
A process / application need data to work properly but, we need
to maintain certain guarantees:
- Data Security:
- Who are you?
- Are you authorized to read/write data from here
- Data processes development:
- Where can I read a trusted source of information
containing my clients emails?
- Is this personal data? I need to follow GDPR!
- Can I delete this record? I do not think it is used in
our business…
- Who created this?
Key non-functional requirements on data centric
4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and
operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a
day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
Different deployment models:
● Replace version
● Blue/Green
● Canary Testing
● Versioning and history
● Rollback mechanisms
● Models retraining
● Functioning Evaluation
● Metrics tracking
● Versions comparison
Applications are monitored on several
metrics:
● Application metrics
● Business metrics
● Computational metrics
Deployment Monitoring
Management Evaluation
Functional case:
Clients Scoring
Functional case: Client Scoring for a financial institution
Functional case: Client Scoring for a financial institution
1. Data exploration—Occurs early in a project; may include viewing sample data, running queries
for statistical profiling, exploratory analysis and visualizing data.
2. Data preparation —Iterative task; may include cleaning, standardizing, transforming,
denormalizing, and aggregating data; typically the most time-intensive task of a project
3. Data validation —Recurring task; may include viewing sample data, running queries for
statistical profiling and aggregate analysis, and visualizing data; typically occurs as part of data
exploration, data preparation, development, pre-deployment, and post-deployment phases
4. Productionalization—Occurs late in a project; may include deploying code to production,
backfilling datasets, training models, validating data, and scheduling workflows
Data Exploration
Data Exploration
Data Preparation
Data Preparation
Data Validation
Data Validation
Productionalization -
Workflow
Productionalization - Workflow Versioning
Productionalization - Workflow Deployment
Evaluation
Evaluation
BIG DATA
CHILD`S PLAY
Questions? :)
● Facial Recognition: ability to correctly identify a high percentage of the known individuals, given the image of face.
Ability to learn new faces.
● Emotion classification: ability to correctly classify above 65% of the emotions of persons, given the image of face.
The emotions identified are: happiness, sadness, surprise, anger.
● Object Recognition: ability to segment and classify objects from images.
● Natural Interaction Agent: ability to talk to humans in a natural way (typing or through voice using a phone terminal).
Ability to trigger basic actions based on the identified intent, e.g., "show a document" or "switch on a light bulb".
● Semantic Document Retrieval: ability to find documents based on their content. The way of querying is based on a
natural interaction using standard text.
● Question Answering: ability to answer a specific questions from a text or a document. E.g., "when was Peter born?"
=> "May 20th, 2001"
● Awareness: ability to manage any amount of data in an almost instantaneous way in order to reach conclusions,
create warnings or trigger actions. The data managed by this ability could come from the previous abilities and/or any
other external feed.
New Capabilities…
Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS

More Related Content

What's hot

Getting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-OnGetting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-OnSplunk
 
Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers! Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers! elangovans
 
Dev talks 2021 Data Science @crowdstrike
Dev talks 2021   Data Science @crowdstrikeDev talks 2021   Data Science @crowdstrike
Dev talks 2021 Data Science @crowdstrikeRuxandra Burtica
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoopNiel Dunnage
 
SAP Cloud security overview 2.0
SAP Cloud security overview 2.0SAP Cloud security overview 2.0
SAP Cloud security overview 2.0Rasmi Swain
 
Challenges with Cloud Security by Ken Y Chan
Challenges with Cloud Security by Ken Y ChanChallenges with Cloud Security by Ken Y Chan
Challenges with Cloud Security by Ken Y ChanKen Chan
 
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICSHIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICSHappiest Minds Technologies
 
Hands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill ChainHands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill ChainSplunk
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseGanesan Narayanasamy
 
How Cloudera SDX can aid GDPR compliance 6.21.18
How Cloudera SDX can aid GDPR compliance 6.21.18How Cloudera SDX can aid GDPR compliance 6.21.18
How Cloudera SDX can aid GDPR compliance 6.21.18Cloudera, Inc.
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session Splunk
 
Observability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyObservability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyTimetrix
 
A Little Security For Big Data
A Little Security For Big DataA Little Security For Big Data
A Little Security For Big DataSaurabh Kheni
 
Introduction to Cloud Applications
Introduction to Cloud ApplicationsIntroduction to Cloud Applications
Introduction to Cloud ApplicationsDataStax
 
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera, Inc.
 
Creating A Solvency II Data Governance Framework
Creating A Solvency II Data Governance FrameworkCreating A Solvency II Data Governance Framework
Creating A Solvency II Data Governance Frameworkcolinrickard
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissanceCloudera, Inc.
 

What's hot (19)

Observability
ObservabilityObservability
Observability
 
Getting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-OnGetting Started with Splunk Enterprise Hands-On
Getting Started with Splunk Enterprise Hands-On
 
Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers! Horizontal Scaling for Millions of Customers!
Horizontal Scaling for Millions of Customers!
 
Dev talks 2021 Data Science @crowdstrike
Dev talks 2021   Data Science @crowdstrikeDev talks 2021   Data Science @crowdstrike
Dev talks 2021 Data Science @crowdstrike
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
 
SAP Cloud security overview 2.0
SAP Cloud security overview 2.0SAP Cloud security overview 2.0
SAP Cloud security overview 2.0
 
Challenges with Cloud Security by Ken Y Chan
Challenges with Cloud Security by Ken Y ChanChallenges with Cloud Security by Ken Y Chan
Challenges with Cloud Security by Ken Y Chan
 
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICSHIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
HIGH-IMPACT USE CASES POWERED BY NEXT-GENERATION NETWORK ANALYTICS
 
Hands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill ChainHands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill Chain
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the Enterprise
 
How Cloudera SDX can aid GDPR compliance 6.21.18
How Cloudera SDX can aid GDPR compliance 6.21.18How Cloudera SDX can aid GDPR compliance 6.21.18
How Cloudera SDX can aid GDPR compliance 6.21.18
 
SIEM game changer
SIEM game changerSIEM game changer
SIEM game changer
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session
 
Observability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyObservability – the good, the bad, and the ugly
Observability – the good, the bad, and the ugly
 
A Little Security For Big Data
A Little Security For Big DataA Little Security For Big Data
A Little Security For Big Data
 
Introduction to Cloud Applications
Introduction to Cloud ApplicationsIntroduction to Cloud Applications
Introduction to Cloud Applications
 
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
 
Creating A Solvency II Data Governance Framework
Creating A Solvency II Data Governance FrameworkCreating A Solvency II Data Governance Framework
Creating A Solvency II Data Governance Framework
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
 

Similar to Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS

Kansen voor marketing om snel en slim te analyseren en te activeren
Kansen voor marketing om snel en slim te analyseren en te activerenKansen voor marketing om snel en slim te analyseren en te activeren
Kansen voor marketing om snel en slim te analyseren en te activerenJordie van Rijn
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsEmbarcadero Technologies
 
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Elemica
 
Top learnings from evaluating and implementing a DLP Solution
Top learnings from evaluating and implementing a DLP Solution Top learnings from evaluating and implementing a DLP Solution
Top learnings from evaluating and implementing a DLP Solution Priyanka Aash
 
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...Steven Meister
 
Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]
Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]
Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]Tudor Damian
 
How to analyze text data for AI and ML with Named Entity Recognition
How to analyze text data for AI and ML with Named Entity RecognitionHow to analyze text data for AI and ML with Named Entity Recognition
How to analyze text data for AI and ML with Named Entity RecognitionSkyl.ai
 
infox technologies
infox technologiesinfox technologies
infox technologiesfidharash
 
The Digital Manufacturer
The Digital ManufacturerThe Digital Manufacturer
The Digital ManufacturerPercy-Mitchell
 
The digital-manufacturer
The digital-manufacturerThe digital-manufacturer
The digital-manufacturerPercy-Mitchell
 
The digital-manufacturer
The digital-manufacturer The digital-manufacturer
The digital-manufacturer Percy-Mitchell
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentationPriyesh Patel
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfData Science Council of America
 
SG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptxSG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptxssuser57f752
 
EMA Presentation: Driving Business Value with Continuous Operational Intellig...
EMA Presentation: Driving Business Value with Continuous Operational Intellig...EMA Presentation: Driving Business Value with Continuous Operational Intellig...
EMA Presentation: Driving Business Value with Continuous Operational Intellig...ExtraHop Networks
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationDenodo
 
Information Driven Enterprise Architecture - Connected Brains 2018
Information Driven Enterprise Architecture - Connected Brains 2018Information Driven Enterprise Architecture - Connected Brains 2018
Information Driven Enterprise Architecture - Connected Brains 2018LoQutus
 
8 Tools For Digital Transformation For Every Leader.pdf
8 Tools For Digital Transformation For Every Leader.pdf8 Tools For Digital Transformation For Every Leader.pdf
8 Tools For Digital Transformation For Every Leader.pdflearntransformation0
 
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...AgileNetwork
 
How To Build Mature SM - final
How To Build Mature SM - finalHow To Build Mature SM - final
How To Build Mature SM - finalDanijel Božić
 

Similar to Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS (20)

Kansen voor marketing om snel en slim te analyseren en te activeren
Kansen voor marketing om snel en slim te analyseren en te activerenKansen voor marketing om snel en slim te analyseren en te activeren
Kansen voor marketing om snel en slim te analyseren en te activeren
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data Assets
 
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
 
Top learnings from evaluating and implementing a DLP Solution
Top learnings from evaluating and implementing a DLP Solution Top learnings from evaluating and implementing a DLP Solution
Top learnings from evaluating and implementing a DLP Solution
 
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
Gdpr ccpa steps to near as close to compliancy as possible with low risk of f...
 
Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]
Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]
Digital Transformation in the Cloud: What They Don’t Always Tell You [2020]
 
How to analyze text data for AI and ML with Named Entity Recognition
How to analyze text data for AI and ML with Named Entity RecognitionHow to analyze text data for AI and ML with Named Entity Recognition
How to analyze text data for AI and ML with Named Entity Recognition
 
infox technologies
infox technologiesinfox technologies
infox technologies
 
The Digital Manufacturer
The Digital ManufacturerThe Digital Manufacturer
The Digital Manufacturer
 
The digital-manufacturer
The digital-manufacturerThe digital-manufacturer
The digital-manufacturer
 
The digital-manufacturer
The digital-manufacturer The digital-manufacturer
The digital-manufacturer
 
final oracle presentation
final oracle presentationfinal oracle presentation
final oracle presentation
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
SG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptxSG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptx
 
EMA Presentation: Driving Business Value with Continuous Operational Intellig...
EMA Presentation: Driving Business Value with Continuous Operational Intellig...EMA Presentation: Driving Business Value with Continuous Operational Intellig...
EMA Presentation: Driving Business Value with Continuous Operational Intellig...
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data Virtualization
 
Information Driven Enterprise Architecture - Connected Brains 2018
Information Driven Enterprise Architecture - Connected Brains 2018Information Driven Enterprise Architecture - Connected Brains 2018
Information Driven Enterprise Architecture - Connected Brains 2018
 
8 Tools For Digital Transformation For Every Leader.pdf
8 Tools For Digital Transformation For Every Leader.pdf8 Tools For Digital Transformation For Every Leader.pdf
8 Tools For Digital Transformation For Every Leader.pdf
 
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
 
How To Build Mature SM - final
How To Build Mature SM - finalHow To Build Mature SM - final
How To Build Mature SM - final
 

More from Stratio

Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18Stratio
 
Kafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka MeetupKafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka MeetupStratio
 
Wild Data - The Data Science Meetup
Wild Data - The Data Science MeetupWild Data - The Data Science Meetup
Wild Data - The Data Science MeetupStratio
 
Ensemble methods in Machine Learning
Ensemble methods in Machine Learning Ensemble methods in Machine Learning
Ensemble methods in Machine Learning Stratio
 
Stratio Sparta 2.0
Stratio Sparta 2.0Stratio Sparta 2.0
Stratio Sparta 2.0Stratio
 
Big Data Security: Facing the challenge
Big Data Security: Facing the challengeBig Data Security: Facing the challenge
Big Data Security: Facing the challengeStratio
 
Operationalizing Big Data
Operationalizing Big DataOperationalizing Big Data
Operationalizing Big DataStratio
 
Artificial Intelligence on Data Centric Platform
Artificial Intelligence on Data Centric PlatformArtificial Intelligence on Data Centric Platform
Artificial Intelligence on Data Centric PlatformStratio
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksStratio
 
“A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack” “A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack” Stratio
 
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...Stratio
 
Lunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelosLunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelosStratio
 
Meetup: Spark + Kerberos
Meetup: Spark + KerberosMeetup: Spark + Kerberos
Meetup: Spark + KerberosStratio
 
Distributed Logistic Model Trees
Distributed Logistic Model TreesDistributed Logistic Model Trees
Distributed Logistic Model TreesStratio
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesStratio
 
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016Stratio
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] SparktaStratio
 
Introduction to Asynchronous scala
Introduction to Asynchronous scalaIntroduction to Asynchronous scala
Introduction to Asynchronous scalaStratio
 
Functional programming in scala
Functional programming in scalaFunctional programming in scala
Functional programming in scalaStratio
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Stratio
 

More from Stratio (20)

Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18Can an intelligent system exist without awareness? BDS18
Can an intelligent system exist without awareness? BDS18
 
Kafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka MeetupKafka and KSQL - Apache Kafka Meetup
Kafka and KSQL - Apache Kafka Meetup
 
Wild Data - The Data Science Meetup
Wild Data - The Data Science MeetupWild Data - The Data Science Meetup
Wild Data - The Data Science Meetup
 
Ensemble methods in Machine Learning
Ensemble methods in Machine Learning Ensemble methods in Machine Learning
Ensemble methods in Machine Learning
 
Stratio Sparta 2.0
Stratio Sparta 2.0Stratio Sparta 2.0
Stratio Sparta 2.0
 
Big Data Security: Facing the challenge
Big Data Security: Facing the challengeBig Data Security: Facing the challenge
Big Data Security: Facing the challenge
 
Operationalizing Big Data
Operationalizing Big DataOperationalizing Big Data
Operationalizing Big Data
 
Artificial Intelligence on Data Centric Platform
Artificial Intelligence on Data Centric PlatformArtificial Intelligence on Data Centric Platform
Artificial Intelligence on Data Centric Platform
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
“A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack” “A Distributed Operational and Informational Technological Stack”
“A Distributed Operational and Informational Technological Stack”
 
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...
 
Lunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelosLunch&Learn: Combinación de modelos
Lunch&Learn: Combinación de modelos
 
Meetup: Spark + Kerberos
Meetup: Spark + KerberosMeetup: Spark + Kerberos
Meetup: Spark + Kerberos
 
Distributed Logistic Model Trees
Distributed Logistic Model TreesDistributed Logistic Model Trees
Distributed Logistic Model Trees
 
Multiplaform Solution for Graph Datasources
Multiplaform Solution for Graph DatasourcesMultiplaform Solution for Graph Datasources
Multiplaform Solution for Graph Datasources
 
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] Sparkta
 
Introduction to Asynchronous scala
Introduction to Asynchronous scalaIntroduction to Asynchronous scala
Introduction to Asynchronous scala
 
Functional programming in scala
Functional programming in scalaFunctional programming in scala
Functional programming in scala
 
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015Spark Streaming @ Berlin Apache Spark Meetup, March 2015
Spark Streaming @ Berlin Apache Spark Meetup, March 2015
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Mesos Meetup - Building an enterprise-ready analytics and operational ecosystem on DC/OS

  • 1. Building an enterprise-ready analytics and operational ecosystem on DC/OS Ignacio Mulas
  • 2. Index: ● Data Centric Overview ● Non-functional Requirements ● Functional Case: ○ Data Exploration ○ Data Preparation ○ Data Validation ○ Productionalization ○ Evaluation
  • 4. SAP : ERP Mobile App Campaign Manager CRM Call center THE ROOT OF THE PROBLEM OF PHYSICAL COMPANIES HAS BEEN IDENTIFIED: SILOS & APPLICATION CENTRIC Big Data LakeDATA MART DATA MART E-commerce DATA WAREHOUSE TPV APP Lost data No Real Time 10X Data Replication Low TPO/TCO 10X Costs Day-1 analytics Non-integrated vision Silos between departments Not a real IA Problems
  • 5. Mobile APP Campaign Management Digital Marketing Legacy Applications Call center Core Application ATG TPV APP CRM E-commerce Microservices of the Data Intelligence layer New Applications are developed through microservice orchestration reducing code in half Unique data at the center and applications around it using it in real time with maximum intelligence Operational and Informational Applications use the microservices of the Data as a Service layer Microservices SOLUTION: STRATIO DATACENTRIC Operationalizing Big Data DATA Data intelligence Api Daas (Data as a Service) DC/OS Infrastructure and container manager MultiDataStore & Multiprocessing
  • 6. Outer look.... Stratio DataCentric Stratio EOS Stratio XData Stratio Sparta Stratio Discovery Stratio Governance Stratio GoSec Deploy and manage all your services with a single click Gain a centralized vision of all your data and easily govern its access and management Apply real-time and batch processing across multiple engines in distributed environments Become a truly data- driven company with AI Turn difficult concepts into something simple Protect your data against security breaches and maintain compliance Stratio Intelligence Begin the journey from data to knowledge Microservices Framework Design, Develop and manager applications easily
  • 8. Key non-functional requirements on data centric 1. Security levels & profiling —On this scenario, we need to be able to support encrypted communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security manager that enforces complex policies on applications and data. 2. Isolation of resources—we should guarantee that each application/user have what they need to work properly without stepping into others resources. Mixing different workloads should not affect the correct functioning of the most critical services, i.e. operational microservices vs big data frameworks. 3. Data governance tools—getting all together imposes new levels of data management requirements where data is not modelled but auto-discovered and enriched with business context. 4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them.
  • 9. Key non-functional requirements on data centric 1. Security levels & profiling —On this scenario, we need to be able to support encrypted communications, authentication & authorization mechanisms, audit and a centralized easy-to-use security manager that enforces complex policies on applications and data. µs SSO Policies Audit µs-2 Secrets
  • 10. Key non-functional requirements on data centric 2. Isolation of resources—we should guarantee that each application/user have what they need to work properly without stepping into others resources. Mixing different workloads should not affect the correct functioning of the most critical services, i.e. operational microservices vs big data frameworks. µs Big Data Process ... - Network isolation - CPU, RAM, Disk isolation
  • 11. Key non-functional requirements on data centric 3. Data governance tools—getting all together imposes new levels of data management requirements where data is not modelled but auto-discovered and enriched with business context. Big Data Tool A process / application need data to work properly but, we need to maintain certain guarantees: - Data Security: - Who are you? - Are you authorized to read/write data from here - Data processes development: - Where can I read a trusted source of information containing my clients emails? - Is this personal data? I need to follow GDPR! - Can I delete this record? I do not think it is used in our business… - Who created this? Data Dictionary Business glossary Lineage A process / application need data to work properly but, we need to maintain certain guarantees: - Data Security: - Who are you? - Are you authorized to read/write data from here - Data processes development: - Where can I read a trusted source of information containing my clients emails? - Is this personal data? I need to follow GDPR! - Can I delete this record? I do not think it is used in our business… - Who created this?
  • 12. Key non-functional requirements on data centric 4. DevOps productionalization mechanisms—in the cloud and containers era, maintenance and operations are reduced to the minimum thanks to automation mechanisms. Scaling, upgrading, deploying is a day-to-day task and therefore, we need to ensure easy mechanisms to do and manage them. Different deployment models: ● Replace version ● Blue/Green ● Canary Testing ● Versioning and history ● Rollback mechanisms ● Models retraining ● Functioning Evaluation ● Metrics tracking ● Versions comparison Applications are monitored on several metrics: ● Application metrics ● Business metrics ● Computational metrics Deployment Monitoring Management Evaluation
  • 14. Functional case: Client Scoring for a financial institution
  • 15. Functional case: Client Scoring for a financial institution 1. Data exploration—Occurs early in a project; may include viewing sample data, running queries for statistical profiling, exploratory analysis and visualizing data. 2. Data preparation —Iterative task; may include cleaning, standardizing, transforming, denormalizing, and aggregating data; typically the most time-intensive task of a project 3. Data validation —Recurring task; may include viewing sample data, running queries for statistical profiling and aggregate analysis, and visualizing data; typically occurs as part of data exploration, data preparation, development, pre-deployment, and post-deployment phases 4. Productionalization—Occurs late in a project; may include deploying code to production, backfilling datasets, training models, validating data, and scheduling workflows
  • 28.
  • 29. ● Facial Recognition: ability to correctly identify a high percentage of the known individuals, given the image of face. Ability to learn new faces. ● Emotion classification: ability to correctly classify above 65% of the emotions of persons, given the image of face. The emotions identified are: happiness, sadness, surprise, anger. ● Object Recognition: ability to segment and classify objects from images. ● Natural Interaction Agent: ability to talk to humans in a natural way (typing or through voice using a phone terminal). Ability to trigger basic actions based on the identified intent, e.g., "show a document" or "switch on a light bulb". ● Semantic Document Retrieval: ability to find documents based on their content. The way of querying is based on a natural interaction using standard text. ● Question Answering: ability to answer a specific questions from a text or a document. E.g., "when was Peter born?" => "May 20th, 2001" ● Awareness: ability to manage any amount of data in an almost instantaneous way in order to reach conclusions, create warnings or trigger actions. The data managed by this ability could come from the previous abilities and/or any other external feed. New Capabilities…