SlideShare uma empresa Scribd logo
1 de 20
How did HADOOP make the NATIXIS PACK more
efficient ?
a short story
by
Front Office, PnL, Risks and Finance
Pierre Alexandre PAUTRAT, Cyril MONTAGNON, EmmanuelVAIE
Dataworks Summit Munich 2017 April the 6 th 2017 #DWS17
THOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?2
HOW DID WE MEET HADOOP ?
A GAME PHASE:
the Front Office, Pnl,the Risks and Finance
departments
HAKAS AND OPENINGS
TAKE AWAY
Q&A
LET US PLAY
THE MATCH
AGAIN
1
2
3
4
5
How did we meet Hadoop ?
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?3
 Experiments done by two separate departments in 2014
 Credit card analysis POC and site creation for Marketing (Internal Anonymized Test Only)
• Hadoop, Elastic Search, Kibana
 NoSQL Persistence for simulated profits and losses by the Market Risks Department
• HBASE
 Informal exchanges between the Front Office and the IT Risks by the end of 2014
How did we meet Hadoop ?
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?4
 “Big DataThursday”: an open meeting with a positive mood !
 June 2015: we built a first Platform - secured as a Production platform should be - host our DEV !
 Target: Go live for a PROD platform Summer 2016,
 if Pilots projects were to be OK
 if sharing a platform was ok for everybody (FO and IT Risks)
 January 2016: First project results accelerate the decision to move forward - especially for
regulatory hot topics
STACK
HDFS
HBASE
HIVE
KERBEROS
QUICK BI
DATA
INTEGRATION
SQOOP
RANGER
WORKFLOWS
AMBARI
TRAINING
BACKUP AND CONTINUITY PLAN
PYTHON
STANDARDS
SCHEDULERS
SPARK
BACKUPS
R
DATA GOVERNANCE
ATLAS
WHEREHOWS
COLLIBRA…
BILLING
PLAFORM
GOVERNANCE
COMMITEE
PRODUCTION PLATFORM
SPONSOR
HADOOP
COMMITERS
SCALA
JAVA
SPARK ML PHOENIX
KAFKA
INTEGRATION WITH
AUTHORIZATION 2017 MARCH
PRODUCTION VERSION
NATIXIS
2.5.3
LLAP
ZEPPELIN
AUTOMATIC
SSO
2S 2014
INDEXIMA
ANACONDA
ARCHIVING CLUSTER
Our Technical Journey
A game phase:
The Front Office, Pnl,The Risks and Finance departments ecosystem
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?6
 Regulatory evolutions
 RIM (Regulatory Initial Margin)
• the initial amount for your loan !
 FRTB (Fundamental Review of the Trading Book)
• the new vision of the Market Risk for the ECB
A game phase:
The Front Office, Pnl,The Risks and Finance departments ecosystem
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?7
 More efficient
 Stop processing data in a sequential way
 Do not waste time in transferring data from one NAS to another
 Go beyond the limit of the (usual) monolithic and centralized systems
 Process data where it is in a common and secured place-> HDFS
 Precise and secured synchronization -> KAFKA
 NoSQL persistence versus Standard SQL -> HBASE
 Connecting the BIG DATA universe to the BIG COMPUTE paradigm
 AddedValue: making Golden Sources available on the cluster
A game phase:
The Front Office, Pnl,The Risks and Finance departments ecosystem
11 avril 2017
8
PnL
PnL certification
Finance
Regulatory
Provision
Accountancy
Front Office
Positions
Market data
Big Compute
Risks
Risk Scenario
Compute
Sensitivities
certification
HAKA
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?9
 If you are interested and want to know more: welcome on board !
 Diversity improves knowledge
 Our Infrastructure team is onboarded and curious by nature
 Open your minds, exchange with others, contribute to Hadoop!
 Web Champions inspiration (GAFA)
 With the banking industry
 Try to optimise the architecture during this meeting through guided debate
 An iterative way… A progressive way
Exchange : Big Data Thursday
«To bodly go where no man has gone before »
HAKA
TITRE DE LA PRÉSENTATION 27 FÉVRIER 201710
 Try your own solution as early as possible
 Proceed iteratively, work on the DEV with real data of the real size
 Enjoy the Wave between DEV and Prod
 Find a Minimum Viable Solution for each project
 A reference, a starting kit, publish everything on the Entreprise Social Network
 An integrated BI solution (with a Big Data cluster) is crucial: Indexima…
 Demonstrate use cases to build platform legitimacy
 A Machine Learning enabled platform
 With flagship success in the community
Minimum
Viable
Solution
Done  More than 40 Pocs & Projects and 10 in production
Openings:Technologies
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?11
 Infrastructure and security helpers:
 Ambari: setup confort
 Ranger KMS, Ranger: security
 Ambari Metrics: monitoring
 ETL and stored like processes, data appenders:
 HDFS DFS copy, Web HDFS
 HIVE,
• High latency (if not using LLAP…)
 Low latency, version control and NoSQL container:
 HBASE and Phoenix
Openings: Age of discovery
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?12
 Hive, what else?
Pros Cons
Best learning curve Not easy to test
JDBC compatible Not iterative
Hard to maintain
Latency
Not really ACID…
API is not friendly (UDF)
Data scientists, operatives, POCs
Openings: Age of discovery
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?13
 Hive use cases
 Explore data
 I am comfortable with SQL
 Business is pushing hard to produce results
Openings: Age of reason
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?14
Pros Cons
Reduced latency Slow learning curve (Scala…)
Iterative jobs Memory greedy
Easy to test Evolving very quickly
Easy to maintain Slow learning curve (Scala…)
Friendly API Memory greedy
Large community
 Ok now I want a computation engine for developers… Spark!
Openings: Age of reason
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?15
Spark use cases
 Iterative computations (cache data!)
 Streaming data
 I want to test my code
 Machine learning
Openings: Age of reason
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?16
 Another tool to read and write data very fast : HBase!
 Uses cases : Logs,Time series…
Pros Cons
Very fast Just a distributed multimap
Latency REST api…
TTL Data model less flexible (than Hive)
Openings: Technologies used NOW
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?17
 Inter-application messaging
 Kafka
 Database import
 SQOOP
 Datascience and prototyping
 Zeppelin with Livy
 BI and Restitution
 IndexIma 10 B records in 10 msec !
 SQL Server and Polypath
 Data governance
 Atlas,
 Colibra
Take Aways
HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?18
 Associate
 Dynamic iterative positive mood weekly meetings
 Manage your projects as a community
 MinimumViable Solutions and iterate
 Integrated BI solution : an open window on the big data
 DEV.Cluster as a PROD. Cluster : Kerberos is key
 Hadoop Providers : make them involved in your project !
 In our caseThankYou Hortonworks for your involvement !
CONTACT
SPEAKERS
M. PierreAlexandre PAUTRAT
pierre-alexandre.pautrat@natixis.com
https://www.linkedin.com/in/pierrealexandrepautrat/
M. Cyril MONTAGNON
cyril.montagnon@natixis.com
M. EmmanuelVAIE
emmanuel.vaie@natixis.com
https://www.linkedin.com/in/emmanuelvaie
ADRESSE
 NATIXIS
30, avenue Pierre Mendès France 75013 Paris
- France
www.natixis.com
How did HADOOP make the NATIXIS PACK more
efficient ?
a short story
by
Front Office, PnL, Risks and Finance
Pierre Alexandre PAUTRAT, Cyril MONTAGNON, EmmanuelVAIE
Dataworks Summit Munich 2017 April the 6 th 2017 #DWS17

Mais conteúdo relacionado

Mais procurados

Build Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsightBuild Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsightDataWorks Summit/Hadoop Summit
 
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceBenefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceDataWorks Summit/Hadoop Summit
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big DataDataWorks Summit
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...DataWorks Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
Depositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDepositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDataWorks Summit/Hadoop Summit
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...DataWorks Summit/Hadoop Summit
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonDataWorks Summit/Hadoop Summit
 
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...DataWorks Summit/Hadoop Summit
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 

Mais procurados (20)

Shaping a Digital Vision
Shaping a Digital VisionShaping a Digital Vision
Shaping a Digital Vision
 
Build Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsightBuild Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsight
 
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceBenefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business Intelligence
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Intro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJIntro to Spark & Zeppelin - Crash Course - HS16SJ
Intro to Spark & Zeppelin - Crash Course - HS16SJ
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Apache Hadoop Crash Course - HS16SJ
Apache Hadoop Crash Course - HS16SJApache Hadoop Crash Course - HS16SJ
Apache Hadoop Crash Course - HS16SJ
 
Admiral Group
Admiral GroupAdmiral Group
Admiral Group
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Depositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske BankDepositing Value from Transactional Data at Danske Bank
Depositing Value from Transactional Data at Danske Bank
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
Big Data at your Desk with KNIME
Big Data at your Desk with KNIMEBig Data at your Desk with KNIME
Big Data at your Desk with KNIME
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
 
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
 
Data-In-Motion Unleashed
Data-In-Motion UnleashedData-In-Motion Unleashed
Data-In-Motion Unleashed
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 

Semelhante a How Hadoop Makes the Natixis Pack More Efficient

Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overviewRohit Jain
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.OW2
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANABirst
 
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...Precisely
 
HPC Top 5 Stories: May 18th, 2018
HPC Top 5 Stories: May 18th, 2018HPC Top 5 Stories: May 18th, 2018
HPC Top 5 Stories: May 18th, 2018NVIDIA
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Cindy Gross
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...BigDataEverywhere
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Hortonworks
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchHortonworks
 
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...Data Con LA
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Oct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopOct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopJosh Patterson
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceHortonworks
 
SUSE Expert Days 2017 LENOVO
SUSE Expert Days 2017 LENOVOSUSE Expert Days 2017 LENOVO
SUSE Expert Days 2017 LENOVOSUSE España
 
Business intelligence in the era of big data
Business intelligence in the era of big dataBusiness intelligence in the era of big data
Business intelligence in the era of big dataJC Raveneau
 

Semelhante a How Hadoop Makes the Natixis Pack More Efficient (20)

Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANA
 
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
 
HPC Top 5 Stories: May 18th, 2018
HPC Top 5 Stories: May 18th, 2018HPC Top 5 Stories: May 18th, 2018
HPC Top 5 Stories: May 18th, 2018
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
Big Data Everywhere Chicago: Leading a Healthcare Company to the Big Data Pro...
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Prasanna Resume
Prasanna ResumePrasanna Resume
Prasanna Resume
 
Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 
Oct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopOct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on Hadoop
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
SUSE Expert Days 2017 LENOVO
SUSE Expert Days 2017 LENOVOSUSE Expert Days 2017 LENOVO
SUSE Expert Days 2017 LENOVO
 
Business intelligence in the era of big data
Business intelligence in the era of big dataBusiness intelligence in the era of big data
Business intelligence in the era of big data
 

Mais de DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit/Hadoop Summit
 

Mais de DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

How Hadoop Makes the Natixis Pack More Efficient

  • 1. How did HADOOP make the NATIXIS PACK more efficient ? a short story by Front Office, PnL, Risks and Finance Pierre Alexandre PAUTRAT, Cyril MONTAGNON, EmmanuelVAIE Dataworks Summit Munich 2017 April the 6 th 2017 #DWS17
  • 2. THOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?2 HOW DID WE MEET HADOOP ? A GAME PHASE: the Front Office, Pnl,the Risks and Finance departments HAKAS AND OPENINGS TAKE AWAY Q&A LET US PLAY THE MATCH AGAIN 1 2 3 4 5
  • 3. How did we meet Hadoop ? HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?3  Experiments done by two separate departments in 2014  Credit card analysis POC and site creation for Marketing (Internal Anonymized Test Only) • Hadoop, Elastic Search, Kibana  NoSQL Persistence for simulated profits and losses by the Market Risks Department • HBASE  Informal exchanges between the Front Office and the IT Risks by the end of 2014
  • 4. How did we meet Hadoop ? HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?4  “Big DataThursday”: an open meeting with a positive mood !  June 2015: we built a first Platform - secured as a Production platform should be - host our DEV !  Target: Go live for a PROD platform Summer 2016,  if Pilots projects were to be OK  if sharing a platform was ok for everybody (FO and IT Risks)  January 2016: First project results accelerate the decision to move forward - especially for regulatory hot topics
  • 5. STACK HDFS HBASE HIVE KERBEROS QUICK BI DATA INTEGRATION SQOOP RANGER WORKFLOWS AMBARI TRAINING BACKUP AND CONTINUITY PLAN PYTHON STANDARDS SCHEDULERS SPARK BACKUPS R DATA GOVERNANCE ATLAS WHEREHOWS COLLIBRA… BILLING PLAFORM GOVERNANCE COMMITEE PRODUCTION PLATFORM SPONSOR HADOOP COMMITERS SCALA JAVA SPARK ML PHOENIX KAFKA INTEGRATION WITH AUTHORIZATION 2017 MARCH PRODUCTION VERSION NATIXIS 2.5.3 LLAP ZEPPELIN AUTOMATIC SSO 2S 2014 INDEXIMA ANACONDA ARCHIVING CLUSTER Our Technical Journey
  • 6. A game phase: The Front Office, Pnl,The Risks and Finance departments ecosystem HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?6  Regulatory evolutions  RIM (Regulatory Initial Margin) • the initial amount for your loan !  FRTB (Fundamental Review of the Trading Book) • the new vision of the Market Risk for the ECB
  • 7. A game phase: The Front Office, Pnl,The Risks and Finance departments ecosystem HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?7  More efficient  Stop processing data in a sequential way  Do not waste time in transferring data from one NAS to another  Go beyond the limit of the (usual) monolithic and centralized systems  Process data where it is in a common and secured place-> HDFS  Precise and secured synchronization -> KAFKA  NoSQL persistence versus Standard SQL -> HBASE  Connecting the BIG DATA universe to the BIG COMPUTE paradigm  AddedValue: making Golden Sources available on the cluster
  • 8. A game phase: The Front Office, Pnl,The Risks and Finance departments ecosystem 11 avril 2017 8 PnL PnL certification Finance Regulatory Provision Accountancy Front Office Positions Market data Big Compute Risks Risk Scenario Compute Sensitivities certification
  • 9. HAKA HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?9  If you are interested and want to know more: welcome on board !  Diversity improves knowledge  Our Infrastructure team is onboarded and curious by nature  Open your minds, exchange with others, contribute to Hadoop!  Web Champions inspiration (GAFA)  With the banking industry  Try to optimise the architecture during this meeting through guided debate  An iterative way… A progressive way Exchange : Big Data Thursday «To bodly go where no man has gone before »
  • 10. HAKA TITRE DE LA PRÉSENTATION 27 FÉVRIER 201710  Try your own solution as early as possible  Proceed iteratively, work on the DEV with real data of the real size  Enjoy the Wave between DEV and Prod  Find a Minimum Viable Solution for each project  A reference, a starting kit, publish everything on the Entreprise Social Network  An integrated BI solution (with a Big Data cluster) is crucial: Indexima…  Demonstrate use cases to build platform legitimacy  A Machine Learning enabled platform  With flagship success in the community Minimum Viable Solution Done  More than 40 Pocs & Projects and 10 in production
  • 11. Openings:Technologies HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?11  Infrastructure and security helpers:  Ambari: setup confort  Ranger KMS, Ranger: security  Ambari Metrics: monitoring  ETL and stored like processes, data appenders:  HDFS DFS copy, Web HDFS  HIVE, • High latency (if not using LLAP…)  Low latency, version control and NoSQL container:  HBASE and Phoenix
  • 12. Openings: Age of discovery HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?12  Hive, what else? Pros Cons Best learning curve Not easy to test JDBC compatible Not iterative Hard to maintain Latency Not really ACID… API is not friendly (UDF) Data scientists, operatives, POCs
  • 13. Openings: Age of discovery HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?13  Hive use cases  Explore data  I am comfortable with SQL  Business is pushing hard to produce results
  • 14. Openings: Age of reason HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?14 Pros Cons Reduced latency Slow learning curve (Scala…) Iterative jobs Memory greedy Easy to test Evolving very quickly Easy to maintain Slow learning curve (Scala…) Friendly API Memory greedy Large community  Ok now I want a computation engine for developers… Spark!
  • 15. Openings: Age of reason HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?15 Spark use cases  Iterative computations (cache data!)  Streaming data  I want to test my code  Machine learning
  • 16. Openings: Age of reason HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?16  Another tool to read and write data very fast : HBase!  Uses cases : Logs,Time series… Pros Cons Very fast Just a distributed multimap Latency REST api… TTL Data model less flexible (than Hive)
  • 17. Openings: Technologies used NOW HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?17  Inter-application messaging  Kafka  Database import  SQOOP  Datascience and prototyping  Zeppelin with Livy  BI and Restitution  IndexIma 10 B records in 10 msec !  SQL Server and Polypath  Data governance  Atlas,  Colibra
  • 18. Take Aways HOW DID HADOOP MAKE THE NATIXIS PACK MORE EFFICIENT ?18  Associate  Dynamic iterative positive mood weekly meetings  Manage your projects as a community  MinimumViable Solutions and iterate  Integrated BI solution : an open window on the big data  DEV.Cluster as a PROD. Cluster : Kerberos is key  Hadoop Providers : make them involved in your project !  In our caseThankYou Hortonworks for your involvement !
  • 19. CONTACT SPEAKERS M. PierreAlexandre PAUTRAT pierre-alexandre.pautrat@natixis.com https://www.linkedin.com/in/pierrealexandrepautrat/ M. Cyril MONTAGNON cyril.montagnon@natixis.com M. EmmanuelVAIE emmanuel.vaie@natixis.com https://www.linkedin.com/in/emmanuelvaie ADRESSE  NATIXIS 30, avenue Pierre Mendès France 75013 Paris - France www.natixis.com
  • 20. How did HADOOP make the NATIXIS PACK more efficient ? a short story by Front Office, PnL, Risks and Finance Pierre Alexandre PAUTRAT, Cyril MONTAGNON, EmmanuelVAIE Dataworks Summit Munich 2017 April the 6 th 2017 #DWS17

Notas do Editor

  1. Présentation de chacun: PAP Cyril Emmanuel We are going to tell you about the experience we had in Natixis in the big Hadoop shift from our legacy IT architecture to a new data centric one. How did we structure/design the change process? How did we convince the different IT actors to embrace the change? And how did we avoid conflicts? Pierre Alexandre will tell you: the history of the project, the way the infrastructure was built and about security and cluster governance. Cyril will talk about technologies: our sucesses and our ambitions I’ll provide some final points AND try and to make the journey as pleasant as possible!!!!
  2. The first point will deal with history: pioneering The second is about consolidation The third point will deal with our feedback The 4th point will be a synthesis And the last point is for you
  3. What events could have made us consider Hadoop as a viable solution ? Why ? The data volumetry The immutable intrinsic characteristic of HDFS: the data is never lost The ability to being scaled out horizontally
  4. The additional features :
  5. As a matter of fact, our different departments are collobarating in such a way : Présenter le datalake
  6. What were the key points of success : weekly meetings
  7. Openings : the tools sorted by needs
  8. Présentation de chacun: PAP Cyril Emmanuel We are going to tell you about the experience we had in Natixis in the big Hadoop shift from our legacy IT architecture to a new data centric one. How did we structure/design the change process? How did we convince the different IT actors to embrace the change? And how did we avoid conflicts? Pierre Alexandre will tell you: the history of the project, the way the infrastructure was built and about security and cluster governance. Cyril will talk about technologies: our sucesses and our ambitions I’ll provide some final points AND try and to make the journey as pleasant as possible!!!!