SlideShare a Scribd company logo
1 of 28
Real-time energy data analytics with Storm
Hadoop Summit 2014, San José, June 3rd
Rémy Saissy - Simon Maby, Octo Technology
Marie-Luce Picard - Bruno Jacquin - Charles Bernard - Benoît Grossin, EDF R&D
2
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
3
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
4
EDF GROUP : A GLOBAL LEADER IN
ELECTRICITY
 €72.7 billion in sales
 39.3 million customers
 159,740 employees worldwide
 84.7% of generation does not emit CO2
Net production capacity
5
EDF R&D: missions and key
figures
€ 520 millions
budget in 2012
70 % activity to support
performance of Group
businesses
30 % activity to anticipate
and prepare for the future
500 major projects ongoing
7 international
Centres
including
3 France
4 Germany, United
Kingdom, Poland, China
Plus 1 USA based team
(technology/innovation
survey and prospective)
2 100 employees
including :
370 PhD
150 PhD students
200 researchers teaching
at universities and advanced
engineering schools
15 departments
(expertise, partnerships
and project management)
14 joint research
laboratories
Partnering with 4 venture
capital funds
in the field of clean technologies
- Consolidate a carbon-free energy mix
- Anticipate the electricity of tomorrow
- Develop a flexible range of low carbon
energy
6
IT consulting company
209 employees
174 consultants, architects, experts or
coaches mastering:
Technology
Methodology
Knowledge of your business needs and
challenges
24.1 million in turnover worldwide
(2013)
16 years of feedbacks
Purely organic growth (20% annually)
Strong corporate culture and values
OCTO ID
NUMBERS
27% JUNIOR
33% SENIOR
40% DE CONFIRMÉS
TURNOVER
EMPLOYEES
« We want to reproduce wherever
possible what made us successful:
a vision of IT, strong values and
sharp skills. »
INTERNATIONAL LOCATIONS
EXPERIENCED
OUR EXPEREINCED TEAM:
7
What we do ?
We use technology and creativity to turn your ideas into reality
IT CONSULTING AND EXPERTISE
It is the product of an ambitious business vision
turned reality thanks to a pragmatic use of
technology.
DESIGN OF INNOVATIVE APPLICATIONS
We are committed to fostering the fruition of your
ideas and needs, making them concrete so that
you can start benefitting from them in just a few
weeks.
You can trust us with the implementation of your
software products from start to finish. We can also
help you to design better innovative applications.
8
Electricity industry business and data
management
The development of Smart Grids will lead to
the creation, collection and use of an
unprecedented amount of data for
utilities. This brings opportunities for:
 A better optimization of the system,
 Improving the value for customers, based
on a deep exploitation of consumption
data
The whole sector is evolving – “smart” data
is everywhere
Utilities become digital: physical systems
come with digital ones (at all levels, from
transportation, distribution, production or
sales), the system becomes more complex
(demand response, distributed generation …)
Today, 2 indexes a year.
Tomorrow, a daily measurement = + 20 000 %
Tomorrow, one measurement every ½ hour = + 900 000 %
9
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
10
POC on STORM: objectives
Evaluate Storm capabilities for various real-time analytical
processing needs:
 On time series
 Simple or complex analytics(build KPIs , or run adaptive machine learning
algorithms)
 Merging data in motion and data at rest
 With real-time business intelligence constraints (not so extreme)
Have a deeper understanding on how Storm works (concepts) and
be able to compare with other classical CEP tools
11
POC Storm: functional picture
Smart Metering
Data Stream
Input
Customer data
Static or dynamic pricing
Weather forecasts
DatainmotionDataatrest
http://storm-project.net/
• Simple
aggregations
ex. national curve
• Complex
aggregations
ex. curves
aggregated by tariff
• Analytics:
ex. scoring (for each
meter)
• Forecasts:
ex.D+1 forecasts
expressed in Wh and
in € (adaptive
models)
Output
12
POC Storm: functional picture
Smart Metering
Data Stream
Input
Customer data
Static or dynamic pricing
Weather forecasts
DatainmotionDataatrest
http://storm-project.net/
• Simple
aggregations
ex. national curve
• Complex
aggregations
ex. curves
aggregated by tariff
• Analytics:
ex. scoring (for each
meter)
• Forecasts:
ex.D+1 forecasts
expressed in Wh and
in € (adaptive
models)
Output
1 ZOOM ON DATA
ZOOM ON ANALYTICS
2
13
Use of simulated data (load curves)
 The simulator TURBO-COURBOGEN © aims to generate
massively individual volatile load-curves
 The simulated aggregated curve should be close to the
real aggregate
 Non parametric and efficient:
 Java code on CPU 2GHz (Xenon E5405)
 360000 tuples/s/CPU (18X real-time)
 See [5]
POC Storm: Zoom on data
Real individual
data
Machine
Learning
process
Markov
generative
model
Simulation
14
Individual scores based on SAX transformation (see FROST
library presentation, a lightning talk during Hadoop
Summit Europe 2013 [3])
Forecasts based on GAM models
Generalized Additive Models, use of mgcv R package (S.
Wood), applied to electricity demand forecast [6]
POC Storm: Zoom on analytics
15
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
16
General development context
Storm
 Many concepts to understand (learning curve)
 Easy to take in hand
 Easy to test and deploy (Storm client)
Setting up a cluster
 HDP 2.1 cluster
 11 nodes
 Easy to install with Ambari 1.5
Task force
 Storm newbies
 Statistics, development and architecture skills
 30 days * 2 persons
17
Use Case: Next Day Forecasting
18
Turbo-CourboGen©
Emits tuples as fast as it can
209544,4268,282240,0.596,0.579,0.322,0.115,0.098,0.052,0.053,0.019,0.055,0.051,0.008,0.054,0.02,0.059,0.06,0.555,0.614,0.56,0.651,
0.631,1.529,4.103,14.937,11.796,13.857,9.309,8.511,6.58,13.06,16.016,11.236,9.304,15.057,5.188,0.682,0.284,0.925,0.181,0.268,0.264,
0.525,0.221,0.197,0.215,0.174,0.132,0.118,0.132
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
0:30
1:00
1:30
2:00
2:30
3:00
3:30
4:00
4:30
5:00
5:30
6:00
6:30
7:00
7:30
8:00
8:30
9:00
9:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
22:00
22:30
23:00
23:30
0:00
19
R computation within a CEP
 Reuse of existing scripts
 Skills available within the organization
 Parallelization abstraction thanks to Storm
 Being able to load new models from the R&D on the fly
But…
 Difficult to instantiate
 Difficult to debug
 Slow (potential bottleneck)
20
Performance Metrics
Load test run distribution – Tuples processed per minute
10 workers
Batch size of 2000
Low Parallelism Hint (<10)
21
Performance Metrics
Load test run distribution – Tuples processed per minute
10 workers
Batch size of 2000
Medium Parallelism Hint (~100)
22
Performance Metrics
Load test run distribution – Tuples processed per minute
20 workers
Batch size of 5000
HighParallelism Hint (~400)
23
Performance Metrics
Tuples processed over time
20 workers
Batch size of 5000
HighParallelism Hint (~400)
24
If we had to start over
25
Conclusion
 We had fun
 Behavior within the whole Information System
 Resources sharing with the rest of the stack
 Storm-on-YARN, capacity scheduler
 Lack of Security
 Wire encryption
 User role management (Kerberos?)
 Reliability
 Transactional
 Failover
 DevOps
26
Conclusion
Finally, Storm is used in operational conditions for supervising the
communication network associated with smart meters [7]
 Process 8 millions of events every day
 Need to build KPIs on the fly for managing the system and ensuring QoS
 Use of Trident (mini-batch, idempotency)
 Storm is used with other components (HBase, Kafka …)
References
[1] A proof of concept with Hadoop: storage and analytics of electrical time-series.
Marie-Luce Picard, Bruno Jacquin, Hadoop Summit 2012, Californie, USA, 2012.
présentation : http://www.slideshare.net/Hadoop_Summit/proof-of-concent-with-hadoop
vidéo: http://www.youtube.com/watch?v=mjzblMBvt3Q&feature=plcp
[2] Massive Smart Meter Data Storage and Processing on top of Hadoop.
Leeley D. P. dos Santos, Alzennyr G. da Silva, Bruno Jacquin, Marie-Luce Picard, David Worms,Charles
Bernard. Workshop Big Data 2012, Conférence VLDB (Very Large Data Bases), Istanbul, Turquie, 2012.
http://www.cse.buffalo.edu/faculty/tkosar/bigdata2012/program.php
[3] Smart Metering x Hadoop x Frost: A Smart Elephant Enabling Massive Time Series Analysis.
Benoît Grossin, Marie-Luce Picard, Hadoop Summit Europe 2013, Amsterdam, Mars 2013
http://hadoopsummit.org/amsterdam/
[4] Searching time-series with Hadoop in an electric power company.
Alice Bérard, Georges Hébrail, BigMine Workshop, KDD2013, Chicago, August 2013
http://bigdata-mining.org/
[5] Realistic and very fast simulation of individual electricity consumption
Alexis Bondu, IEEE Transaction on Smart Grid Journal, 2014, to be published
[6] Short-term electricity load forecasting with Generalized Additive Models
Amandine Pierrot, Yannig Goude, Proceedings of ISAP Power, pp593-600, 2011
[7] Retour d’expérience du client eRDF. Supervision Linky
Olivier Pellegrino, Richard Tagliazucchi, RedHat Forum, Paris, Juin 2014.
Special thanks to : EDF R&D: Alexis Bondu, Yannig Goude
OCTO Technology: Cyrille Mailley

More Related Content

What's hot

State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
The Environmental Impact of Cloud Computing
The Environmental Impact of Cloud ComputingThe Environmental Impact of Cloud Computing
The Environmental Impact of Cloud ComputingSuyati Technologies
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Business Intelligence and Data Analytics in Renewable Energy Sector
Business Intelligence and Data Analytics in Renewable Energy SectorBusiness Intelligence and Data Analytics in Renewable Energy Sector
Business Intelligence and Data Analytics in Renewable Energy SectorDarshit Paun
 
BLD() Tech Conference — Data exploration with KSQL
BLD() Tech Conference — Data exploration with KSQLBLD() Tech Conference — Data exploration with KSQL
BLD() Tech Conference — Data exploration with KSQLGillis J. de Nijs
 
Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning Armando Vieira
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsGanesan Narayanasamy
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise WeAreEsynergy
 
Real-time data integration to the cloud
Real-time data integration to the cloudReal-time data integration to the cloud
Real-time data integration to the cloudSankar Nagarajan
 
Big Data Techcon 2014
Big Data Techcon 2014Big Data Techcon 2014
Big Data Techcon 2014Samir Lad
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsMuralidhar Somisetty
 
Data as the New Oil: Producing Value in the Oil and Gas Industry
 Data as the New Oil: Producing Value in the Oil and Gas Industry Data as the New Oil: Producing Value in the Oil and Gas Industry
Data as the New Oil: Producing Value in the Oil and Gas IndustryVMware Tanzu
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputinginside-BigData.com
 
Big data and Blockchain in HealthIT
Big data and Blockchain in HealthITBig data and Blockchain in HealthIT
Big data and Blockchain in HealthITDave Callaghan
 
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 20187 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 2018Ellen Friedman
 
IoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldIoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldDataWorks Summit
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyDatabricks
 

What's hot (20)

State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
The Environmental Impact of Cloud Computing
The Environmental Impact of Cloud ComputingThe Environmental Impact of Cloud Computing
The Environmental Impact of Cloud Computing
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Business Intelligence and Data Analytics in Renewable Energy Sector
Business Intelligence and Data Analytics in Renewable Energy SectorBusiness Intelligence and Data Analytics in Renewable Energy Sector
Business Intelligence and Data Analytics in Renewable Energy Sector
 
BLD() Tech Conference — Data exploration with KSQL
BLD() Tech Conference — Data exploration with KSQLBLD() Tech Conference — Data exploration with KSQL
BLD() Tech Conference — Data exploration with KSQL
 
Big Data Analytics at Vestas Wind Systems
Big Data Analytics at Vestas Wind SystemsBig Data Analytics at Vestas Wind Systems
Big Data Analytics at Vestas Wind Systems
 
Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
 
Perspective on HPC-enabled AI
Perspective on HPC-enabled AIPerspective on HPC-enabled AI
Perspective on HPC-enabled AI
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
Real-time data integration to the cloud
Real-time data integration to the cloudReal-time data integration to the cloud
Real-time data integration to the cloud
 
Big Data Techcon 2014
Big Data Techcon 2014Big Data Techcon 2014
Big Data Techcon 2014
 
Apache Spark and future of advanced analytics
Apache Spark and future of advanced analyticsApache Spark and future of advanced analytics
Apache Spark and future of advanced analytics
 
Data as the New Oil: Producing Value in the Oil and Gas Industry
 Data as the New Oil: Producing Value in the Oil and Gas Industry Data as the New Oil: Producing Value in the Oil and Gas Industry
Data as the New Oil: Producing Value in the Oil and Gas Industry
 
High Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for SupercomputingHigh Availability HPC ~ Microservice Architectures for Supercomputing
High Availability HPC ~ Microservice Architectures for Supercomputing
 
Big data and Blockchain in HealthIT
Big data and Blockchain in HealthITBig data and Blockchain in HealthIT
Big data and Blockchain in HealthIT
 
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 20187 Habits for Big Data in Production - keynote Big Data London Nov 2018
7 Habits for Big Data in Production - keynote Big Data London Nov 2018
 
IoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected WorldIoT: How Data Science Driven Software is Eating the Connected World
IoT: How Data Science Driven Software is Eating the Connected World
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
 

Similar to Real-time Energy Data Analytics with Storm

Andmekeskuse hüperkonvergents
Andmekeskuse hüperkonvergentsAndmekeskuse hüperkonvergents
Andmekeskuse hüperkonvergentsPrimend
 
Big data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sectorBig data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sectorNicolas Sarramagna
 
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Enterprise Italia
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginnershpcexperiment
 
Workload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning PlatformWorkload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning PlatformActiveeon
 
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)Denny Muktar
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsHitachi Vantara
 
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessHow eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessElasticsearch
 
Microservices: The Future-Proof Framework for IoT
Microservices: The Future-Proof Framework for IoTMicroservices: The Future-Proof Framework for IoT
Microservices: The Future-Proof Framework for IoTCapgemini
 
Activeeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migrationActiveeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migrationActiveeon
 
Le Bourget 2017 - From earth observation to actionable intelligence
Le Bourget 2017 - From earth observation to actionable intelligenceLe Bourget 2017 - From earth observation to actionable intelligence
Le Bourget 2017 - From earth observation to actionable intelligenceLeonardo
 
Introduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data PlatformIntroduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data PlatformOVHcloud
 
Hey IT, Meet OT with Hima Mukkamala
Hey IT, Meet OT with Hima MukkamalaHey IT, Meet OT with Hima Mukkamala
Hey IT, Meet OT with Hima Mukkamalagogo6
 
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the DatacenterIDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the DatacenterNEXTtour
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 

Similar to Real-time Energy Data Analytics with Storm (20)

Andmekeskuse hüperkonvergents
Andmekeskuse hüperkonvergentsAndmekeskuse hüperkonvergents
Andmekeskuse hüperkonvergents
 
Big data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sectorBig data presentation, explanations and use cases in industrial sector
Big data presentation, explanations and use cases in industrial sector
 
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
HP Moonshot. Progettato per i Data Center, costruito per il pianeta.
 
CTE Phase III
CTE Phase IIICTE Phase III
CTE Phase III
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
 
Workload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning PlatformWorkload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning Platform
 
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
 
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their BusinessHow eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
How eStruxture Data Centers is Using ECE to Rapidly Scale Their Business
 
Microservices: The Future-Proof Framework for IoT
Microservices: The Future-Proof Framework for IoTMicroservices: The Future-Proof Framework for IoT
Microservices: The Future-Proof Framework for IoT
 
IBM Think Milano
IBM Think MilanoIBM Think Milano
IBM Think Milano
 
Activeeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migrationActiveeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migration
 
Mundi
MundiMundi
Mundi
 
Le Bourget 2017 - From earth observation to actionable intelligence
Le Bourget 2017 - From earth observation to actionable intelligenceLe Bourget 2017 - From earth observation to actionable intelligence
Le Bourget 2017 - From earth observation to actionable intelligence
 
Introduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data PlatformIntroduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data Platform
 
NetApp - Digital Transformation
NetApp - Digital TransformationNetApp - Digital Transformation
NetApp - Digital Transformation
 
Hey IT, Meet OT with Hima Mukkamala
Hey IT, Meet OT with Hima MukkamalaHey IT, Meet OT with Hima Mukkamala
Hey IT, Meet OT with Hima Mukkamala
 
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the DatacenterIDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
IDC Nutanix - Hyperconvergence and the Pulling Forces in the Datacenter
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Real-time Energy Data Analytics with Storm

  • 1. Real-time energy data analytics with Storm Hadoop Summit 2014, San José, June 3rd Rémy Saissy - Simon Maby, Octo Technology Marie-Luce Picard - Bruno Jacquin - Charles Bernard - Benoît Grossin, EDF R&D
  • 2. 2 Outline 1. CONTEXT 2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS 3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS 4. CONCLUSIONS 5. REFERENCES Brice Richard - FlickrKC Tan Phoyography - Flickr
  • 3. 3 Outline 1. CONTEXT 2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS 3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS 4. CONCLUSIONS 5. REFERENCES Brice Richard - FlickrKC Tan Phoyography - Flickr
  • 4. 4 EDF GROUP : A GLOBAL LEADER IN ELECTRICITY  €72.7 billion in sales  39.3 million customers  159,740 employees worldwide  84.7% of generation does not emit CO2 Net production capacity
  • 5. 5 EDF R&D: missions and key figures € 520 millions budget in 2012 70 % activity to support performance of Group businesses 30 % activity to anticipate and prepare for the future 500 major projects ongoing 7 international Centres including 3 France 4 Germany, United Kingdom, Poland, China Plus 1 USA based team (technology/innovation survey and prospective) 2 100 employees including : 370 PhD 150 PhD students 200 researchers teaching at universities and advanced engineering schools 15 departments (expertise, partnerships and project management) 14 joint research laboratories Partnering with 4 venture capital funds in the field of clean technologies - Consolidate a carbon-free energy mix - Anticipate the electricity of tomorrow - Develop a flexible range of low carbon energy
  • 6. 6 IT consulting company 209 employees 174 consultants, architects, experts or coaches mastering: Technology Methodology Knowledge of your business needs and challenges 24.1 million in turnover worldwide (2013) 16 years of feedbacks Purely organic growth (20% annually) Strong corporate culture and values OCTO ID NUMBERS 27% JUNIOR 33% SENIOR 40% DE CONFIRMÉS TURNOVER EMPLOYEES « We want to reproduce wherever possible what made us successful: a vision of IT, strong values and sharp skills. » INTERNATIONAL LOCATIONS EXPERIENCED OUR EXPEREINCED TEAM:
  • 7. 7 What we do ? We use technology and creativity to turn your ideas into reality IT CONSULTING AND EXPERTISE It is the product of an ambitious business vision turned reality thanks to a pragmatic use of technology. DESIGN OF INNOVATIVE APPLICATIONS We are committed to fostering the fruition of your ideas and needs, making them concrete so that you can start benefitting from them in just a few weeks. You can trust us with the implementation of your software products from start to finish. We can also help you to design better innovative applications.
  • 8. 8 Electricity industry business and data management The development of Smart Grids will lead to the creation, collection and use of an unprecedented amount of data for utilities. This brings opportunities for:  A better optimization of the system,  Improving the value for customers, based on a deep exploitation of consumption data The whole sector is evolving – “smart” data is everywhere Utilities become digital: physical systems come with digital ones (at all levels, from transportation, distribution, production or sales), the system becomes more complex (demand response, distributed generation …) Today, 2 indexes a year. Tomorrow, a daily measurement = + 20 000 % Tomorrow, one measurement every ½ hour = + 900 000 %
  • 9. 9 Outline 1. CONTEXT 2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS 3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS 4. CONCLUSIONS 5. REFERENCES Brice Richard - FlickrKC Tan Phoyography - Flickr
  • 10. 10 POC on STORM: objectives Evaluate Storm capabilities for various real-time analytical processing needs:  On time series  Simple or complex analytics(build KPIs , or run adaptive machine learning algorithms)  Merging data in motion and data at rest  With real-time business intelligence constraints (not so extreme) Have a deeper understanding on how Storm works (concepts) and be able to compare with other classical CEP tools
  • 11. 11 POC Storm: functional picture Smart Metering Data Stream Input Customer data Static or dynamic pricing Weather forecasts DatainmotionDataatrest http://storm-project.net/ • Simple aggregations ex. national curve • Complex aggregations ex. curves aggregated by tariff • Analytics: ex. scoring (for each meter) • Forecasts: ex.D+1 forecasts expressed in Wh and in € (adaptive models) Output
  • 12. 12 POC Storm: functional picture Smart Metering Data Stream Input Customer data Static or dynamic pricing Weather forecasts DatainmotionDataatrest http://storm-project.net/ • Simple aggregations ex. national curve • Complex aggregations ex. curves aggregated by tariff • Analytics: ex. scoring (for each meter) • Forecasts: ex.D+1 forecasts expressed in Wh and in € (adaptive models) Output 1 ZOOM ON DATA ZOOM ON ANALYTICS 2
  • 13. 13 Use of simulated data (load curves)  The simulator TURBO-COURBOGEN © aims to generate massively individual volatile load-curves  The simulated aggregated curve should be close to the real aggregate  Non parametric and efficient:  Java code on CPU 2GHz (Xenon E5405)  360000 tuples/s/CPU (18X real-time)  See [5] POC Storm: Zoom on data Real individual data Machine Learning process Markov generative model Simulation
  • 14. 14 Individual scores based on SAX transformation (see FROST library presentation, a lightning talk during Hadoop Summit Europe 2013 [3]) Forecasts based on GAM models Generalized Additive Models, use of mgcv R package (S. Wood), applied to electricity demand forecast [6] POC Storm: Zoom on analytics
  • 15. 15 Outline 1. CONTEXT 2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS 3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS 4. CONCLUSIONS 5. REFERENCES Brice Richard - FlickrKC Tan Phoyography - Flickr
  • 16. 16 General development context Storm  Many concepts to understand (learning curve)  Easy to take in hand  Easy to test and deploy (Storm client) Setting up a cluster  HDP 2.1 cluster  11 nodes  Easy to install with Ambari 1.5 Task force  Storm newbies  Statistics, development and architecture skills  30 days * 2 persons
  • 17. 17 Use Case: Next Day Forecasting
  • 18. 18 Turbo-CourboGen© Emits tuples as fast as it can 209544,4268,282240,0.596,0.579,0.322,0.115,0.098,0.052,0.053,0.019,0.055,0.051,0.008,0.054,0.02,0.059,0.06,0.555,0.614,0.56,0.651, 0.631,1.529,4.103,14.937,11.796,13.857,9.309,8.511,6.58,13.06,16.016,11.236,9.304,15.057,5.188,0.682,0.284,0.925,0.181,0.268,0.264, 0.525,0.221,0.197,0.215,0.174,0.132,0.118,0.132 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0:30 1:00 1:30 2:00 2:30 3:00 3:30 4:00 4:30 5:00 5:30 6:00 6:30 7:00 7:30 8:00 8:30 9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 19:00 19:30 20:00 20:30 21:00 21:30 22:00 22:30 23:00 23:30 0:00
  • 19. 19 R computation within a CEP  Reuse of existing scripts  Skills available within the organization  Parallelization abstraction thanks to Storm  Being able to load new models from the R&D on the fly But…  Difficult to instantiate  Difficult to debug  Slow (potential bottleneck)
  • 20. 20 Performance Metrics Load test run distribution – Tuples processed per minute 10 workers Batch size of 2000 Low Parallelism Hint (<10)
  • 21. 21 Performance Metrics Load test run distribution – Tuples processed per minute 10 workers Batch size of 2000 Medium Parallelism Hint (~100)
  • 22. 22 Performance Metrics Load test run distribution – Tuples processed per minute 20 workers Batch size of 5000 HighParallelism Hint (~400)
  • 23. 23 Performance Metrics Tuples processed over time 20 workers Batch size of 5000 HighParallelism Hint (~400)
  • 24. 24 If we had to start over
  • 25. 25 Conclusion  We had fun  Behavior within the whole Information System  Resources sharing with the rest of the stack  Storm-on-YARN, capacity scheduler  Lack of Security  Wire encryption  User role management (Kerberos?)  Reliability  Transactional  Failover  DevOps
  • 26. 26 Conclusion Finally, Storm is used in operational conditions for supervising the communication network associated with smart meters [7]  Process 8 millions of events every day  Need to build KPIs on the fly for managing the system and ensuring QoS  Use of Trident (mini-batch, idempotency)  Storm is used with other components (HBase, Kafka …)
  • 27. References [1] A proof of concept with Hadoop: storage and analytics of electrical time-series. Marie-Luce Picard, Bruno Jacquin, Hadoop Summit 2012, Californie, USA, 2012. présentation : http://www.slideshare.net/Hadoop_Summit/proof-of-concent-with-hadoop vidéo: http://www.youtube.com/watch?v=mjzblMBvt3Q&feature=plcp [2] Massive Smart Meter Data Storage and Processing on top of Hadoop. Leeley D. P. dos Santos, Alzennyr G. da Silva, Bruno Jacquin, Marie-Luce Picard, David Worms,Charles Bernard. Workshop Big Data 2012, Conférence VLDB (Very Large Data Bases), Istanbul, Turquie, 2012. http://www.cse.buffalo.edu/faculty/tkosar/bigdata2012/program.php [3] Smart Metering x Hadoop x Frost: A Smart Elephant Enabling Massive Time Series Analysis. Benoît Grossin, Marie-Luce Picard, Hadoop Summit Europe 2013, Amsterdam, Mars 2013 http://hadoopsummit.org/amsterdam/ [4] Searching time-series with Hadoop in an electric power company. Alice Bérard, Georges Hébrail, BigMine Workshop, KDD2013, Chicago, August 2013 http://bigdata-mining.org/ [5] Realistic and very fast simulation of individual electricity consumption Alexis Bondu, IEEE Transaction on Smart Grid Journal, 2014, to be published [6] Short-term electricity load forecasting with Generalized Additive Models Amandine Pierrot, Yannig Goude, Proceedings of ISAP Power, pp593-600, 2011 [7] Retour d’expérience du client eRDF. Supervision Linky Olivier Pellegrino, Richard Tagliazucchi, RedHat Forum, Paris, Juin 2014.
  • 28. Special thanks to : EDF R&D: Alexis Bondu, Yannig Goude OCTO Technology: Cyrille Mailley

Editor's Notes

  1. 27
  2. 28