08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Real-time Energy Data Analytics with Storm
1. Real-time energy data analytics with Storm
Hadoop Summit 2014, San José, June 3rd
Rémy Saissy - Simon Maby, Octo Technology
Marie-Luce Picard - Bruno Jacquin - Charles Bernard - Benoît Grossin, EDF R&D
2. 2
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
3. 3
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
4. 4
EDF GROUP : A GLOBAL LEADER IN
ELECTRICITY
€72.7 billion in sales
39.3 million customers
159,740 employees worldwide
84.7% of generation does not emit CO2
Net production capacity
5. 5
EDF R&D: missions and key
figures
€ 520 millions
budget in 2012
70 % activity to support
performance of Group
businesses
30 % activity to anticipate
and prepare for the future
500 major projects ongoing
7 international
Centres
including
3 France
4 Germany, United
Kingdom, Poland, China
Plus 1 USA based team
(technology/innovation
survey and prospective)
2 100 employees
including :
370 PhD
150 PhD students
200 researchers teaching
at universities and advanced
engineering schools
15 departments
(expertise, partnerships
and project management)
14 joint research
laboratories
Partnering with 4 venture
capital funds
in the field of clean technologies
- Consolidate a carbon-free energy mix
- Anticipate the electricity of tomorrow
- Develop a flexible range of low carbon
energy
6. 6
IT consulting company
209 employees
174 consultants, architects, experts or
coaches mastering:
Technology
Methodology
Knowledge of your business needs and
challenges
24.1 million in turnover worldwide
(2013)
16 years of feedbacks
Purely organic growth (20% annually)
Strong corporate culture and values
OCTO ID
NUMBERS
27% JUNIOR
33% SENIOR
40% DE CONFIRMÉS
TURNOVER
EMPLOYEES
« We want to reproduce wherever
possible what made us successful:
a vision of IT, strong values and
sharp skills. »
INTERNATIONAL LOCATIONS
EXPERIENCED
OUR EXPEREINCED TEAM:
7. 7
What we do ?
We use technology and creativity to turn your ideas into reality
IT CONSULTING AND EXPERTISE
It is the product of an ambitious business vision
turned reality thanks to a pragmatic use of
technology.
DESIGN OF INNOVATIVE APPLICATIONS
We are committed to fostering the fruition of your
ideas and needs, making them concrete so that
you can start benefitting from them in just a few
weeks.
You can trust us with the implementation of your
software products from start to finish. We can also
help you to design better innovative applications.
8. 8
Electricity industry business and data
management
The development of Smart Grids will lead to
the creation, collection and use of an
unprecedented amount of data for
utilities. This brings opportunities for:
A better optimization of the system,
Improving the value for customers, based
on a deep exploitation of consumption
data
The whole sector is evolving – “smart” data
is everywhere
Utilities become digital: physical systems
come with digital ones (at all levels, from
transportation, distribution, production or
sales), the system becomes more complex
(demand response, distributed generation …)
Today, 2 indexes a year.
Tomorrow, a daily measurement = + 20 000 %
Tomorrow, one measurement every ½ hour = + 900 000 %
9. 9
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
10. 10
POC on STORM: objectives
Evaluate Storm capabilities for various real-time analytical
processing needs:
On time series
Simple or complex analytics(build KPIs , or run adaptive machine learning
algorithms)
Merging data in motion and data at rest
With real-time business intelligence constraints (not so extreme)
Have a deeper understanding on how Storm works (concepts) and
be able to compare with other classical CEP tools
11. 11
POC Storm: functional picture
Smart Metering
Data Stream
Input
Customer data
Static or dynamic pricing
Weather forecasts
DatainmotionDataatrest
http://storm-project.net/
• Simple
aggregations
ex. national curve
• Complex
aggregations
ex. curves
aggregated by tariff
• Analytics:
ex. scoring (for each
meter)
• Forecasts:
ex.D+1 forecasts
expressed in Wh and
in € (adaptive
models)
Output
12. 12
POC Storm: functional picture
Smart Metering
Data Stream
Input
Customer data
Static or dynamic pricing
Weather forecasts
DatainmotionDataatrest
http://storm-project.net/
• Simple
aggregations
ex. national curve
• Complex
aggregations
ex. curves
aggregated by tariff
• Analytics:
ex. scoring (for each
meter)
• Forecasts:
ex.D+1 forecasts
expressed in Wh and
in € (adaptive
models)
Output
1 ZOOM ON DATA
ZOOM ON ANALYTICS
2
14. 14
Individual scores based on SAX transformation (see FROST
library presentation, a lightning talk during Hadoop
Summit Europe 2013 [3])
Forecasts based on GAM models
Generalized Additive Models, use of mgcv R package (S.
Wood), applied to electricity demand forecast [6]
POC Storm: Zoom on analytics
15. 15
Outline
1. CONTEXT
2. OBJECTIVES : USING A CEP FOR REAL-TIME ANALYTICS
3. POC ON STORM: DETAILED ARCHITECTURE AND RESULTS
4. CONCLUSIONS
5. REFERENCES
Brice Richard - FlickrKC Tan Phoyography - Flickr
16. 16
General development context
Storm
Many concepts to understand (learning curve)
Easy to take in hand
Easy to test and deploy (Storm client)
Setting up a cluster
HDP 2.1 cluster
11 nodes
Easy to install with Ambari 1.5
Task force
Storm newbies
Statistics, development and architecture skills
30 days * 2 persons
19. 19
R computation within a CEP
Reuse of existing scripts
Skills available within the organization
Parallelization abstraction thanks to Storm
Being able to load new models from the R&D on the fly
But…
Difficult to instantiate
Difficult to debug
Slow (potential bottleneck)
20. 20
Performance Metrics
Load test run distribution – Tuples processed per minute
10 workers
Batch size of 2000
Low Parallelism Hint (<10)
21. 21
Performance Metrics
Load test run distribution – Tuples processed per minute
10 workers
Batch size of 2000
Medium Parallelism Hint (~100)
22. 22
Performance Metrics
Load test run distribution – Tuples processed per minute
20 workers
Batch size of 5000
HighParallelism Hint (~400)
25. 25
Conclusion
We had fun
Behavior within the whole Information System
Resources sharing with the rest of the stack
Storm-on-YARN, capacity scheduler
Lack of Security
Wire encryption
User role management (Kerberos?)
Reliability
Transactional
Failover
DevOps
26. 26
Conclusion
Finally, Storm is used in operational conditions for supervising the
communication network associated with smart meters [7]
Process 8 millions of events every day
Need to build KPIs on the fly for managing the system and ensuring QoS
Use of Trident (mini-batch, idempotency)
Storm is used with other components (HBase, Kafka …)
27. References
[1] A proof of concept with Hadoop: storage and analytics of electrical time-series.
Marie-Luce Picard, Bruno Jacquin, Hadoop Summit 2012, Californie, USA, 2012.
présentation : http://www.slideshare.net/Hadoop_Summit/proof-of-concent-with-hadoop
vidéo: http://www.youtube.com/watch?v=mjzblMBvt3Q&feature=plcp
[2] Massive Smart Meter Data Storage and Processing on top of Hadoop.
Leeley D. P. dos Santos, Alzennyr G. da Silva, Bruno Jacquin, Marie-Luce Picard, David Worms,Charles
Bernard. Workshop Big Data 2012, Conférence VLDB (Very Large Data Bases), Istanbul, Turquie, 2012.
http://www.cse.buffalo.edu/faculty/tkosar/bigdata2012/program.php
[3] Smart Metering x Hadoop x Frost: A Smart Elephant Enabling Massive Time Series Analysis.
Benoît Grossin, Marie-Luce Picard, Hadoop Summit Europe 2013, Amsterdam, Mars 2013
http://hadoopsummit.org/amsterdam/
[4] Searching time-series with Hadoop in an electric power company.
Alice Bérard, Georges Hébrail, BigMine Workshop, KDD2013, Chicago, August 2013
http://bigdata-mining.org/
[5] Realistic and very fast simulation of individual electricity consumption
Alexis Bondu, IEEE Transaction on Smart Grid Journal, 2014, to be published
[6] Short-term electricity load forecasting with Generalized Additive Models
Amandine Pierrot, Yannig Goude, Proceedings of ISAP Power, pp593-600, 2011
[7] Retour d’expérience du client eRDF. Supervision Linky
Olivier Pellegrino, Richard Tagliazucchi, RedHat Forum, Paris, Juin 2014.
28. Special thanks to : EDF R&D: Alexis Bondu, Yannig Goude
OCTO Technology: Cyrille Mailley