With the arrival of the Internet of Things, plenty of data is being generated. Do we need to maintain all that data in the cloud or are there other ways to do this? This presentation, originally given at the BI symposium in the Netherlands complements a blog entry you can find http://community.hpe.com/t5/Cloud-Source/IoT-Big-Data-let-s-think-differently/ba-p/6813967
3. Data Generation increases at fast pace
3
World
Population
7.210 Billion
Active
Internet
Users
3.010 Billion
Penetration: 42%
Active
Social
Media
Accounts
2.078 Billion
Penetration: 29%
Unique
Mobile
Users
3.649 Billion
Penetration: 51%
Active
Mobile
Social
Accounts
1.685 Billion
Penetration: 23%
Things in
IoT
6.400 Billion
4. Welcome to the
new vocabulary
1030
This will be our digital
universe tomorrow…
Geopbyte*
1027
A 1BB hard drive would
cover the earth 23,000
times
Brontobyte
1024
This is our digital universe today
= 250 trillion of DVDs
Yottabyte
10
21
1.3 ZB of network traffic
by 2016
Zettabyte
10
18
1 EB of data is created on the internet each day = 250 million DVDs worth of information.
The proposed Square Kilometer Array telescope will generated an EB of data per day
Exabyte
10
12
Terabyte
500TB of new data per day are ingested in Facebook databases
1015
Petabyte
The CERN Large Hadron Collider
generates 1PB per second
109
Gigabyte
10
6
Megabyte
*The terms Gegobyte and Geobyte are also used in the literature.
5. 5
In 2013, 4.4 Zetabytes
were generated, by 2020,
44 Zetabytes will be
generated.
By 2020, 37% of the digital
universe will contain
information that might be
valuable if analyzed
6. Sources of Data
6
Transaction &
Application Data
Internet of Things Social Media Enterprise Content
Structured Data Unstructured Data
10% of Data By 2020, 12% of Data
7. 7
By 2030, to store all the data
generated, a datacenter, 6X the
size of Greater London would
be required. This datacenter
would consume 25% of the
world energy.
10. 0.3 0.8 1.2 1.8
4.4
7.9
44
0
5
10
15
20
25
30
35
40
45
50
2006 2008 2010 2012 2014 2016 2018 2020 2022
Data(Zettabytes)
Years
‘09
ZB
ZB
Digital Universe
2013
2020
Compute is not keeping up
11. Healthcare - Epidemiology
Don’t bring data to the algorithm, bring the algorithm to data
11
Confidential Healthcare Data – Geographically Constraint
Extract
Anonymize
Extract
Anonymize
Extract
Anonymize
Extract
Anonymize
Extract
Anonymize
Extract
Anonymize
Join
Analyze
Analyse Analyse Analyse Analyse Analyse
Report
Finalize
Analysis
12. Mesh Networks – Information that is important locally
– Process local information locally, don’t clog the internet.
– Information is only valid for a very short amount of time
– If there is a gap in the network, it means there is no car, so the information
does not need to be taken further.
12
Cloud
14. Computer Technology is Changing
14
Processor
CPU registers
Level 1 cache
Level 2 cache
Level 3 cache
Computer
Network
Network drive array*
Network backup
Archive
Main memory
Local disk
Flash accelerator
SSD
* actually an entire computer
system with its own hierarchy
Physical
Server
Physical
Server
SoC
SoC
Local DRAM
Local DRAM
Network
SoC
SoC
Local DRAM
Local DRAM
Memory Pool
NVM
NVM
NVM
NVM
18. Conclusion
–We will go from a centralized to a decentralized processing of data and information
– Allows the processing
– Creates opportunities for a whole new market
of data providers
– Enables improved analysis and allows to
address new problems
– Improves security and reduces data duplication
–Requires a new thinking and a
different approach to analytics
–Builds on a new computing
technology
–Enables full exploitation of IoT
18