This document summarizes a presentation given by Dr. Kim K. Larsen on big data strategies at Deutsche Telekom. Some key points:
- Deutsche Telekom is developing a "Right-in-Time Big Data Architecture" to support real-time and non-real-time use cases for the foreseeable future.
- This architecture is aligned with IT and other business segments and embraces open source solutions.
- Big data brings challenges of high volume, velocity, and variety of network data, including IoT connections that will generate hundreds of billions of extra events per day.
- Real-time demands range from milliseconds to seconds, while non-real-time includes longer timescales of minutes to hours
Boost Fertility New Invention Ups Success Rates.pdf
Big Data @ NT - A Network Technology Perspective
1. Big Data @ NT
Network Technology Perspective
Big Data Day
Frankfurt am Main, Germany
September 22nd, 2016
Dr. Kim Kyllesbech Larsen,
Group Technology, Deutsche Telekom.
2. Dr. Kim K. Larsen / Big Data @ NT 2
Big Data is Big Team Work.
3. Big Data @ NT.
Dr. Kim K. Larsen / Big Data @ NT 3
Strategy & Vision.
We call it the “Right-in-Time Big Data Architecture”.
Serves all needs for foreseeable future (i.e., min. next 5 yrs).
Supports all existing & proposed new network use cases.
Supports Real Time and non-Real Time Technology use cases.
Alignment with IT & Segments.
Fully aligned over-arching Big Data architectural principles.
High degree of synergy with IT and other segments
embracing open source solutions.
New components required by Network have been
identified & conceptually aligned with IT.
4. Big Network Data.
An illustration …
Dr. Kim K. Larsen / Big Data @ NT 4
Timing
Action & Reaction
High Velocity
(events/sec)
Large Variety
(e.g., 10k+ event cats)
Very High
Volume
Event Process
approx. <1+>
Alarm per sec
approx. <30+>
Events per millisecond
Daily (mobile) IP User Plane
Data 750+ Tera Byte
approx. 20 Mega Byte
per millisecond
5. Future of Bigger Network Data …
Dr. Kim K. Larsen / Big Data @ NT 5
~2.5 IoT connections
per Household
~13 IoT connections
per Household
~300+ IoT connections
per km2 urban area.
~1700+ IoT connections
per km2 urban area
Frankfurt City has ca.
3,000 pop per km2
Germany2024Expect
250– 500Million
IoT Connections Up-to 300+billion
Extra events per day
6. A Network-Centric View.
Dr. Kim K. Larsen / Big Data @ NT 6
€
User Experience
(in Network)
Network
Incidents
Network
Optimization
7. The Functional Scope.
Focuses on main strategic directions from NT Perspective.
Dr. Kim K. Larsen / Big Data @ NT 7
• Network Enrichment of data-driven business models & decisions.
• Enables 360o user experience management.
• External monetization possibilities (e.g., B2B, location, adverts, credit rating, security, etc…).
Data
Driven
Business
Anomaly detection.
Events & Incidents.
Self-restoration.
Self-Healing.
Security.
“Zero-touch” Operations.
Network Operations
Minutes→Milliseconds
Utilization management.
Self-optimization.
Self-configuring.
Resource management.
Congestion management.
Zero-touch” Optimization.
Network Optimization
Month→Week→Milliseconds
Reporting, KPIs, ….
Data enrichment/augmentation.
Classical (re-active) CEM.
NG (pro-active) CEM (NRT).
AI-driven customer care.
“Zero-touch” UX.
User Experience
Reactive→Proactive
8. Plan ahead for Big Data.
Avoid the usual suspects …
Dr. Kim K. Larsen / Big Data @ NT 8
Use
case 1
Reqs
x,y,z,..
Use
case 2
Reqs
a,b,z,..
Use
case 3
Reqs
a,k,p,..
Use
case N
Reqs
x,y,q,..
….
Harmonized
Architectural ConceptUse
Case 1
Design
Use
Case 2
Design
Use
Case 3
Design
… Design
RT
Design
Near-RT
Design
Non-RT
We have to deal with a (large) number of use cases.
Go for a Harmonized
Architectural Concept!
Avoid Ad-hoc Single Use Case
driven system design.
9. Big Network.
The network context and its relation to Big Data and ML.
Dr. Kim K. Larsen / Big Data @ NT 9
Telco Network
Machine Learning Apps
Big Data Analytics
10. What is your Real-Time time-scale?
Dr. Kim K. Larsen / Big Data @ NT 10
Merriam-Webster: “Real-Time is the actual time
during which something takes place.”
11. Telco Real-Time Domain.
Dr. Kim K. Larsen / Big Data @ NT 11
time scale
~50ms ~500ms~5ms minutes hours daysµs
RT Telco World
New territory
Most Real-Time demands ranges from 5 ms up-to 500 ms.
The wide range is covered by different Big Data technologies.
“Tactile domain” drives new uses cases asking for 1ms and lower.
Near-Real TimeReal-Time
Tactile
domain
Non-Real Time
12. The Meaning of “Right-in-Time”
Dr. Kim K. Larsen / Big Data @ NT 12
Use case dependent time-scale.
Reaction times; µs, ms, sec up to min or even hours.
E.g. if relevant time is hours, no need to analyze in millisecond.
time scale
~50ms ~500ms~5ms minutes hours daysµs
SQM / CEM
Status reporting
“Tactile” apps
Network optimization
Fault detection
Incident mgmt
RT Telco World
Marketing related data analytics
streaming
micro batch processing
batch and backend processing
new territory
13. Right-in-Time Network Architecture.
Converged network vision.
Dr. Kim K. Larsen / Big Data @ NT 13
Right-in-TimeBigData
Virtualized Network and Service functions
Infrastructure Cloud
NG IP Network (BNG/TeraStream)
Mobile Access Fixed Access
CPESIM
Hybrid
Virtualized Network and Service functions
Infrastructure Cloud
NG IP Network (BNG/TeraStream)
Mobile Access Fixed Access
CPESIM
Hybrid
Real-TimeNetwork&
ServiceManager
14. Challenges ... The Next Steps.
ML in the Real-Time Domain … from seconds to milliseconds.
Dr. Kim K. Larsen / Big Data @ NT 14
Data Sources
(Data Generation Entity)
Data
Stream
{ X(t) }
Process
(e.g., filter, route,
enrich, compute)
Transport
Decision Point
(e.g., ML model)
Data
Stream
{ X(t), F(X(t)) }
Transport Store
(e.g., HDFS)
Store or
in-memory
Change
Order
Input Output
t0 t1
Roundtrip
time
Scale
~ms
t2
Batch
Process
Typical timescales from ms and up
Insights
Typical timescales
Minutes Daily Monthly
+ Ad-hoc
Streaming or micro-batch processing
MachineLearning Apps
15. Danger of Over-Engineering Solutions.
Dr. Kim K. Larsen / Big Data @ NT 15
Very efficientsolution!
GoodBike
Very expensive& complexsolution!
Bad“Bike”
vs
A B
Best
Solution?
Desired outcomeNeed or Desire
e.g., GLM, Kernels, or parsing e.g., DCNN, RNN, …
Which one of below solutions are the best bike solution?
16. The Entanglement Challenge.
Many machine learning agents (or apps) with different
objectives will be present in a modern control system.
Machine Learning App
“Machine Learning Systems mix signals
together, entangling them & makes
isolation of improvements largely
impossible & stability at risk.”
(RTx) SON
AI
(RTy) CEM
AI
Simple illustration
Optimize cell for best
cell performance
Optimize cell (& terminal?)
for best user experience
Reference: D, Sculley et al (2015), “Hidden Technical
Debts in Machine Learning”.
?
Dr. Kim K. Larsen / Big Data @ NT
17. Simple Agents Interacts in Very Complex Ways!
Dr. Kim K. Larsen / Big Data @ NT 17
“Bots reverted another bot’s change on
average 105 times, significantly larger
than the average of 3 times for humans”.
Source: Tsvetkova et al., “Even Good Bots Fight,
https://arxiv.org/ftp/arxiv/papers/1609/1609.04285.pdf
Bot-Bot interactions
on Wikipedia
Human-Human interactions
on Wikipedia
“Bots intended to support often undo each
other’s changes and these “fights” may
sometimes continue for years”.
“Research suggests that even relatively
“dumb” bots may give rise to complex
interactions.”
18. 18
Does it
work?
No
Yes
Fail Fast
Fail Often
Rapid proto-typing & proof-of-concepts.
Architecture is about building stuff.
Dr. Kim K. Larsen / Big Data @ NT
19. Big Data … Core Technology Beliefs.
Non-exhaustive, i.e., just a subset.
Dr. Kim K. Larsen / Big Data @ NT 19
We (DT) own the
data.
1
Harmonization
more important
than
Centralization.
2
RT and Non-RT
co-exist, both
need to be
embraced in a
“Right in Time”
concept.
4
“Right in Time”
implies that
a single
technology does
not solve every
Big Data
challenge.
5
Benefits from
shared local Big
Data lake
substantial.
3
20. Next Developing Steps.
Dr.KimK.Larsen/BigData @NT 20
Developing a Big Data Architecture in the Tactile Domain
Study Real Time (e.g., ms – sec domain) requirements.
Study System Engineering requirements for Tactile Applications.
Develop proof of concepts – Fail fast philosophy!
Developing RT Applied Machine Learning expertise
Feasibility study of Deep Learning Algorithms applied to RT.
Applied Machine Learning in Tactile Domain, e.g., dynamic algorithms.
Alternatives: Genetic algorithms, scale-free networks.
Developing re-enforcement learning applications.
Spectrum auctions, network management, customer experience, self-
optimized network applications, etc..