The topic of Federated Clouds has been in discussion for several years. However, practice today sees very little federation across large infrastructure providers. One of the biggest causes of this loitering is insufficient understanding of how to share responsibility across data centers, providers, and so on. This study shows that understanding cloud performance at such a large scale is a crucial part of information support in federated clouds. Topics like cloud performance measurement and modeling, as well as several practical ongoing projects and works in progress are also discussed.
WordPress Websites for Engineers: Elevate Your Brand
Is It Time to Go Global with Cloud Performance Management?
1.
2. .
Mission Statement
1. federated clouds = diversification
2. many DCs and/or cloud providers
3. we care mostly about performance
4. practical solutions are needed
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 2/30
...
2/30
3. .
Example: BizStore
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 3/30
...
3/30
4. .
BizStore: One DC is Not Enough
• remember June 2013?
• most services today use vertical intergration -- no diversity
• Hitachi does not share DCs with NEC
• regional diversity of one provider is bad
◦ how many Amazon DCs in Japan?
.
(the only possible) Solution
..
.
... is to sign contracts with multiple DCs and manage on
client side
◦ to be officially presented/released in April 01
01 myself+0 "High Availability Cloud Storage ... Social Graph ... Smart Distribution" NS研 (April 2014)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 4/30
...
4/30
5. .
BizStore: One DC is Not Enough
Kansai
DC1
OkinawaLocations
Data
Centers
DC2
Kyushu
Osaka Office DC1
DC1 DC2 Naha Office
Network
distance
Network
distance
storage
network
Employee A ….
Content / Social Metadata
High Availability Data Store
DC1 DC2 ….
DC1 DC2 Business trip
Store
APIs
Proposed
Software
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 5/30
...
5/30
6. .
BizStore: Store Diversification
• in software: not a priority list -- optimization engine!
• realtime performance monitoring, read/write optimization, etc.
• sub-file data unit -- chunks
SSD
Growing network
distance
User
HDD DC1 DC2 …
Network
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 6/30
...
6/30
7. .
BizStore: Socially Aware Store
• content relevance based on
social graph
• relevance is a distribution
• individual redundancy based on distribution
• other link types: same time, location,
filetype, ...
• link strengh != 1
Descending
order
Relevance
Distribution
Redundancy
(user setting)
Physical limit
of redundancy
End of
content
There is
a link
When
a file
is …
Between
Created
Viewed
Edited
Deleted
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 7/30
...
7/30
8. .
Example: Cloud Streaming
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 8/30
...
8/30
9. .
Cloud Streaming: Fixing Problems
Traditional
streaming
P2P
streaming
Cloud
streaming
Adaptive
streaming
• Congestion
(Flash Crowds)
• Unreliable
throughput
• Unreliable
sources
• Unreliable
throughput
• Congestion
Fixed Fixed
Fixed
Fixed
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 9/30
...
9/30
11. .
Practical Solutions for Federated
Clouds
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 11/30
...
11/30
12. .
A Shortlist of (S)olutions
1. S1: Nextgen traffic processors at DCs
2. S2: QoS Context and Performance Visualization at DCs
3. S3: Performance Modeling for Federated Clouds
4. S4: Client Side Traffic Boostings
5. .... definitely not a complete list
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 12/30
...
12/30
13. .
Solution (S) 1:
Nextgen Traffic Processors at DCs
(work in progress)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 13/30
...
13/30
14. .
S1: Multicore Packet Capture
Global
Networks
Data Center
Internals
Gateway
Switch
Capture
Manager
CPU
CPU
CPU
CPU
CPU
CPU
…
Storage
Mirror
• multicore is the key
• multicore !=
traditional parallel processing
03
• on-demand capture, DPI,
heterogeneous tasks 04
03 myself+0 "...Multicore Capture in Data Center Forensics" ACM AISACCS-SFCS (June 2014)
04 myself+0 "A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture" IJNM (in print)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 14/30
...
14/30
15. .
S1: Multicore Hates Memory Locks
• lockfree design 04 : no messages, no memory locks
PF_
RING
PF_
RING
Time
Manager
Shared Memory
Capture
Capture
…
Core 1
Core 2
Core 3
….
Core X Manager
PF_
RING
Shared
memory
Onethread
Create
Fork
Lifespan
Stale check
Process/wrap
Wrap wait
Double-LinkedList(DLL)
Assign
04 myself+0 "A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture" IJNM (in print)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 15/30
...
15/30
16. .
Solution (S) 2: DC Performance APIs
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 16/30
...
16/30
17. .
S2: E2E QoS, M2M Patterns
Meter Merger
Per flow
statistics
Analyzer
History,
state
Profiler
UDP
Users
Clients
Probe Analysis machine
Web application
• clean slate: capture
QoS context 05
• visualize user
communities
• export via APIs to users
and/or service providers
05 myself+0 "A holistic community-based architecture for measuring E2E QoS at data centres" IJCSE (in print)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 17/30
...
17/30
18. .
S2: Has to be a Clean Slate
Probe
Router
Data center
infrastructure
source IP
timestamp Key Key DLL
0 #01 #02 #03
1 #04
2 #05 #06
….
….
2^24 #07
source port
dest IP
dest port
protocol
packet size
CRC24
Packet Hash table
#01
DLL
#05
#04
#02
#03
#07
Export
over UDP
Byte
0
4
8
12
16
20
24
…
0 (bits) 32
Source port Dest port
Source IP
Destination IP
* psize pspace
Start time (s)
Start time (us)
* psize pspace
1 11
Data unit
psize:
Packet size
pspace:
Packet space
(us)
#06
Export
via
a file
UDP RX
Buffer (5s)
Byte
0
4
8
12
16
20
24
…
0 (bits) 32
Source port Dest port
Source IP
Destination IP
D psize pspace
Start time (s)
Start time (us)
D psize pspace
1 11
Data unit
D:
Direction
(0 or 1)
Merger
Find flow from
opposite direction
Analyzer
History
State
Read and
update
Ring buffer of data
units per IP on internal
networks
Statistic Meaning
MinOWD Global minimum OWD
MaxBatch Max byte count of a
packet burst
Bulks Throughputs in flows
Per
source-
dest pair
• has to be a clean slate!
• cisco, ntop, sflow are not
feasible
• QoS context is something
new
• (figure is vector, so, zoom in!)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 18/30
...
18/30
19. .
S2: But Payoff is Great!
0 6400 12800 19200
Batch size (bytes)
0
800
1600
2400
3200
4000
OWD(ms)+TXtime(x0.1ms)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 19/30
...
19/30
20. .
Solution (S) 3: Cloud Weather System
(work in progress)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 20/30
...
20/30
21. .
S3: Cloud Weather System
(high/low)
Pressure
front
Typhoon
Drought
Good
weather
Bad
weather
• continents: user, services 07
• water: network
• weather, clouds, etc.: changes in
performance
• droughts: insufficiency of
infrastructure, users do not get enough
capacity
• typhoons: basically, Flash Crowds in
services, going viral, ...
• forecasting: possible with enough
performance monitoring, similar to stock
market
07 myself+0 "Cloud Weather System as a Futuristic Performance Model" IEICE総合大会 (March 2013)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 21/30
...
21/30
22. .
Solution (S) 4: Mobile Throughput
Boosters
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 22/30
...
22/30
23. .
S4: Mobile Throughput Booster
• so far, only possible in wireless -- WiFi Direct
Single Connection Multipath
Singular
Connectivity
Traditional
Applications
Traditional
Multipath
Multiple
Connectivity
No known cases
(wasted potential)
Group Communication
3G/LTE/* + WiFi Direct
THIS PROPOSAL
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 23/30
...
23/30
24. .
S4: Group Resource Pooling
Remote connectivity
Local Connectivity
Content Provider
Main
Client
Delegated
Client
Delegated
Client
3G/LTE/*
Access 3G/LTE/*
Access
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 24/30
...
24/30
25. .
S4: Converged Wireless Campus
Student
Develop,
make
secure
APP +
CODE
Campus
Another
Student
APP +
CODE
APP +
CODE
1
2
2 Distribute
3
Meet and
delegate
API
Tokens
API
Tokens
Distribute
Pass at delegation
University
4
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 25/30
...
25/30
26. .
Solution (S) 5: Over-the-Network
Indexing
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 26/30
...
26/30
27. .
S5: Indexing in Clouds
Data
Indexer
Index
Network
Traditional
Client
Data
Indexer
IndexRead,
Write
Stringex
Client
The
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 27/30
...
27/30
28. .
S5: Over-the-Network Optimization
• in short: throughput-centric network storage optimization 08
Stringex
Index
Stringex
Client
The
Sync
Engine
Optimization
Local
Cache
Check
1 2
Use
08 myself+0 "A New Practical Design for Browsable Over-the-Network Indexing" ISEEE (April 2014)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 28/30
...
28/30
29. .
S5: Performance
3.15 3.85 4.55 5.25 5.95 6.65
Index Size (log)
2.55
2.65
2.75
2.85
2.95
3.05
3.15
3.25
Throughput(logofbytes/doc)
Lucene
Stringex
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 29/30
...
29/30
30. .
That’s all, thank you ...
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30
...
30/30
31. .
[01] myself+0 (April 2014)
High Availability Cloud Storage ... Social Graph ... Smart Distribution
NS研
[02] myself+0 (2014)
Multi-Source Stream Aggregation in the Cloud
Wiley Book on ACDN, Chapter 10
[03] myself+0 (June 2014)
...Multicore Capture in Data Center Forensics
ACM AISACCS-SFCS
[04] myself+0 (in print)
A Lock-Free Shared Memory Design for ... Multicore Packet Traffic Capture
IJNM
[05] myself+0 (in print)
A holistic community-based architecture for measuring E2E QoS at data centres
IJCSE
[06] myself+0 (May 2014)
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30
...
30/30
32. .
Towards a Practical Method for Interactive Traffic Visualizations in Data Centers
SC研
[07] myself+0 (March 2013)
Cloud Weather System as a Futuristic Performance Model
IEICE総合大会
[08] myself+0 (April 2014)
A New Practical Design for Browsable Over-the-Network Indexing
ISEEE
Marat Zhanikeev -- maratishe@gmail.com Is It Time to Go Global with Cloud Performance Management? -- http://tinyurl.com/marat140328 30/30
...
30/30