2. Who am I ?
Development Experience
◆ Bio-Medical Data Processing based on
HPC for Human Brain Mapping
◆ Medical Image Reconstruction
(Computer Tomography)
◆ Enterprise System Architect
◆ Open Source Software Developer
Open Source Develop
◆ Linux Kernel (ARM, x86, ppc)
◆ LLVM (x86, ARM, custom)
◆ OpenStack : Orchestration (heat)
◆ SDN (OpenDaylight, OVS, DPDK)
◆ OPNFV: (DPACC, Genesis, Functest, Doctor,)
Technical Book
◆ Unix V6 Kernel
Open Frontier Lab.
Manseok (Mario) Cho
hephaex@gmail.com
3. Open Source S/W developer community
http://kernelstudy.net
- Linux Kernel (ARM, x86)
- LLVM Compiler
- SDN/NFV
8. 10 2
MAINFRAME
CLIENT-SERVER
WEB
SOCIAL
INTERNET
OF THINGS
CLOUD
Few
Employees
Many
Employees
Customers/
Consumers
Business
Ecosystems
Communities
& Society
Devices
& Machines
10 4
10 6
10 7
10 9
10 11
Front Office
ProductivityBack Office
Automation
E-Commerce
Line-of-Business
Self-Service
Social
Engagement
Real-Time
Optimization
1960s-1970s
1980s
1990s
2011
2016
2007
OS/360
USERS
VALUE
TECHNOLOGIES
SOURCES
BUSINESS
TECHNOLOGY
10 11
Processes
Stand alone projects
Corporate IT driven
Data
Infrastructure
LOB driven
Data
ecosystem
+
+
Data integration becoming the
barrier to business success
People
Product
& Things
* http://www.slideshare.net/SanjeevKumar17/tech-mahindra-i5sanjeevdec/
The Technical Challenge
11. Operating System focus on Storage
SSD/HDD Memory
Application
System Call
Processor
Scheduler
process
manager
Memory
Manager
File
System
I/O Interface
Device Driver
User Space
Hardware Space
Operating System
(Kernel Space)
Network
Application Application
Logical
Block Layer
12. Redundant Arrays of Independent Disks
SSD/HDD
Application
Resource Manager (VFS)
File System
User Space
Hardware Space
Operating System
(Kernel Space)
Application Application
Logical Block Layer
SSD/HDD SSD/HDDSSD/HDD
SSD/HDD SSD/HDD SSD/HDDSSD/HDD
SSD/HDD SSD/HDD SSD/HDDSSD/HDD
SSD/HDD SSD/HDD SSD/HDDSSD/HDD
Hardware Block Layer (RAID Controller)
13. RAID: The First Software Defined Storage
at 1988
* Source: 1988,Anil Vasudeva “ A case for Disk Arrays” Presented at conference,Santa Clara CA. Aug 1988
14. OpenStack
OpenStack is collection of software for setting up a massive
IaaS (Infrastructure as a Service) environment
OpenStack consists of six main components below
OpenStack support Block Storage (Cinder) & Object Storage(Swift)
* http://www.openstack.org/software/
15. Storage System on OpenStack
SSD/HDD
Application
Resource
Manager
File System
User Space
Hardware Space
Operating System
(Kernel Space)
Logical
Block Layer
Application ApplicationApplication
Virtual Computing Machine Manage (Nova)
Block
Storage
Manager
(Cinder)
Object
Storage
Manager
(Swift)
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Shared File
System
(Manila)
Application
16. Comparison of OpenStack Storage
Swift CinderManilaOpenStack Component
Object BlockFile
REST API iSCSINFS, CIFS/SMB
- VM live migration
- Storage of VM files
- Use with legacy
application
Storage Type
Primary Interface
Use Cases - Large datasets
- Movie,Images, Sounds
- Storage of VM files
- Archiving
- High performance
- DBs
- VM Guest Storage
- Snaps shot
- VMs clones
Benefit Scalability
Durability
ManageabilityCompatibility
* http://www.openstack.org/openstack-manuals/openstack-ops/content/storage_decision.html
17. Cinder provides persistent block storage resource to the virtual
machines running on Nova compute
Cinder uses plugin to support multiple types of backend storages
Cinder: Block Storage Layer
Cinder
Nova Compute #1
VM #2VM #1 VM #3 …
Nova Compute #2
VM #8VM #7 VM #9 …
Nova Compute #3
VM #12VM #11 VM #13 …
Create
A volume
Delete
A volume Snapshot
Attach
a volume
Detech
a volume
18. Cinder: Volume Manage APIs
API no. Work Function
(1)
Volume Operation
Create Volume
(2) Create Volume from Volume
(3) Extend Volume
(4) Delete Volume
(5)
Connection Operation
Attach Volume
(6) Detach Volume
(7)
Volume snapshot Operation
Create Snapshot
(8) Create Volume from Snapshot
(9) Delete Snapshot
(10)
Volume Image Operation
Create Volume from Image
(11) Create Image from Volume
Nova
VM VM #1
Glance
Image
Cinder Volume
Snapshot
VM #2
1) create volume
2) Create volume
fromvolume
5) Attach volume 5) Attach volume
7) Create Snapshot
VM #4
8) Create volume
fromSnapshot
5) Attach volume
VM #5
10) Create volume
fromImage
5) Attach volume
VM #6
11) Create Image
fromVolume
5) Attach volume
VM #3
5) Attach volume
3) Extent volume
…
19. Cinder: Requirement of backend
Life Cycle of VM
Create
VM
Launch
VM
Running
VM
Stop
VM
Delete
VM
Cinder Work Create
/
Attach
Extend
/
Snapshot
Detach Delete
Technical
Requirement
1. When needed, quickly
prepare block space
2. Copy and reuse existing
block
1. Flexible add
2. Automatic Extend block
1. Preserve important data
2. Safety delete
unnecessary confidential
data
20. Cinder: Volume Manage (Scheduler)
Volume
Service 1
Volume
Service 2
Volume
Service 3
Volume
Service 4
Volume
Service 5
Volume
Service 1
Volume
Service 2
Volume
Service 3
Volume
Service 4
Volume
Service 5
Weight = 25
Weight = 20
Weight = 41
Volume
Service 2
Volume
Service 4
Volume
Service 5
Filters
Weighers
Winner!
• AvailabilityZone
Filter
• Capabilities
Filter
• JsonFilter
• CapacityFilter
• RetryFilter
• CapacityWeigher
• AllocatedVolumesWeigher
• AllocatedSpaceWeigher
* http://www.intel.com/
23. Cinder Plug-In: LVM case
Nova
VM VM #1
Cinder
LVM
Cinder API
VM #2 VM #3
…
Hyper-visior (KVM, VMWARE, … )
iSCSI initiator /dev/sdx
Nova
VM VM #6 VM #7 VM #8
…
Hyper-visior (KVM, VMWARE, … )
iSCSI initiator /dev/sdx
LV#1 LV#2 LV#3 LV#4
Cinder Scheduler
Volume (LVM Plugin)
iSCSI
target
iSCSI
target
Attach/Detach via Hypervisor Attach/Detach via Hypervisor
Create/Delete/Extend/…
24. Cinder Plug-In: FC case
Nova
VM VM #1
Cinder
LVM
Cinder API
VM #2 VM #3
…
Hyper-visior (KVM, VMWARE, … )
/dev/sdx
(LUN1)
Nova
VM VM #6 VM #7 VM #8
…
Hyper-visior (KVM, VMWARE, … )
Cinder Scheduler
Volume (FC Plugin)
Storage Controller
Attach/Detach via Hypervisor Attach/Detach via Hypervisor
Create/Delete/Extend/…
LUN1 LUN2 LUN3 LUN4
/dev/sdy
(LUN2)
/dev/sdx
(LUN1)
/dev/sdy
(LUN2)
25. Cinder Compare LVM vs FC
LVM FC Remark
Volume
Implementation
Managed LVM Managed Storage
Controller
Volume
Operation
LVM (Software) FC (Hardware) LVM more
flexible
Supported
Storage
Storage
Independent
Specific
Storage (req. plug-in)
LVM Better
Support
coverage
Access Path iSCSI
(Software)
Fibre Channel
(Hardware)
FC Better
performance
26. Swift: Object Storage
Client
Swift
Proxy
Node
Storage
NodeHTTP (REST API) HTTP (REST API)
Account Node
Container Node
Object Node
Reliable
Highly Scalable
Hardware Proof
Configurable replica model with zones & regions
Easy to use HTTP API – Developers don’t shard
High Concurrency (support lots of users)
Multi-tenant: each account has its own namespace
Tier & Scale any component in the system
No Single Point of Failure (High Availability
Assumes unreliable hardware
Mix & match hardware vendors
* https://www.openstack.org/assets/presentation-media/Swift-Workshop-OSS-Atlanta-2014.pdf
27. Swift: Ring Hash
* Source: https://ihong5.wordpress.com/tag/consistent-hashing-algorithm/
28. Swift architecture
User Space
Swift Stooge Node
Swift Proxy Node
Application ApplicationApplication
HTTP Load balancer
Proxy
Node
Proxy
Node
(expand)
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Proxy
Node
Application
Network
Storage
Node
Expand
Proxy Server
“Throughput”
Expand
Storage Server
“Volume”
29. Swift Account Node
/
srv
node
Disk#1 Disk#2 Disk#3
accountcontainer object
Partition #1 Partition #2 Partition #3 Partition #4
Hash low value Hash low value Hash low value
Hash #2Hash #1
Account Hash.db Hash.db.pending
30. Swift Container Node
/
srv
node
Disk#1 Disk#2 Disk#3
Contrineraccount object
Partition #1 Partition #2 Partition #3 Partition #4
Hash low value Hash low value Hash low value
Hash #2Hash #1
container Hash.db Hash.db.pending
32. Swift Replicator
Node #1 Node #2 Node #3 Node #4 Node #5
Each Nodes checks
Node #1 Node #2 Node #3 Node #4 Node #5
Find Defect data
Node #1 Node #2 Node #3 Node #4 Node #5
Copy data to another node
Node #1 Node #2 Node #3 Node #4 Node #5
Recovery data to original node
Node #1 Node #2 Node #3 Node #4 Node #5
Delete temp. data
33. Swift: hash synchronize
Node #1
HASH A HASH B
Data
#1
Data
#3
Node #2
HASH A HASH B
Data
#2
tmp
Node #1
HASH A HASH B
Data
#1
Data
#3
Node #2
HASH A HASH B
Data
#2
tmp
Sync
Data
#2
tmp
Data
#1
Data
#3
Node #1
HASH A HASH B
Data
#1
Data
#3
Node #2
HASH A HASH B
Data
#2
tmp
Data
#2
tmp
Data
#1
Data
#3
34. Swift: Object Update
Node
tmp HASH
upload
Data
#1
Node Node
tmp HASH
upload
Data
#1
Data
#1’
Move
tmp HASH
Data
#1
Data
#1’
Delete
37. Swift: Using REST for Object handling
Basic Command
- http://swift.server.net/v1/account/container/object
Get a list of all container in an account
- GET http://swift.server.net/v1/account/
To create new container
- PUT http://swift.server.net/v1/account/new_container
To list all object a container
- GET http://swift.server.net/v1/account/container
To create new object
- PUT http://swift.server.net/v1/account/container/new_object
* Source: https://tw.pycon.org/2013/site_media/media/proposal_files/cinder_2013.pdf
38. Swift vs Ceph
* http://japan.zdnet.com/article/35072972/
Load balancer
Client
Proxy Proxy
Client
OSDOSD OSD OSD
Monitor/
Metadata
Placement
Group
Cluster Map
Swift Ceph
39. Data Science work flow
EDW
NoSQL
DataDistribution
Business
Intelligence
Hadoop
Grid
MDM Data Quality
Real-time
Streaming
Batch
Replication
Collection
Layer
Data Integration
Layer
Report
Layer
Data
Sources
Archival
Hadoop
Data
Exploration
Network Elements
Content
Network Logs
Social Media
External Data
Transactions
CEP
Staging
Layer
* http://www.slideshare.net/SanjeevKumar17/tech-mahindra-i5sanjeevdec/
40. Data analysis with OpenStack storage
Storage
APIs
Analysis
APIs
Storage Node
Compute Node
Serving
Layer
BatchLayer
Real-timeAnalysisLayer
New data
stream
Stream
Processing
Realtime
View
Query
All Data
Pre-compute
Views
Batch
View
Batch
View
42. The OpenStack® Word Mark and OpenStack Logo are either registered trademarks/service
marks or trademarks/service marks of the OpenStack Foundation in the United States and
other countries and are used with the OpenStack Foundation's permission. We are not affiliated
with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community.
• GPFS is a trademark of International Business Machines Corporation in the United States,
other countries, or both.
• GlusterFS, the Gluster ant logo, and the Gluster Community logo are all trademarks of Red
Hat, Inc. All other trademarks, registered trademarks, and product names may be trademarks
of their respective owners.
• Dell is a trademark of Dell Inc.
• EMC and CLARiiON are registered trademarks of EMC Corporation.
• HP is a trademark of Hewlett-Packard Development Company, L.P. in the U.S. and other
countries.
•Other company, product or service names may be trademarks or service mark of others.