UnaGrid is an opportunistic virtual grid infrastructure that takes advantage of the idle processing capabilities of conventional desktop machines in computer rooms through the use of Customizable Processing Virtual Clusters (CPVCs), these capabilities are used within the development of e-Science projects. This paper presents the design, implementation and assessment of a virtual storage system, which simultaneously allows UnaGrid to take advantage of the storage and processing capabilities available in tens or hundreds of desktop machines. The first tests have shown that this system allows attaining large storage capabilities, at low cost, and superior performance than a NFS-NAS dedicated solution.
1. An Opportunistic Storage System for UnaGrid
Harold Castro, Mario Villamizar
Department of Systems and Computing Engineering
Universidad de los Andes
Bogotá, Colombia
2. Introduction
Service
Grids
Grid
Computing
Opportunistic
Grids
4. Project: Campus Grid Uniandes - UnaGrid
Take advantage of the idle processing capabilities available in
conventional computer labs.
Support the development of e-Science projects.
5. UnaGrid
X X
Cores Cores
Linux Linux
Processing Processing
Virtual Machine Virtual Machine
Physical Machine of a
Physical Machine of a
Computer Room
Computer Room
b. When there is not an End User
a. When there is an End User using
using the physical machine
the physical machine
A Processing Virtual Machine (PVM) is executed on each computer of a
lab, which is executed in background as a low priority process.
The PVM is executed in a transparent manner while the users execute
theirs daily activities.
6. UnaGrid – Customizable Processing Virtual Cluster (CPVC)
Computer lab
VM VM VM
A CPVC is composed of PVMs
VM VM VM executed on each computer
Master of a lab (cluster slaves) and a
Dedicated computer
outside the computer lab dedicated machine (master
cluster).
VM VM VM
Computer Lab – Virtual Cluster Slaves
Each research group can define its own CPVCs with custom
application environments (middlewares, applications, etc.).
8. UnaGrid – Current Storage System
A dedicated NFS-NAS storage solution has been used in which
all CPVCs store their data
9. Problem and motivation
UnaGrid Disk space
benefits available in
computer labs
A strategy to implement a Virtual Distributed Storage System
Take advantage of the idle storage capabilities
A transparent system for users and applications
Provide new storage capabilities to UnaGrid infrastructure
10. Possible Solutions
A new file system or
opportunistic system
Use an opportunistic The UnaGrid requirements
system require another approach.
Use a distribute or parallel
file system
11. UnaGrid Requirements
The desktops machines of
the computer labs have The CPVCs operates with
Windows, Linux or Mac, as Linux operating system.
their base operating system.
The virtual distributed storage system must be executed from
Windows, Linux or Mac desktops and used form Linux CPVCs.
Solution
Virtualization Technologies
12. Strategy Proposed
n n n
Gigabytes Gigabytes Gigabytes
Storage Server Storage Server Storage Server
Customizable
Metadata Server
n
Gigabytes
n
Gigabytes
n
Gigabytes
Storage Virtual
Cluster (CSVCs)
Storage Server Storage Server Storage Server
Computer outside the
computer lab Computer lab
A VM is executed on each computer of a computer lab, this
machine operates as a storage server.
An additional VM it is necessary as metadata server.
13. Two Virtual Machines on each Computer
X X X X
Gigabytes Cores Gigabytes Cores
Linux Linux Linux Linux
Storage Processing Storage Processing
Virtual Machine Virtual Machine Virtual Machine Virtual Machine
Windows Windows
Physical Machine of Physical Machine of
a Computer Lab a Computer Lab
a. When there is not a user using the b. When there is a user using the
physical machine physical machine
Intrusion level on the end user.
Priorities and resources assigned to VMs.
Resource competition between the VMs.
14. Solution Strategy
Definition of a virtual
storage cluster by computer
lab.
Concurrent execution
with the CPVCs.
Take advantage of the
idle processing and storage
capabilities of each
computer.
15. Solution Strategy
Any opportunistic system or distributed file system may be
executed on a Customizable Storage Virtual Cluster (CSVC)
This strategy must be validated.
Current opportunistic Parallel and distributed file
solutions do not meet the systems can be used to
UnaGrid requirements validated the strategy
proposed
16. Methodology
Intrusion level on the end user
Resource competition between virtual
machines
Performance evaluation of the strategy
proposed
17. Level intrusion on the end user
X X Several tests were conducted
Gigabytes Cores
to determine the best resource
Linux Linux assignation to the two virtual
machines executed in a non-
Storage
Virtual Machine
Processing
Virtual Machine
intrusive manner:
VMs executed in
Windows background.
Resource assigned to VMs.
Tasks executed by the end
Physical Machine of
a Computer Lab
user.
b. When there is a user using the Tasks executed by the two
physical machine
virtual machines.
18. Level intrusion on the end user
1 2 1 2
Core Cores Core Cores
Processing Virtual Machine Storage Virtual Machine
Intensive processing task Intensive storage task
Four type of tests were executed when the end user execute:
One or two intensive processing tasks.
One or two intensive storage tasks
We configured 8 execution environments.
19. Level intrusion on the end user
Results when the end user executes one intensive
processing tasks.
EVALUATION OF THE PERFORMANCE DEGRADATION WHEN THE USER EXECUTES 1 INTENSIVE PROCESSING TASK
ID Processing Virtual Machine Storage Virtual Machine Average Execution Time (4 Tests) - Seconds
In execution # Cores Activity RAM In execution # Cores Activity RAM 100000 200000 300000 400000 500000
A1 No NA NA NA No NA NA NA 22,07 44,09 66,14 88,20 110,24
A2 Yes 1 1 Task 1 GB No NA NA NA 22,15 44,27 66,41 88,53 110,66
A3 Yes 2 2 Tasks 1 GB No NA NA NA 22,19 44,32 66,48 88,65 110,82
A4 No NA NA NA Yes 1 1 Task 1 GB 22,14 44,27 66,39 88,54 110,66
A5 No NA NA NA Yes 2 2 Tasks 1 GB 22,21 44,38 66,56 88,73 110,94
A6 Yes 1 1 Task 1 GB Yes 1 1 Task 1 GB 22,15 44,29 66,42 88,67 110,70
A7 Yes 2 2 Tasks 1 GB Yes 1 1 Task 1 GB 22,19 44,35 66,52 88,71 110,89
A8 Yes 2 2 Tasks 1 GB Yes 2 2 Tasks 1 GB 22,18 44,37 66,55 88,71 110,89
Maximum performance degradation (%): 0,65 0,66 0,64 0,61 0,63
20. Level intrusion on the end user
The execution of the two virtual machines executed in
background decrease the QoS perceived by the end user by
less than 4%.
One intensive processing task: 0.66%
Two intensive processing tasks: 1.24%
One intensive storage task : 2.45%
Two intensive storage tasks : 3.35%
We executed 640 tests using an application called
UnaGridLoadSimulator.
21. Resource competition between virtual machines
100
90
80
70
% CPU usage
60
End User Process
50
Processing Virtual Machine
40
Storage Virtual Machine
30
20
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
The VMs only use the processing capabilities does not used by
the end user, these capabilities are equitably divided between
the VMs.
22. Performance evaluation of the strategy proposed
We evaluated 4
distributed file systems
and the NFS-NAS solution.
A computer lab with 31
computers.
62 VMs were configured
for each file system.
Condor scheduler.
23. Performance evaluation of the strategy proposed
Operating Aggregated
File system Version
System Capacity
PVFS 2.8.1 Debian 4.0 344 GB
Gfarm 2.3.0 Debian 4.0 346 GB
Lustre 1.8.1 RHEL 5 300 GB
GPFS 3.2.1-13 RHEL 5 300 GB
24. Performance Evaluation – From one client
120,00 45,00
40,00
Write Bandwidth (MB/sec)
Read Bandwidth (MB/sec)
100,00
35,00
80,00 30,00
NFS 25,00 NFS
60,00
PVFS 20,00 PVFS
40,00 Gfarm 15,00 Gfarm
Lustre 10,00 Lustre
20,00
GPFS 5,00 GPFS
0,00 0,00
I/O Request Size (KB) I/O Request Size (KB)
The performance of the file systems were tested from one client
(one PVM of the CPVC) varying the size of the I/O requests for a
1 GB file. We used the IOzone tool.
25. Performance Evaluation – Read from several clients
120,00
Average Aggregate Read Rate
100,00
80,00 NFS
(MB/sec)
60,00 PVFS
40,00 Gfarm
20,00 Lustre
0,00 GPFS
1 3 6 9 12 15 18 21 24 27 30
Number of Concurrent Clients
As the number of clients increases, the average bandwidth per client
decreases. As the number of file system clients (PVMs) increases there
is a higher probability that the two VMs executed on each physical
machine operate as a client (PVM) and server (SVM) of the file systems.
26. Performance Evaluation – Read from several clients
700,00
Aggregate Read Rate
600,00
500,00
NFS
(MB/sec)
400,00
PVFS
300,00
200,00 Gfarm
100,00 Lustre
0,00 GPFS
1 3 6 9 12 15 18 21 24 27 30
Number of Concurrent Clients
As the number of clients increases, global performance of the file
systems also went up.
GPFS = 580.79 MB/s, Lustre = 425.17 MB/s,
Gfarm = 310.88 MB/s, PVFS = 244.73 MB/s, NFS = 18.61 MB/s
27. Performance Evaluation – Write from several clients
40,00
Average Aggregate Write Rate
35,00
30,00
25,00 NFS
(MB/sec)
20,00 PVFS
15,00
Gfarm
10,00
5,00
Lustre
0,00 GPFS
1 3 6 9 12 15 18 21 24 27 30
Number of Concurrent Clients
As the number of clients increases, the average bandwidth per
client decreases.
28. Performance Evaluation – Write from several clients
300,00
Aggregate Write Rate
250,00
200,00 NFS
(MB/sec)
150,00 PVFS
100,00 Gfarm
50,00 Lustre
0,00 GPFS
1 3 6 9 12 15 18 21 24 27 30
Number of Concurrent Clients
Global performance for the file systems increases up to a determined
number of clients (15 or 18) and then begins to decrease.
Lustre = 270.01 MB/s, GPFS = 246.88 MB/s,
Gfarm = 211.47 MB/s, PVFS = 116.15 MB/s, NFS = 5.93 MB/s
29. Performance Evaluation Analysis
When the CPVCs execute intensive processing tasks,
performance of the file systems (CSVCs) is affected by less than
4%.
With the use of a CSVC it is possible to achieve read
bandwidths of 4.5 Gbps and write bandwidths of 2.2 Gbps.
Several terabytes may be grouped through the proposed
strategy.
With a CSVC bandwidths higher than 1 Gbps are attained
without the need of lay down more cable.
30. Conclusions
The strategy of using CPVCs and CSVCs in computer labs
concurrently and transparently allow to take advantage of the
non-used processing and storage capabilities.
Hundreds of processing cores and several terabytes may be
grouped through the proposed strategy for the development of
e-Science projects.
The strategy allows personalizing the tools, middleware,
applications, and configurations of the CPVCs and the CSVCs,
guarantying the usability of the UnaGrid infrastructure.
31. Future Work
Assessment of the strategy proposed in a production
environment and its scalability.
Assessment of the performance of applications that use
CSVCs.
The use of policies and mechanisms of redundancy in the
CSVCs.
The use of strategies for data placement.
Performance evaluation with other opportunistic and file
systems.
33. UnaGrid – Implementation
Three computer rooms
(with 35 computers each
one).
Core 2 Duo processors
and 4 GB of RAM memory.
Three CPVCs.
Condor.
VMware.
Globus.
34. Possible Solutions
A new file system or
opportunistic system
Use an opportunistic The UnaGrid requirements
system require another approach.
Use a distribute or parallel
file system
35. A new file system or opportunistic system
Long time is required
Data and metadata distribution.
Metadata management.
Cache management.
Implementation (kernel or user).
Storage media.
Communication protocols.
User management.
Scalability.
POSIX semantics.
Replication tools.
Others.
36. Related Work – Opportunistic Systems
Desktop Data
FarSite FreeLoader OppStore Grid (DDG)
¿What features must the UnaGrid storage system have?
37. Related Work – Opportunistic Systems
FreeLoader
OppStore
Data Grid
Desktop
Farsite
Property /
System
Application modification required no yes yes no
Operation in Linux environments no yes yes yes
Natively integrated with the operating
yes no no yes
system
Data redundancy support by software yes yes yes yes
Installation on PC desktops yes yes yes yes
Non-intrusive operation no yes yes yes
Available for installation no no yes no
38. Related Work – Opportunistic Systems
Desktop Data
FreeLoader
OppStore
Farsite
Grid
Property /
System
Designed for HPC no yes yes yes
Security mechanisms yes no NA NA
License type Pr OS OS OS
Data striping support no yes yes no
C/S
Model for metadata management P2P C/S C/S
P2P
Model for file/fragment transfer P2P P2P P2P P2P
39. Related Work – Distributed File Systems
Data Data/Metadata Data Metadata Data Data/Metadata Data
Server Server Server Server Server Server Server
NETWORK
Client Client Client Client Client Client
Parallel Virtual File System General Parallel File System
(PVFS) (GPFS)
Grid Datafarm Sun Microsystems
Gfarm Lustre
40. Related Work – Distributed File Systems
Lustre
Gfarm
GPFS
PVFS
Property /
System
Application modification required no no no no
Operation in Linux environments yes yes yes yes
Natively integrated with the operating
yes yes yes yes
system
Data redundancy support by software no yes no yes
Installation on PC desktops yes yes yes yes
Non-intrusive operation no no no no
Available for installation yes yes yes yes
41. Related Work – Distributed File Systems
Lustre
Gfarm
GPFS
PVFS
Property /
System
Designed for HPC yes yes yes yes
Security mechanisms yes yes yes yes
License type OS OS OS Pr
Data striping support yes no yes yes
Model for metadata management C/S C/S C/S C/S
Model for file/fragment transfer P2P P2P P2P P2P