2. What have I tested?
➢ GlusterFS
http://glusterfs.org
➢ XtremeFS
http://www.xtreemfs.org/
➢ FhgFS
http://www.fhgfs.com/cms/ Fraunhofer
➢ Tahoe-LAFS http://tahoe-lafs.org/
➢ PlasmaFS
http://blog.camlcity.org/blog/plasma4.html
3. What will be compared?
➢ Ease of install and configuration
➢ Sequential write and read (large file)
➢ Sequential write and read (many same size, small files)
➢ Copy from local to distributed
➢ Copy from distributed to local
➢ Copy from distributed to distributed
➢ Creating many random file sizes (real cases)
➢ Creating many links (cp -al)
4. Why only on 1Gbit/s ?
➢ It is considered commodity
➢ 6-7 years ago it was considered high performance
➢ Some projects have started around that time
➢ And last, I only had 1Gbit/s switches available for the
tests
5. Lets get the theory first
1Gbit/s has ~950Mbit/s usable Bandwidth
Wikipedia - Ethernet frame
Which is 118.75 MBytes/s usable speed
iperf tests - 512Mbit/s -> 65MByte/s
There are many 1Gbit/s adapters
that can not go beyond 70k pps
iperf tests - 938Mbit/s -> 117MByte/s
hping3 tcp pps tests
- 50096 PPS (75MBytes/s)
- 62964 PPS (94MBytes/s)
6. Verify what the hardware can deliver locally
# echo 3 > /proc/sys/vm/drop_caches
# time dd if=/dev/zero of=test1 bs=XX count=1000
# time dd if=test1 of=/dev/null bs=XX
bs=1M
Local write 141MB/s
bs=1M
Local read 228MB/s real 0m4.605s
bs=100K Local write 141MB/s
real 0m7.493s
real 0m7.639s
bs=100K Local read 226MB/s real 0m4.596s
bs=1K
Local write 126MB/s
real 0m8.354s
bs=1K
Local read 220MB/s real 0m4.770s
* most distributed filesystems write with the speed of the slowest member node
8. Linux Kernel Tuning
TCP memory optimizations
min pressure max
net.ipv4.tcp_mem=41460 42484 82920
min default max
net.ipv4.tcp_rmem=8192 87380 6291456
net.ipv4.tcp_wmem=8192 87380 6291456
Double the tcp memory
9. Linux Kernel Tunning
➢ net.ipv4.tcp_syncookies=0
default 1
➢ net.ipv4.tcp_timestamps=0
default 1
➢ net.ipv4.tcp_app_win=40
default 31
➢ net.ipv4.tcp_early_retrans=1 default 2
* For more information - Documentation/networking/ip-sysctl.txt
24. Joomla tests (local to cluster)
# for i in {1..100}; do time cp -a /tmp/joomla /mnt/joomla$i; done
70
62.83
60
seconds
50
40
31.42
30
19.26
20
10
0
copy
* lower is better
28MB
6384 inodes
GlusterFS
XtreemeFS
FhgFS
25. Joomla tests (cluster to local)
# for i in {1..100}; do time cp -a /mnt/joomla /tmp/joomla$i; done
250
200.73
seconds
200
150
100
50
39.7
19.26
0
copy
* lower is better
28MB
6384 inodes
GlusterFS
XtreemeFS
FhgFS
26. Joomla tests (cluster to cluster)
# for i in {1..100}; do time cp -a joomla joomla$i; done
# for i in {1..100}; do time cp -al joomla joomla$i; done
28MB
6384 inodes
GlusterFS
XtreemeFS
FhgFS
300
265.02
250
seconds
200
150
113.46
100
50
89.52
76.44
51.31
22.53
0
copy
* lower is better
link
27. Conclusion
➢Distributed FS for large file storage – FhgFS
➢ General purpose distributed FS - GlusterFS
* lower is better