Tải bản đầy đủ (.pdf) (28 trang)

Tài liệu GlusterFS pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (387.36 KB, 28 trang )

© 2007 Z RESEARCH
Z RESEARCH
Z RESEARCH, Inc.
Commoditizing Supercomputing and Superstorage
Massive Distributed Storage over
InfiniBand RDMA
© 2007 Z RESEARCH
Z RESEARCH
What is GlusterFS?
GlusterFS is a Cluster File System that aggregates multiple
storage bricks over InfiniBand RDMA into one large parallel
network file system
 GlusterFS is MORE than making data available over a network
or the organization of data on disk storage….
• Typical clustered file systems work to aggregate storage and provide
unified views but….
- scalability comes with increased cost, reduced reliability, difficult
management, increased maintenance and recovery time….
- limited reliability means volume sizes are kept small….
- capacity and i/o performance can be limited…

GlusterFS allows scaling of capacity and I/O using industry
standard inexpensive modules!
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Features
1. Fully POSIX compliant!
2. Unified VFS!
3. More flexible volume management (stackable features)!
4. Application specific scheduling / load balancing
• roundrobin; adaptive least usage; non-uniform file access (NUFA)!


5. Automatic file replication (AFR); Snapshot! and Undelete!
6. Striping for performance!
7. Self-heal! No fsck!!!!
8. Pluggable transport modules (IB verbs, IB-SDP)!
9. I/O accelerators - I/O threads, I/O cache, read ahead and write
behind !
10. Policy driven - user group/directory level quotas , access
control lists (ACL)
© 2007 Z RESEARCH
Z RESEARCH
GigE
GlusterFS Design
GlusterFS Clustered Filesystem on x86-64 platform
Storage Clients
Cluster of Clients (Supercomputer, Data Center)
GLFS Client
Clustered Vol Manager
Clustered I/O Scheduler
GLFS Client
Clustered Vol Manager
Clustered I/O Scheduler
GLFS Client
Clustered Vol Manager
Clustered I/O Scheduler
GLFS Client
Clustered Vol Manager
Clustered I/O Scheduler
GLFS Client
Clustered Vol Manager
Clustered I/O Scheduler

GLFS Client
Clustered Vol Manager
Clustered I/O Scheduler
Storage Brick N
GLFSD
Volume
GLFSD
Volume
Storage Brick 1
GLFSD
Volume
Storage Brick 2
GLFSD
Volume
Storage Brick 3
GLFSD
Volume
GLFSD
Volume
GLFSD
Volume
Storage Brick 4
GLFSD
Volume
RDMA
RDMA
Storage Gateway
NFS/Samba
GLFS Client
Storage Gateway

NFS/Samba
GLFS Client
Storage Gateway
NFS/Samba
GLFS Client
RDMA
Compatibility with
MS Windows
and other Unices
InfiniBand / GigE / 10GigE
NFS / SAMBA
over TCP/IP
C
l
i
e
n
t

S
i
d
e
S
e
r
v
e
r


S
i
de
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Server
VFS
Stackable Design
Client
Server
I/O Cache
Unify
POSIX
Ext3 Ext3Ext3
TCPIP – GigE, 10GigE / InfiniBand - RDMA
POSIX POSIX
Brick 1
ServerServer
GlusterFS
Client Client
Gl
u
s
t
e
r
F
S

C

l
i
e
n
t
Read Ahead
Brick 2 Brick n
GlusterFS Server GlusterFS ServerGlusterFS Server
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Function - unify
Server/Head Node 1
Server/Head Node 2
Server/Head Node 3
Client View
Client View
(unify/roundrobin)
/files/aaa
/files/bbb
/files/aaa
/files/bbb
/files/ccc
/files/ccc
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Function – unify+AFR
Server/Head Node 1
Server/Head Node 2
Server/Head Node 3
Client View

Client View
(unify/roundrobin+AFR)
/files/aaa
/files/bbb
/files/aaa
/files/ccc
/files/aaa
/files/bbb
/files/bbb
/files/ccc
/files/ccc
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Function - stripe
Server/Head Node 1
Server/Head Node 2
Server/Head Node 3
Client View (stripe)
/files/aaa
/files/bbb
/files/ccc
/files/aaa
/files/bbb
/files/ccc
/files/aaa
/files/bbb
/files/ccc
/files/aaa
/files/bbb
/files/ccc

© 2007 Z RESEARCH
Z RESEARCH
I/O Scheduling
1. Round robin
2. Adaptive least usage (ALU)
3. NUFA
4. Random
5. Custom
volume bricks
type cluster/unify
subvolumes ss1c ss2c ss3c ss4c
option scheduler alu
option alu.limits.min-free-disk 60GB
option alu.limits.max-open-files 10000
option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage
option alu.disk-usage.entry-threshold 2GB # Units in KB, MB and GB are allowed
option alu.disk-usage.exit-threshold 60MB # Units in KB, MB and GB are allowed
option alu.open-files-usage.entry-threshold 1024
option alu.open-files-usage.exit-threshold 32
option alu.stat-refresh.interval 10sec
end-volume
© 2007 Z RESEARCH
Z RESEARCH
Benchmarks
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Throughput & Scaling Benchmarks
Benchmark Environment
Method: Multiple 'dd' of varying blocks are read and written from multiple clients
simultaneously.

GlusterFS Brick Configuration (16 bricks)
Processor - Dual Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
RAM - 8GB FB-DIMM
Linux Kernel - 2.6.18-5+em64t+ofed111 (Debian)
Disk - SATA-II 500GB
HCA - Mellanox MHGS18-XT/S InfiniBand HCA
Client Configuration (64 clients)
RAM - 4GB DDR2 (533 Mhz)
Processor - Single Intel(R) Pentium(R) D CPU 3.40GHz
Linux Kernel - 2.6.18-5+em64t+ofed111 (Debian)
Disk - SATA-II 500GB
HCA - Mellanox MHGS18-XT/S InfiniBand HCA
Interconnect Switch: Voltaire port InfiniBand Switch (14U)
GlusterFS version 1.3.pre0-BENKI
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Performance
Aggregated I/O Benchmark on 16 bricks(servers) and 64 clients over IB Verbs transport.
¾Peak aggregated read throughput was 13 GBps.
¾After a particular threshold, write performance plateaus because of disk I/O bottleneck.
¾System memory greater than the peak load will ensure best possible performance.
¾ib-verbs transport driver is about 30% faster than ib-sdp transport driver.
¾Peak aggregated read throughput was 13 GBps.
¾After a particular threshold, write performance plateaus because of disk I/O bottleneck.
¾System memory greater than the peak load will ensure best possible performance.
¾ib-verbs transport driver is about 30% faster than ib-sdp transport driver.
1
3
G
B

p
s
!
© 2007 Z RESEARCH
Z RESEARCH
Scalability
Performance improves when the number of bricks are increased
Throughput increases with corresponding increased in servers from 1 to 16
© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Value Proposition
9 A single solution for 10's of Terabytes to Petabytes
9 No single point of failure – completely distributed - no
centralized meta-data
9 Non-stop Storage – can withstand hardware failures, self
healing, snap-shots
9 Data easily recovered even without GlusterFS
9 Customizable schedulers
9 User Friendly - Installs and upgrades in minutes
9 Operating system agnostic!
9 Extremely cost effective – deployed on any x86-64
hardware!
© 2007 Z RESEARCH
Z RESEARCH
Thank You!


© 2007 Z RESEARCH
Z RESEARCH
Backup Slides

© 2007 Z RESEARCH
Z RESEARCH
GlusterFS Vs Lustre Benchmark
Benchmark Environment
Brick Config (10 bricks)
Processor - 2 x AMD Dual-Core Opteron™ Model 275 processors
RAM - 6 GB
Interconnect - InfiniBand 20 Gb/s - Mellanox MT25208 InfiniHost III Ex
Hard disk - Western Digital Corp. WD1200JB-00REA0, ATA DISK drive
Client Config (20 clients)
Processor - 2 x AMD Dual-Core Opteron™ Model 275 processors
RAM - 6 GB
Interconnect - InfiniBand 20 Gb/s - Mellanox MT25208 InfiniHost III Ex
Hard disk - Western Digital Corp. WD1200JB-00REA0, ATA DISK drive
Software Version
Operation System - Redhat Enterprise GNU/Linux 4 (Update 3)
Linux version - 2.6.9-42
Lustre version - 1.4.9.1
GlusterFS version - 1.3-pre2.3
© 2007 Z RESEARCH
Z RESEARCH
Directory Listing Benchmark
Directory Listing
0
0.2
0.4
0.6
0.8
1
1.2

1.4
1.6
1.8
1.7
1.5
GlusterFS vs Lustre - Directory Listing benchmark
Lustre
GlusterFS
Time in Seconds
Lower is better
$ find /mnt/glusterfs
"find" command navigates across the directory tree structure and prints them to console.
In this case, there were thirteen thousand binary files.
Note: Commands are same for both GlusterFS and Lustre, except the directory part.
© 2007 Z RESEARCH
Z RESEARCH
Copy Local to Cluster File System
Copy Local to Cluster File System
0
2.5
5
7.5
10
12.5
15
17.5
20
22.5
25
27.5

30
32.5
35
37.5
37
26
GlusterFS vs Lustre - Copy Local to Cluster File System
Lustre
GlusterFS
Time in Seconds
Lower is better
$ cp -r /local/* /mnt/glusterfs/
cp utility is used to copy files and directories.
Copy 12039 files (595 MB) were copied into the cluster file system.
© 2007 Z RESEARCH
Z RESEARCH
Copy Local from Cluster File System
Copy Local from Cluster File System
0
5
10
15
20
25
30
35
40
45
45
18

GlusterFS vs Lustre - Copy Local from Cluster File System
Lustre
GlusterFS
Time in Seconds
Lower is better
$ cp -r /mnt/glusterfs/ /local/*
cp utility is used to copy files and directories.
Copy 12039 files (595 MB) were copied from the cluster file system.
© 2007 Z RESEARCH
Z RESEARCH
Checksum
Checksum
0
5
10
15
20
25
30
35
40
45
50
45.1
44.4
GlusterFS vs Lustre - Checksum
Lustre
GlusterFS
Time in Seconds
Lower is better

Perform md5sum calculation for all files across your file system. In this case, there were
thirteen thousand binary files.
$ find . -type f -exec md5sum {} \;
© 2007 Z RESEARCH
Z RESEARCH
Base64 Conversion
Base64 Conversion
0
2.5
5
7.5
10
12.5
15
17.5
20
22.5
25
27.5
25.7
25.1
GlusterFS vs Lustre - Base64 Conversion
Lustre
GlusterFS
Time in Seconds
Lower is better
Base64 is an algorithm for encoding binary to ASCII and vice-versa. This benchmark was
performed on a 640 MB binary file.
$ base64 encode big-file big-file.base64
© 2007 Z RESEARCH

Z RESEARCH
Pattern Search
Pattern Search
0
5
10
15
20
25
30
35
40
45
50
55
54.3
52.1
GlusterFS vs Lustre - Pattern Search
Lustre
GlusterFS
Time in Seconds
Lower is better
grep utility searches for a PATTERN on a file and prints the matching lines to console.
This benchmark used 1GB ASCII BASE-64 file.
$ grep GNU big-file.base64
© 2007 Z RESEARCH
Z RESEARCH
Data Compression
Compression Decomression
0

2
4
6
8
10
12
14
16
18
20
18.3
16.5
14.8
10.1
GlusterFS vs Lustre - Data Compression
Lustre
GlusterFS
Time in Seconds
Lower is better
GNU gzip utility compresses files using Lempel-Ziv coding.
This benchmark was performed on 1GB TAR binary file.
$ gzip big-file.tar
$ gunzip big-file.tar.gz
Lower is better
© 2007 Z RESEARCH
Z RESEARCH
Apache Web Server
Apache web server
0
0.25

0.5
0.75
1
1.25
1.5
1.75
2
2.25
2.5
2.75
3
3.25 3.17
GlusterFS vs Lustre - Apache Web Server
Lustre
GlusterFS
Time in Minutes
Lower is better
Apache served 12039 files (595 MB) over HTTP protocol. wget client fetched the files
recursively.
**Lustre failed after downloading 33 MB out of 585 MB in 11 mins.
Lustre Failed to execute**

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×