About SSD - Dongjun Shin Samsung Electronics pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.04 MB, 27 trang )

About SSD
Dongjun Shin
Samsung Electronics
Outline

SSD primer

Optimal I/O for SSD

Benchmarking Linux FS on SSD

Case study: ext4, btrfs, xfs

Design consideration for SSD

What’s next?

New interfaces for SSD

Parallel processing of small I/O
SSD Primer (1/2)

Physical unit of flash memory

Page
NAND
– unit for read & write

Block
NAND
– unit for erase (a.k.a erasable block)


Physical characteristics

Erase before re-write

Sequential write within an erasable block
LBA space
(visible to OS)
Flash memory space
NAND page (2-4kB)
NAND block = 64-128 NAND pages
Flash Translation Layer
SSD Primer (2/2)

Internal organization: 2-dimensional (NxM parallelism)

Similar to RAID-0 (stripe size = sector or page
NAND
)

Effective page & block size is multiplied by NxM (max)
SSD
Controller
running
F/W(FTL)
Host I/F
(ex. SATA)
N-channel
(striping)
M-way (pipelining)

0 4 8 12
32364044
1 5 9 13
33374145
2 6 1014
34384246
3 7 1015
35394347
16202428
48525660
17212529
49535761
18222630
50545862
Ch0
Ch1
Ch2
Ch3
Chip0 Chip1
Chip2
Chip3
32364044
64687276
48525660
80848892
Optimal I/O for SSD

Key points

Parallelism

•
The larger the size of I/O request, the better

Match with physical characteristics
•
Alignment with page or block size of NAND*
• Segmented sequential write (within an erasable block)

What about Linux?

HDD also favors larger I/O  read-ahead, deferred aggregated write

Segmented FS layout  good if aligned with erasable block boundary

Write optimization  FS dependent (ex. allocation policy)
* Usually, partition layout is not aligned (1st partition at LBA 63)
Test environment (1/2)

Hardware

Intel Core 2 Duo , 1GB RAM

Software

Fedora 7 (Kernel 2.6.24)

Benchmark: postmark

Filesystems


No journaling - ext2

Journaling - ext3, ext4, reiserfs, xfs
•
ext3, ext4: data=writeback,barrier=1[,extents]
•
xfs: logbsize=128k

COW, log-structured - btrfs (latest unstable, 4k block), nilfs (testing-8)

SSD

Vendor M (32GB, SATA): read 100MB/s, write 80MB/s

Test partition starts at LBA 16384 (8MB, aligned)
Test environment (2/2)

Postmark workload

Ref: Evaluating Block-level Optimization through the IO Path (USENIX
2007)
9G/17G
9.7G/12G
600M/1.8G
630M/755M*
Total app
read/write
10,0004,2500.1-3MLL
10,0001,0000.1-3MLS
100,000100,0009-15KSL

100,00010,0009-15KSS
# of
transaction
# of file
(work-set)
File sizeWorkload
* Mostly write-only
Benchmark results (1/2)

Small file size (SS, SL)
SS SL
0
500
1000
1500
2000
2500
ext2 ext3 ext4 reiserfs xfs btrfs nilfs
transaction/sec
Benchmark results (2/2)

Large file size (LS, LL)
LS LL
0
5
10
15
20
25
30

ext2 ext3 ext4 reiserfs xfs btrfs nilfs
transaction/sec
I/O statistics (1/2)

Average size of I/O
0
20
40
60
80
100
120
140
SS SL LS LL SS SL LS LL
read write
Avg I/O size (Kbytes)
ext2 ext3 ext4 reiserfs xfs btrfs nilfs
I/O statistics (2/2)

Segmented sequentiality of write I/O (segment: 1MB)
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
16.00%
18.00%

20.00%
SS SL LS LL
ext2 ext3 ext4 reiserfs xfs btrfs nilfs
100% 100% 100% 100%
Case study - ext4

Condition

data=ordered, allocation: default/noreservation/oldalloc
0
200
400
600
800
1000
1200
SS SL
transaction/sec
ext4-wb ext4-ord ext4-nores ext4-olda
1. Almost no difference
between allocation policies
2. Why data=ordered is
better for SL?
Case study - btrfs

Condition

Block size: 4k/16k, allocation: ssd option on/off
0
200

400
600
800
1000
1200
1400
1600
1800
SS SL LS LL
transaction/sec
btrfs-4k btrfs-16k btrfs-ssd-4k
1. 4k is better than 16k
(sequentiality = 12% : 2%)
2. ssd option is effective
(10-40% improvement)
Case study - xfs

Condition

Mount with barrier on/off
0
100
200
300
400
500
600
700
800
SS SL LS LL

transaction/sec
xfs-bar xfs-nobar
Large barrier overhead
Design consideration for SSD

Lessons from flash FS (ex. logfs)

Sequential writing at multiple logging points

Wandering tree
• Trace-off between sequentiality vs. amount of write
•
Cf. space map (Sun ZFS)

Need to optimize garbage collection overhead
•
Either FS itself or FTL in SSD

Next topic: End-to-end optimization

Exchange info with SSD (trim, SSD identification)

Make best use of parallelism
New interfaces for SSD (t13.org)

Trim command

Let device know which LBA range is not used
•
This will be helpful for optimizing FTL


Should be passed through: FS  bio  scsi  libata
•
Passing bio with no data
• What about I/O reordering & I/O queuing?

SSD identification (added to “ATA identify”)

Report size of page

and erasable block
•
Physical or effective?

Useful for FS and volume manager
Parallel processing of small I/O

Make better use of I/O queuing (TCQ or NCQ)

Parallel processing of small I/O

Desktop environment? Barrier?
A B C D A B C D
A
B,C
D
without I/O queuing, 4 steps
with I/O queuing, 2 steps
request
queue

A
B,C
D
Ch0
Ch1
Ch2
Ch3
Ch0
Ch1
Ch2
Ch3
chip is busy
chip is idle
1 2 3 4 1 2
Summary

Optimization for SSD

Alignment is important

Segmented sequentiality

Make better use of parallelism (either small or large)
•
I/O barrier may stall the pipelined processing

What can you do?

File system: alignment, allocation policy, design (ex. COW)


Block layer: bio w/ hint, barrier, I/O queueing, scheduler(?)

Volume manager: alignment, allocation

Virtual memory: read-ahead
References

T13 spec for SSD

/>
/>
Introduction to SSD and flash memory

/>
/>
/>
FTL description & optimization

BPLRU: A Buffer Management Scheme for Improving Random Writes
in Flash Storage (FAST ’08)
Appendix. I/O Pattern

SS workload – ext4, xfs
Appendix. I/O Pattern

SS workload – btrfs, nilfs
Appendix. I/O Pattern

SL workload – ext4, xfs
Appendix. I/O Pattern


SL workload – btrfs, nilfs
Appendix. I/O Pattern

LS workload – ext4, reiserfs, xfs
Appendix. I/O Pattern

LS workload – btrfs, nilfs

About SSD - Dongjun Shin Samsung Electronics pot

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về