nvm express (nvme): an overview and performance study · 2018. 11. 19. · 1500 2000 2500 3000 3500...

14
NVM Express (NVMe): An Overview and Performance Study Chris Hollowell <[email protected]> RHIC/ATLAS Computing Facility Brookhaven National Laboratory

Upload: others

Post on 20-Feb-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

NVM Express (NVMe): An Overview and Performance Study

Chris Hollowell <[email protected]>RHIC/ATLAS Computing FacilityBrookhaven National Laboratory

Page 2: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

2

What is NVMe?

NVMe - NVM Express - Non-Volatile Memory ExpressAn industry standard for attaching SSDs (NAND flash) directly tothe PCIe bus

Eliminates latency and bandwidth limitations imposed by SAS/SATAstorage controllers optimized for traditional rotating media

Architected for highly parallel accessSupport for up to 64k hardware I/O queues, with up to 64k commands per queueExcellent for parallel I/O operations in servers with ever-increasing processor core counts

Supported in the Linux kernel since 3.3Backported to RHEL/SL 6 (kernel 2.6.32) in the 6.5 releaseUses “Multi-Queue Block IO Queuing” (blkmq) rather than the standard kernel block I/O schedulers (noop, cfq, deadline) tosupport parallel hardware queue architecture

Page 3: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

3

What is NVMe? (Cont.)

NVMe Command Queue Architecture (from nvmexpress.org)

Several vendors manufacturing NVMe hardware:Intel, Crucial, Samsung, etc.

Sizes over 3 TB available

Available as PCIe add-in cards, or a 2.5” SFF-8639/U.2form factor with a physically SAS-like connector for drive backplanesCost is still fairly high:

400 GB drive ~$400+1 TB drive >$1000

Page 4: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

4

Comparison to Fusion-IO Devices

Fusion-IO has offered PCIe-connected SSD storage (ioDrive) for a number ofyears: how is NVMe different?

NVMe interface/protocol is an industry standardNo need for proprietary OS drivers

Commoditization of NVMe makes the technology significantly more affordable

SFF-8639/U.2 form factor NVMe drives are in a familiar 2.5” physically SAS-like form which can be used on a backplane, and easily hotplugged

Helps to reduce downtime when replacing failed devices

Performance of NVMe drives can be better than traditional Fusion-IO devices

NVMe and Fusion-IO Read IOPs (from “The Register”)

Page 5: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

5

NVMe Evaluation

Test ConfigurationDell PowerEdge R630

2 Intel Xeon 2650v3 2.3 GHz CPUs (32 logical cores total)PERC H730 (1 GB cache) storage controller64 GB (8x8 GB) 2133 MHz DDR4 DIMMs2 300 GB Dell 400-AEEH 15K RPM 6 Gbps SAS 2.5” drives 2 400 GB Sasmsung/Dell MZWEI400HAGM NVMe 2.5” drives

SFF-8639/U.2 form factor – front loadingSSDs:

Samsung MZ-7PD512 512 GB 6 Gbps SATA 2.5” driveCrucial CT1024M550SSD1 1 TB 6 Gbps SATA 2.5” drive

Most tests performed with EXT4Scientific Linux 6

Kernel 2.6.32-504.3.3.el6

BenchmarksCFQ I/O scheduler used with SASDeadline I/O scheduler used with SSDsBlkmq scheduling used with NVMe

No other scheduling options available

Page 6: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

6

NVMe Evaluation (Cont.)

Benchmarks (Cont.)

bonnie++http://www.coker.com.au/bonnie++/Single and synchronized multi-process tests runPrimarily interested in sequential I/O tests results

Multiple processes performing sequential I/O creates a somewhat randomized workload

Likely a good simulation of the workload in our batch processingenvironment, in particular because our batch jobs often use a stage in/out to/from local scratch I/O model

IOzonehttp://www.iozone.orgPrimarily interested in random I/O performance

PgbenchInterested in testing NVMe as the backend storage for PostgreSQL,potentially for use with dCache

Page 7: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

7

Bonnie++ - Single Process

Sequential Write (block) Sequential Read (block)

0

200

400

600

800

1000

1200

960

778

192237

437478

450

615

NVMe

SAS

Crucial SSD

Samsung SSD

MB/s

bonnie++ with default options

Page 8: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

8

Bonnie++ - 32 Processes

Random Write (block) Random Read (block)

0

500

1000

1500

2000

2500

3000

952

2422

182 167

425

566452

609

NVMe

SAS

Crucial SSD

Samsung SSD

MB/s

32 synchronized parallel bonnie++ processes: bonnie++ -y -r 2560 -s 8120Parallelism creates a randomized workload

Page 9: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

9

IOzone – Random Write

Page 10: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

10

Iozone – Random Read

Page 11: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

11

Pgbench

pgbench -M prepared -T 1200

500

1000

1500

2000

2500

3000

3500

3222

311

NVMe SASTPS

Postgres server benchmark - database stored on the NVMe device

Page 12: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

12

NVMe Filesystem Peformance Comparison

Write Read0

500

1000

1500

2000

2500

3000

952

2422

931

2516

514

2595

EXT4

EXT3

XFSMB/s

32 synchronized parallel bonnie++ processes: bonnie++ -y -r 2560 -s 8120

Page 13: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

13

Conclusions

NVMe drives eliminate latency and bandwidth limitations imposed by SAS/SATA storage controllers which are optimized for traditional rotating media

NVMe technology available today can provide impressive I/O performanceTypically saw a 100% or more performance improvement for NVMeover traditional SSDs in our sequential and random I/O benchmarks

It was not uncommon to see the NVMe drive performten times better than the SAS drive benchmarked, particularly withsmaller record sizes, and with random I/O tests

High density (3 TB+) NVMe drives are available, making this a viablestorage alternative to traditional drives and SSDs

Unfortunately, still a relatively expensive optionMay change in the future

As this commoditized hardware becomes increasingly commonplace, expect the cost to drop

Page 14: NVM Express (NVMe): An Overview and Performance Study · 2018. 11. 19. · 1500 2000 2500 3000 3500 3222 311 NVMe SAS TPS Postgres server benchmark - database stored on the NVMe device

14

Acknowledgments

Thanks to Shawn Hoose, a student intern at RACF this summer, for his workin conducting this study

Thanks to Dell, Costin Caramarcu (RACF), Tejas Rao (RACF)Alexandr Zaytsev (RACF) for providing evaluation equipment, andadditional assistance