the case for benchmarking control · july 14, 2020 the case for benchmarking control operations in...

34
The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) July 14, 2020 Alex Merenstein 1 , Vasily Tarasov 2 , Ali Anwar 2 , Deepavali Bhagwat 2 , Lukas Rupprecht 2 , Dimitris Skourtis 2 , and Erez Zadok 1 The Case for Benchmarking Control Operations in Cloud Native Storage 12 th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage ’20) 1 Stony Brook University; 2 IBM Research - Almaden

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020

Alex Merenstein1, Vasily Tarasov2, Ali Anwar2, Deepavali Bhagwat2,Lukas Rupprecht2, Dimitris Skourtis2, and Erez Zadok1

The Case for Benchmarking Control Operations in Cloud Native Storage

12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage ’20)

1Stony Brook University; 2IBM Research - Almaden

Page 2: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 2

● Introduction● Storage Control Operations● Impact of Storage Control Operations● Benchmark Design● Conclusion

Outline

Page 3: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 3

● Cloud native software◆ Often container based◆ Microservice architectures◆ Frequent scaling and updates

● Cloud native storage◆ Used by applications, not systems◆ Automated management◆ Container Storage Interface (CSI)

provides standard interface

New Trends in Clouds

https://landscape.cncf.io

Page 4: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 4

Benchmarking’s Blind Spot● Storage challenges

◆ Choosing a storage provider◆ Evaluating different storage configurations

● Current benchmarks (e.g., fio1, pgbench2, NoSQLBench3)◆ I/O operations◆ Metadata operations◆ Storage control operations

1. https://fio.readthedocs.io/en/latest/index.html2. https://www.postgresql.org/docs/current/pgbench.html3. https://www.datastax.com/blog/2020/03/nosqlbench

Page 5: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 5

● Introduction● Storage Control Operations● Impact of Storage Control Operations● Benchmark Design● Conclusion

Outline

Page 6: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 6

Storage Control Operations● Storage control operations

◆ Creating volumes, attaching volumes, snapshotting, resizing, etc.◆ Volumes: single unit of storage provisioned by a storage provider

● More frequent in cloud native environments● Existing benchmarks do not generate

storage control operations

Page 7: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 7

Increasing Number of Storage Control Operations

● Some companies have increased deployments from 2–3 ⨉ week to 150 ⨉ day1

● On one platform, 54% of containers ran for ≤5 minutes and hosts ran a median of 30 containers2

◆ On a 20 nodes cluster, that results in a rate of one container creation per second

1. https://www.weave.works/technologies/going-cloud-native-6-essential-things-you-need-to-know2. https://sysdig.com/blog/sysdig-2019-container-usage-report/

Page 8: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 8

User Creates Container Requiring Storage

Container

Storage Provider Node (VM)

Page 9: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 9

Create Volume(Storage Control Operation #1)

Container

Storage Provider Node (VM)

Volume

Storage control operation #1

Page 10: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 10

Container Scheduled on Node

Storage Provider Node (VM)

Container

Container

Storage control operation #1

Volume

Page 11: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 11

Volume Mounted on Node(Storage Control Operation #2)

Storage Provider Node (VM)

ContainerVolume Mount

Storage control operation #2

Storage control operation #1

Volume

Page 12: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 12

Volume Attached to Container(Storage Control Operation #3)

Storage Provider Node (VM)

ContainerVolume Mount

Storage control operation #2

Storage control operation #3

Storage control operation #1

Volume

Page 13: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 13

● Introduction● Storage Control Operations● Impact of Storage Control Operations● Benchmark Design● Conclusion

Outline

Page 14: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 14

Impact of Storage Control Operations

● Experiment 1: creating and attaching volumes◆ Do storage providers have different performance characteristics when

executing these operations?

● Experiment 2: snapshots with concurrent workload◆ Can storage control operations impact other workloads?◆ Is the level of impact different across different storage providers?

Page 15: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 15

Experimental Setup

Kubernetes Master Node

Kubernetes Master Node

Kubernetes Master Node

Kubernetes Worker Node

Kubernetes Worker Node

Gluster

Kubernetes with three masters in high availability configuration and two workers nodes

Three different-by-design storage providers

Page 16: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 16

Experiment 1: Volume Creation and Attachment

Page 17: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 17

Median > 2⨉

higherMedian

~2⨉ lower

Experiment 1: Volume Creation and Attachment

Page 18: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 18

Experiment 1: Volume Creation and Attachment

Performance does differ

between storage providers

Page 19: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 19

Experiment 2: Snapshotting

Page 20: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 20

Experiment 2: Snapshotting

Page 21: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 21

Experiment 2: Snapshotting

Page 22: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 22

Experiment 2: Snapshotting

P99.9 latency 3.3⨉ higher

with 20 snapshots

P99.9 latency 24⨉ higher

with 20 snapshots

Page 23: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 23

Experiment 2: Snapshotting

Storage control operations can

impact other workloads Impact varies

depending on storage provider

Page 24: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 24

● Introduction● Storage Control Operations● Impact of Storage Control Operations● Benchmark Design● Conclusion

Outline

Page 25: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 25

Benchmark Design RequirementsWorkload1. Create I/O and storage control

workloads2. Specify complex & realistic storage

control workloads3. Use existing tools for I/O workloads4. Include QoS targets

Useability1. Enable reproducibility2. Be easy to use

Result Measurement & Visualization1. Measurement should be decoupled from

I/O generation2. Results should be aggregated in clear,

actionable manner3. Metrics collection should have low

overhead

Page 26: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 26

Proposed Design1. Benchmark Controller: creates

I/O workload containers and executes control operations

Page 27: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 27

Proposed Design2. User creates Benchmark object3. Benchmark objects: custom

object type, created by users to define a benchmark

Page 28: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 28

4. I/O containers: created by Benchmark Controller to run I/O workload

5. Container image repository: I/O workloads can be created using existing I/O benchmarking tools such as fio or filebench

Proposed Design

Page 29: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 29

6. The Benchmark Controller executes Control Operation workloads by acting directly on PVs and PVCs

Proposed Design

Page 30: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 30

7. The volumes used by the benchmark are provisioned by the storage provider specified in the Benchmark object

Proposed Design

Page 31: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 31

Proposed Design8. Results and metrics are collected

and can be analyzed and visualized using tools such as ELK or Grafana

Page 32: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 32

● Introduction● Storage Control Operations● Impact of Storage Control Operations● Benchmark Design● Conclusion

Outline

Page 33: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020 33

● New benchmark is needed to support cloud native workflows

● Proposed nine requirements and an initial design for such a benchmark

● Looking for community input, especially for storage control operation rates

Conclusion

Page 34: The Case for Benchmarking Control · July 14, 2020 The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20) 4 Benchmarking’s Blind Spot Storage challenges

The Case for Benchmarking Control Operations in Cloud Native Storage (HotStorage ’20)July 14, 2020

Alex Merenstein1, Vasily Tarasov2, Ali Anwar2, Deepavali Bhagwat2,Lukas Rupprecht2, Dimitris Skourtis2, and Erez Zadok1

The Case for Benchmarking Control Operations in Cloud Native Storage

1Stony Brook University; 2IBM Research - Almaden

Thank YouQ&A

Contact: [email protected]