1 thesis proposal zachary kurmas (v4.0– 24 april 03)

1

Thesis Proposal

Zachary Kurmas

(v4.0– 24 April 03)

2

Outline

• Motivation and discussion of problem• Overview of of solution• Contributions• Proposal• Future Work• Timeline• Details of solution (time permitting)

3

Typical disk array

Controller ACache

Controller BCache

SCSI Buses

Fibre Channel

Hosts

4

Motivation

• Potential storage system designs and automated configuration algorithms must be evaluated with respect to some set of workloads.• Ideally, these workloads are actual production

workloads.• This is usually impossible

• Two alternatives• Replay traces of production workloads• Construct and use synthetic workload

5

Problem

• The currently available set of workload traces and synthetic workloads are not sufficient• Can’t get enough of right traces

• Companies don’t like to give them out• No traces of future workloads

• Quality of synthetic workloads too low• High-quality synthetic workload must share certain

key properties with production workload• These properties are currently found by trial-and-

error and domain expertise

6

Solution

• Improve quality of synthetic I/O workloads• Automatically determine what properties

a synthetic workload must share with the production workload on which it is based(R,1024,120932,124)(W,8192,120834,126)(W,8192,120844,127)(R,2048,334321,131

...

Production Workload List of Properties SyntheticWorkload

(R,1024,120932,124)(W,8192,120834,126)(W,8192,120844,127)(R,2048,334321,131

...

CDF of Response Time

7

Contributions

• Prototype system to automatically determine what properties a synthetic workload must share with the production workload on which it is based• Library of possible properties and corresponding generation

techniques• Algorithm for searching through library

• Examination of tradeoffs between size and complexity of properties and quality of synthetic workloads

• Evaluation of whether improved synthetic workloads enable us to make better design decisions

• Exploration of workload scaling using identified properties

16

Review of problem

• Not enough input for evaluations of storage systems• Too few workload traces• Traces not always right answer• Synthetic workloads are not practical

• Don’t know precisely what makes synthetic workloads representative

• Trial-and-error too cumbersome• Can’t maintain every conceivable attribute-

value

17

Outline

• Discussion of problem• Overview of of solution• Contributions• Proposal• Future Work• Timeline• Details of solution (time permitting)

18

Attributes / Attribute-Values

• An attribute is the name or description of a property• Read percentage• Mean interarrival time

• An attribute-value is the actual value of the measurement (i.e., the actual property.) • Read percentage of 67• Mean interarrival time of .8ms

19

Requirements of attributes

• Attribute-values are properties of only the workload.• Response time not an attribute because

attribute-value depends on both workload and disk array

• Attributes must be quantifiable• “Locality” and “burstiness” are

qualitative concepts. “runCount” and “Hurst parameter” are attributes

20

The Distiller

• Automate process of choosing necessary attribute-values

• Input: workload trace and large set of attributes

• Output: set of attributes that identifies those attribute-values that synthetic workload must share with target

• Helps identify type of any necessary attribute missing from library• (if no known set of attribute-values leads to a

representative synthetic workload)

21

Basic Idea

• Basic Idea • Begin with simple attribute-values

• (distributions of I/O request parameters)

• Iteratively add attribute-values until evaluation of original and synthetic workloads is similar

(R,1024,120932,124)(W,8192,120834,126)(W,8192,120844,127)(R,2048,334321,131

...

Production Workload Attribute-value List SyntheticWorkload

(R,1024,120932,124)(W,8192,120834,126)(W,8192,120844,127)(R,2048,334321,131

...


23

Challenges: Which attribute to add

• Each iteration takes many minutes; therefore, we must limit the number of iterations

• Addition of necessary attribute-values does not always result immediately in improvement

• Fewer attributes better• Smaller compact representation• Less complex generation techniques• Generation techniques for some attributes can

interfere with each other• E.g., distribution of location and jump distance

26

Outline

• Discussion of problem• Overview of of solution• Contributions

• The Distiller itself• An analysis of the key attributes for many different

workloads• A an analysis of the potential uses of synthetic workloads

• Proposal• Future Work• Timeline• Details of solution (time permitting)

27

The Distiller itself

• Distiller makes generating representative synthetic workloads practical• Encourage companies to make evaluation

workloads available• More accurate / relevant research results

• Basis for “what-if” evaluations• Future estimations• Stability estimates• Improved relevance of old evaluation workloads

(possibly)

• Distiller provides library of attributes and corresponding generation techniques

28

Analysis of key attributes for many workloads

• Attribute-values that lead to representative synthetic workloads describe what makes the workload behave like or unlike other workloads• The “essence” of the workload

• Possible to study essence of many different (workload, storage system) pairs and look for interesting trends or patterns

29

Potential benefits of analyses

• Help us learn about how workloads and storage systems interact• Attribute-values contain all info

necessary to predict behavior• Focus researchers’ attention on

concentrated information• Help development of analytical models• Identify potential areas of improvement

qq

30

Outline

• Discussion of problem• Overview of of solution• Contributions• Proposal

• Evaluate the correctness of the Distiller• Examine the attributes chosen for different

workload/storage system pairs• Show that the resulting synthetic workloads are useful

• Future Work• Timeline• Details of solution (time permitting)

31

Evaluate correctness of Distiller

• Show that the Distiller works for:• One definition of “representative”:

response time distribution • Up to three storage systems: FC-60, FC-

30, and JBOD (Just a Bunch Of Disks)• Several artificial workloads• Five production workloads: Open Mail,

TPC-C, TPC-H, file system trace• Stopping Condition:

• Distiller can correctly identify key attributes for artificial workloads

32

Definition of “representative”

• Design decisions almost always based on performance. Thus, matching response time distributions should be a stronger condition than most design decisions

• Distribution of response time stronger condition than mean response time

• Many decisions decide between competing configurations. Showing applicability across storage system configurations is my next evaluation

•Workloads are considered representative when RMS difference between distributions of response time is sufficiently small

SecondsN

umbe

r of

IO

s

representative

Not representative

38

Outline


• Demonstrate that the Distiller works • Examine the attributes chosen for different



39

Learn about attributes (1)

• Determine if attributes depend on the workload.• My Guess: Yes. Locality attributes are probably

different for write-only workload on FC-60

• Determine if attributes depend on the storage system?• Answer: They must. Storage system with constant

2min response time has no important attributes• Better objective: Compare attributes chosen for

similar storage systems / system configurations.• How much overlap?

40


• Determine which set of attributes does best overall (for a given storage system configuration)• average over all workloads• best worst-case• Can either of these be used in practice for all

wklds?• Attempt to find a single set of attributes

that works for almost all workloads (e.g. take union of all chosen attributes)• Examine complexity (e.g., number of

attributes) of such a set

41


• Examine changes in attributes and attribute-values over time.• Compare traces of a file system taken in 1992,

1996, 1999, and 2002.• Attempt to develop scaling rules.

• Examine tradeoffs between accuracy and complexity.

• Attempt assign a “percent contribution” to each attribute and/or attribute group?

42

Outline


• Demonstrate that the Distiller works • Examine the attributes chosen for different



43

Apply to real life

• Show that synthetic workloads can be used to make design decisions

• Show that currently available traces not adequate

• Show usefulness of “knobs”

44

Synthetic workloads useful

• Show that synthetic workloads can be used in place of real workloads to make simple design decision• Cache size• Prefetch length• High-water mark of write-back cache

• Complex design decisions basis for entire Ph.D. theses. Can’t practically reproduce at Tech.

• Use Pantheon disk simulator to simulate effects of changing above parameters

45

Synthetic workloads useful (2)

• Take production workload trace• Simulate performance given different prefetch

lengths.• Choose best• Take synthetic workload based on production

workload• Simulate performance given different prefetch lengths• Compare best to best for production workload

• For cache size, find best performance/$ mark.

47

Show available traces inadequate

• Use Pantheon disk simulator to show that using the cello92 and cello02 traces to evaluate simple design decisions results in different answers.

• From this we infer that using cello92 traces to justify more complex design decisions also produces incorrect answer

48

Turning “knobs” useful

• Show that turning “knobs” of compact representation better than ad-hoc modifications to workload traces• Show that turning arrival time knob better

than contracting interarrival times• Show that turning request size knob better

than ad-hoc doubling of request size and location values.

• Evaluate turning of knobs versus removing ½ of cello I/Os based on process ID

49

Future Work

• Optimality • Find “smallest” set of attributes per

workload (e.g. set of attributes with smallest compact representation)

• Find smallest set of attributes per storage system (if possible)

• Use chosen attributes to develop analytical model of performance• Formula for performance, not simulation

50

Timeline• April June: Run Distiller on many workloads

• Submit results to MASCOTS• June July: Analyze changes over different workloads /

storage systems. • Submit results to CMG conference

• July August • Find best overall set of attributes. Find best worst-case

• September October: Attempt to develop set of attributes that works for all workloads on a given storage system• Submit results to SIGMETRICS and/or FAST

• November December: Evaluate different what-if scenarios.

• February 2004: Defense• January 2004 February 2004: write• March 2004 April 2004: interview• May 2004 July 2004: write• August 2004: graduate

51

Outline

• Discussion of problem• Overview of of solution• Contributions• Proposal• Future Work• Timeline• Details of solution (time permitting)

52

Generating Synthetic Workload

• To generate synthetic workload, randomly choose value for each element in table

• Attribute-values put restrictions on values chosen

• Adding attribute-values reduces the difference between synthetic and production workloads

(R, 1024, 42912, 10)(W, 8192, 12493, 12)(W, 2048, 20938, 15)(R, 2048, 43943, 2)(W 8192, 98238, 11)(W 8192, 76232, 23)

ReadWrite

RequestSize Location

ArrivalTime

53

Mean Arrival Time

Arrival Time Dist.

Hurst Parameter

Mean Request Size

Request Size Dist.

Request Size Attrib 3

Request Size Attrib 4 COV of Arrival Time

Dist. of Locations Read/Write ratio

Mean run length Markov Read/Write

Jump Distance R/W Attrib. #3

Proximity Munge R/W Attrib #4

Mean Read Size D. of (R,W) Locations

Read Rqst. Size Dist. Mean R,W run length

Mean (R, W) Sizes R/W Jump Distance

(R, W) Size Dists. R/WProximity Munge

Mean Arrival Time

Arrival Time Dist.

Hurst Parameter

Mean Request Size

Request Size Dist.

Request Size Attrib 3

Request Size Attrib 4 COV of Arrival Time

Dist. of Locations Read/Write ratio

Mean run length Markov Read/Write

Jump Distance R/W Attrib. #3

Proximity Munge R/W Attrib #4

Mean Read Size D. of (R,W) Locations

Read Rqst. Size Dist. Mean R,W run length

Mean (R, W) Sizes R/W Jump Distance

(R, W) Size Dists. R/WProximity Munge

Choosing Attribute Wisely

• Challenge• Not all attributes useful• Some attributes partially

redundant• Can’t test all attributes

• My Solution• Group attributes • Evaluate whole groups at once

Attributes

54

Attribute Groups

• Attributes measure one or more parameters• Mean Request Size Request Size• Distribution of Location Location• Burstiness Interarrival Time• Request Size • Read/Write

• Attributes grouped by parameter(s) measured• Location = {mean location, distribution of location,

locality, mean jump distance, mean run length, ...}• Arrival Time = {mean interarrival time, Markov

model of interarrival time, Hurst parameter, etc. }

Distribution of Read Size

55

Attribute Groups

• Each group corresponds to each column or set of columns• Operation Type• Request Size• {Arrival Time,

Location}

• Measures patterns within column(s)

(R, 1024, 42912, 10)(W, 8192, 12493, 12)(W, 2048, 20938, 15)(R, 2048, 43943, 2)(W, 8192, 98238, 11)(W, 8192, 76232, 23)

ReadWrite


ArrivalTime

Workload

56

121315

Do I need (more) attributes from the {Arrival Time} group?

• Idea #1: Add “best” attribute from {Arrival Time} and measure improvement• Amount of improvement implies potential benefit

R/W RS Loc AT R/W RS Loc AT

Current Attributes Attributes for Test

R, 1024, 10242W, 2048, 11224R, 1024, 10252

Current

Current

Current

Current

R, 1024, 10242W, 2048, 11224R, 1024, 10252

Current

Current

121415P

erfe

ct

57

Problem with idea #1

• Errors involving other parameters can interfere• Very random reads can overshadow moderate

queuing effects

R/W RS Loc AT R/W RS Loc AT

Per

fectCurrent

Current Attributes Attributes for Test

Cur

rentCurrent

58

Idea #2 --- Idea #1 “backwards”

• Look at a synthetic workload in which everything except Arrival Time is “perfect”.• Change in performance implies importance of

group.

R/W RS Loc AT

Cur

rent

Current Arrival Time Attributes

Perfect

Everything PerfectR/W RS Loc AT

Production Workload

Perfect

Workload Trace R/W RS Loc AT

Workload Trace

59

Problem with idea #2

• Workload on left missing not only {Arrival Time}• Also missing {Arrival Time, Request Size}, {Arrival Time,

Location} and {Arrival Time, Operation Type}• Cause of any difference not clear

R/WRS Loc AT R/W RS Loc AT

Current Operation Type Attributes Workload Trace

Production Workload

Production Workload

Cur

rent

60

Solution

• Remove {Arrival Time, Request Size}, {Arrival Time, Location} and {Arrival Time, Operation Type} from workload trace by “rotating” arrival times.• Only difference between workloads is {Arrival Time}

R/W RS Loc AT

Cur

rent

R/W RS Loc AT

Current Operation Type Attributes

“Rotated” Arrival Time

Production Workload

Production Workload

Prod

ucti

on

Wor

kloa

d

61

Process

• Add {Operation Type} attributes until two workloads below are representative

• Repeat for other attribute groups

R/W RS Loc AT

Cur

rent

R/W RS Loc AT

Current Operation Type Attributes

“Rotated” Operation Types

Production Workload

Production Workload

Prod

ucti

on

Wor

kloa

d

62

Hints

• If Distiller is unable to find attributes for a particular group, it identifies the deficiency• Helps people develop new attributes

• Attributes for multi-parameter groups must be compatible with single parameter groups• {Operation Type, Location} attribute must

maintain same properties as chosen {Operation Type} and {Location} parameters

63

End Of Talk

64

Problem:

• Lack of traces for researchers• …. Papers use same … traces

• Traces used may or may not be representative of actual production workloads

• When traces not sufficient, really bad synthetic workloads used instead

• We don’t know how to easily produce representative synthetic workload• Lack of synthetic workload generation ability

suggests lack of understanding of disk array and storage system interactions

65

Proposed Solution

• Improve our ability to generate synthetic workloads

• (Discuss previous work)

66

Workload Characteristic

• Characteristic: A property of a workload (or workload trace) that can be measured.• 27% reads• Mean request size of 8KB

• Must be property of workload alone• Response time not workload characteristics, but

characteristics of both workload and storage system

• Must be concrete measurable property.• “burstiness” and “locality” too vague.

• Also called “attribute-values”

67

Attributes

• Attribute: The “name” of a characteristic• Attribute Characteristic• eye color blue eyes• Read percentage 27% reads• mean request size mean size: 8KB

• Hence, characteristics also called “attribute-values”

68

How the Distiller works

• Partition known attributes into groups• All characteristics in each group contain

similar information

• Choose a “complete” set of characteristics from each group.• i.e. choose a set of characteristics that

contains all the necessary information from the group

69

$21,000 question

• What attribute-values must a synthetic workload should share with the production workload in order to be representative?• Do the attributes depend on the workload?• Do the attributes depend on the storage

system?• If so, how can we find them easily?• If not, what are they?

70

Trivial “solution” doesn’t work

• Trivial solution: Use many attribute-values• Problems with trivial solution

• Many attribute-values contain irrelevant info.• Many attribute-values contain duplicate info.• High-level description too large and complex

• Negates advantages of synthetic workload

• Generating synthetic workload too difficult • Obvious algorithms for generating attribute-values

often interfere with each other.

71

Challenges of Useful Solution

• Solution: Choose small set of “important” attribute-values• That is, attribute-values that have the

most impact on evaluation

• Challenges• Estimating impact of single attribute-

value on evaluation• Finding small set of attribute-value with

“disjoint” information

72

Goal of Distiller


• will have evaluation similar to original.

(R,1024,120932,124)(W,8192,120834,126)(W,8192,120844,127)(R,2048,334321,131

...

Original Workload

• Given a workload and storage system, • automatically find a set of attributes, so

Attribute List SyntheticWorkload(R,1024,120932,124)(W,8192,120834,126)(W,8192,120844,127)(R,2048,334321,131

...

• synthetic workloads with the same values

73

High-level approach

• Divide and Conquer• Partition attributes into groups

according to “type of information”• Recall some attributes describe similar info.

• Find a set of attribute-values that contains all the information for a particular group

• (No, its not that simple …)

74

Workload

• I/O request has four parameters• Read/Write type• Request Size• Location• Arrival Time

• shown in ms

• Workload series of I/O requests • Trace can be viewed as a

table with four columns

(R, 1024, 42912, 10)(W, 8192, 12493, 12)(W, 2048, 20938, 15)(R, 2048, 43943, 2)(W 8192, 98238, 11)(W 8192, 76232, 23)

ReadWrite


ArrivalTime

75

“Engineering” Contributions

• Finding representative synthetic workloads becomes practical• Basis for evaluations when traces are unavailable• Basis for “what-if” evaluations

• Provides basis for workload similarity metric• “Table-based” models

• Highlight what workload features a storage system handles best• Help configure storage system for new workload

77

Apply to “real life”

• Attempt to generate a representative synthetic workload when no trace exists• Choose workload trace and “hide” it• Use lessons from previous slides to choose attributes based

on similar workloads• Compare synthetic workload to trace

• Compare “what-if” workload based on chosen attributes to ad-hoc “what-if” workload• Play workload twice as fast• “bootstrapping”

• Attempt to find attributes that determine whether to use Raid 1/0 or Raid 5

78

Proposal: Apply to Real Life (2)

• Attempt to build “table-based” model of performance• n-dimensional table• Each axis represents one attribute• fill element (w, x, y, …) with performance of

workload with attribute-values w, x, y, …• Given new workload

• compute attribute-values w, x, y, … • Value in corresponding table element estimate of

performance

79

Problem

• Both alternatives have problems:•Workload traces

•Companies don’t like to give them out•Don’t always meet the researcher’s

needs

•Synthetic workloads•Difficult and tedious to generate

correctly

80

Overview

• Motivation: Storage system design studies and automated management systems require workloads to drive evaluations

• Problem: Neither traces of production workloads nor simple synthetic workloads are sufficient to drive experimental evaluation

• My solution: Improve the quality of synthetic storage workloads by automatically determining what properties synthetic workloads must share with the production workloads they model

1 thesis proposal zachary kurmas (v4.0– 24 april 03)

Documents

workload trace workload

workload workload trace

synthetic workload slide

workload b r

workload traces advantage

s mbhourlu production

workloads q slide

improved synthetic workloads