introduction

16
Introduction

Upload: cruz

Post on 06-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Introduction. Readings. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, L. Barroso and U. Holze. Introduction. Increasingly we are seeing more of our applications moving from the PC to the Internet e.g., Email – gmail, yahoo - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction

Introduction

Page 2: Introduction

Readings

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, L. Barroso and U. Holze

Page 3: Introduction

Introduction Increasingly we are seeing more of our

applications moving from the PC to the Internet e.g., Email – gmail, yahoo Photo management – Picasso, Kodak,

Sutterbug Word processing – Google apps

Why? Less work on the user’s behalf Maybe the potential for less cost for the

user

Page 4: Introduction

Introduction To support this move from the PC to the

“Internet” requires a large number of servers, storage, network support etc; Companies like Amazon, Google, eBay are

running data centers with tens of thousands of machines

To make users trust these systems requires that a number of issues be addressed e.g., failure handling

Page 5: Introduction

Architecture

Page 6: Introduction

Architecture

Common elements include Low end servers typically in a blade

enclosure within a rack The interconnection of servers within a rack

is supported with a local Ethernet switch (rack switch)

The local Ethernet switch has a number of uplink connections to one or more cluster-level (data center level) Ethernet switch

Page 7: Introduction

Storage

Disks can be connected directly to each server and managed by a global distributed file system (e.g., Google’s GFS); or

Disks can be part of Network Attached Storage (NAS) devices that are directly connected to the cluster level switch

Page 8: Introduction

Storage

NAS Reliability is provided by the device through

replication and error codes Server node

Need a fault-tolerant file system at the cluster level which is not trivial to implement

• Writes are slower Potentially is lower cost then using NAS

• Disks can be the same as what is on your PC

Page 9: Introduction

Storage Hierarchy

Page 10: Introduction

Networking Fabric

Tradeoffs between speed, scale and cost Intra rack connectivity is relatively

inexpensive to achieve Network switches with high port counts

have a different price structure then switches used for rack connectivity Much more expensive

Network switches with few ports require programmers to be aware of the scarce bandwidth

Page 11: Introduction

Latency, Bandwidth, Capacity

Much faster for an application to retrieve data from local disks then from off rack disks but

Applications often need more storage then found on a local disk (e.g., Google search)

How is this dealt with efficiently?

Page 12: Introduction

Power Usage

Peak power usage measured at one of Google’s data centers: Networking 5% CPUs 23% Disks 10% DRAMS 30% Other 22%

Page 13: Introduction

Handling Failures

The high number of components almost guarantee failures Disk drives can exhibit annualized failure

rates higher than 4% Lots of restarts needed

This issue has received a good deal of attention

Page 14: Introduction

Request Handling

Lots of disks so how is data placed so that it can be found

Let’s look at Amazon Partition the data so that groups of

servers handle just a part of the inventory (or any other data) Router needs to be able to extract keys from

request • Hashing is one strategy for doing this• Based on the key you then determine the server

to handle the request

Page 15: Introduction

Online Evolution Internet-time implies constant change

Need acceptable quality Three approaches to managing upgrades

Fast reboot: Cluster at a time• Minimize yield impact

Rolling upgrade: Node at a time• Versions must be compatible

Big flip: Half the cluster at a time• Reserved for complex changes

Either way: use staging area, be prepared to revert

Page 16: Introduction

Summary

We have briefly discussed a high-level view of data centers

In this course we will discuss how Google, Amazon, etc deal with some of the implications of these architectures