ceg7380 cloud computing lecture 1 keke chen. outline syllabus scope of this course tentative...

30
CEG7380 Cloud Computing Lecture 1 Keke Chen

Upload: tracey-sullivan

Post on 11-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

CEG7380 Cloud ComputingLecture 1

Keke Chen

Page 2: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Outline Syllabus

Scope of this course Tentative schedule Prerequisites Resources Assignments

Introduction

Page 3: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Scope of this course Understand the basic ideas of cloud

computing Get familiar with

Tools Systems

Expose to some research topics

Page 4: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Two major parts: Processing large data with the cloud Scaling up/down web applications

with the cloud

Note: some programming parts need self-study

Page 5: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Prerequisites Some programming skills

Java, python, shell Comfortable with learning new

programming frameworks

Sufficient knowledge about Data structure and databases Operating systems Distributed systems

Page 6: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Assignments and Grading Reading papers (~3) (10%) Some miniprojects (4~5) (60%)

Help you master the concepts Learn to use tools and systems

Self-motivated research projects are strongly encouraged!

Final exam (20%) Class attendance and discussion

(10%)

Page 7: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Resources updated reference list Inhouse hadoop cluster AWS access

coupon code for each student

Pilot Submitting reading assignments and

projects

Page 8: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Tentative Schedule Parallel data processing

Distributed file systems (GFS, HDFS) MapReduce High-level distributed data management

Cloud infrastructures Virtualization AWS and Eucalyptus Interactive front-end – Google App Engine

Cloud security and privacy Research topics

Page 9: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

In projects, we will learn to use Hadoop Mapreduce, Pig Latin AWS google app engine

Page 10: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Cloud Computinglecture 1-2

Some slides are borrowed from UC Berkeley RAD Lab

Keke Chen

Page 11: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Outline What is cloud computing? Why now? Cloud killer applications Cloud economics Challenges and opportunities

“above the cloud” “Clairemont Report”

Page 12: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

What is Cloud Computing?

Old idea: Software as a Service (SaaS) Def: delivering applications over the

Internet Recently: “[Hardware, Infrastrucuture,

Platform] as a service”

Utility Computing: pay-as-you-use computing Illusion of infinite resources No up-front cost Fine-grained billing (e.g. hourly)

12

Page 13: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Cloud computing vs. grid computing Cloud computing = virtualization+

grid + services + utility computing Grid computing: resource provisioning,

load balancing, parallel processing

Views of different users System admin/hadoop users: grid Application owners/service users:

service, utility

Page 14: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Users and cloud providers

Page 15: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Why Now?

Experience with very large datacenters – profitable for cloud providers economics of scale Pervasive broadband Internet Fast x86 virtualization Pay-as-you-go billing model

Large user base Online payment Online Ads Content distribution Web 2.0 lowers the entry point to e-business

more small e-business owners Large user base of clouds

15

Page 16: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Spectrum of Clouds

Instruction Set VM (Amazon EC2, 3Tera)

Bytecode VM (Microsoft Azure) Framework VM

Google AppEngine, Force.com

EC2 Azure AppEngine Force.com

Lower-level,Less management

Higher-level,More management

16

Page 17: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Cloud Killer Apps

Mobile and web applications Batch processing / MapReduce

Data analytics (big data) E.g., OLAP, data mining, machine learning

Extensions of desktop software Matlab, Mathematica

17

Page 18: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Unused resources

Cloud Economics

• Pay by use instead of provisioning for peak

Static data center Data center in the cloud

Demand

Capacity

Time

Demand

Capacity

Time

18

Page 19: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Unused resources

Economics of Cloud Users

• Risk of over-provisioning: underutilization

Static data center

Demand

Capacity

Time

19

Page 20: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Economics of Cloud Users

• Heavy penalty for under-provisioning

Lost revenue

Lost users

Demand

Capacity

Time (days)1 2 3

Demand

Capacity

Time (days)1 2 3

Demand

Capacity

Time (days)1 2 3

20

Page 21: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Economics of Cloud Providers

5-7x economies of scale [Hamilton 2008]

Extra benefits Amazon: utilize off-peak capacity Microsoft: sell .NET tools Google: reuse existing infrastructure

ResourceCost in

Medium DCCost in

Very Large DC Ratio

Network $95 / Mbps / month $13 / Mbps / month 7.1x

Storage $2.20 / GB / month $0.40 / GB / month 5.7x

Administration ≈140 servers/admin >1000 servers/admin 7.1x

21

Page 22: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Adoption Challenges

Challenge Opportunity

Availability Multiple providers & DCs

Data lock-in Standardization

Data Confidentiality, Auditability, and privacy

Encryption, VLANs, Firewalls; Geographical Data Storage; Privacy preserving data outsourcing

22

Page 23: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Growth Challenges

Challenge Opportunity

Data transfer bottlenecks

FedEx-ing disks, Data Backup/Archival

Performance unpredictability

Improved VM support, flash memory, scheduling VMs

Scalable storage Invent scalable store

Bugs in large distributed systems

Invent Debugger that relies on Distributed VMs

Scaling quickly Invent Auto-Scaler that relies on ML; Snapshots

23

Page 24: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Policy and Business Challenges

Challenge Opportunity

Reputation Fate Sharing Offer reputation-guarding services like those for email

Software Licensing Pay-for-use licenses; Bulk use sales

24

Page 25: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Research Challenges Mentioned by Database Community (Claremont

Report)

Page 26: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Functionality and operational cost Background: compare massive-scale

data intensive computing systems with today’s DBMS

Limited functionality Simple APIs (e.g. mapreduce) Pushes more burden on developers

Benefits Easier to manage Lower operational cost Service Level Agreement (SLA) that is hard

to provide for a SQL DBMSP.S. DB Systems are notorious for their expenses in

installation and maintenance.

Page 27: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Manageability Features of cloud systems

Limited human intervention High variance workloads A variety of shared infrastructures No DBAs or Administrators to assist developers

Systems need to do work automatically Self-managing Adaptive (autonomous) computing

Page 28: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Data security and privacy Users sharing physical resources in a

cloud Protect from each other (security) Protect from curious cloud providers

(privacy)

Successes may depend on specific target usage scenarios Examples

Query based services Mining based services

Page 29: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Datasets over multiple clouds Interesting datasets might be

available in different clouds Different cloud providers Private or public clouds

Services mashing up datasets Inevitably crossing clouds

Federated cloud architectures

Page 30: CEG7380 Cloud Computing Lecture 1 Keke Chen. Outline  Syllabus Scope of this course Tentative schedule Prerequisites Resources Assignments  Introduction

Algorithms on Big data Working on “Big Data”

Data mining Machine learning Visualization

Traditionally assume data is in flat files or relational databases

Distributed data organization puts new challenges Redesign algorithms Redesign frameworks