advanced data management jiaheng lu department of computer science renmin university of china

43
Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China www.jiahenglu.net

Upload: steve-brenner

Post on 31-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Advanced data management

Jiaheng Lu

Department of Computer Science

Renmin University of Chinawww.jiahenglu.net

Page 2: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Course purpose

2

Teach in English

The objective is to expose graduate students to exciting data management topics

Page 3: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Course contents

3

Cloud computing and cloud data management

XML data management

Column-store database

Data processing in bioinformatics

Page 4: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Lecturer Academic experience

2006.9 ~2008.6 University of California, Irvine, Postdoc researcher

2002.8 ~2006.8 National University of Singapore, PhD candidate

1998.9 ~ 2001.1 Shanghai Jiao Tong University Master candidate

Page 5: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

University of California, Irvine

Page 6: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Research in Postdoc

66

Data integration in medical system

[US patent]

Approximate string search [ICDE08]

Page 7: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

7National University of Singapore

Page 8: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Course grading

8

Report 30%

Google App Engine 30%

In-class presence and quiz 40%

Page 9: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

23/4/11 9

Any question and any comments ?

Page 10: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Cloud computing

Page 11: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Page 12: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Why we use cloud computing?

Page 13: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Why we use cloud computing?

Case 1:

Write a file

Save

Computer down, file is lost

Files are always stored in cloud, never lost

Page 14: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Why we use cloud computing?

Case 2:

Use IE --- download, install, use

Use QQ --- download, install, use

Use C++ --- download, install, use

……

Get the serve from the cloud

Page 15: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Page 16: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

What is cloud and cloud computing?

Cloud

Demand resources or services over Internet

scale and reliability of a data center.

Page 17: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

What is cloud and cloud computing?

Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a serve over the Internet.

Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.

Page 18: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

The architecture of cloud computing system

Page 19: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Characteristics of cloud computing

Virtual. software, databases, Web servers,

operating systems, storage and networking as virtual servers.

On demand. add and subtract processors, memory,

network bandwidth, storage.

Page 20: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

IaaSInfrastructure as a Service

PaaSPlatform as a Service

SaaSSoftware as a Service

Types of cloud service

Page 21: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Software delivery model

No hardware or software to manage Service delivered through a browser Customers use the service on demand Instant Scalability

SaaS

Page 22: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Examples

Your current CRM package is not managing the load or you simply don’t want to host it in-house. Use a SaaS provider such as Salesforce.com

Your email is hosted on an exchange server in your office and it is very slow. Outsource this using Hosted Exchange.

SaaS

Page 23: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Platform delivery model

Platforms are built upon Infrastructure, which is expensive

Estimating demand is not a science! Platform management is not fun!

PaaS

Page 24: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Examples

You need to host a large file (5Mb) on your website and make it available for 35,000 users for only two months duration. Use Cloud Front from Amazon.

You want to start storage services on your network for a large number of files and you do not have the storage capacity…use Amazon S3.

PaaS

Page 25: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Computer infrastructure delivery model

A platform virtualization environment

Computing resources, such as storing and processing capacity.

Virtualization taken a step further

IaaS

Page 26: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Examples

You want to run a batch job but you don’t have the infrastructure necessary to run it in a timely manner. Use Amazon EC2.

You want to host a website, but only for a few days. Use Flexiscale.

IaaS

Page 27: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Cloud computing and other computing techniques

Page 28: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Page 29: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

An Industry Transformed

http://www.boxofficemojo.com/

Delgo www.delgo.com

Page 30: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Shrek, Delgo, and Others

•Why did Dreamworks use this?•Upsides?•Downsides?

Page 31: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Grid Computing & Cloud Computing

share a lot commonality intention, architecture and technology Difference programming model, business model,

compute model, applications, and Virtualization.

Page 32: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Grid Computing & Cloud Computing

the problems are mostly the samemanage large facilities;

define methods by which consumers discover, request and use resources provided by the central facilities;

implement the often highly parallel computations that execute on those resources.

Page 33: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Grid Computing & Cloud Computing

Virtualization Grid

do not rely on virtualization as much as Clouds do, each individual organization maintain full control of their resources

Cloudan indispensable ingredient for

almost every Cloud

Page 34: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Page 35: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

23/4/11 35

Any question and any comments ?

Page 36: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Google App Engine

Page 37: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

37

Google App Engine

Does one thing well: running web apps

Simple app configuration

Scalable

Secure

Page 38: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

38

App Engine Does One Thing WellApp Engine handles HTTP(S) requests, nothing else

Think RPC: request in, processing, response out Works well for the web and AJAX; also for other services

App configuration is dead simple No performance tuning needed

Page 39: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

App Engine Architecture

39

PythonVM

process

stdlib

app

memcachedatastore

mail

images

urlfech

statefulAPIs

stateless APIs R/O FSreq/resp

Page 40: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

How to use Google App engine

Download Java 6

Download Eclipse and Google plug in

Register a user account in Google

Create an application (python, Java) and upload the code

Page 41: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China
Page 42: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

In class quiz

Please answer all questions

You may be requested to answer a question later. Your performance will affect your final score.

Page 43: Advanced data management Jiaheng Lu Department of Computer Science Renmin University of China

Study Google App Engine

http://code.google.com/intl/en/appengine/docs/java/gettingstarted/