big data analysis

27
Prof. K. Thammi Reddy 1 Massive Data Crunching: Cloud and its Impact GITAM Univer sity

Upload: profktr

Post on 07-May-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIG DATA Analysis

Prof. K. Thammi Reddy 1

Massive Data Crunching: Cloud and its Impact

GITAM

University

Page 2: BIG DATA Analysis

Contents to be covered in this talk

• Road map to Cloud Computing

• Definition of Cloud • Cloud Architecture• Classification of Clouds• Desired features of Cloud• Migration into the Cloud• Challenges in cloud

Computing

Page 3: BIG DATA Analysis

Road Map to Cloud

1. Evolutionary Computing: Scientific purposes

2. Information Processing: OS, MP, AP

3. Client Server Computing : Databases

4. Three tier architecture and n tiered architecture : Business logic

5. WWW : Web Servers

6.Cluster computing: Networked environment

7.Grid Computing: WAN

Page 4: BIG DATA Analysis

4

Computing Paradigms and Attributes: Realizing the ‘Computer Utilities’ Vision

• Web• Data Centres• Utility Computing• Service Computing• Grid Computing• P2P Computing• Market-Oriented

Computing• Cloud Computing• …

-Ubiquitous access-Reliability-Scalability-Autonomic-Dynamic discovery- Composability-QoS-SLA- …

}

+

Paradigms

Attributes/Capabilities

?-Trillion $ business- Who will own it?

Page 5: BIG DATA Analysis

Road Map to Cloud

1. Different types of computers: • Main Frame computers• Mini Computers• Workstations• Personal Computers

Page 6: BIG DATA Analysis

Web Search Trends & Hot News Items (ref: Google)

Legend: Cluster computing, Grid computing, Cloud computing

Page 7: BIG DATA Analysis

Realizing the ‘Computer Utilities’ Vision: What Consumers and Providers Want?

• Consumers – minimize expenses, meet QoS– How do I express QoS requirements to meet my goals?– How do I assign valuation to my applications?– How do I discover services and map applications to meet QoS needs?– How do I manage multiple providers and get my work done?– How do I outperform other competing consumers?– …

• Providers – maximise Return On Investment (ROI)– How do I decide service pricing models?– How do I specify prices?– How do I translate prices into resource allocations?– How do I assign and enforce resource allocations?– How do I advertise and attract consumers?– How do I perform accounting and handle payments?– …

• Mechanisms, tools, and technologies – value expression, translation, and enforcement

Page 8: BIG DATA Analysis

Convergence of various advances leading to the advent of cloud computing.

Page 9: BIG DATA Analysis

"A Cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualised computers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreements established through negotiation between the service provider and consumers.”

Rajkumar Buyya, UOA

“clouds are a large pool of easily usable and accessible virtualized resources (such as hardware, development platforms and/or services). These resources can be dynamically reconfigured to adjust to a variable load (scale), allowing also for an optimum resource utilization.

Vaquero et.al.

Defining Cloud:

Page 10: BIG DATA Analysis

Technologies such as cluster, grid, and now, cloud computing, have all aimed at allowing access to large amounts of computing power in a fully virtualized manner, by aggregating resources and offering a single system view. An important aim of these technologies has been delivering computing as utility computing. Utility computing describes a business model for on-demand delivery of computing power; consumers pay providers based on usage (“pay-as –you-use”)

The main principle behind this model is offering computing, storage, and software “as a service.”

Page 11: BIG DATA Analysis

Cloud Architecture

Page 12: BIG DATA Analysis

• Infrastructure as a Service (IaaS): It means delivering a virtual server, desktop computer or remote storage from the cloud. In other words, a hosting provider such as Dell gives you a remote data center — where it manages the infrastructure, servers and virtualization — and you access your virtual computers and storage from the internet through a secure channel.

– CPU, Storage: Amazon.com, Nirvanic, GoGrid….

Page 13: BIG DATA Analysis

Platform as a Service (PaaS): It delivery models enable you to use a provider such as Dell or Microsoft® to provide hardware and software — as well as provisioning and hosting capabilities — needed to develop, deliver and maintain applications and other resources

Ex: Google App Engine, Microsoft Azure, Manjrasoft Aneka..

Software as a Service (SaaS): It means delivering a software application from the cloud, often to users' browsers as a web-based application. You may already use SaaS applications without knowing it. For example, Google's popular Gmail™ service delivers an email client to your web browser from the cloud.

Ex: SalesForce.Com

Page 14: BIG DATA Analysis

Advantage of the Cloud Components

Page 15: BIG DATA Analysis

An IaaS approach for:

Faster responses to changing business conditions or internal customer needs, enabled by rapid system provisioning and rapid scalability, both up and down, without the long-term lock-in of hardware purchases.

Productivity increases resulting from access to your applications and data from anywhere and the reliability that comes from a distributed computing model.

Reduced capital outlay for hardware acquisition, maintenance, data center real estate, and power and cooling, when using a pay-for-use (public cloud) model.

Page 16: BIG DATA Analysis

Challeges: Dealing with too many issues and offerings

Uhm, I am not quite clear…Yet another complex IT paradigm?

Storage

Web 2.0

IaaS

PaaS

SaaS

Web Services

Public Cloud

Private Cloud

Enterprise Cloud

Amazon EC2

Amazon S3

Google AppEngine

SalesForce.com

Mosso

VMWare

Hypervisors

Manjrasoft Aneka

Resource Metering

Billing

QoSVirtualization

Service Level

Agreement

Provisioning on Demand

Pricing

Utility Management

Security

Privacy

Scalability

Reliability

Software Eng. Complexity

Energy Efficiency

Page 17: BIG DATA Analysis

Clouds based on Ownership and ExposureTypes of Clouds

Private/Enterprise Clouds

Cloud computingmodel run within a company’s own Data Center / infrastructure forinternal and/or partners use.

Public/Internet Clouds

3rd party, multi-tenant Cloudinfrastructure & services:

* available on subscription basis (pay as you go)

Hybrid/Mixed Clouds

Mixed usage of private and public Clouds:Leasing publiccloud serviceswhen private cloud capacity is insufficient

Page 18: BIG DATA Analysis

The cloud computing service offering and deployment models

Page 19: BIG DATA Analysis

Market-oriented Cloud Architecture

DispatcherVM

MonitorService Request

Monitor

Pricing Accounting

Service Request Examiner and Admission Control

- Customer-driven Service Management- Computational Risk Management- Autonomic Resource Management

Users/Brokers

SLAResource Allocator

Virtual Machines

(VMs)

Physical Machines

Page 20: BIG DATA Analysis

Cloud Architecture

Cloud resources

Virtual Machine (VM), VM Management and Deployment

QoS Negotiation, Admission Control, Pricing, SLA Management, Monitoring, Execution Management, Metering, Accounting, Billing

Cloud programming: environments and toolsWeb 2.0 Interfaces, Mashups, Concurrent and Distributed Programming, Workflows, Libraries, Scripting

Cloud applicationsSocial computing, Enterprise, ISV, Scientific, CDNs, ...

Adaptive Managem

ent

CoreMiddleware

User-LevelMiddleware

System level

User level

Autonom

ic / Cloud E

conomy

Apps Hosting Platforms

Page 21: BIG DATA Analysis

Desired Features of Cloud

(i) Self-service,

(ii)Per-usage metered and billed,

(iii)Elastic,

(iv)Customizable

Page 22: BIG DATA Analysis

The iterative Seven-step Model of Migration into the Cloud

Page 23: BIG DATA Analysis
Page 24: BIG DATA Analysis
Page 25: BIG DATA Analysis

INFRASTRUCTURE AS A SERVICE (IAAS)

Page 26: BIG DATA Analysis

PLATFORM AND SOFTWARE AS A SERVICE Aneka Architecture

Private Cloud

LAN network

AmazonMicrosoft Google

IBM

Data Center

Hardware Profile Services

ContainerPersistence

TaskModel

ThreadModel

Map Reduce Model

OtherModels

.NET @ Windows Mono @ Linux

Security

Programming Models

Software Development Kit

ManagementStudio

Application

Foundation Services

MembershipServices

ReservationServices

LicenseServices

APIsDesign Explorer

Management Kit

AdministrationPortal

SLA-NegotiationWeb Services

ManagementWeb Services

StorageServices

AccountingServices

Fabric Services

Dynamic Resource Provisioning Services

Infrastructure

Physical Machines/Virtual Machines

Private Cloud

LAN network

Private Cloud

LAN network

AmazonMicrosoft Google

IBM

Data Center

AmazonMicrosoft Google

IBM

Data Center

Hardware Profile Services

ContainerPersistence

TaskModel

ThreadModel

Map Reduce Model

OtherModels

.NET @ Windows Mono @ Linux

Security

Programming Models

Software Development Kit

ManagementStudio

Application

Foundation Services

MembershipServices

ReservationServices

LicenseServices

APIsDesign Explorer

Management Kit

AdministrationPortal

SLA-NegotiationWeb Services

ManagementWeb Services

StorageServices

AccountingServices

Fabric Services

Dynamic Resource Provisioning Services

Infrastructure

Physical Machines/Virtual Machines

Page 27: BIG DATA Analysis

Acknowledgements

1. Cloud Computing principals and Paradigms, Rajkumar Buyya, James Broberg, Andrzej Goscinki

2. Above the Clouds: A Berkeley View of Cloud Computing Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz

3. VMWARE, India