[ieee 2010 ieee/ifip network operations and management symposium workshops - osaka, japan...

8
Ranking and Mapping of Applications to Cloud Computing Services by SVD Hoi Chan, Trieu Chieu IBM Thomas J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 {hychan, [email protected]} Abstract Cloud computing promises to provide high performance, on-demand services in a flexible and affordable manner, it offers the benefits of fast and easy deployment,, scalability and service oriented architecture. It promises substantial cost reduction together with increased flexibility than the traditional IT operation. Cloud service providers typically come with various levels of services and performance characteristics. In addition, there are different types of user applications with specific requirements such as availability, security and computational power. Currently, there are no standard ranking and classification services for the users to select the appropriate providers to fit their application requirements. Determining the best cloud computing service for a specific application is a challenge and often determines the success of the underlying business of the service consumers. In this paper, we propose a set of cloud computing specific performance and quality of service (QoS) attributes, an information collection mechanism and the analytic algorithm based on Singular Value Decomposition Technique (SVD) to determine the best service provider for a user application with a specific set of requirements. This technique provides an automatic best-fit procedure which does not require a formal knowledge model. 1. Introduction Cloud computing [1-4] promises to provide high performance, flexible and yet low cost on- demand computing services with the benefits of speed, ease of deployment, scalability and service oriented architecture. It offers a pay-for-use model, for many businesses, especially for start- ups, small and medium-sized companies, it is extremely attractive; it offers a readily available and scalable computing environment without substantial capital investment and hardware administrative and maintaining cost. It is a general perception that cloud computing is reminiscent of the application service provider (ASP) and its associated technologies. In practice, cloud computing platform, such as those offered by Amazon Web Services, AT&T’s Synaptic Hosting, and the IBM/Google cloud initiative, works quite differently than the typical ASPs. It provides and supports little more than a collection of physical servers and offers another level of virtualization by providing the users with virtual machines to install and run their own software, instead of owning, maintaining and running the software themselves. With the advance in virtualization technologies, resource availability is typically very elastic and responsive (deploying more physical servers and providing more VMs automatically when needed), with virtually unlimited amount of computing power and storage capacity readily available on demand. With the benefits of cloud computing also come new challenges and complexities such as reliability, security etc. that must be properly addressed. Indeed, not all service providers are created equal, some are superior in computing power, some are good at offering seamless and unlimited storage, and some excel in security management while some offer the lowest cost. In such a setting, decisions have to be made somewhere as to which applications or fragments of applications from the users are to be executed on which service providers. It is also important to realize that there are different types of users with different types of applications with different set of requirements. Some applications require substantial computing and storage power while others have compelling need for maximum confidentiality. From the users’ perspective, their goal is to run their applications seamlessly and meet their performance, security and cost target. Therefore, matching and determining the best cloud computing service for a specific application is important and often determines the success of the underlying business of the service consumers. 362 978-1-4244-6039-7/10/$26.00 c 2010 IEEE

Upload: trieu

Post on 16-Mar-2017

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

Ranking and Mapping of Applications to Cloud Computing Services by SVD

Hoi Chan, Trieu Chieu IBM Thomas J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532

{hychan, [email protected]}

Abstract

Cloud computing promises to provide high performance, on-demand services in a flexible and affordable manner, it offers the benefits of fast and easy deployment,, scalability and service oriented architecture. It promises substantial cost reduction together with increased flexibility than the traditional IT operation. Cloud service providers typically come with various levels of services and performance characteristics. In addition, there are different types of user applications with specific requirements such as availability, security and computational power. Currently, there are no standard ranking and classification services for the users to select the appropriate providers to fit their application requirements. Determining the best cloud computing service for a specific application is a challenge and often determines the success of the underlying business of the service consumers. In this paper, we propose a set of cloud computing specific performance and quality of service (QoS) attributes, an information collection mechanism and the analytic algorithm based on Singular Value Decomposition Technique (SVD) to determine the best service provider for a user application with a specific set of requirements. This technique provides an automatic best-fit procedure which does not require a formal knowledge model.

1. Introduction

Cloud computing [1-4] promises to provide high performance, flexible and yet low cost on-demand computing services with the benefits of speed, ease of deployment, scalability and service oriented architecture. It offers a pay-for-use model, for many businesses, especially for start-ups, small and medium-sized companies, it is extremely attractive; it offers a readily available and scalable computing environment without substantial capital investment and hardware administrative and maintaining cost.

It is a general perception that cloud computing is reminiscent of the application service provider (ASP) and its associated technologies. In practice, cloud computing platform, such as those offered by Amazon Web Services, AT&T’s Synaptic Hosting, and the IBM/Google cloud initiative, works quite differently than the typical ASPs. It provides and supports little more than a collection of physical servers and offers another level of virtualization by providing the users with virtual machines to install and run their own software, instead of owning, maintaining and running the software themselves. With the advance in virtualization technologies, resource availability is typically very elastic and responsive (deploying more physical servers and providing more VMs automatically when needed), with virtually unlimited amount of computing power and storage capacity readily available on demand.

With the benefits of cloud computing also come new challenges and complexities such as reliability, security etc. that must be properly addressed. Indeed, not all service providers are created equal, some are superior in computing power, some are good at offering seamless and unlimited storage, and some excel in security management while some offer the lowest cost. In such a setting, decisions have to be made somewhere as to which applications or fragments of applications from the users are to be executed on which service providers. It is also important to realize that there are different types of users with different types of applications with different set of requirements. Some applications require substantial computing and storage power while others have compelling need for maximum confidentiality. From the users’ perspective, their goal is to run their applications seamlessly and meet their performance, security and cost target. Therefore, matching and determining the best cloud computing service for a specific application is important and often determines the success of the underlying business of the service consumers.

362978-1-4244-6039-7/10/$26.00 c©2010 IEEE

Page 2: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

We classify applications according to their requirements and characteristic based in part on the security requirement recommended by “Director of Central Intelligence Directive 6/3” [5], whether the application is transitional or analytic [13] and its budget requirement, and at the same time we describe each cloud service providers by using a “Quality of Services” (QoS) metric which includes static and dynamic parameters, based in part on the Web service model [6, 7].

Traditional approaches utilize QoS policies [8] or some weighting schemes to sort through a pile of attributes in the hope of finding the ones that best fit the application requirements. This approach is acceptable if the number of QoS attributes and the number of available services are relatively small and do not change much over time. In addition, not all service providers attributes that the users require are available, which hinders its usability. Last but not the least, due to the more “abstract” nature of some of the provider QoS attributes, a rule-based approach [9] of getting the best services may be too restrictive and not be able to provide a comprehensive set of alternatives. Another approach for mapping applications to service provider is the use of utility functions [10] to maximize utility for an application given the available service providers information. Usually, the utility function approach requires sophisticated utility function algorithms and optimization technique which may not be readily available and too costly to maintain.

This paper explores part of this space, by describing an application classification scheme and cloud computer service provider metrics coupled with a statistical algorithm based on SVD to map different types of applications to the best service providers. In the rest of this paper, section 2 describes the system design and architecture. Section 3 describes the application classification scheme while section 4 introduces the cloud computing services metric. In section 5, we briefly introduce SVD and use an example to illustrate the application classification and cloud service provider metric, followed by the creation of an input matrix for the SVD analysis engine. Section 6 and 7 describe the cloud service provider selection process and the choice of dimensional factor and how prior data could be mined to enable active learning. In section 8, we describe the result. Section 9 discusses possible future work and concludes.

2. System Overview

Cloud Service Provider Cloud Service

ProviderCloud Service Provider

Cloud Service Provider

Cloud Service Provider

Mapper

Application Execution

Request

Figure 1, High Level Architecture

Figure 1 shows the high level conceptual architectural view of the application and service mapping system. A user sends a request with its requirement for application execution to the cloud service provider mapper which automatically assigns a service provider based on the application requirements.

QoS-ProviderMatrix

Cloud Service ProviderMetrics

Provider Selectionin QoS-Provider

Space with k value

QoS-Provider SpaceConstruction via SVD

Application and Service Provider Mapper

Cloud Service InformationCollector

Info Gathering

Apply SVD

Select Provider

Selected Provider Application Metric

Figure 2, Service Provider Mapper Overview

Figure 2 shows the overview of the proposed application and service provider mapping system. It includes a cloud service provider information collector and a repository for storing the metrics of each of the available cloud service providers. An SVD engine transforms the collected provider information into matrix with providers as columns and individual QoS attributes as rows. We define this transformed matrix as our provider-QoS

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 363

Page 3: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

attribute (PQ) matrix which is then transposed and then decomposed by SVD transformation to form an n-dimensional service providers and QoS attributes space wherein service providers and their associated QoS attributes are placed near one another. Service providers which are closely associated with the required QoS attributes in this space are selected. Users can accept, reject the recommended service providers. Each episode of successful match triggers the system to update the PQ repository by putting a positive or negative weight on the relevant QoS patterns.

3. Application Classification

Applications are classified based on their security requirement, resource consumption behaviors and budget constraints. Classification of applications can be achieved both manually and automatically. However, discussion of application classification is beyond the scope of this paper. Detail description of security classification technologies and research can be found on the following publications [11, 12]. For application security requirement, the selected parameters are based on the recommendation by “Director of Central Intelligence Directive 6/3”, [5] which includes confidentiality, integrity and availability. We further classify applications as transactional, analytic (and/or transactional-analytic) based on their resource consumption behavior. Aboulnaga [13] provides a fairly detail analysis of the transaction and analytical applications classification and their characteristics. Transactional applications refers to commercial applications which rely on ACID (atomicity, consistency, isolation, durability is a set of properties that guarantee that database transactions are processed) guarantees provided by databases and tend to be quite write intensive such as airline reservation, stock trading transactions, and day to day banking activities. They require comparatively much high standard on security, integrity and availability. Analytical applications refer to applications that perform data analysis with data from the data warehouse for use in business planning, problem solving, report generation and decision support. Historical data along with data from multiple operational databases are all typically involved in the analysis. Consequently, the scale of analytical data management systems is generally larger than transactional systems. Furthermore, analytical systems tend to be read-mostly (or read-only), with occasional batch inserts. Their requirements, with the exception of confidentiality, may be less stringent than the transactional applications but have difference resource consumption

characteristics than their counter part. In addition, we also realize that there is another class of application which is both transactional and analytic in nature, even varying in degree.

Finally, cost and timing associated with executing the application are also important factors in determining the selection of providers. These parameters are readily available from the service providers for a specific level of performance. To summarize, the following attributes: confidentiality, integrity, availability, CPU requirement, memory requirement, IO requirement, application theme (transaction /analytic), and budget are used as application requirement metric to characterize an application.

4. Metrics for Service Providers

We divide the metric of a cloud service provider into 2 parts, the first part deals mainly with generally available data collected from rating agencies, mostly physical properties such as mean time to repair, first attempt fix rate which measures the percentage of incidents that are successfully corrected on the first fix attempt; change frequency measures the frequency of changes in the platforms that requires either down time or client actions. In addition, lower level performance indicators such as memory availability, CPU utilization and IO performance are also important factors in creating a set of meaningful parameters best describes the state and characteristics of the service providers. Security, integrity and availability are of paramount important in selecting a service provider. However, the measurements of these parameters may not be available as readily as the other physical quantities. But they can be estimated based on historical data analysis and feedback from other clients. Lastly, the cost charged by the service providers for a specific level of performance is readily available directly from the server providers.

Figure 3 shows set of static and dynamic performance and QoS metrics for cloud computing model and they constitute the initial set of inputs that we use for the SVD analysis.

5. QP Matrix for Services Providers

The QP matrix captures the collection of attributes associated with the set of target service providers. This matrix represents the service providers and their QoS metrics as mathematical objects, which captures the relationship among the

364 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

Page 4: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

service providers and their QoS metrics in a simple, compact and extendable form. Such representation enables fast and easy computation as there are numerous numerical analysis tools [14,19] and algorithms [20,21] readily available and therefore eliminates the need to develop the tools needed to support the knowledge model. It is extensible, adding/deleting service providers and QoS attributes requires simply adding/deleting columns and rows of the matrix, it does not alter the original data structure and meanings of the entries. It is also a compact way to capture knowledge since it preserves the intrinsic or latent relationship among the entries [22].

In our QP matrix, each cell contains a numerical index of 1 to 5 with 5 being the highest which represents a weighted rating of a QoS attribute of a service provider. Since different rating agencies provide different rating systems and formats for each of the QoS attributes, such as either “positive”, “negative” or “neutral” as in ebay, a user review index of 1 to 5 as in Amazon. We need to convert these different rating representations into a single representation. For each cell in the matrix, we define a Quality Index (QI) which is a weighted average of values from selected rating agencies for a particular attribute:

QI = Wu (F1(Q1) + F2(Q2) + …F3(Qn))/n (1)

Where

Qn = normalized, common quality index between 0 and 1.

Fn = weight assigned to the specific agency

Wu = weight assigned by the user or automatically based on historical data for the rating of a specific attribute (convert the index to 1 to 5 scale)

With N available cloud service providers and M attributes in a QoS metric, our QoS and providers (QP) Matrix is an MxN matrix which can be readily decomposed (after transposition of the QP matrix) by SVD technique. The decomposition enables the transposed NxM matrix to be approximated by the first keigenvalues of the diagonal matrix (one of the 3 resulting decomposed matrices) of the original MxN matrix, resulting in a compressed representation of the original data. Using the compressed data, clusters of cloud services with similar qualities can be identified and ranked accordingly.

6. Conceptual Space for QoS Metrics and Service Providers

For ease of illustration and simplicity, a simulated set of data was used as sample QoS metrics and service provider data set as shown in Figure 3, 4 and 5.

CostA10Average CPU UtilizationA9Average IO ResponseA8Average Server/VM RatioA7Change FrequencyA6First Attempt Fix RateA5Mean Time for RepairA4AvailabilityA3IntegrityA2SecurityA1

CostA10Average CPU UtilizationA9Average IO ResponseA8Average Server/VM RatioA7Change FrequencyA6First Attempt Fix RateA5Mean Time for RepairA4AvailabilityA3IntegrityA2SecurityA1

Figure 3, Service Provider QoS Metric

C lo u d S erv ice P ro v id er 5P 5

C lo u d S erv ice P ro v id er 4P 4

C lo u d S erv ice P ro v id er 3P 3

C lo u d S erv ice P ro v id er 2P 2

C lo u d S erv ice P ro v id er 1P 1

C lo u d S erv ice P ro v id er 5P 5

C lo u d S erv ice P ro v id er 4P 4

C lo u d S erv ice P ro v id er 3P 3

C lo u d S erv ice P ro v id er 2P 2

C lo u d S erv ice P ro v id er 1P 1

Figure 4, Service Provider List

4 .02 .04 .03 .05 .0A 92 .01 .02 .03 .05 .0A 1 0

3 .05 .04 .03 .02 .0A 85 .04 .04 .03 .03 .0A 73 .03 .01 .02 .02 .0A 63 .03 .03 .03 .02 .0A 55 .05 .04 .03 .03 .0A 44 .03 .03 .02 .03 .0A 35 .04 .03 .01 .02 .0A 24 .05 .05 .02 .01 .0A 1P 5P 4P 3P 2P 1

4 .02 .04 .03 .05 .0A 92 .01 .02 .03 .05 .0A 1 0

3 .05 .04 .03 .02 .0A 85 .04 .04 .03 .03 .0A 73 .03 .01 .02 .02 .0A 63 .03 .03 .03 .02 .0A 55 .05 .04 .03 .03 .0A 44 .03 .03 .02 .03 .0A 35 .04 .03 .01 .02 .0A 24 .05 .05 .02 .01 .0A 1P 5P 4P 3P 2P 1

Figure 5, QoS Metrics and Service Provider table

This dataset consists of m quality attributes (Am) and n service providers (Pn), where m=10 and n=5. The m quality attributes are entered as rows and the n service providers are entered as columns in the MxN QP matrix (Figure 5). The entries in the QP matrix are simply QI as defined by Equation (1) of a particular QoS attributes for a specific service provider. We then take the

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 365

Page 5: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

transpose of the QP matrix such that R = QP T is a NxM matrix.

R is decomposed into three matrices [14,15] by SVD as in Equation (2),

R = E S A’ (2)

where E and A’ are the left and right singular vectors of the R matrix as shown in Figures 6 and 7, respectively. Both of them have orthogonal columns. As shown in Figure 8, S is the diagonal matrix of singular values ordered in decreasing magnitude. These special matrices are the result of a breakdown of the original QoS metrics and service providers relationships into linearly independent QoS metrics and service provider components in concept space. So each QoS metric and service provider is represented by its vector. As shown in Figures 8, the values for many of these singular values can be ignored as they become relatively small (weak correlation). Usually, only the first few largest singular values are needed and the rest deleted. Thus, a reduced model which approximately equals to the original QP model with fewer dimensions can be built. This process, in essence, captures the major relationships among QoS attributes and service providers while ignoring the minor ones by treating them as noise.

-0.040.460.030.48-0.340.32-0.12-0.170.67-0.23

0.42-0.22-0.54-0.21-0.33-0.480.040.11-0.01-0.37-0.02-0.700.230.01-0.22-0.63-0.14-0.290.01-0.270.04-0.190.06-0.11-0.400.210.050.240.08-0.300.270.090.63-0.23-0.31-0.110.41-0.26-0.45-0.35

-0.040.460.030.48-0.340.32-0.12-0.170.67-0.23

0.42-0.22-0.54-0.21-0.33-0.480.040.11-0.01-0.37-0.02-0.700.230.01-0.22-0.63-0.14-0.290.01-0.270.04-0.190.06-0.11-0.400.210.050.240.08-0.300.270.090.63-0.23-0.31-0.110.41-0.26-0.45-0.35

.0.0

Figure 6, E Matrix (First 5 of 9 columns)

-0.420.050.73-0.11-0.530.48-0.47-0.13-0.53-0.490.010.77-0.41-0.14-0.47-0.60-0.42-0.520.27-0.340.48-0.030.120.78-0.38

-0.420.050.73-0.11-0.530.48-0.47-0.13-0.53-0.490.010.77-0.41-0.14-0.47-0.60-0.42-0.520.27-0.340.48-0.030.120.78-0.38

Figure 7, A’ Matrix

0.93

2.01

2.59

5.51

23.09

0.93

2.01

2.59

5.51

23.09

Figure 8, S Matrix

In a two dimensional model where k = 2 as shown in the shaded elements in Figures 4, 5 and 6, all the QoS attribute to QoS attribute, service provider to service provider, and QoS attribute to service provider similarities are now approximated by the first two largest singular values of S. As a result, the row vectors of the reduced matrices (shaded columns of the Ematrix in Figure 6 and A’ matrix in Figure 7 are taken as coordinates of points representing QoS attribute and service provider in a two-dimensional concept space as shown in Figure 9 where QoS attributes are represented as trianglesand service provider as diamonds. The dot product or cosine between two vectors representing any two components corresponds to their estimated similarity.

- 0.1- 0.2- 0.6 - 0.5 - 0.4

- 0.4

- 0.2

- 0.6

- 0.8

- 0.7 - 0.3

0.2

0.4

0.6

0.8

(A1,-.35,-.45)

(A2,-.31,-.23)

(A3,-.30,.08)(A5,-.27,.05)

(A6,-.22,.01)(A7,-.37,-.01)(A4,-.4,.-11)

(A8,-.33,-.21)

(A9,-.34,.48)

(P1,-.38,.78)

(P2,-.34,.27)

(P3,-.47,.-14)

(P4,-.49,.-53)

(P5,-.53,-.11)

(A10,-.23,.67)

Figure 9, 2-Dimensional Plot of Ps and As

7. Selecting Service Provider with a Set of QoS Attributes

When the system receives an application execution request with a QoS attribute set, a pseudo-service provider vector is constructed as the weighted sum of its constituent attribute QoS vectors. With appropriate rescaling of the axes, this amounts to placing the pseudo-service provider at the centroid of its corresponding QoS attribute points, which summarizes the cluster of QoS attribute points:

366 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

Page 6: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

c = centroid, the average vector of the application required QoS vectors r1, … rn are QoS requirements vectors w1, … wn are weights of requirement QoS vectors

Wn is the sum of weights

when all weights (w) are equal, Wn = n x w

This pseudo-service provider is compared against all existing service providers by calculating the cosine between the pseudo-service provider vector and the existing service provider vectors as a similarity metric. Those service providers with the highest cosines (the nearest vectors) to the pseudo-service provider are selected. The resulting service provider are ranked according to their closeness to the pseudo-service provider vector and merged to form the recommended service provider set. Clearly, the choice of the threshold cosine value constitutes a significant role in the number and the accuracy of the service providers selected. The common practice is to use a small cosine value (e.g. cosine value = 0.17, 80 degree coverage) to enable a broader initial search space, and reduce the search space gradually as more data is accumulated to maximize accuracy.

When an application request includes at least one new QoS attribute, the system excludes the new QoS attribute and only uses the existing QoS attributes to form the pseudo-service provider and select the recommended service provider as described in the previous paragraph. But this recommended service provider will be examined by the user. The user, at his discretion, can accept or reject this service provider. Upon acceptance of this recommended service provider, the new QoS attribute and the selected service provider are recorded in the QP repository and trigger the system to re-construct a new QP space which includes the new QoS attribute for subsequent uses. As a result, new knowledge is acquired and captured by the new QP matrix.

Using the SVD based service provider ranking and selecting system, we can expect a

collection of service providers, with each of its member service providers ranked and selected individually, constitutes the best available set for an application. This implies that a set consists of multiple ranked service providers can in fact achieves the best matched QoS requirement.

8. Results

With k = 2 and a threshold cosine value of 0.7, the following examples illustrate the operation of the system under different application request requirements:

1) An application request requires attribute A1 and A3, (user is interested solely in Confidentiality and Availability). The system uses A1 and A3 (we can adjust the weight of A1 and A3 b applying a scaling factor to reflect their relative importance, but for simplicity we choose weight = 1 for both A1 and A3) to form the pseudo service provider represented as point F in Figure 10.

- 0.1- 0.2- 0.6 - 0.5 - 0.4

- 0.4

- 0.2

- 0.6

- 0.8

- 0.7 - 0.3

0.2

0.4

0.6

0.8

(A1,-.35,-.45)

(A2,-.31,-.23)

(A3,-.30,.08)(A5,-.27,.05)

(A6,-.22,.01)(A7,-.37,-.01)(A4,-.4,.-11)

(A8,-.33,-.21)

(A9,-.34,.48)

(P1,-.38,.78)

(P2,-.34,.27)

(P3,-.47,.-14)

(P4,-.49,.-53)

(P5,-.53,-.11)

(A10,-.23,.67)

F

Figure 10, 2-Dimensional Plot of Ps and As

Using a cosine value of 0.9 from F, P4 (service provider 4) is within the similarity space and is selected as the recommended service provider. It is also useful to note that other relevant service providers can be retrieved (such as P3, closest to the first choice P4) depending on their proximity to the pseudo service provider formed by A1 and A3. This is useful in the case that the user needs to consider a broader set of recommended service providers. As we can see from the service provider table (Figure 5), the first ranked service provider P4 provides the best security and availability combination, while P3 is a second choice if we extend the similarity space. This is consistent with the corresponding attribute values of service provider P3 and P4.

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 367

Page 7: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

2) An application request includes 3 requirements: A8 and A10 and is of transactional type (user is interested in IO response, running a transactional application and minimum cost). A pseudo-service provider is constructed from A8 and A10 (with weight = 1 for both A8 and A10) while ignoring the “transactional application type” requirement. This is represented as point q in Figure 11 which is the centroid of vector A8 and A10, P2 is selected as it is within the dotted cone with a cosine value of 0.7 from q.

- 0.1- 0.2- 0.6 - 0.5 - 0.4

- 0.4

- 0.2

- 0.6

- 0.8

- 0.7 - 0.3

0.2

0.4

0.6

0.8

(A1,-.35,-.45)

(A2,-.31,-.23)

(A3,-.30,.08)(A5,-.27,.05)

(A6,-.22,.01)(A7,-.37,-.01)(A4,-.4,.-11)

(A8,-.33,-.21)

(A9,-.34,.48)

(P1,-.38,.78)

(P2,-.34,.27)

(P3,-.47,.-14)

(P4,-.49,.-53)

(P5,-.53,-.11)

(A10,-.23,.67)

q

Figure 11, 2-Dimensional Plot of Ps and As

If user accepts this recommended service provider, it will trigger the recreation of QP matrix and new knowledge is acquired as an addition attribute “transactional application” in the respective service provider QoS attribute list. This is interpreted as even the new QoS attribute is not available from the QoS list, its computation is based on the other 2 available QoS attributes. The new QoS attribute, upon successful execution of the application, will be included in the QoS attribute list and the relationship with other attributes and service provider will be re-established as subsequent execution result is accumulated.

9. Discussion and Conclusion

This paper introduces a system for cloud service provider ranking and selection to match application requirements using a statistical approach with SVD, together with the set of QoS attributes to describe application requirements and service provider characteristics. SVD method provides a compact and efficient knowledge representation mechanism to represent QoS attributes and cloud service provider relationship by using matrix and dimension reduction technique to extract meaningful relationship

among QoS attributes and the corresponding service providers. It enables the selection of service providers without exact match of the required QoS attributes. Compared to other statistical approaches, the advantages of using SVD technique include: 1) SVD is an automatic algorithm which does not require a formal knowledge model. 2) It can be used in any applications without customization. 3) It guarantees the best fit. 4) Increasing and decreasing the precision of the fit is relatively straightforward (e.g. by adjusting the threshold cosine value). 5) Theories of SVD and its variants have been well established. 6) Its uses in numerous different applications prove its usefulness 7) Tools for SVD transformation and analysis are readily available. However, for large matrix, it is computational intensive and updating the system [23] to reflect new data in a large matrix is a challenge. In addition, the automatically generated correlations among QoS attributes and service providers are sometimes not obvious to humans, especially when the orthogonal factors are large.

Initial simulated dataset yields promising results with the number of orthogonal factors k used in the reduced model chosen to be two to represent a 2-dimensional conceptual space. However, the representation of a conceptual space for any large QoS metrics and service provider collection usually requires a fairly large number of orthogonal factors. Finding a balance point between accuracy and coverage is a challenge as well as an art. In addition, the choice of the cosine value in locating recommended cloud service providers relative to the pseudo service provider in the QP space is also an important factor as this parameter determines the degree of matching of the selected service providers. The relationships among the number of QoS attributes, service providers, factor k and cosine values are interesting topics for research.

The measurements and the selection of appropriate parameters for cloud service providers directly affect the selection of appropriate service providers for an application. The accurate classification of applications with respect to its requirements also plays a major role in the accurate matching of applications to the appropriate service providers. In this paper, we use generally accepted QoS practices to describe service providers and general application characteristics to describe application requirement, there is no standard way to allow a universal description format and semantics. Research on standardizing QoS parameters for service providers and application classification

368 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

Page 8: [IEEE 2010 IEEE/IFIP Network Operations and Management Symposium Workshops - Osaka, Japan (2010.04.19-2010.04.23)] 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

semantics are interesting and important topics which will help to advance the use of cloud computing services.

10. References

[1] R. Buyya, Y. S. Chee, and V. Srikumar, “Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities”, Department of Computer Science and Software Engineering, University of Melbourne, Australia, July 2008, pp. 9. [2] D. Chappell, “A Short Introduction to Cloud Platforms”, David Chappell & Associates, August 2008. [3] G. Gruman, "What cloud computing really means", InfoWorld, Jan. 2009. [4] T. C. Chieu, A. Mohindra, A. A. Karve and A. Segal, “Dynamic Scaling of Web Applications in a Virtualized Cloud Computing Environment”, Proceedings of the IEEE International Conference on e-Business Engineering (ICEBE 2009), Macau, China, Oct. 2009, pp. 281-286. [5]-http://www.fas.org/irp/offdocs/dcid-6-3-manual.pdf [6] Menasce, D.A. QoS Issues in Web Services Internet Computing, IEEE publication Nov/Dec 2002. Volume:6, Issue:6 ( page 72-75) [7] “Understanding Quality of Services for Web Services and Web Services QoS Requirements”, http://www.ibm.com/developerworks/library/ws-quality.html[8] “Using Policy-Based QoS to Enable and Manage WMM in Enterprise Wireless Deployments”, http://blogs.msdn.com/wndp/archive/2006/06/30/653047.aspx[9] Ehtesham Zahoor et al, “Rule-based semi-automatic Web services composition”, Proc. 2009 IEEE Congress on Services I - SERVICES 2009 (2009) 805-812" [10] Norman Paton et al, “Optimizing Utility in Cloud Computing through Autonomic Workload Execution”, IEEE Data Engineering Bulletin, Volume 32 , Number 1, March 2009 p 51-58 [11] Jian Zhang, R.J. Figueiredo, "Application classification through monitoring and learning of resource consumption patterns," ipdps, pp.121, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006 [12] Jian Zhang, R.J. Figueiredo, "Autonomic Feature Selection for Application Classification," icac, pp.43-52, 2006 IEEE International Conference on Autonomic Computing, 2006

[13] A. Aboulnaga, K. Salem, et al. Deploying Database Appliances in the Cloud. In IEEE DE Bulletin, 32(1):3--12, 2009. [14] J.E. Gentle, "Singular Value Factorization" in Numerical Linear Algebra for Apps in Statistics, Berlin: Springer-Verlag, (1998), pp.102-103. [15] T. Kwok and M. Perrone, “Adaptive N-Best List Handwritten Word Recognition” in Proc. of 6th Int’l Conf. on Document Analysis and Recognition, (2001), pp. 168- 172. [16] Michael E. Wall, Andreas Rechtsteiner, Luis M. Rocha. “Singular value decomposition and principal component analysis” http://public.lanl.gov/mewall/kluwer2002.html.[17] Jolliffe I.T., Principal Component Analysis. New York: Springer, 1986 [18] T. K. Landauer and S.T. Dumais, “The latent semantic analysis theory of acquisition, induction, and knowledge representation”, Psychological Review, 104(2): 211-240, (1997). [19] T. Hofmann, “Latent Semantic Models for Collaborative Filtering”, ACM Transaction on Information System, Volume 22 no. 1 Jan 2004. [20] H. Chan, T Kwok, “Autonomic PD Agents using SVD for Ambiguous Situations” in Proc. of IEEE WIC/ACM IAT, (2006), pp. 270-275. [21] R. Berry. 1992. “Large-scale sparse singular value computations”, The International Journal of Supercomputer Applications, 6(1):13–49 [22] T. K. Landauer and S.T. Dumais, “The latent semantic analysis theory of acquisition, induction, and knowledge representation”, Psychological Review, 104(2): 211-240, (1997). [23] P. Hall, D. Marshall, and R. Martin. “Merging and splitting eigenspace models”, PAMI, 22(9):1042--1048, 2000

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 369