time-based criteria for performance comparison of resource

5
Time-based Criteria for Performance Comparison of Resource-Intensive User Tasks in Virtual Desktops Mikhail Makarov 1 , Prasad Calyam 2 , Andrei Sukhov 3 , Vitaly Samykin 3 1 Volga Oil-trunk Pipelines, Russia; 2 University of Missouri-Columbia, USA; 3 Samara State Aerospace University, Russia [email protected], [email protected], [email protected], [email protected] Abstract—In order to reduce the costs of maintaining user applications and increase their scalability and manageability, companies are increasingly implementing virtual desktop in- frastructures (VDI). The most important factor in migration from traditional desktops to virtual desktops is the quality of user experience, which directly depends on thin-client protocols performance in VDIs. Performance measurement of thin-client protocols is a complex task because it involves a combined analysis of intensive resource consumption along multiple di- mensions (CPU, memory, and network bandwidth) between data centers and thin-client user sites. In this paper, we create a novel method and a “VDtest” benchmarking toolkit for comparing the performance of common thin-client protocols such as RDP (Remote Desktop Protocol) and PCoIP (PC-over-IP) for resource- intensive user tasks involved in graphics-based applications, using a set of “time-based criteria” such as: task downloading time, application processing time, time of video data output to thin- client console and time for I/O operations. Through empirical data analysis relating to these criteria for JPEG-transform tasks with VDtest featuring different image sizes, we show the dominant criteria to evaluate thin-client protocol performance, and explore suitable configurations based on their influence on CPU, memory and network bandwidth resources. I. I NTRODUCTION Today, user applications are becoming highly complex, dis- tributed and are demanding large computing and networking resources to meet remote user access needs within domains of advanced manufacturing, health care and education. Virtual Desktop Infrastructure (VDI) solutions are being deployed increasingly to reduce the costs and overheads of maintaining these user applications and to increase their scalability as well as their manageability by migrating from traditional desktops to virtual desktops (VD) [1]. Moreover, VDI solutions allow resources at data centers to be optimized for meeting resource demands that satisfy user quality of experience needs when accessing virtual desktops [2]. Currently, popular thin-client protocols used to widely access VDI environments include RDP (Remote Desktop Protocol), PCoIP (PC over IP), and RGS (Remote Graphics Software) [3]. This work was partially supported by grant of the Russian Foundation for Basic Research (RFBR) 13-07-00381a. This material is also based upon work supported by VMware and the National Science Foundation under award numbers CNS-1342499 and CNS-1205658. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of VMware or the National Science Foundation. However, there are new and complex challenges in terms of the satisfactory user experience delivery within VDIs. For example, the limitation of the network bandwidth at edge networks is a critical bottleneck factor for thin-client protocols performance, and the overall measurement of thin- client protocols performance involves a combined analysis of intensive resource consumption along multiple dimensions (CPU, memory, and network bandwidth) between data centers and thin-client user sites. Another challenge in VDIs to deliver satisfactory user experience is the fact that different thin-client vendors publish conservative “rule-of-thumb” estimates of re- quired bandwidth and maximum tolerable latency of an end-to- end network connection to warrant an “impairment free” user experience – that is perceived comparable to the use of a local traditional desktop environment. In the same context, selection of the appropriate resource size and hardware configuration for CPU and memory (i.e., RAM) amongst multiple choices is a challenge at the data center level. Previous research has shown that suitable thin-client con- figurations and system resource allocations can give a perfor- mance advantage in terms of user experience when performing specific tasks at the user thin-client site [1] - [4]. However, performance tests of thin-client protocols and their impact on CPU and RAM to-date have been conducted in the use context of common user applications such as word processing, Internet explorer, multimedia playback, and there is a clear lack of works that address issues relating to satisfactory delivery of user quality of experience for resource-intensive task contexts such as those seen in graphics-intensive applications. Building upon earlier works and to address the above gap, there is a need to develop a novel benchmarking approach that can be broadly applied to common applications and ad- vanced graphics-intensive applications in virtual desktops with different thin-client configurations and VDI environments, analogous to ‘LINear equations software PACKage’ (LIN- PACK) tests used in challenges to compare High-Performance Computing (HPC) systems. Given the burstiness of virtual desktop resource usage due to user behavior that involves inter- mittent application context switches, and the demanding user expectations for real-time interaction with remotely hosted applications in virtual desktops, several significant criteria need to be jointly considered to derive a general benchmarking approach. The interaction demands between the users and applications in virtual desktops motivates using “time-based” criteria to gain insights of impact on user experience.

Upload: others

Post on 02-Oct-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Time-based Criteria for Performance Comparison of Resource

Time-based Criteria for Performance Comparison ofResource-Intensive User Tasks in Virtual Desktops

Mikhail Makarov1, Prasad Calyam2, Andrei Sukhov3, Vitaly Samykin31Volga Oil-trunk Pipelines, Russia; 2University of Missouri-Columbia, USA; 3Samara State Aerospace University, Russia

[email protected], [email protected], [email protected], [email protected]

Abstract—In order to reduce the costs of maintaining userapplications and increase their scalability and manageability,companies are increasingly implementing virtual desktop in-frastructures (VDI). The most important factor in migrationfrom traditional desktops to virtual desktops is the quality ofuser experience, which directly depends on thin-client protocolsperformance in VDIs. Performance measurement of thin-clientprotocols is a complex task because it involves a combinedanalysis of intensive resource consumption along multiple di-mensions (CPU, memory, and network bandwidth) between datacenters and thin-client user sites. In this paper, we create a novelmethod and a “VDtest” benchmarking toolkit for comparingthe performance of common thin-client protocols such as RDP(Remote Desktop Protocol) and PCoIP (PC-over-IP) for resource-intensive user tasks involved in graphics-based applications, usinga set of “time-based criteria” such as: task downloading time,application processing time, time of video data output to thin-client console and time for I/O operations. Through empiricaldata analysis relating to these criteria for JPEG-transformtasks with VDtest featuring different image sizes, we show thedominant criteria to evaluate thin-client protocol performance,and explore suitable configurations based on their influence onCPU, memory and network bandwidth resources.

I. INTRODUCTION

Today, user applications are becoming highly complex, dis-tributed and are demanding large computing and networkingresources to meet remote user access needs within domains ofadvanced manufacturing, health care and education. VirtualDesktop Infrastructure (VDI) solutions are being deployedincreasingly to reduce the costs and overheads of maintainingthese user applications and to increase their scalability as wellas their manageability by migrating from traditional desktopsto virtual desktops (VD) [1]. Moreover, VDI solutions allowresources at data centers to be optimized for meeting resourcedemands that satisfy user quality of experience needs whenaccessing virtual desktops [2]. Currently, popular thin-clientprotocols used to widely access VDI environments includeRDP (Remote Desktop Protocol), PCoIP (PC over IP), andRGS (Remote Graphics Software) [3].

This work was partially supported by grant of the Russian Foundation forBasic Research (RFBR) 13-07-00381a. This material is also based uponwork supported by VMware and the National Science Foundation underaward numbers CNS-1342499 and CNS-1205658. Any opinions, findings,and conclusions or recommendations expressed in this publication are thoseof the author(s) and do not necessarily reflect the views of VMware or theNational Science Foundation.

However, there are new and complex challenges in termsof the satisfactory user experience delivery within VDIs.For example, the limitation of the network bandwidth atedge networks is a critical bottleneck factor for thin-clientprotocols performance, and the overall measurement of thin-client protocols performance involves a combined analysisof intensive resource consumption along multiple dimensions(CPU, memory, and network bandwidth) between data centersand thin-client user sites. Another challenge in VDIs to deliversatisfactory user experience is the fact that different thin-clientvendors publish conservative “rule-of-thumb” estimates of re-quired bandwidth and maximum tolerable latency of an end-to-end network connection to warrant an “impairment free” userexperience – that is perceived comparable to the use of a localtraditional desktop environment. In the same context, selectionof the appropriate resource size and hardware configuration forCPU and memory (i.e., RAM) amongst multiple choices is achallenge at the data center level.

Previous research has shown that suitable thin-client con-figurations and system resource allocations can give a perfor-mance advantage in terms of user experience when performingspecific tasks at the user thin-client site [1] - [4]. However,performance tests of thin-client protocols and their impact onCPU and RAM to-date have been conducted in the use contextof common user applications such as word processing, Internetexplorer, multimedia playback, and there is a clear lack ofworks that address issues relating to satisfactory delivery ofuser quality of experience for resource-intensive task contextssuch as those seen in graphics-intensive applications.

Building upon earlier works and to address the above gap,there is a need to develop a novel benchmarking approachthat can be broadly applied to common applications and ad-vanced graphics-intensive applications in virtual desktops withdifferent thin-client configurations and VDI environments,analogous to ‘LINear equations software PACKage’ (LIN-PACK) tests used in challenges to compare High-PerformanceComputing (HPC) systems. Given the burstiness of virtualdesktop resource usage due to user behavior that involves inter-mittent application context switches, and the demanding userexpectations for real-time interaction with remotely hostedapplications in virtual desktops, several significant criterianeed to be jointly considered to derive a general benchmarkingapproach. The interaction demands between the users andapplications in virtual desktops motivates using “time-based”criteria to gain insights of impact on user experience.

Page 2: Time-based Criteria for Performance Comparison of Resource

In this paper, we present a novel benchmarking method anda related “VDtest” toolkit for comparing the performance ofpopular thin-client protocols such as RDP and PCoIP specif-ically for resource-intensive user tasks involved in graphics-based applications. The method and toolkit use a set of “time-based criteria” such as: task downloading time, applicationprocessing time, time of video data output to thin-clientconsole and time for I/O (input/output) operations. Throughempirical data analysis relating to these criteria for JPEG-transform tasks with VDtest featuring different image sizes,we show the dominant criteria to evaluate thin-client protocolperformance, and explore suitable configurations based ontheir influence on CPU, memory and network bandwidthresources. Although our method and toolkit are primarilyapplied to the JPEG-transform task that uses a variation of thediscrete cosine transform (DCT) algorithm in [5], we remarkthat our method and toolkit are easily extensible for othercommon user application contexts. Further, our experimentsand results provide insights in selection of thin-client protocolconfigurations in various resource-intensive user task contextsand guidance for performance bottleneck troubleshooting ofone or more resource dimensions in VDI environments.

The remainder of the paper is organized as follows: SectionII presents an application workload overview, details our VDItestbed setup and VDtest toolkit implementation for empiricaldata collection. Section III describes our experiment resultsrelating to our investigation of resource-intensive user tasks.Section IV concludes the paper and presents future work.

II. TESTBED SETUP AND APPLICATION WORKLOAD

VDI enables creation of virtual desktop pools on one ormore servers running a number of virtual machines. In orderto access the required virtual desktop, user via a thin-clientauthenticates against an identity management system suchas an Active Directory or LDAP, and selects an entitledvirtual desktop from a corresponding pool. Figure 1 shows ourclosed-network LAN testbed setup for experiments (does notfeature artificial or congestion-caused packet loss). It featuresa thin-client and a resource-intensive application workloadon a virtual desktop that is benchmarked using our VDtesttool and monitoring analysis scripts. Details of our testbedconfiguration, VDtest tool implementation and benchmarkingmethodology are presented in the following sub-sections.

A. Hardware and Software Configuration

Server-side configuration of our testbed included VMwarevSphere with ESXi hypervisor, a VMware View ConnectionBroker, and a Quest Software vWorkspace Connection Broker.The virtual desktop platform specifications were: WindowsXP operating system provisioned with one core 3GHz CPU,1 Gbps NIC and 1 GB of RAM. The thin-client devicewas a Wyse V90LE with preinstalled Microsoft WindowsXP SP3 Embedded operating system. The thin-client wasconfigured with two RDP protocol implementations, one fromQuest Software (RDP Quest) and another from VMware (RDPVMware), as well as with the Teradici PCoIP protocol. Both

Fig. 1: Testbed Setup

Fig. 2: Test application workflow

RDP Quest and RDP VMware are TCP-based, whereas thePCoIP is UDP-based and uses as much available bandwidthin the network path. It is important to note that these thin-client protocols as well as other popular thin-clients such asCitrix HDX and HP RGS use various algorithms for codingand data compression between server and client, and areoptimized for resiliency against network health fluctuations.For example, the RDP Quest is variant of the Microsoft RDPimplementation with enhancements such as EOP (ExperienceOptimized Protocol) graphics acceleration.

B. VDtest Tool

As shown in Figure 1, the thin-client contains a removableflash drive with uncompressed image components, which areaccessible at the virtual desktop through a USB Redirectionfeature that creates a tunnel connection between the thin-clientuser site and the VDI server side. The VDtest tool uses theimage components and performs a JPEG transformation [5]as a test application based on Microsoft Excel spreadsheetthat contains matrix of the original image. We chose such anapplication workload because it involves the popular Excelspreadsheet tasks that are commonly performed by usersaccessing virtual desktops. We also use the Excel program toperform graphics-intensive processing that creates significantload on the CPU and RAM resources allocated to the virtualdesktop. The output of the transformation that uses a variationof the discrete cosine transform (DCT) algorithm is the matrixof the compressed image. The sequence of actions performedin the test application workflow are shown in Figure 2.

The graphical user interface of the VDtest tool that wedeveloped as a contribution of this paper is shown in Figure 3.It falls under ‘synthetic workload generation based measure-ment and monitoring’ in the taxonomy of cloud monitoring [6]and helps with repeatability of experiments and data analysiswith the test application. The VDtest user interface allows

Page 3: Time-based Criteria for Performance Comparison of Resource

Fig. 3: VDtest: Graphical User Interface

Fig. 4: Sequence of steps in an experiment run

us to choose initial image, optionally enable or disable fea-tures such as 3-channel transform and can redirect outputof video data to console. It also has buttons to observe allintermediate results such as matrix of a raw image, matrixof co-efficients after DCT, as well as optimized matrix afterrounding and compressing. Lastly, the result button allows usto open measurement results such as the total execution timeof VDtest code, time of basic nested loops, and run-time I/Owith spreadsheets.

Screen scrapes of the JPEG transformation process involv-ing video output during tables processing from the server-sideto the thin-client creates significant load on the network band-width as well, as seen from our experiment results describedin Section III. Thus, we were able to use a resource-intensiveapplication task for benchmarking with VDtest tool and applysignificant load to CPU, RAM and network bandwidth in orderto study how load on one resource affects the other two interms of time-based criteria.

C. Performance Benchmarking Method

Figure 4 shows the sequence of steps in an experiment runfor a particular thin-client protocol configuration and imagesize. The monitoring of resources is started and timestampedwhen a connection to a virtual machine to access the testapplication is invoked. The monitoring is continued duringthe various VDtest tool processing steps involving the uploadof the raw image from the thin-client to the virtual machine

with the VD, the processing of the raw image and the resultscollection, followed by disconnection of the VD session.The measurement results during the entire experiment runare collected using a result collection protocol in VDtestusing common utilities provided with hypervisor software. Theresults are then used to profile the resource consumption usinganalysis scripts and a measurement report is generated.

TABLE I: Time-based criteria to measure VD performance fora resource-intensive user task

Notation Definitiont0 Time to download a test task to a virtual desktopt1 Time to process basic VDtest actionst2 Transfer time for video data output to the thin-client consolet3 Processing time of standard I/O operations

tsum Total measurement time

Our performance benchmarking method divides the totalexecution time of the test application into individual actiontime periods. The time periods as shown in Table I correspondto the various time-based criteria whose measurements underdifferent test conditions exhibit impact in their length, whichthen manifest as performance indicators. More specifically,longer time periods than baseline length under initial (ideal)test conditions indicate degraded performance. Such experi-ment runs are repeated and results are averaged (default settingin each of our experiments is 10 runs) for different thin-clientconfigurations such as RDP VMware, RPD Quest and PCoIP,as well as for different image sizes shown in Table II.

TABLE II: Sets of the source image data

Data Set Image Size, KBS1 48S2 96S3 144S4 192S5 240

III. EXPERIMENT RESULTS AND ANALYSIS

Performance of a VDI is measured by three dependent re-source types: CPU, memory and network bandwidth consump-tion. However, to quantify thin-client protocol performance, itis preferable to determine the relative importance of each ofthe resource types on perceived performance, and to use ageneralized set of “time-based” criteria as a unifying set ofmetrics with a common scale (or measurement units). In thefollowing subsections, we first discuss the generalized criteria.Subsequently, we show the dominant criteria to evaluatethin-client protocol performance, and lastly explore suitableconfigurations based on their influence on CPU, memory andnetwork bandwidth resources.

A. Generalized Criteria

The approach of using the generalized criteria helps tocompare VDI environments that have different resource spec-ifications with a common set of rules. It particularly helpsin performance comparison of thin-client protocols and to

Page 4: Time-based Criteria for Performance Comparison of Resource

get answers to questions such as: “What are the dominantperformance metrics or time-based criteria to assess the impactof load and other bottlenecks on the user quality of experienceperceived in a VDI?” We derive generalized criteria in ourstudy based on data obtained from experiments described inSection II. In the data sets, we assume that the total time ofthe measurement is proportional to the complexity of the testtasks performed by the application. For a given task QS witha particular image size, Equation (1) below shows how theperformance of thin-client protocols as perceived by a thin-client user is inversely proportional to the tsum, which isinherently composed of a sum of criteria with different weightcoefficients. The more dominant criteria are assigned higherweights, and thus have a higher level influence on QS .

QS ∝ 1

tsum∝ 1

n∑i=1

kiti

(1)

where ti is the i-th time component of the generalized criteriacorresponding to entries in Table I; ki is the weighting factorof the i-th component of the generalized criteria; S is the sizeof the initial matrix i.e., one of the image sizes in Table II.

The determination of the true closed-form expression forthe generalized criteria requires carrying out several additionalmeasurements in the large sample space of CPU, memory andnetwork bandwidth, and such experimentation is beyond thescope of this paper and is part of our future work. The addi-tional measurements need to be made using small steps thatwould change the parameters of one of the three factors (CPU,RAM, NET). Next, we need to construct graphs Q(CPU),Q(RAM), Q(NET) using objective or subjective (involvingactual human subjects) testing. A step change of each of thethree quantities CPU, RAM and network bandwidth can betraced to the effects of changing one factor and its impact onthe other two. Next, weights need to be set such that it ispossible to express the impact of the CPU, RAM and networkbandwidth over each other in the corresponding dimension.

B. Dominant Component Identification

Regardless of having a true closed-form expression, exami-nation of the dominant components of the generalized criteriacan provide useful insights for performance bottleneck trou-bleshooting, and to adapt corresponding resources to improveuser experience.

At the outset, the relative run-time i.e., tsum performancecomparison of the generalized criteria components for thedifferent thin-client protocols showed that PCoIP in each caseexhibits the least run-times than either RDP VMware and RDPQuest, which have comparable performance. Figure 5 showsthe structure of a few of the dominant generalized criteria forthe different sizes S1 (smallest size we considered) and S5(largest size we considered) of initial matrices for PCoIP, RDPVMware and RDP Quest protocols. Analyzing the structure ofthese criteria, we conclude that the main dominant criterioncomponent is clearly the video output time t2 as seen fromits high range values (20 – 80%). This criterion needs to

Fig. 5: The structure of generalized criteria for sets S1 and S5

(a) PCoIP: S1 (b) PCoIP: S5

(c) RDP VMware: S1 (d) RDP VMware: S5

Fig. 6: Breakdown structure of the generalized criteria for setsS1 and S5

be carefully monitored in VDI systems as it determines theperformance of the thin-client protocol when CPU and RAMas well as the network bandwidth are significantly loaded.

Figure 6 shows the breakdown structure of all generalizedcriteria for protocols PCoIP and RDP VMware for the dif-ferent sizes S1 and S5 of initial matrices. We can see fromFigures 6a, 6c that if the option of extended video outputis enabled, then for small matrices the time criterion t2 isstill the predominant factor that impacts the test applicationthe most. For large matrices, relevant components have asimilar ratio as seen in Figures 6b, 6d, and weight of t2is still the most dominant. The time for processing maincycles of the application (i.e., t1), which is the basis forDCT-transformation is about the same for all protocols fora given hardware configuration, and does not have significanteffect on the user quality of experience. Observing results ofexperiments showing components of the generalized criteriasuch as downloading uncompressed image to a virtual machine(t0), processing of main loops of VDtest application (t1) andthe time allotted for the I/O operations (t3), we can see thattheir differences occur in the test application no more than 6%of total time.

C. Influence on CPU utilization, RAM and Network Traffic

In order to evaluate the resource consumption performanceof thin-client protocols, it is apt to represent the obtained datausing 3-dimensional diagrams shown in Figure 7, as done

Page 5: Time-based Criteria for Performance Comparison of Resource

(a) Set S1 (b) Set S5

Fig. 7: Performance comparison of resource consumption ofthe thin-client protocols

similarly in [3]. From these diagrams it is clear that the CPUgets loaded to a full extent (97-99%) when only using thePCoIP protocol. The RDP Quest and RDP VMware protocolshave average integral values of this parameter i.e., 54% and47%, respectively. Even though there can be specific RDPoptimizations for codecs such as in the case of RDP Questto optimize user quality of experience, they do not exhibitsignificant influence in the resource consumption profile char-acteristics.

Figure 8 (a) and (b) shows general characteristics of the datarate behavior (a.k.a. bandwidth consumption) for RDP Questand PCoIP thin-client protocols used in the test application,respectively. The behavior is shown for components of thetime-criteria such as the loading task in the virtual machine(first peak) and the subsequent transfer of the video to thethin-client. Network bandwidth is best used by PCoIP pro-tocol, efficiently organizing data in the forward and reversedirections. With reference to the CPU load characteristics, weobserve a gradual plateau for the family of RDP protocols(RDP VMware data not shown due to space constraints ofthe paper), while PCoIP has a clear plateau during the datatransfer to the thin-client. CPU consumption for RDP is lowerin average at a given instant, but is more extended in time. Incontrast, PCoIP consumes higher CPU to speed up the videooutput in order to maximize user experience. Further, if weobserve the obtained experiment results for the consumptionof RAM shown in Figure 9, again PCoIP is more efficient, andRAM resource consumption depends on the image size beingprocessed. It varies in a slightly increasing linear functionfashion for all the protocols. Thus, in comparison to the familyof RDP protocols, the PCoIP shows aggressive yet efficientuse of CPU, network bandwidth and RAM resources in thecontext of resource-intensive user tasks in virtual desktops.

IV. CONCLUSION

In this paper, we presented a novel approach to measurethe performance of a VDI when supporting resource-intensiveuser tasks involved in graphics-based applications. The mea-surements were analyzed in terms of a set of generalizedtime-based criteria obtained by dividing the total executiontime of a test application into individual action time periods.Using a testbed that featured our “VDtest” tool and relatedanalysis scripts, we leveraged our approach for performancecomparison of popular thin-client protocols such as RDP and

(a) RDP Quest (b) PCoIP

Fig. 8: Data rate and CPU load for thin-client protocols duringtest application execution

Fig. 9: Usage of RAM for different thin-client protocols

PCoIP that are dependent on resource dimensions such asCPU, memory and network bandwidth consumption. Throughexperiments with the test application in VDtest that involved aset of JPEG-transform tasks on raw image matrices featuringdifferent image sizes, we showed that the most dominantcriterion to evaluate thin-client protocol performance was thetransfer time for video data output to the thin-client consolefrom the server-side. In addition, we found that PCoIP protocolin comparison to RDP protocol variants showed aggressive yetefficient use of CPU, network bandwidth and RAM resources.

Our future work is the determination of the true closed-formexpression for the generalized criteria, which requires carryingout several additional measurements in the large sample spaceof CPU, memory and network bandwidth.

REFERENCES

[1] P. Calyam, R. Patali, A. Berryman, A. Lai, R. Ramnath,“Utility-directed Resource Allocation in Virtual DesktopClouds”, Elsevier COMNET, 2011.

[2] L. Deboosere, B. Vankeirsbilck, P. Simoens, et. al.,“Cloud-based Desktop Services for Thin Clients”, IEEEInternet Computing, 2011.

[3] A. Lai, J. Nieh, “On The Performance Of Wide-AreaThin-Client Computing”, ACM TOCS, 2006.

[4] J. Kouril, P. Lambertova, “Performance Analysis andComparison of Virtualization Protocols, RDP andPCoIP”, Proc. of ICCOMP, 2010.

[5] G.K. Wallace, “The JPEG Still Picture CompressionStandard”, Communications of the ACM, 1991.

[6] G. Aceto, A. Botta, W. Donato, A. Pescape, “CloudMonitoring: A Survey”, Elsevier COMNET, 2013.