a scalable blackbox-oriented e-learning system based on desktop grid over private cloud

10
Future Generation Computer Systems 38 (2014) 1–10 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs A scalable blackbox-oriented e-learning system based on desktop grid over private cloud Lung-Pin Chen a,, Jien-An Lin a , Kuan-Ching Li b , Ching-Hsien Hsu c , Zhi-Xian Chen a a Department of Computer Science, TungHai University, Taiwan, ROC b Department of Computer Science and Information Engineering, Providence University, Taiwan, ROC c Department of Computer Science and Information Engineering, Chung Hua University, Taiwan, ROC highlights Enabling e-learning users to contribute to the server computation via desktop grid. Developing agent-based worker deployment for the workers hosted in VMs. Developing new ‘‘workunit-based scheduling algorithm’’ for assigning tasks to workers of different capability. article info Article history: Received 7 December 2011 Received in revised form 10 February 2014 Accepted 24 February 2014 Available online 20 March 2014 Keywords: Cloud computing Desktop grid E-learning Volunteer computing abstract Traditional web-based e-learning system suffers from unstable workloads and security risks of incorpo- rating external executable objects to servers. This paper addresses these issues with emerging technolo- gies, as desktop grid and cloud computing. Learning users are motivated to be volunteers by hosting the virtual machines equipped with e-learning desktop grid applications. We develop components to inte- grate the e-learning system and desktop grid into the circumstance in which each user serves not only a task producer, but also a volunteer that solves tasks. In order to enhance the responsiveness between the passive desktop grid server and e-learning system, we have also developed asynchronous processes to enable the server and volunteer workers to cooperate in a tightly coupled manner. The system achieves the scalability by maintaining the ratio between the number of volunteers and the number of online users beyond certain threshold. © 2014 Elsevier B.V. All rights reserved. 1. Introduction E-learning is a rapid growing and important application in the Internet and is widely used in many enterprises and educational communities [1]. With the continuous evolution of the Internet, there is an increasing demand for incorporating online practice or online certification into traditional e-learning systems. This work focuses on a web-based computer language e-learning system that allows users to upload their programs to the server where the code is parsed, performed and evaluated. The client–server computing model provides the economical and robust medium for adaptive learning management. However, Correspondence to: Department of Computer Science, TungHai University, No. 1727, Sec. 4, Taiwan Boulevard, Xitun District, Taichung 40704, Taiwan, ROC. Tel.: +886 4 23590415; fax: +886 4 23591567. E-mail address: [email protected] (L.-P. Chen). system non-responsiveness is often experienced since the client request rate may exceed the fixed server capacity. A traditional so- lution is to build a dedicated server cluster with a task dispatcher. Since users can submit tasks intermittently in an e-learning sys- tem, a server may stay idle during non-training time. Today, many research works have been devoted to various e-learning systems. Instead of developing new functionalities, this work attempts to explore the feasibility of building a low-cost and scalable e-learning system via user cooperation. The idea is to gather computing resources from e-learning users by using the recently emerged volunteer computing and cloud computing techniques [2–6]. Volunteer computing is a type of distributed computing that allows people donating their computing resources to certain projects [7,8]. BOINC is a volunteer computing platform that has demonstrated its efficacy for large scientific research projects in fields as biology, medicine, astrophysics, engineering and climate study [9,10]. By breaking a monolithic job into a large number http://dx.doi.org/10.1016/j.future.2014.02.017 0167-739X/© 2014 Elsevier B.V. All rights reserved.

Upload: zhi-xian

Post on 30-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Future Generation Computer Systems 38 (2014) 1–10

Contents lists available at ScienceDirect

Future Generation Computer Systems

journal homepage: www.elsevier.com/locate/fgcs

A scalable blackbox-oriented e-learning system based on desktop gridover private cloudLung-Pin Chen a,∗, Jien-An Lin a, Kuan-Ching Li b, Ching-Hsien Hsu c, Zhi-Xian Chen a

a Department of Computer Science, TungHai University, Taiwan, ROCb Department of Computer Science and Information Engineering, Providence University, Taiwan, ROCc Department of Computer Science and Information Engineering, Chung Hua University, Taiwan, ROC

h i g h l i g h t s

• Enabling e-learning users to contribute to the server computation via desktop grid.• Developing agent-based worker deployment for the workers hosted in VMs.• Developing new ‘‘workunit-based scheduling algorithm’’ for assigning tasks to workers of different capability.

a r t i c l e i n f o

Article history:Received 7 December 2011Received in revised form10 February 2014Accepted 24 February 2014Available online 20 March 2014

Keywords:Cloud computingDesktop gridE-learningVolunteer computing

a b s t r a c t

Traditional web-based e-learning system suffers from unstable workloads and security risks of incorpo-rating external executable objects to servers. This paper addresses these issues with emerging technolo-gies, as desktop grid and cloud computing. Learning users are motivated to be volunteers by hosting thevirtual machines equipped with e-learning desktop grid applications. We develop components to inte-grate the e-learning system and desktop grid into the circumstance in which each user serves not only atask producer, but also a volunteer that solves tasks. In order to enhance the responsiveness between thepassive desktop grid server and e-learning system, we have also developed asynchronous processes toenable the server and volunteer workers to cooperate in a tightly coupled manner. The system achievesthe scalability bymaintaining the ratio between the number of volunteers and the number of online usersbeyond certain threshold.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

E-learning is a rapid growing and important application in theInternet and is widely used in many enterprises and educationalcommunities [1]. With the continuous evolution of the Internet,there is an increasing demand for incorporating online practice oronline certification into traditional e-learning systems. This workfocuses on aweb-based computer language e-learning system thatallows users to upload their programs to the server where the codeis parsed, performed and evaluated.

The client–server computing model provides the economicaland robust medium for adaptive learning management. However,

∗ Correspondence to: Department of Computer Science, TungHai University, No.1727, Sec. 4, Taiwan Boulevard, Xitun District, Taichung 40704, Taiwan, ROC. Tel.:+886 4 23590415; fax: +886 4 23591567.

E-mail address: [email protected] (L.-P. Chen).

http://dx.doi.org/10.1016/j.future.2014.02.0170167-739X/© 2014 Elsevier B.V. All rights reserved.

system non-responsiveness is often experienced since the clientrequest rate may exceed the fixed server capacity. A traditional so-lution is to build a dedicated server cluster with a task dispatcher.Since users can submit tasks intermittently in an e-learning sys-tem, a server may stay idle during non-training time.

Today, many research works have been devoted to variouse-learning systems. Instead of developing new functionalities, thiswork attempts to explore the feasibility of building a low-costand scalable e-learning system via user cooperation. The idea isto gather computing resources from e-learning users by usingthe recently emerged volunteer computing and cloud computingtechniques [2–6].

Volunteer computing is a type of distributed computing thatallows people donating their computing resources to certainprojects [7,8]. BOINC is a volunteer computing platform that hasdemonstrated its efficacy for large scientific research projects infields as biology, medicine, astrophysics, engineering and climatestudy [9,10]. By breaking a monolithic job into a large number

2 L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10

of units and distributing into different desktop computers calledworkers, BOINC can perform jobs efficiently via massive paral-lelism. Cloud computing [2] is a computing paradigm that de-scribes a model of supplement, consumption, and delegation ofcomputing resources to the users on-demand. Cloud computingplatform is usually viewed as a management system of a pool ofvirtualized resources [2]. Virtualization brings many benefits formanipulating computing resources across complex and heteroge-neous environments.

In this work, the e-learning desktop grid refers to a VM infras-tructure built on top of the user’s computers, in which each user isnot only a generator of tasks, but also a volunteer to solving tasks.The following main issues are addressed.Promotion. Motivating has been always critical for volunteercomputing. A naive restrict approach is to prohibit users from sub-mitting new tasks unless they contribute their resources propor-tionally. In this paper, we adopt the non-restrict approach whichmotivates users via user credits in the e-learning web site.Performance. In the e-learning private cloud, user-perceived re-sponse time is a major concern since the workers are voluntaryand can disconnect spontaneously. The servers’ responsiveness di-rectly depends on the ratio between the number of online learnersand the number of online volunteers, which can be controlled viathe restrict or non-restrict promotion policies mentioned above.We introduce some high capability official workers and develop anew scheduling algorithm to integrate the official and volunteerworkers efficiently.Security. Unlike normal volunteer computing platformwhere tasksare normally generated in authenticated servers, the tasks inthe e-learning system are initiated by those in-trained learners.Executing the e-learning tasks may lead to unexpected resultsin the volunteer’s computer. In addition, to prevent plagiarism,volunteers must be prohibited from inspecting the code delegatedto their computers.

To the best of our knowledge, the notion of gathering com-puting resources from e-learning users to maintain system re-sponsiveness is original. There are several existing web-basede-learning systems that provide automated and distributed onlinejudge functionalities [11–13]. These systems employ more serversas the system scale increases. In this paper, the e-learning desktopgrid achieves scalability by involving user computers. Most cloudcomputing platforms assume that VM hosts are directly controlledby a front-end server, which is not applicable in volunteer com-puting environment. In [14,15], the authors developed the toolsfor automated deployment of VMs to the computers owned byanonymous volunteers. Compared to their works, our deploymentcomponent is more complex since the system has to maintain amembership mapping between the e-learning system and the vol-unteer computing platform.

The proposed architecture allows participants not only to do-nate their resources but also to use others’ as well. When em-ploying a broker, the participants form a desktop grid communitythat helps members to solve overloaded large-scale tasks such ascomputational science projects or artificial intelligence game treesearches. Developing a resource broker that can fairly and effi-ciently assign resource for the desktop grid community will be afuture work.

The remaining of this paper is organized as follows. Section 2introduces the backgrounds of e-learning systems. Section 3discusses the system architecture and implementation issues.Section 3.4 discusses the scheduler for assigning tasks amongofficial and volunteer workers. Finally, conclusion remarks aremade in Section 5.

2. Blackbox-oriented e-learning

This work models an e-learning system as a blackbox-orientedsystem—a platform representing the learning contents as a

composition of blackboxes, which are software components whoseinternals cannot be viewed. The term ‘‘blackbox’’ indicates that thesystem only considers the input/output of learner behaviors andignores completely other details. This model has been significantlyadopted in many organizations and is referred by different termi-nologies, such as outcome-based education or performance-basededucation, etc. [16]. In this paper, we adopt the name ‘‘blackbox’’in terms of software technologies.

An invocation of blackbox bwith input data I generates the out-put referred to by invoke(b, I). We assume that blackboxes are de-terministic. That is, a blackbox always produces the same result forthe same input data. Therefore, the behaviors of a blackbox cannotbe dependent on time-related functions or random-number func-tions.

In the perspective of computer language education, interactivecode review can conduct learning context more effectively. Thisapproach is referred to as the whitebox-oriented e-learning. How-ever, it is also considered difficult and costly to be implemented. Infact, the whitebox-oriented e-learning can be emulated by usingblackboxes with adapted granularity (fine grain vs. coarse grain).In the extreme case, given a sequence of program instructions, thesystem can create one blackbox for each single instruction to forceusers to examine the program line-by-line. On the other hand, con-structing blackboxes with coarse grain leads to a result-orientedlearning approach.

Fig. 1 illustrates the architecture of a blackbox-oriented LMS(Learning Management System). Each learning object in the DAGis associated with an instructor blackbox (or T-box) as well as a setof input data. To complete the lesson, a user has to construct auser blackbox (or U-box) and try to make it act like the T-box. Achecker is a program to determine whether two blackboxes actlike each other for the same input data. In this work, we usean elementary checker which simply uses the string comparisonfunctions to judge the equality of outputs of two blackboxes. Thechecker can be customized to evaluate blackboxes based on exacttextual matches or approximate pattern matches, depending onindividual e-learning application requirements.

When navigating the learning contents, a user is not allowed toadvance to objectw in theDAGunless all the prerequisite objects ofw (i.e., all the objects with edge tow) are completed. Theoretically,if the cardinality of input data is large enough and theU-box alwaysgenerates expected answers that can be approved by the checker,then user k has a high probability of successfully completing thetopic w. For each user, the learning progress is the collection of theobjects completed by the user.

This paper implements the LMS shown in Fig. 1 as a web-based e-learning system [17]. In this system, each T-box representsthe instructional materials of a topic of computer language (C++or Java), including the textual and illustrative explanations, thesource code of the standard answer, and a set of input datafor learning evaluation. The teacher can upload the instructionalmaterials and maintain their precedence relationships up to theLMS via the web-based administrative interface. To complete acourse, a user has to upload aU-boxwhich contains the source codeof the answer for the topic. The server then compiles, executes,and invokes the checker to validate the answer. The user learningprogress is updated if the quality of the U-box is approved. In thisresearch paper, the implementation of the e-learning web site isbeyond the scope of this article and will be omitted.

3. System design and development

3.1. System architectureThis subsection gives an overview and depicts the specific

requirements for the e-learning desktop grid. The system containsan e-learning web server, a BOINC server and the worker control

L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10 3

Fig. 1. The architecture of a blackbox-oriented e-learning system.

Fig. 2. Architecture of the e-learning desktop grid.

components, as illustrated in Fig. 2. Depending on which serveris registered, a user can be an e-learning user, a BOINC user, orboth. The computers owned by e-learning users and BOINC usersare respectively referred to as user computers (UCs) and workers.The workers are further categorized into volunteer workers (VCs),which are user computers participating the BOINC desktop grid,and official workers (OCs), which are server computers maintainedby the system administrator.

Fig. 2 illustrates the life cycle of task execution involvinge-learning web site, desktop grid, and users. The first step isthe transformation of e-learning tasks generated by e-learningusers to the BOINC tasks. A daemon in the BOINC server, named

WorkGenerator, periodically checks the database and performstask transformations. Formal discussion of task transformation iscovered in Section 3.2. Upon checking the e-learning database,WorkGenerator also tries to bring up some official workers tomaintain the minimal operating capability on the BOINC server.

The above WorkGenerator, transforming tasks between thedatabases, provides a loosely coupled integration without systeminternal modification. However, to construct a practical e-learningdesktop grid, more issues need to be addressed.

(1) Membership binding: the system can recognize the accountnames in different systems that refer to an identical user.

4 L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10

(2) Credit feedback: the course instructor can feedback the BOINCcredits to the e-learning system since the credits represent theamount of user contribution to the computational tasks.

(3) Security: tasks submitted to a worker must be hidden fromthe view of this worker, which is usually also a learning user.In contrast, the worker computer needs to be prevented fromintrusion when running volunteer tasks.

(4) Scalability: the system remains responsive in different systemscales.

These properties are not orthogonal. The membership bindingproperty not only prevents security risks from anonymous Inter-net publicity, but also enables the feedback of the volunteer con-tribution to the e-learning system. This researchwork develops thecomponents for integrating the e-learning web site and the desk-top grid that satisfy the above properties.

In this architecture, we disable the default registration inter-face provided by the BOINC server. Hence, the only way a usercan participate in the desktop grid is through the AutoDeploy-ment component—a software agent that is downloaded to a usercomputer to perform all the necessary steps for deploying a BOINCclient. The agent is basically a passive component which flatlyreads and performs the instruction sequences sent by the con-troller. The agent brings a VM instance to the user computer,prompts the e-learning account information and then makes theuse of it to configure the BOINC worker hosting in this VM. Withthis approach, the membership binding property is satisfied andconsequently the BOINC credits can be precisely feedback to thelearning system.

3.2. Execution of E-learning tasks

Volunteer computing heavily relies on the donation of com-puter CPU cycles by public participants whose identities are un-known. In this way, the resource availability must be taken intoconsideration before operating on these resources. Redundant com-puting is used to address the issues to the volunteer computing,including malicious attacks, network latency, and hardware mal-functions. Thismechanism involves dispatching two ormore repli-cas of a same job to different computers. A job is complete and thecorresponding results are valid only if a certain number of volun-teers respond and all report the exact same result.

3.2.1. Transforming E-learning tasks to BOINC tasksThe notation for learning object w is defined as follows.

• Tw: instructor blackbox (T-box) of learning object w.• Iw: the set of input data of T-box Tw .• Sw: the set of result data after invoking T-box Tw with input data

Iw . That is, Sw = {invoke(Tw, i) | i ∈ Iw}.

Furthermore, the evaluation results are notated as follows.

• B(k)w : user blackbox (U-box) provided by user k for learning

objectw. The U-box stores the source code of the user program.• O(k)

w : the set of return data after invoking U-box B(k)w with input

data Iw . That is, O(k)w = {invoke(B

(k)w , i) | i ∈ Iw}.

Note that all learners share the same input data for blackboxevaluation. Thus, the input data Iw should be large enough andcannot be re-engineered by learners.

Definition 3.1 defines the e-learning task formally.

Definition 3.1 (e-learning task). Given the U-box B(k)w of a user k

and learning topic w, the tuple (B(k)w , Iw, Sw) refers to an e-learning

task. Also, O(k)w refers to the output data of the execution of this

task. �

For instance, let w be a learning topic of finding the square val-ues of the input parameter Iw = {1, 2, 3}. Since Tw is the instruc-tor’s blackbox with correctness guarantee, we have Sw = {1, 4, 9}.In addition to the standard computer program input/output check-ing model, the blackbox can be even more flexible. Let us considera learning course about setting up the Linux Apache web server.The T-box is a VMwith the web server correctly installed and con-figured. To complete the lesson, a learner has to set up aweb serverin another VM (i.e., the U-box) and try to make it act like the onein the T-box. The input data is a sequence of designated URLs andparameters. The U-box gets approval if it always returns the HTMLcode the same as that of the T-box for the same URL request.

In the BOINC desktop grid, a BOINC task has the form (x, i),where x is an executable program and i is the input data of x. ABOINC application increases the degree of parallelism by partition-ing input data into workunits (WUs) that can be dispatched andperformed in different computers. We employ the BOINC wrap-per [4] for wrapping the actions of the e-learning task into a singleBOINC task. These actions include program compilation, execution,and validation.

Definition 3.2 defines the transformation formally.

Definition 3.2 (BOINC Task).Given an e-learning task (B(k)w , Iw, Sw)

of learning topic w. The corresponding BOINC task has the form(x, i), where

• x = wrapper, and• i = (B(k)

w ∪ Iw ∪ Sw).

Moreover, a BOINC task is partitioned into seg WUs denoted by(x, i[j]), 1 ≤ j ≤ seg , where i[j] is the jth segment of the inputdata. �

3.2.2. Execution of E-learning tasksThis subsection describes the implementation details of the

e-learning desktop grid.For a learning object w, the BOINC server performs the follow-

ing steps to execute the e-learning tasks (these steps are performedconcurrently).

(1) The work generator periodically selects a newly arrivede-learning task (B(k)

w , Iw, Sw) from the e-learning database.(2) The work generator generates the set of workunits {(x, i[j]) |

1 ≤ j ≤ seg} for the e-learning task via the transformationdefined in Definition 3.2.

(3) Upon receiving a task request from some workers, the BOINCserver dispatches a set of WUs (a subset of {(x, i[j]) | 1 ≤ j ≤seg}) to the worker.

(4) Upon receiving the result data ofWUs (computed and reportedby theworker), the BOINC server reconciles the result data andupdates the learning progress for this user.

The BOINC server creates and assigns a set of task replicas of thesameworkunit to differentworkers. A replica is termed a Result inthe BOINC desktop grid. Aworkunit is completewhen amajority ofResult reported from different workers are the same. A workunitdispatched but not completed before the time-out is rescheduledaccording to the BOINC project configuration.

The workers perform the following procedures to execute thee-learning tasks.

• When receiving a WU (x, i[j]), where x = wrapper and i[j] =(B(k)

w ∪ Iw[j]∪Sw[j]). Thewrapper performs the following steps.(1) Extract B(k)

w , Iw[j] and Sw[j] from input data i[j].(2) Compile the source code stored in B(k)

w . All the requiredsoftware in this step, such as compiler and libraries, havebeen already installed and configured in the VM that run inthe worker’s computer.

L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10 5

Fig. 3. The work generator module that transforms tasks from the e-learning web site to the BOINC server.

(3) Execute the executable file compiled in Step (2) by usinginput Iw[j]. Let out be the execution result.

(4) Compare out and Sw[j], if they are equal, report true to theserver; otherwise, report false.

When executing a workunit, the wrapper can interrupt the execu-tion when detecting an abnormal or time-out event.

3.3. Work generator

The challenge of integrating an e-learning web site and a desk-top grid comes from the different computing models they use.The e-learning website is basically an interactive systemwhich re-sponds to user’s request in an event-driven manner. The desktopgrid, however, tended to be a compute-bound system that gener-ates and dispatches tasks in a batch mode. In the worker generatormodule of the BOINC server, the main loop sleeps for a period oftime (60 s by default) between two activations to reduce workloadon the server. In some cases, this model leads to system irrespon-siveness in terms of interactive learning progress management.

We developed a multithreading work generator in whichseveral threads are created for different task sources. As shown inFig. 3, some threads (e.g., thread 1) are used for generating WUsfrom job pools, and some other threads (e.g., thread N-1) are usedfor reading e-learning tasks from the database. Separating thesecomputations from one main loop (as in the old work generator)enables us to extract tasks and generate WUs concurrently. Also,blocking one thread (e.g., waiting for network connection orperforming database query) does not affect the execution of otherthreads.

3.4. Worker deployment

The computing resources provided by the volunteer workers(VCs) could be unstable due to the autonomous nature ofe-learning users. We maintain several official workers (OCs)which are dedicated for solving e-learning tasks. This subsectiondiscusses the deployment of volunteer and official workers.

The e-learning private cloud establishes an official worker bydeploying the VM equipped with all components required for thee-learning application. As illustrated in Fig. 2, the VM frontendis implemented as a part of WorkGenerator. Upon checking thedatabase and transforming e-learning tasks,WorkGenerator alsotries to bring up some official workers if it detects that the average

job completion time goes beyond a certain threshold. We employthe KVM (Kernel-based virtual machine) solution [18] to managethe official workers. The VM frontend connects to the target hostvia a secure ssh connection and then executes the libvirt pythonscripts for creating/destroying such VM instances. Once the VMinstance is created and powered on, the worker (i.e., the BOINCclient which is pre-installed in the VM) starts and connects to theBOINC server automatically.

Agent-based worker deploymentCompared to official workers, a volunteer computer can have

various operating systems installed and may have dynamic IP be-hind a firewall or NAT, turning not applicable the above server-dominant deployment. A feasible way of deploying volunteerworkers is to initiate the installation procedures by users them-selves. However, a user can register an arbitrary BOINC accountwhich does not bind to the e-learning account. In addition, theuser-controlled procedure implies to grant the user permission tomanipulate the VM, giving a handle for unauthorized snooping in-side the hosted applications.

We develop a safe and ease-of-use agent-based workflow fordeploying volunteer computers (VCs). The architecture of a VC isillustrated in Fig. 4. To deploy, the user needs to log in to thee-learning web site and download an installer program whichis customized on a per user basis by using the e-learning accountinformation. When executing the installer will try to bring theVM instance to the user computer and start a stub to manage allthe successive procedures inside the VM.

In the deployed VM, all the necessary applications, includingstub and BOINC client programs, are pre-installed in a securedaccount namedworkerAdmin. The Stub program is run when theVM starts. The following procedures are performed after the VMboots up.1. Installer logs in to the VM by usingworkerAdmin permission

via an ssh connection.2. Installer connects to the Stub and passes it with the e-learning

account information via the ssh channel.3. Inside the VM, the Stub configures the BOINC client by using the

e-learning account information. The BOINC project account canbe controlled by employing the boinccmd APIs.

In the above deployment procedures, the BOINC worker and itsapplication data inside the VM are protected from inspecting andediting by using the secured workerAdmin account. It is theresponsibility of a user to keep the personal installer script and theconfigured VM instance in a safe place.

6 L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10

Execute tasks

Maintain credits

Manage learning activities

Manage memberships

Download VM Images

Create installer scripts

Fig. 4. Architecture of the volunteer computer (VC).

Implementation issuesAccording to our experiences, certain issues have to be ad-

dressed when developing the agent-based worker deployment.First, in Step 1, since the VM instance is configured as aDHCP client,the Installer-Stub communication must be initiated by the Stubside. Usually the Installer (runs in the host machine) cannot lo-cate the Stub (runs in the virtual machine) since the Stub is hostedinside a VMofwhich the IP address is dynamically generated and isunknown beforehand. Also, there can have multiple VM instancesusing one physical IP address. On the contrary, a Stub can easilylocate the Installer since the host computer is configured as a de-fault network gateway for the VM. Therefore, in Step 1, before es-tablishing the connection, the Installer has to wait for a TCP/IPstartup message from the Stub.

Another critical issue is security — preventing applicationdata inspection and identity theft for the cooperative computingplatform. The system security is based on the VM with a securedworkerAdmin account as well as the ssh protocol. The onlyinteraction between the VM and the user is through the installerscript. Thus the system is secured if users can keep the personalinstaller script and the deployed VM instance in a safe place.

4. Scheduling algorithm for official and volunteer workers

The traditional scheduler does not distinguish official and vol-unteer workers which are considered having obvious complemen-tary properties. This section develops a new scheduling algorithmthat improves the task response times by trying to maximize theeffect of high capability official workers.

4.1. Utilizing official workers

In the BOINC platform, one workunit is replicated to severalinstances, called Results, and assigned to independent workers.

A scheduled workunit is complete when its Results are reportedand a certain agreement policy, such asmajority voting threshold, isreached. In the traditional scheduler, Result tasks, even belongingto a sameworkunit, are simply treated as individual tasks. Fig. 5(a)illustrates this basic result-based scheduling configuration calledVC_conf in which Result tasks are chosen and assigned tovolunteer workers by using the default BOINC scheduler.

Fig. 5(b) shows an enhanced configuration called OC_confin which a small number of high capability official workersareintroduced. As these workers are operated in the dedicated serverswith the short polling cycle [19,20], tasks can be executed in a highthroughput rate and the system performance can be enhanced.

In the above result-based scheduling system, Result tasksbelonging to the same workunit can be assigned to workersof different capabilities. Often a volunteer worker disconnectswith unsolved Result tasks, halting them until timeout and get-ting rescheduled. Therefore, although speeding up Result tasks,OC_conf may not improve the global performance since WU’scompletion time is determined by the Result instance that finisheslast.

4.2. Workunit-based scheduling algorithm

This subsection discusses a new scheduling algorithm that triesto maximize the effect of using expensive server computers in thee-learning cloud. In the new workunit-based scheduling algorithm,the Result tasks of the same workunit are grouped and effectivelyassigned by taking into consideration the group-wide properties.

In order to improve the throughput of workunits, the newscheduling algorithm tends to select the workunit with the higherprogress ratio value. Also, the tasks in the same workunit areassigned to the workers with the same type in order to make thesetasks finish in the near time period.

The workunit-based scheduling algorithm uses three priorityqueues to store the workunits:

L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10 7

a b

Fig. 5. System configuration: (a) VC_conf, and (b) OC_conf.

Fig. 6. The configuration for workunit-based scheduler (WU_conf).

• Qinit: stores the Result tasks of initialized workunits,• Qvol: stores the Result tasks of workunits assigned to volunteer

workers, and• Qoff: stores the tasks assigned to official workers.

Eachworkunitw partitions its redundant instances into three lists:

• w.waitResults: records the Result tasks waiting for scheduling,• w.pendResults: records the Result tasks that are assigned to

workers but not reported, and• w.completeResults: records the completed Result tasks.

The order of workunits in the priority queues is determined by theproperty

w.key

=∥w.waitResults∥ + ∥w.pendResults∥

(∥w.waitResults∥ + ∥w.pendResults∥ + ∥w.completeResults∥), (1)

where the denominator is the number of total Results of a worku-nit and the numerator is the number of incomplete tasks.Workunitwith less key has higher priority. Thus, performing an extract-min

operation on a queue will extract the workunit with the higher ra-tio of completed tasks. For example, in Fig. 6, the minimal worku-nit in Qvol has the key 1/4 and theminimal workunit in Qoff has thekey 2/4.

Algorithm 1 describes the workunit-based scheduling algo-rithm.

Algorithm 1Workunit_Based_Scheduler• A: When generating a new workunit w:

(1) Generate the redundant instances (i.e., Result tasks) of w;(2) Add all the Result tasks to list w.waitResults;(3) Add w to Qinit using w.key (see Eq. 1);

• B: When receiving a task request from a worker:

(1) If the worker is a volunteer worker then. Q ← Qvol;

else (the worker is official). Q ← Qoff;

End if(2) If Q is empty then move the minimal element in Qinit to Q ;(3) w← extractMin(Q ); //get the workunit with highest priority(4) If w is not null then

. r ← pop(w.waitResults);

. Move r from w.waitResults to w.pendResults;

. Assign task r to the worker;End if

• C: When receiving a report of task r:

(1) Find the workunit w containing r;(2) Move r from w.pendResults to w.completeResults;(3) Update w.key and the queue order;(4) Delete w from the queue if w matches the predefined task

completion rule;

End if

Fig. 6 illustrates the WU_conf configuration which is the sameas OC_conf but uses the workunit-based scheduling algorithm.Initially, the tasks are added to Qini, waiting for dispatching to Qvolor Qoff, depending on the rate the workers consume tasks (StepA3 and Step B1). In Procedure B of Algorithm 1, Step B3, when acertain type of worker sends a task request, the scheduler selectsthe workunit with the higher task completion rate. This workunit

8 L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10

is considered most likely to be completed in the near time period.The scheduler then fetches a Result task from the chosenworkunitand assigns it to the client.

In Procedure C of Algorithm 1, the server collects a completedResult task r and changes the task status from pending tocomplete. The workunit containing r is completed and removedfrom queues if it matches the completion rule such as the majorityvoting threshold.

4.3. Experimental results and discussions

This section presents the experimental results of enhancing thee-learning web server by using the volunteer computing platform.The experiment aims to evaluate the system performance withdifferent scheduling policies.

4.3.1. Experimental setupThe architecture of the experimental system is shown in Fig. 2.

The e-learning server is built on top of Microsoft’s Windowsoperating system and implemented using dynamic web page anddatabase technologies [21]. The e-learning and BOINC servers areindividually installed and executed in a personal computer withone Quad-core CPU and 4GB of memory. We prevent anonymousworkers from Internet publicity by disabling the defaultweb-basedBOINC registration, since anonymous users from Internet publicityare often unresponsive and are not of help for highly interactivee-learning applications. We adopted interface-level integration tofacilitate interoperability between the e-learning server and theBOINC server. TheWorkGenerator periodically fetches the newlyarrived e-learning tasks from the e-learning database and thenwraps into BOINC tasks that are stored in the BOINC server. Afterthe BOINC tasks are assigned and completed by some workers,the Assimilator updates the learning progress according to theexecution results of the BOINC tasks. The Assimilator updates thelearning progress by invoking the e-learning web pages via a webautomation tool called cURL [14].

We employ the personal computers with the AMD six-core CPUand 8 GB of memory to host the client VMs. To coordinate theserver and dozens of experimental workers, the VcControllercomponent and the client-side agent component (see Section 3.1)are extended with the connection-oriented TCP communications.According to the experimental script, the VcController controlsthe status of workers, including suspend/resume, attach/detach,or changing CPU utilization, by issuing commands via theclient–server communication links. Upon receiving a command, aworker performs the command by invoking the boinccmd functionin BOINC client API.

The amount of donated resources of a worker depends on itsCPU utilization, which is configured as one in set {25%, 33%, 50%,66%, 75%} in this experiment. As the higher CPU utilization rateimplies lower donation computation time,we configure the officialworkerswith the lowest CPU utilization and the volunteerworkerswith random ratios that are uniformly selected from the abovecandidate set. When a worker receives the instruction, it uses asmall program consisting of sleep and busy loops for keeping theCPU utilization in a certain level.

4.3.2. Experimental results of task assignment policiesThe first experiment evaluates the following approaches for

assigning tasks:

VC_conf Use the default BOINC scheduler and involve volunteerworkers only.

OC_conf The same as VC_conf but add two additional officialworkers that can provide more stable computing re-sources.

WU_conf The same as OC_conf but use the workunit-basedscheduler instead of the default BOINC scheduler.

CS This is the standard client–server computing model. Weemploy two servers for all computation tasks in the website.

Each run of the experiments reads and executes the same setof 400 tasks (200 WUs, each with 2 task replicas) that are issuedin a normal distribution rate. Fig. 7 shows the experimental resultsof average task response time versus volunteer worker numbers.The WU’s response time is the elapsed time from the initiation ofissuing the WU until the completion of all of its task replicas. Thenumber of official workers is fixed at two for all the experiments.

As shown in Fig. 7, although employing two additional officialworkers, OC_conf does not show overall improvement overVC_conf as expected, particularly when the worker number getsgreater. This is due to the fact that there are dozens of volunteerworkers and it is likely that they fetch most of experimentaltasks. An inactive volunteer worker could dramatically increaseWU’s response time which is determined by the task replica thatfinishes last. For another WU_conf approach, the task replicas ofthe sameWUare all assigned to either official workers or volunteerworkers. As a result, the tasks assigned to official workers couldbe performed smoothly and completed in the same time slot. Theexperimental result showed in Fig. 7 demonstrates that WU_confcan obviously enhance the effect of the official workers.

The experimental data of CS model is depicted as a horizontalline (the line CS) in Fig. 7 since it uses two server computersand does not involve volunteer workers. The client–server modeloutperforms WU_conf when the number of workers is less than10. Beyond this value WU_conf becomes dominant. This resultcomes from the significant overhead incurred by BOINC schedulingcycles [20,22] as well as high data transformation latency betweenthe servers. Fig. 7 also compares CS with the other approachesin terms of system load levels. Fig. 7(b) depicts the experimentalresults for the systemwith the light load of which the average taskarrival interval is 60 s. The task execution time ranges from 30 to150 s. In this case, theCS approach outperforms since a task is likelyto be completed before the next task arrival time. Fig. 7(a) depictsthe experimental results for the case of heavy load with averagetask arrival interval being equal to 15 s. In this case, the WU_confapproach outperforms via massive parallelism.

4.3.3. Task parallelism inWU_conf configurationThe second experiment aims to evaluate the parallelism

efficiency in terms of the number of workers when using theWU_conf approach. Let Iw be the set of standard input data of ane-learning task of topic w. One e-learning task is transformedinto seg BOINC workunits (WUs), each of which equally contains|Iw|/seg input data. Thus, as mentioned in Section 3.2.1, each WUcontains a source code file provided by the learner, |Iw|/seg inputdata, and the set of standard output data Sw . In the architectureshown in Fig. 2, the life cycle of a WU can be divided into thefollowing phases.

• Tsche: the period of time that the task stays in the BOINC jobqueue. That is, the interval from the time the WU is generatedto the time it is dispatched to some worker.• Tcompile: the average time required for compiling the source code

in the WU and generating the executable file.• Texe: the time required for executing the executable file of the

WU.• Tcheck: the time required for checker to validate the execution

results. As mentioned in Section 2, we use the elementarychecker, which simply compares the equality between standardoutput and the execution result of the executable file of theWU.

L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10 9

a b

Fig. 7. Experimental results of response time of various task assignment policies with the average task arrival interval being equal to (a) 15 s and (b) 60 s.

600

500

400

300

200

100

01

5 510 25 45

15

25

40

#Workers

Tim

es (

sec)

#job

s1000%

800%

600%

400%

200%

0%

1 workervs 5

workers

1 workervs 10

workers

1 workervs 25

workers

3-4

20-22

job

len

gth

(se

c)

Sp

eed

up

(%

)

a b

Fig. 8. Parallelism efficiency and scheduling overhead of theWU_conf approach: (a) execution time, and (b) speedup.

• Tcomm: the time required for communication between theBOINC server and the worker.

The WU is correct if the execution result equals the standardoutput. Since the WU has (|Iw|/seg) input data, the worker hasto execute and validate the WU (|Iw|/seg) times of which eachexecution reads one input data from Iw . Therefore, the totalexecution time of WU is given by

Tcompile + (|Iw|/seg) · (Texe + Tcheck)+ Tsche + Tcomm.

Transforming an e-learning task into several equal-size WUsenables parallel execution of WUs. Fig. 8(a) plots the experimentalresults of total execution time in terms of the number of jobs andthe number of workers. The Y axis represents the total executiontime of WUs, which is quantified in seconds, while the Z axisrepresents the workload of system (the number of total jobs inthe system), and the X axis represents the number of volunteerworkers solving the WUs. Each value measured on the Y axis isobtained by taking average on 10 execution results. As shown inthis figure, the parallelism effect in desktop grid improves theperformance since each e-learning task is independent of eachother and can be performed individually.

Fig. 8(b) demonstrates the scheduling overhead of the desktopgrid. This experiment shows that the system gains more speedupfor compute-bound tasks. For example, in figure (b), the speedupof executing tasks of length 20–22 s with 10 workers is equal to 8,though the speedup of executing short-length tasks (of length 3–4s), with 10 workers, equals to 5.

5. Conclusion remarks

The traditional web-based e-learning system suffers fromunstable workloads and the security risk of incorporating externalexecutable objects. This paper addressed these issues and coveredwith the recently emerged desktop grid and cloud computingtechniques. Unlike the normal volunteer computing model wherethe identities of users are known, the learning users are motivatedto be volunteers to share the computations. We have developedthe components to integrate the e-learning system and desktopgrid into the circumstances in which each user is not only a taskproducer, but also a volunteer for solving tasks. Additionally, theresponsiveness is enhanced by using a collection of asynchronousprocesses to enable the desktop grid and the e-learning website tocooperate in amore tightly coupledmanner. As from experimentalresults, we could demonstrate that the parallelism in desktopgrid greatly improves the overall system performance. In addition,we developed the worker type-based task assignment anddemonstrated an obvious enhancement when compared to thedefault task assignment approach.

The proposed architecture allows participants not only solvingtasks but also generating tasks via the volunteer computingplatform. When employing a broker, the participants form adesktop grid community. It is a future work to develop a resourcebroker that can fairly and efficiently manage the resources of thedesktop grid community.

10 L.-P. Chen et al. / Future Generation Computer Systems 38 (2014) 1–10

Acknowledgment

Thisworkwas supported in part by theNational Science Councilof the Republic of China (Taiwan) under Contracts NSC 101-2221-E-029-034 and NSC 102-2221-E-029-029.

References

[1] K. Schewe, B. Thalheim, A.B. Zdanowicz, R. Kaschek, T. Kuss, B. Tschiedel, Aconceptual view of web-based e-learning systems, Educ. Inf. Technol. 10 (1–2)(2005) 83–110.

[2] Opennebula, 2010. http://www.opennebula.org.[3] D.P. Anderson, J. Cobb, E. Korpela, M. Lebofsky, D. Werthimer, SETI@home:

an experiment in public-resource computing, Commun. ACM 45 (11) (2002)56–61.

[4] D.P. Anderson, BOINC: a system for public-resource computing and storage,in: 5th IEEE/ACM International Workshop on Grid Computing, Pittsburgh, PA,Nov. 8 2004, pp. 365–372.

[5] F. Cappello, S. Djilali, G. Fedak, T. Herault, F. Magniette, V. Neri, O. Lodygensky,Computing on large scale distributed systems: XtremWeb architecture,programmingmodels, security, tests and convergencewith grid, Future Gener.Comput. Syst. (2004).

[6] A. Chien, B. Calder, S. Elbert, K. Bhatia, Entropia: architecture and performanceof an enterprise desktop grid system, J. Parallel Distrib. Comput. 63 (2003)597–610.

[7] L.F.G. Sarmenta, S. Hirano, Bayanihan: building and studying web-basedvolunteer computing systems using Java, Future Gener. Comput. Syst. 15 (5–6)(1999).

[8] D.P. Anderson, K. Reed, Celebrating diversity in volunteer computing, in: Proc.of the Hawaii International Conference on System Sciences, HICSS, 2009.

[9] I-Chen Wu, Ching-Ping Chen, Ping-Hung Lin, Guo-Zhan Huang, Lung-PingChen, Der-Johng Sun, Hsin-Yun Tsou, A desktop grid computing service forconnect6 applications, in: International Symposium on Grid Computing 2009,ISGC 2009, Academia Sinica, Taipei, Taiwan, 2009.

[10] T. Estrada, K. Reed, D. Anderson, M. Taufer, Emboinc: an emulator forperformance analysis of boinc projects, in: Proc. of PCGrid, 2009.

[11] DOM judge, 2012. http://domjudge.sourceforge.net.[12] Mooshak, 2012. http://mooshak.dcc.fc.up.pt.[13] UVa Online Judge, 2012. http://uva.onlinejudge.org/.[14] David Garcia Quintas, A host guest VM communication system for virtual box,

2011. http://boinc.berkeley.edu/trac/wiki/VirtualBox.[15] D. Ferreira, F. Araujo, P. Domingues, libboincexec: a generic virtualization

approach for the BOINC middleware, in: IEEE International Symposium onParallel and Distributed Processing Workshops and Ph.D. Forum, IPDPSW,Shanghai, May 2011, pp. 1903–1908.

[16] Outcome-based education, 2011. http://en.wikipedia.org/wiki/Outcome-based_education.

[17] A Blackbox-Oriented e-Learning Desktop Grid, 2012.http://portal.csie.thu.edu.tw/drupal-6.26/.

[18] Virtual machine, 2011. http://en.wikipedia.org/wiki/Virtual_machine.[19] D. Kondo, G. Fedak, F. Cappello, A.A. Chien, H. Casanova, Resource availability in

enterprise desktop grids, Future Gener. Comput. Syst. 23 (7) (2007) 888–903.[20] E.M. Heien, D.P. Anderson, K. Hagihara, Low latency batches with unreliable

workers in volunteer computing environments, J. Grid Comput. (2009).[21] Microsoft Developer Network, 2010. http://msdn.microsoft.com.[22] Sangho Yi, Derrick Kondo, David P. Anderson, Toward real-time, many-

task applications on large distributed systems, in: Proceedings of EuropeanConference on Parallel and Distributed Computing, Euro-Par, August 2010.

Lung-Pin Chen is an associate professor of Depart-ment of computer science and information engineer-ing at Tung-Hai University, Taiwan. He received his B.S.from Soochow University in September 1991, M.S. fromNational Chung-Cheng University in September 1993, andPh.D. from National Chiao-Tung University in January1999, all in computer science. His research interests in-clude distributed algorithm, internet computing, SOA, andpervasive computing.

Jien-An Lin received the M.S.degree in Computer Scienceand Information Engineering from Tunghai University,Tai-Chung, Taiwan, in 2009. His research interests includedistributed system, volunteer computing, and databasesystem.

Kuan-Ching Li is a Professor of Department of ComputerScience and Information Engineering (CSIE) of ProvidenceUniversity in Taiwan. He serves as the Editor-in-chiefof International Journal of Computational Science andEngineering (IJCSE). He is a Fellow of IET society and is aSenior Member of IEEE computer society.

Ching-Hsien (Robert) Hsu received the B.S. and Ph.D. de-grees in Computer Science from Tung Hai University andFeng Chia University, Taiwan, in 1995 and 1999, respec-tively. He is currently a professor of the department ofComputer Science and Information Engineering at ChungHua University, Taiwan. Dr. Hsu is the editor-in-chief ofInternational Journal of Grid and High Performance Com-puting (IJGHPC) and serving in a number of journal edi-torial boards. His research interests include Parallel anddistributed computing, cloud/grid computing, P2P com-puting, Pervasive computing, Services computing, RFID

and smart homes.

Zhi-Xian Chen received the M.S.degree of managementinformation science from Tunghai University, Taiwan,in July 2012. His research interests include paralleland distributed system, volunteer computing and cloudcomputing.