introduction to cyber foraging, tools and techniques

7/28/2019 Introduction to Cyber Foraging, Tools and Techniques

1/38

Introduction to Cyber Foraging, Tools and Techniques

R. Gill

June 30, 2010


2/38

2


3/38

Contents

1 Introduction 1

1.1 Example Design Brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Cyber Foraging 5

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Cyber Foraging Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.2 Chroma Tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.3 AIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.4 Networked integrated Multimedia Middleware (NMM) . . . . . . . . . . . . . . . . . 8

2.2.5 Goyal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.6 Slingshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.7 Vivendi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.8 EyeDentify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.9 DiET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.10 Instant-X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.11 Scavenger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.1 Surrogate Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Appendix 25

i


4/38

ii CONTENTS


5/38

List of Figures

1.1 Cyber foraging architecture showing mobile client in a pervasive compute environment with

bi-directional communication to surrogates and remote multimedia content server . . . . . . . 3

2.1 Classification of cyber foraging systems as thin and thick ATA and AAA . . . . . . . . . . . . 16

3.1 The Spectra API from [Flinn02a] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2 Simple Architecture of Spectra from [Flinn02a] . . . . . . . . . . . . . . . . . . . . . . . . . 253.3 Example of a generated Tactic File from [Balan03a] . . . . . . . . . . . . . . . . . . . . . . 26

3.4 A description of Chroma Tactic components from [Balan03a] . . . . . . . . . . . . . . . . . . 26

3.5 Overall architecture of AID from [Messer02a] . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.6 A simple NMM flow graph from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.7 NMM buffering and of nodes from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.8 NMM registry hierachy from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.9 NMM code snippet adapted from from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . 27

3.10 Goyal architecture from [Goyal04a] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.11 Instantitating a new application replica in Slingshot adapted from Su [Su05a] . . . . . . . . . 28

3.12 An example of Vivendi tactic file from [Balan07a] . . . . . . . . . . . . . . . . . . . . . . . . 283.13 An example of Vivendi file with tactic definition and remote invocation parameters from [Balan07a] 29

3.14 The Ibis framework employed in EyeDentify from [Kemp09b] . . . . . . . . . . . . . . . . . 29

3.15 DiET mobile code reductin flowpath from [Kim09a] . . . . . . . . . . . . . . . . . . . . . . 29

3.16 DiET API showing main working components from [Kim09a]] . . . . . . . . . . . . . . . . . 30

3.17 Scavenger client surrogate interface showing surrogate daemon components frontend and code

execution environment from [Kristensen09a] . . . . . . . . . . . . . . . . . . . . . . . . . . 30

iii


6/38

iv LIST OF FIGURES


7/38

Glossary of Terms

JVM Java Virtual Machine (JVM)

Proxy Object Proxy Object acts as an intermediary between the client and an accessible object.

The purpose of the proxy object is to monitor the life span of the accessible object

and to forward calls to the accessible object only if it is not destroyed.

RTP Real-time Transport Protocol

RTPS Real-Time Publish-Subscribe (RTPS) Wire Protocol provides two main communication models:

the publish-subscribe protocol, which transfers data from publishers to subscribers;

and the Composite State Transfer (CST) protocol, which transfers state.RPC Remote Procedure Call (RPC) is an inter-process communication that allows a computer

program to cause a subroutine or procedure to execute in program to cause a subroutine

or procedure to execute in subroutine or procedure to execute in the details for this

remote interaction.

Overhead Any combination of excess or indirect computation time , memory, bandwidth, or other

resources that are required to attain a particular goal

Broadcast Transferring a message to all recipients simultaneously.

Unicast Transmitting the same data to all possible destinations

Multicast Delivery of a message or information to a group of destination computers simultaneously

in a single transmission, uses User Datagram Protocol UDP

SIP Session Initation Protocol, The Session Initiation Protocol is an IETF-defined

signaling protocol, widely used for controlling multimedia communication sessions such asvoice and video callsover Internet Protocol (IP).

COTS Commercial Of The Shelf software

v


8/38

vi LIST OF FIGURES


9/38

The exponential uptake of mobile computing devices, such as smartphones and PDAs, is problematic with re-

gard to streaming multimedia rich content to a mobile device. The advantage and attraction of a portable light

mobile computing device is mobility. Portable power supply technology has not increased the power require-

ments of increased CPU and processing capabilities of mobile devices such as smartphones. Subsequently,

a shortfall exists between available and required power for processing and playing of streamed multimedia

content. However, with portability comes constraint on physical size, consequently such devices are compute

resource limited and unable to execute applications requiring high compute processing such as digitising and

rendering multimedia rich content. The CPU, memory and energy overheads of such multimedia applications

outstrip the capabilities of thin mobile clients, and handheld devices. Although smart phone CPU and memory

has increased recently, the drain on compute energy still limits usage of high intensive compute application

processes. One method to reduce required power requirement is to offload power consuming processes to

surrogates, called cyber foraging [Satyanaranyanan01a]. Access and availability to compute surrogates is pre-

dicted to become ubiquitous in future pervasive compute environments. The implementation of cyber foraging

requires particular essential capabilities such as surrogate (service) discovery, establishing trust, partitioning

what application task operations to give the surrogate, and scheduling surrogate tasks.

In this work a number of current cyber foraging tools and techniques available to the mobile application

developer are described. A typical mobile application design brief is included to provide a context in which

cyber foraging might be used.

1.1 Example Design Brief

The example application is initially required to automatically search and discover available surrogates. When

required, the application compares compute resources of local (on mobile client) entity and remote entity(surrogate), with minimum required compute resources to complete multimedia processing tasks. If local

compute resource is less than minimum required compute resource, the application offloads the processing

task to the surrogate. The surrogate performs the task and streams processed multimedia content back to the

application. The Use Case scenarios describe human interaction and operation of the application in a pervasive

compute environment.

1.1.1 Use Case

This use case is intended to show how a the proposed system may be used in practice, and illustrate what is

required of the system to fulfill its goals.

Hubert is at an airport departures waiting for his plane to go to France. The airport departure lounge is a

designated pervasive compute environment. Hubert is an avid football fan and takes out his mobile phone to

watch the highlights of the latest match played by his favourite football team. Hubert has recently registered

1

1. Introduction


10/38

2 CHAPTER 1. INTRODUCTION

with a service provider that provides high definition video footage of replays of recent football matches. His

phone has already discovered that it is in a pervasive compute environment. Hubert starts his phone application

to watch the goals scored by his favourite football team in their most recent match. Unknown to Hubert his

mobile phone only has the computing power to process high definition video at the lowest quality and with

intermittent stops and jerks during playback. His phone application detects the shortfall in compute power and

redirects the video from the service provider to surrogate(s) in the departure lounge that carryout the video

processing and stream the video to Huberts phone, where he watches the footage in high quality.

When Hubert arrives in France, he tries out his new phone language translation application. Again, his

phone does not have the computing power to run the application. Hughbert looks for a cyber cafe or public

building such as a library or museum that normally have pervasive compute environments. He spots an cyber

cafe sign and walks towards it, as he approaches the cafe entrance his phone discovers the surrogates and

displays a message that the language translation application is ready to be executed. Outside the cafe, Hubert

asks some locals directions to his hotel using the language translation application on his phone. During this,

his phone offloads the translation application processing tasks to the surrogates in the cafe, that process the

tasks and return processed data back to the phone application. Eventually, one of the locals recognises the

name of Huberts hotel and with the help of the translation application Hubert was able to understand the route

he must take to get to his hotel.

Before setting off to his hotel Hubert takes a picture of the local who gave him directions, but his mobile

phone could not execute the language translation application and the phones high mega pixel camera at the

same time. Instead of forcing Hubert to close the language translation application, the phone offloads the

camera application processing to the surrogates in the cyber cafe.

1.1.2 Design Considerations

While we use the term mobile client, in this work the scope of mobile is limited to mobility within a single

pervasive compute environment. The use case describes a cyber foraging architecture similar to that shown

in network A and B shown in Figure 1.1. In both A and B, communication channel 1 is between client and

surrogates within a pervasive compute environment. Communication channel 1 is required for access between

client and surrogate, for surrogate discovery, offloading/distributing application tasks to surrogate, distribution

of client and surrogate details, and sending processed task data back to mobile client. Communication channel

2 is required for access between client and remote multimedia content provider, for remote application exe-

cution/invocation and transfer of network data and network entity details. Communication channel 3 enables

access between remote server and pervasive compute environment, for transfer of content for processing and

network details.

Previous approaches to supporting client application using cyber foraging, have taken either a thin client

approach, or thick client approach. In network A, previous solutions to run remote applications have used

thin client solutions such as VNC [Richardson98a] SSH [SSH11a], or web based services such as GoToMyPC

[Citrix11a]. Thin client solutions have the advantage of not requiring modification to the application, and

ease of use. However, thin client solutions require low network connection latency to be effective, and are

consequently reliant on bandwidth. To decrease reliance on latency and bandwidth, approaches using thicker

client solutions have been used.

A thick client approach is one in which a part of the application executes on the mobile client, therefore

when there is inadequate bandwidth, a low quality degraded version of the application may run on the client.

Mobile clients that run a complete application are the extreme of the thick client approach, these make any

attempt at cyber foraging completely redundant. The use case describes an example of both thin and thick client

approaches with cyber foraging support. Network A and B are basically thin and thick client implementations,

respectively.

The thick client approach may be classified into two methods. In method one, called application-transparent

adaptation ( in this work ATA), existing application APIs on surrogates and remote servers are used to control


11/38

1.1. EXAMPLE DESIGN BRIEF 3

Figure 1.1: Cyber foraging architecture showing mobile client in a pervasive compute environment with bi-

directional communication to surrogates and remote multimedia content server

and communicate with the client application. ATA is limited to application APIs ability to adapt to external

commands. The second method, called application-aware adaptation(in this work AAA), requires explicit

modification to the application, to work at runtime. AAA enables greater scope for application adaptability

during runtime, and is not limited by prior application adaptation constraints. However, the application-aware

method does require manual intervention and access to application source.

A hybrid of ATA and AAA are approach in which some code from the client application is transferred to

the surrogate during cyber foraging. The application code transferred during cyber foraging is called mobile

code, if choosing a hybrid approach the application must balance the proportion of mobile code to transfer

to a surrogate with respect to the whole application code, and the time required to execute mobile code on

a surrogate prior to cyber foraging taking place. In effect, mobile code prepares the surrogate by rewriting

existing surrogate application, this is different from transferring identification and authentication information.

Whichever approach or method is used, the design of a cyber foraging system to support a mobile client

application in a pervasive compute environment should consider as a minimum, the following four design

features.

1. Discovery of surrogates - Without being aware of the presence of available surrogates no cyber foraging

can take place.

2. Trust establishment between communicating entities - The transient nature of cyber foraging means that

initially both client and surrogate are unknown un-trusted entities that must satisfy some form of trust

criteria before any interaction can take place.

3. Partitioning application tasks to offload to surrogates -

4. Scheduling tasks for offloading and to surrogates and retrieving completed task data to the client appli-

cation.

Human interaction should also be considered, ubiquitous computing ideally requires no conscious human

interaction. However, the option of a manual mode that prompts the user to interact prior to offloading mobile

code to surrogates has psychological usability benefits regarding privacy. Finally mobility, although the scope

of mobility in this work is confined to a single pervasive compute environment, the user would still be expected

to move location within the environment.


12/38

4 1. INTRODUCTION


13/38

2.1 Introduction

A decade ago, Satyananrayanan [Satyanaranyanan01a] accurately described and predicted pervasive computing

as the next step foreword in computing evolution, as a hybrid of distributed and mobile computing. Motivation

behind the prediction was continual advances made in distributed and mobile computing a decade previous to

Satyananrayanan towards the realisation of Weisers [Wieser91a] vision of ubiquitous computing. Integrating

distributed and mobile computing technology continues to present unique challanges. Cyber Foraging is

the term used by Satyananrayanan to describe a potential solution for one of the unique challenges pervasive

computing must address in order to integrate distributed and mobile computing seamlessly within a pervasive

computing environment. The specific unique challenge cyber foraging seeks to address, is to dynamically

divide and distribute application processing tasks of a resource constrained mobile client, to remote surrogateresources that perform the processing tasks on behalf of the mobile client, and return the processed data back

to the mobile client application. Each potential remote resource surrogate hosts some kind of service that the

cyber foraging system searches for; consequently the activity of searching for remote resource surrogates is

called service discovery. Implementing cyber foraging (also described as offloading [Gu04a]) for a mobile

client within a pervasive environment requires four fundamental stages of operation. Stage one is service

discovery of remote surrogate entity(s), stage two is establishing trust between mobile client and resource

surrogate entity(s), thirdly partitioning the application processing tasks between local and remote execution

[Balan07a], and fourthly scheduling the partitioned processing tasks to the correct surrogate resource entity

[Kristensen10a]. In combination, these four stages define t he fundamental features required of a cyber foraging

system within a pervasive computing environment.

Intuitively, the speed with which a process task can be offloa ded and processed by surrogate is paramount.

If the cyber foraging process cannot maintain transparency to the user, in the form of uninterrupted application

usage, then the cyber foraging system has failed. Factors that affect the offloading and processing of partitioned

tasks are called overheads, and play a very important part in cyber foraging research and development.

Overheads include those network factors that effect overall response time and energy consumption, such as

CPU, memory and battery consumption.

This literature review identifies the current range of cyber foraging systems and the extent to which these

systems adhere to the four fundamental prerequisite features that define cyber foraging within a pervasive

environment. Firstly each cyber foraging systems is briefly described, and then we discuss the salient points of

each cyber foraging system in relation to each of the four design features described earlier. (Figure 1.1 presents

a tabulated snapshot of overall adherence) Secondly, alternative methods to achieve the functionality of the four

design features are discussed.

5

2. Cyber Foraging


14/38

2.2 Cyber Foraging Tools

Introduction

Essentially, a cyber foraging system is a fluid network that h as to juggle What processing code to distribute,

Where to distribute it, and How to distribute it, in order for an application to running transparently on a

resource constrained entity within minimum QoS. All the systems introduced in this work have as a minimum,

some form of task scheduler that can range from deciding where a single task should be performed, to multiple

tasks distributed on multiple surrogates. Tasks maybe modelled from data captured by system/network resource

monitors, and prediction algorithms performed on the modelled tasks that allocate surrogate resources to the

tasks. The range and scope of information included in the monitoring and modelling vary from system to

system. The overall system architecture effects interaction and communication between system activities such

as monitoring, scheduling, task/code partitioning and data transfer. Finally, actual communication methods vary

from RPC, RPT, or messages depending on the system and system objectives. All the systems described here

are designed as aides to developing applications with cyber foraging and task offloading functionality; therefore

these are tools and not applications. Some cyber foraging tools vary in maturity and access to descriptive

documentation and technical information; this is reflected in variation section page lengths.

2.2.1 Spectra

Spectra is arguably the forerunner of todays cyber foraging systems. The motivation behind Spectra was to

address the uneven conditioning [Satyanaranyanan01a] that occurs in pervasive compute environments.

Spectra is based on predicting an applications future resource requirements by monitoring and then mod-

elling current application resource usage. Monitoring is performed by six resource monitors each of which

monitor single of related sets of resources, namely CPU, network, battery, file cache state, remote CPU and

remote cache state on surrogates. These monitors are within a modular framework shared by Spectrass client

and server. Spectra consists of a client server architecture running on an application entity, and a server running

on surrogate entity(s). Spectra uses services provided by Coda file system for replication and consistent remote

file execution, and Odyessy for defining the fidelity of tasks f or distribution. Coda file system was used instead

of using a distributed file system because it was felt that net work latency and bandwidth was not sufficient

to sustain consistency. Spectra uses energy consumption during operation execution run times as the primary

metric for scheduling and partitioning of tasks.

The information provided by the resource monitors provide a snapshot of the surrogate state and availability

of surrogate resources, and current state of the existing running system. From this, Spectra predicts balance

or imbalance between current running system and future processing tasks, based on this balance prediction

Spectra indicates the location of available surrogate(s) to the application. The decision to actually use the

surrogate for the next task processing is made by the application.

Predictions on the expected time for process task execution are made by firstly calculating transfer of

data by dividing data for transmission with bandwidth; secondly, calculating task process time from previous

heuristic data such as system logs, that are modelled for prediction of future usage. These predictions are made

by predictors that use the heuristic data to generate the prediction models and update current log data.

Scheduling is also based on modelled heuristic data of the surrogates available in the environment, any

planned future execution of processes and fidelity of execut ion (fidelity is provided by Odyssey). The volume

of data retrieved from the environment affects accuracy of the heuristic prediction models. This means that in a

small environment with few surrogates scheduling predictions cannot be guaranteed to be optimal. Spectra uses

a utility function to evaluate process executions. The function takes into account time for execution, fidelity

and energy consumption. The utility function predicts execution by summing CPU time, data transfer time,

time to manage cache, and time required to ensure consistency of data. Whilst cache management and data

consistency times can remain stable, execution time varies with different applications. To overcome application

specific execution time variability, each application must provide Spectras utility function with a similar func-

6 2. CYBER FORAGING


15/38

2.2. CYBER FORAGING TOOLS 7

tion indicating potential application requirement or desire for remote execution i.e. the function 1/T where T is

the predicted execution time, low value would would indicate low desire for local task execution and potential

requirement for remote execution. In this example Spectra provides available compatible surrogate(s) informa-

tion to the application. The decision to perform remote execution is made by the application. To continually

carry out this requirements and desire for remote execution and pass on to Spectras utility function that returns

surrogate information, carries with it non negligible overhead, therefore only execution times approximately

one second duration are passed on to Spectras utility function.

The Spectra API is presented in Appendix - Spectra. A register fidelity call identifies and registers start

operations. The application then defines a set of possible execution scenarios, these scenarios provide different

ways to partition execution between remote surrogate and local entity machines. The set of scenarios also

specify levels of fidelity and input parameters, all of which add to operational complexity. The begin fidelity op

call determines how to execute an operation and where the operation is to be executed. In short, Odysessy

chooses the fidelity level and Spectra chooses the execution scenario (plan), before any actual execution takes

place.

do local op and do remote op calls mark the start of execution of operations, these call make RPCs to

Spectras surrogates and local server. Other than starting execution of operations, these calls also serve to

continually monitor resource usage at surrogates. The end fidelity op call is made by the application to signal

end of operation execution.

In the background to execution of operations a snapshot of resources in generated by the predict avail call,

that iterates through Spectras resource monitors and returns predicted resource availability, this information

is used by the register fidelity operation that generates potential scenarios. Spectras monitors are started and

stopped in tandem with start and stop of execution of operations of the do local op and do remote op calls.

Finally, add usage logs observations made, which may be used as future heuristic data.

2.2.2 Chroma Tactics

Based on Spectra [Flinn02], Chroma differs from Spectra in two major ways, firstly decisions to execute re-

motely are not made by the application but by Chroma. The reason for this was to introduce flexibility so that

application developers are not required to specifically interface with Spectra. Secondly, applications in parallel

is possible with the introduction of remote execution tactics or tactics developed by Balan [Balan03a] (who

also devloped Chroma).

Tactics enables different ways to sub divide operations that can be executed sequentially or in parallel using

RPCs. Once different tactics to execute an operation are defined, Chroma decides which tactic to use, and

if execution shall be local or on remote surrogate resource. Tactics are defined in a generated tactics file, an

example of which is found in Appendix. A tactics file is in two parts, part one is a sequence of RPC calls

describing available RPCs and their associated IN input and OUT output parameters. Part two of a tactic file

consists of single tactic definition descriptors, each of which is made up of a sequence of RPCs. RPCs that

make up tactic file descriptors can be executed in parallel (within brackets) or sequentially (separated by an &

symbol). Each tactic definition is created by the application developer, ideally at the application development

stage.

The tactic files for each task are collected by a solver that schedules each task by selecting an appropriate

tactic plan for particular resource availability scenarios. A tactic plan specifies which tactic to use and where

to execute related RPCs, taken from the tactic file. In order for Chromas solver to select an optimal tactic

plan it requires resource usage information, which is supplied using multiple resource predictors and heuristic

information similar to Spectras heuristic prediction models and resource utility function. However, unlike

Spectra, Chromas solver selects an optimal tactic plan based on resource priority and enforces it in a brute

force manner. The reason for this is the claim that there is only a small number of ways to subdivide an

application task for remote execution to a surrogate. An example of Chroma architecture and that shows the

work flow using tactics may be found in Appendix Chroma.


16/38

Parallel remote execution of tasks in Chroma is achieved when the pervasive environment is over resourced

for the current application requirements. In such a scenario, the same task RPCs from a tactic plan are executed

on different surrogates. Chroma employs three different optimisation techniques, called fastest result, data

decomposition, and best fidelity. The fastest result technique performs a tactic at a certain fidelity on multiple

surrogates, and uses whichever result is returned first, any subsequent returned results are discarded. The fastest

result technique increases performance but increases overall load on surrogates, especially in an environment

with multiple application devices are operating simultaneously. The data decomposition technique requires the

programmer to explicitly define a function to subdivide inpu t data that can be sent to multiple surrogates. The

programmer defined function must include a method for mergin g returned data using this technique. The best

fidelity technique is implemented in the tactic file. Here Chr oma sends different tactics to different surrogates

and waits a certain time interval, the returned results within the time interval with the best fidelity are chosen,

all others are discarded.

2.2.3 AIDE

AIDE developed by Messer [Messer02a] takes the approach partitioning the a service running on a client and

offloading to surrogates. AIDE used a modular distributed pl atform employing the Java Virtual Machine (JVM),three modules address monitoring the application execution, partitioning tasks, and offloading of components.

Dynamically partitioning Java programs and offloading code sections to surrogates is based on memory and

processing constraints. AIDE used a graph of the application execution history, and subdivided the graph to

represent code sections to offload to surrogates. The granul arity is defined by Javas class component in relation

to Javas code architecture of objects, classes, and higher level components such as JavaBeans.

Distributed execution was achieved by modifying JVM to have hooks instead of unique object references.

These hooks took the form modifications to JVM that flag object references to remote objects, and then intercept

accesses to remote objects. With these modifications to JVM, the AIDE modules were able to convert remote

access into RPCs between JVMs on client application and surrogates. Any JVM on either client or suggogate

that receives a request used a pool of threads from which to perform RPCs on behalf of other JVMs. Herethreads are not migrated, rather invocations and data accesses allow placement of objects.

Partitioning of java code, was done using heuristic execution data in the form of an exection graph. Based on

the graph MINCUT heuristic, all graph nodes representing a class that cannot be offloaded are partitioned first

and stored on the client entity. Following this, each remaining graph node was evaluated using the MINCUT

heuristic. The AIDE MINCUT heuristic produced a group of minimum cut partitions, that were individually

evaluated to determine which one satisfies the partitioning policy. The partitioning policy was based on a cost

function that returns historical data transferred between partitions. Partitions were then selected which could

be offloaded without detriment to overall network operation and use of resources such as memory.

Resource monitoring during partitioning and application execution was achieved by augmenting JVM code

for method invocations, data field accesses, object creatio n and object deletion. The monitored information

is obtained at the object level and aggregated to class level, to coincide with graph description. Memory

usage within interclass interactions was also monitored and returned as values represented by graph edges

and graph edge parameters. Graph representation of application execution was a fully weighted execution

graph. The execution graph represents node classes annotated with memory usage of objects within the class,

interactions between class objects, and data transfer between class objects. Graph adaptation and adaptive

partitioning policy used memory usage by tracking free memory space from the JVM garbage collector. An

overall archtecture found in Appendix AIDE shows the overall AIDE architecture and the hardware and VN

used.

2.2.4 Networked integrated Multimedia Middleware (NMM)

Although not presented as either a cyber foraging system or a remote execution system, NMM is included in

this work because it demonstrates the features required of a cyber foraging system and remote execution of

8 2. CYBER FORAGING


17/38


tasks.

NMM is a component orientated middleware framework that integrates and configures components in a

network. The primary service goal of NMM is to manipulate and render multimedia content within a network

of resources prior to delivery to a mobile device.

NMM is modelled on a logical flow graph, made up of nodes representing individual multimedia content

processing tasks. The flow graph represents the overall task requested by an application, the overall task is

further divided into subtasks represented by different node elements of the flow graph. Typically, a requested

application task maybe for playback, transcoding or recording of multimedia data. To complete the application

request task the data must be read from a source, for multiple media streams the data requires demuliplexing,

and then individual streams decoded prior to rendering. The application request task is represented as node

elements forming a chain of subtasks node elements required to accomplish the overall application request

task. Each node within a logical flow graph accepts as input and produces as output, different defined data

format via input/output jacks. Format of data is defined as a tupel, made up of media type and encoding task

ie audio/mpeg3. As data is passed from node to node through the logical flow graph, the format tupel changes,

until the data is ready for rendering or whatever the task of the final sink node maybe.

Jacks act as connectors between nodes, each node having its own input and output jack. Exceptions depend

on the node type such as a source node that does not have an input jack because it accepts data directly from

the multimedia content repository, therefore the source node produces data to be consumed by the flow graph.

Another exception is the sink type node that does not have an output jack because they are used to render data

or dump to a hard drive, a sink node consumes data produced by the source node. The current NMM includes

60 such node types, each defined as a unique multimedia content processing task. The life cycle states of a node

runs through initialisation, initialise output, activated and finally started, once started processing may begin.

The advantage of using NMM flow graph description is that different nodes can run on different resource

hosts, furthermore the NMM application itself can run on separate hosts from any processing resource host.

Resource node management including discovery, reservation, and instantiation is accomplished using a registry

service that runs a registry server. Each resource node runs registry server which can be accessed by a registry

client running on the application host. An application can use the registry client to query local of distributed

running registry servers. When an application requests a resource node from the registry server, the registry

server checks node availability, if available the node is instantiated and the node goes through its life cycle.

Global resource host information is gained through registry information.

NMM uses a client server and peer-to-peer communication architecture to communicate between nodes in

a network, in which using a proxy object, invocation and execution are separated. Thus, the user of a proxy

object does not need to know the resource host information on which the node is running; only the proxy object

requires this information. These communication channels are therefore abstractions for all communication

between NMM components, such as input/output binding between jacks of different nodes running on different

resource surrogates.

Service discovery and remote execution is performed by a server registry service. Discovery and registration

of nodes is performed statically during initial registry service setup, however once a serve registry is initialised,

dynamic registration of any added plug-in is dynamically added to the registry service. The dynamic addition

of plug-ins to an existing network is added to a hierarchy of registry registers, the interface between client and

servers in an existing network is called IRegistry and IServerRegistry, a description may be found in Appendix

NMM.

At this juncture it is important to underline the difference between a graph description, a node description,

and plug-ins in order to make clear the what information is communicated by the registry service. The concepts

used for describing entities that are administered within the registry service are the same for querying entities

from the registry service. A plug-in is specified by a node description, whereas a complete flow graph is stored

within a graph description Each description type contains all relevant attributes such as object name,ie a node

description includes, node name, node type, format and sharing attributes. In addition to attributes, a node

description also stores a list of events called configuration events used when configuring a plug-in instance.


18/38

Therefore a list of any possible state and configuration attr ibutes for that state exists as configuration events.

Node description can be further subdivided into subsets of the overall node description, this allows querying

all node descriptions in the registry and returning only those subsets that full fill query criteria. The properties

in a graph description include specification of nodes and the ir connections, specification of communication

channel, and synchronisation.

Distributed synchronisation in NMM distinguishes intra-stream and inter-stream synchronisation. Intra-

stream synchronisation refers to refers to timings between multiple presentation of the same media stream,

ie stream of subsequent video frames. Inter-stream synchronisation refers to synchronising media streams

themselves, ie synchronising lip-sync for audio and video streams. Each NMM message holds a timestamp

that contains entries for time and stream counter. The time and stream counter entries of the timestamp and

a global system clock, are the main reference to time when synchronising between media streams passed

from one processing task node to another either locally or remotely on surrogates. Synchronization between

surrogate nodes is governed by a set of synchronisation controllers or synchronisers for each task node. These

synchronisers realise inter-stream synchronisation by implementing a synchronisation protocol, and provide an

interface allowing the application to modify the operation of a corresponding flow graph i.e. for pausing data

processing.

The overall objective of the sink synchronisation is to either, provide a synchronised playback of the media

content, or rendering, for distributed audio/video sinks. An example of distributed synchronisation to a sink ap-

plication node may be found at Appendix NMM. Here the buffer is simply a collection of messages containing

timestamp information. Timestamps are handled by local running controllers delegated by synchronising sink

nodes, these local controllers deal with the intra-stream synchronisation. Controllers decide when to present

a particular buffer from a number of buffers from multiple streams by matching of buffer latency and flow

graph latency. A buffer requires a certain time interval for reaching a node, after a timestamp has been set.

This timestamp to node interval is called real latency and is expressed as the difference between arrival time

and time set within the timestamp. Two buffers with corresponding time stamps will have different time for

reaching sink node, i.e. real latency 1 and real latency 2. If one imagines latency as the time stream from the

node where the time stamp was set called the sync-time, to the sink node where the buffer will be presented

called the presentation-time, then latency = presentation-time sync-time. Alternatively, if latency is given,

the controller will calculates a theoretical latency or theo-latency where theo-latency = sync-time latency.

During runtime a controller computes real-latency > theo-latency+ max-skew, where max-skew is a tolerance

previously defined. If the computed real-latency exceeds a v alue depending on the theo-latency, the buffer is

considered as to old an maybe considered as invalid. However, if real-latency < theo-latency + max-skew

presentation of the buffer will be delayed. In summary, the intra-stream synchronisation attempts to maintain

constant latency for stream buffers, so that temporal distance between buffers is the same as their corresponding

sync-time(s).

This differs from inter-stream synchronisation that attempts to maintain equality for latencies of streams

(equal latencies for different streams). This is achieved by setting theo-latency(s) of all controllers to the

maximum real-latency of all current streams. First every controller sends computed real-latency to for first

buffer to arrive, to the synchroniser. Then the synchroniser computes theo-latency as a maximum of all latencies

and sets this theo-latency value as theo-latency for all connected controllers.

Search and instantiation is performed in two stages. The firs t stage is referencing all nodes that match a

given node description, here node description refers to a subset of a complete graph description stored in the

server registry. The second stage is node instantiation of the first node in a list of nodes returned in response

to stage one matching of references; response consists of returning a registry identifier and node identifier.

Using these identifiers, a complete flow graph description ca n be requested using the first responding node. A

description of a client registry and the registry hierarchy of registers that hold a number of specialised registers

for different identifiers and node attributes may be found in Appendix NMM.

The registers in the registry hierarchy are accessible via IRegistry interface, Registry1394 administrates

firewire compatible devices, and LocalRegistry provides in formation for non-specialist available plug-in. Shown

10 2. CYBER FORAGING


19/38


in the ClientRegistry is the scope of information held and subsequent operations performed on the flow graph.

Finally, a code snippet that a developer might use to access the registry service may be found at Appendix

NMM. The first part creates a central application object for the application. Here the system server registry is

contacted; if contact fails a local instance is created instead. The second line requests the client registry. The

next part start with NodeDescription is an example for requesting a node specified by node name from the

registry service. Here a graph description is used to request a simple flow graph consisting of three nodes, a

source node for reading data from a file, a converter for decoding MPEG audio, and a sink node for outputting

uncompressed audio. In the last part of the code snippet, all specified edges of the flow graph are specified and

connected and all nodes activated, finally the flow graph is started.

2.2.5 Goyal

Goyal [Goyal04a] was motivated to develop a cyber foraging system on a widely available platform without

the requirement of a large middleware layer.

Goyal developed a lightweight cyber foraging system that differed from Spectra and Chroma because it

did not require use of a common file system such as Coda/Odyssey. Similar to AIDE, Goyal employs virtual

machine technology, however in contrast to AIDE partitioning is done by the application developer and not by

any automated code division method. Multiple virtual surrogates can be created on the same surrogate host.

The argument for using virtual machine technology was that independent virtual servers allowed for greater

isolation, flexibility, resource control, and clean-up compared to running on real host surrogate machines.

Isolation, in terms of no interference between virtual machines. Flexibility, as client applications can arbitrarily

software on the virtual machine. Resource control, in that resources of the physical host can be fairly allocated

between multiple virtual machines. This also allows the physical host to compute separate applications from

the virtual machines without draining virtual machine resources. Clean-up is automated and simple, when a

virtual machine instances shuts down, the allocated disk partition on the host surrogate is restored to original

clean state.

Service discovery was managed by a seperate service discovery server, that maintains lists of registered

surrogates and their individual resource capabilities, represented in an XML syntax description. When a client

requires surrogate resources, the client queries the service discovery server by requesting particular resources.

The service discovery server matches existing listed registered surrogates with client query resource requests.

Matching requests and resources is based on previous profiling of application resource requirements made by

the developer.

A typical implementation of the Goyal system workflow may be found in Appendix Goyal. Initially, the

client sends a request to the server to discover a surrogate from its listed registered surrogates. The service

discovery server replies with an IP and port number of the surrogate manager of a listed registered surrogate.

With this information the client contacts the surrogate manager with a service start request. The surrogate

manager determines adequate resources by matching application requirements with available resources, using

the same XML notation as the initial service discovery request.

After authenticating the client, the surrogate manager sends the client a service start response containing

IP of new virtual machine, after allocating matched resources to a newly started virtual machine. During client

application and virtual machine interaction, the client invokes an operation on the surrogate by sending a Sub

Task Configuration Request to a virtual server manager on the surrogate. A Sub Task Configuration Request

from the client would include a URL of the client program to run, the URL programme would include all the

information the virtual surrogate requires to install and run the program.

Authentication between client and virtual surrogate is addressed using a flexible authentication framework

that supports multiple authentication mechanisms, specified by the client when first connecting to the surrogate

machine during the service start request stage. The different authentication mechanisms available for the client

to specify are SSL, TLS and SSH. Once a client - surrogate session is established, any subsequent transfer of

data of data or communication uses a clients public key for authorisation. The clients public key is stored by


20/38

the surrogate and service discovery server for future reference, and future client service discovery requests.

2.2.6 Slingshot

The motivation behind Slingshot [Su05a] is to eliminate the bottleneck that can occur when a client application

attempts cyber foraging on remote surrogates via a wireless hotspot. Slingshot is client surrogate architecture

for deploying mobile services at wireless hotspots, based on the concept of continuous replication of applica-

tion states instantiated on virtual surrogates, as the client moves between available virtual surrogate resource

services. The slingshot architecture replicates remote application state on surrogate computers, co located with

wireless access points. A first replica of each application i s executed on a trusted safe server, and acts as a

backup if subsequent surrogates fail. A subsequent second replicated application state is co located on a virtual

surrogate within closer proximity than the first replicated state virtual surrogate, for quicker response times.

The client applications broadcasts application requests to all replicated states on all virtual surrogates, and

only responds to quickest return from any of the replicated state virtual surrogates. A database of the state of

each replicated application maintains checkpoints for reference for a start point of a new replication instance.

Replication in this fashion is used instead of migration of replicated state from surrogate to surrogate because

during migration, processing cannot continue. In Slingshot processes continue on previous replicated states

while new replicated states are being instantiated. Slingshot instantiates a new replica by check pointing the

first replica, and migrating its volatile state to a surrogat e, and then replaying any operations that occurred after

the checkpoint. The workflow of Slingshot may be found at Appe ndix Slingshot.

The first safe replicated state surrogate server is called th ehome server that maintains a service database.

The service database maintains current service state of replicated server on its virtual disk using SHA-1 values

assigned to 4kb chunks of the latest replicated state. Therefore, at any time the home server has the latest

updated replicated state

2.2.7 Vivendi

Vivendi was developed by Balan [Balan07a] who also developed Chroma tactics [Balan03a] and VERSUDS

[Balan02a]. The Vivendi system has two main components, one that deals with creating a remote execution

tactic file (tactic), and the Chroma runtime system [Balan03 a]. Here we only describe the Vivendi partitioning

system and the interactions between Vivendi, tactics and Chroma. Please refer to previous sections for futher

information on Chroma and tactics.

Vivendi is written in little language [Bentley86a] for rapid modification of applications to enable partition-

ing of application tasks. The motivation for vivendi is to reduce application development time by reducing the

complexity of modifying an application to support cyber foraging at developer level, to allow both novice and

less experienced application developers to develop cyber foraging enabled applications. Essentially, Vivendi

requires the developer to preplan which application tasks should be considered for remote execution, and define

critical variables of the task as parameters to be used to predict expected required resources to carry out the

task, written in little languages as a tactics file. The devel oper includes for each critical task variable parameter,

a fidelity for carrying out the task with satisfactory qualit y results. The developer also specifies RPCs that

define the actual application computation task a surrogate p erforms on behalf of the client application. Finally,

the developer defines combinations of different RPCs that ca n carry out the application task within the defined

fidelity and quality.

Chroma compliments Vivendi tactics file by selecting the app ropriate tactic and the binding of RPCs to

surrogates. Selection of tactic is based on Chromas resource management, prediction and fidelity selection

functions. as depicted in Appendix Vivendi. The Chroma solver module responds to Vivendi stubs generated

by Vivendi RPCs defined in tactics file, and predicts the curre nt optimum tactic to use. The solver prediction

process engages monitored surrogate resources data, computed using Chromas utility function, and similar

tactic heuristics.



21/38


Vevendi generates two types of stubs, a standard RPC stub and a wrapper stub. The wrapper stub is manually

written by the developer, the wrapper stub contains the application code methods required for the task defined

in the original tactic, the wrapper stub effectively contains the lower level information a surrogate need to

do the task. Using a Vivendi wrapper stub provides a convenient interface between Chroma and the targeted

application. An example of a Vevendi tactic file and Vevendi wrapper stub may be found in Appendix Vivendi.

2.2.8 EyeDentify

EyeDentify EyeDentify is a smart phone object recognition application developed on the Android OS, that

uses the Ibis Distributed Deployment System to deploy remote application on to surrogates, and the Ibis High

Performance Programming System for communication.

The paper developed two versions of the EyeDentify application, one version performed all computation

locally and a second version performed computation on surrogates, the response times for the same computation

processes were compared. Results revealed a 60 fold increase in responsiveness using Ibis for cyber foraging

on remote surrogates than on local phone resources.

The Ibis middleware consists of a number of sub projects, each of which implements a part of the grid mid-

dleware requirements. The left side panel represents th Ibis Deployment System, with JavaGAT as the main

component. The right panel represents the Ibis High Performance Programming System, the main component

of which is Ibis Portability Layer (IPL). The combination of JavaGAT and IPL forms the main cyber foraging

mechanism for EyeDentify. A graphic representation of the Ibis Middleware used can be found in Appendix

EyeDentify/ JavaGAT has adaptors able to bind to any middleware, the adaptors map JavaGAT API to mid-

dleware calls, including SSH. The EyeDentity Android application used two adaptors, one for client resource

access, and one for surrogate resource access using SSH. On top of JavaGAT is a deployment library called

IbisDeploy (Deploy) that starts distributed applications developed using Ibis High Performance Programming

System. On top of IbisDeploy is a GUI from which remote application can be started. Deployment procedure

of an Ibis application on a remote surrogate involves the following sequence of eight subtasks. 1) Replicate the

application, libraries and input file on the remote surrogate. 2) Start and Ibis Server registry process. 3) Form

an overlay network. 4) Construct middleware specific job descriptions. 5) Submit job description to remote

surrogate. 6) Monitor job statuses. 7) Retrieve input file when process is completed. 8) Clean up remote file

system.

Service discovery is performed by the IbisDeploy library when it defines job descriptions and Ibis applica-

tion using a namespace concept. Within an Ibis application description is contained a main class, and virtual

machine options and arguments, within remote surrogate description is contained details of how a remote

surrogate should be accessed. Both the IbisDeploy library and GUI are ported onto Android. In summary,

the application itself is developed on the client and deployed to remote surrogate, therefore the only service

software required to run on surrogates is the Ibis default middleware that JavaGAT binds to, and a JVM.

The EyeDentify application is an object recognition application that has two stages of operation. Stage one

is called the learning mode in which an image of an object is stored on an internal database with a predefined

identification profile. In stage two called the recognition mode, another image is captured and matched to

images in the internal database, the best match between second image and database images are then presented.

Matching images is performed by learning algorithms that extract features and attributes of the second image,

such as colour histograms, shapes, relative size etc. The recognition phase of the application is very resource

consuming; this was the compute that the paper offloaded to surrogates.

2.2.9 DiET

Diet developed by Kim and workers [Kim09a] is a framework that transforms original java bytecode from

a remote content service provider into graduating smaller versions, one version for surrogates, and an even

slimmer version for execution on a client. Motivation for the work is to reduce java bytecode into serialised


22/38

distributed objects that are usable by surrogates and clients, without the need for major developer modifications

to the original java application. This slimming down of bytecode is done by replacing the main bodies of

methods with remote procedure calls (RPCs). A client request starts the process by requesting an application

to execute from a remote content service provider. The service providers then slim down the application java

bytecode and transfers this to prediscovered surrogates and clients in the pervasive compute environment. The

surrogates recieve a server byte code and clients recieve a smaller slim bytecode. Since no modification to code

functionality takes place, once the client and server have their received their versions, the cyber foraging takes

place as ATA using JVM on invoked on surrogates. A graphic description of the transfer of server and slim

bytecode is shown in Appendix DiET.

2.2.10 Instant-X

Instant-X is a component based middleware platform that provides a generic programming model with an

API for essential tasks of multimedia applications with respect to signalling and data transmission. The mo-

tivation behind Instant-X is to develop spontaneous communication software, compatible with multimedia

encoder/decoder protocols. The work argues that standard multimedia encoder/decoder protocols are limited

as communication software. For example, the Java Media Framework [Sun99a] offers basic access to multi-

media codecs, and RTP data transmission, but does not support further communication mechanisms such as

signalling. However, JAIN SIP does support signalling, but requires considerable configuration by the multi-

media application developer.

The thrust of Instant-X concept is the ability to replace specific protocols implementations without chang-

ing application code of multimedia application. Instant-X also supports dynamic deployment of unavailable

components at runtime. Although not a cyber foraging system pre se, Instant-X is included in this section be-

cause implemented with OSGI [OSGi07a] as a component platform, the system demonstrates interesting cyber

foraging functionality. The programming model consists of three elements, binding, session and context. A

graphic representation of the programming model can be found in Appendix Instant-X. Binding is a local end-

point of an application represented by URI, that activates the URI and maintains UIR active status. A session

represents a P2P relationship participants or actors, each participant is has a unique URI identification, such

as SIP:[email protected] if using SIP URI method. The UIR identifiers are encapsulated in bindings. Con-

text contains optional parameters required for sessions and binding such as permissions. A SIP session may

contain multiple SIP sessions of RTP sessions for audio and video. The programming model is designed to

provide generic tool for developers who do not need to worry about the underlying protocols required of their

application. The generic API of instant-X is such that the application does not need to change if if the protocol

implementation changes. Instant-X API employs OSGi [OSGi07a] as a java service orientated architecture that

dynamically discovers collaborative components, changes device composition of a variety of networks, without

the need for device restart. Instant-X is demonstrated using cloud computing with OSGi. By using the cloud

computing paradigm, surrogate discovery and scheduling is deferred to the cloud.

2.2.11 Scavenger

Scavenger is a dual profile task scheduling system, written i n python as a hybrid cyber foraging approach based

on locust [Kristensen07a], consisting of a daemon installed on surrogates and libraries installed on the client.

Scavenger is motivated by increasing the effect of heuristic profiling for scheduling to take into account task

complexity, and merging task centric and peer centric profil es. The surrogates consists of two independent

software components that are a daemon running of surrogates using stackless python, and a library running on

the client using normal python. The libraries on the client are the mobile code executed on surrogates through

RPC entry point of surrogate daemon. Libraries can be invoked by the application starts cf or automatically

without the need to start the application.

The daemon on the surrogates has a front-end to receive RPC, and a mobile code environment. The mobile



23/38


code environment allows dynamic installation and execution of the python code transferred from the client in

RPC.

Kristensen argues that mobile code is a necessity for true mobility because pre-installed tasks on surrogates

mean all surrogates everywhere must have all pre installed tasks. Similarly, using VM is too heavy weight and

takes to long for a full VM to instantiate, especially if the user is mobile and out of reach in a few minutes.

Kristenson argues using trusted pre installed mobile code is better, if the code is not installed then the mobile

client simply installs it.

The daemon execution environment spawns a core scheduler on the surrogate that handles the offloading

application tasks for a particular core. When installing the daemon on a surrogate the number of cores to offer

as surrogate cores is user configurable in the case that the surrogate is a machine such as a laptop used by other

users. Here the laptop maybe used locally for other activities and still be used as a surrogate for cyber foraging

in a pervasive compute environment. A high-level view of the scavenger architecture can be found in Appendix

Scavenger.

Once a surrogate had performed an offloaded task, the task is stored at the surrogate for future use using

a automated UID with MD5 sum naming. So when invoking a given task scavenger first queries if the task

is already installed, before installing from the mobile code. Here mobile code does not have to be transferred

along with the task because the task code is already at the surrogate. This aides the fact that many transient

client will normally use a certain number of tasks more than others.

Security and trust measures include black-listing and white listing of imported known standard library

modules.

Surrogate discovery is performed using a presence discovery framework, used by clients to discover surro-

gates. XML-RPC is used.

Scheduling - The main contribution of scavenger is the dual profiling during scheduling. Each task has

two profiles , a task centric profile where a globally applicable task weight is stored, and a peer-centric profile,

where a profile is maintained for each (peer-task) pair, that has been encountered. The peer-centric profile

stores information about how exactly a peer performs with regards to a specific task. These stored peer-centric

profiles are used to probabilistically determine how an unknown peer may perform during a particular task.

Scheduling then is based on maintaining a history based peer-centric profile. When profiling information is

required during scheduling the history based peer centric is first read to see if that particular peer has been used

before and for which tasks. If the peer is unknown, then the task centric profile is consulted. Task centric profile

is also history based and contains a weight for each task. Task centric weighting is calculated as follows:

Tweight= TdurationPstrength

Pactivity(2.1)

Where Tduration is task duration in seconds, Pstrength is peer strengh for nbench benchmark1, and Pactivity

is the number of tasks running during execution.

During a task, an expected Pstrength is derved by scaling with Tduration, the derived expected Pstrength

is used as a measure to reason about task expected running times on other surrogates. For example a task with

Tduration of 1 second on an surrogate with Pstrength 40, should take approximately 2 seconds on a surrogate.

Experiments verified a correlation between theoretical and real task duration times on surrogates with varying

compute resources carrying out the same task.

The profiles are two dimensional with a second dimension that takes into account task complexity. Task

complexity, is a factor that is also considered as a factor in the scheduling process. Task complexity variables

such as input size and value will vary. At the development stage the application developer is asked to provide a

description of input parameters that determine a tasks complexity. Combining developer specific complexity

parameter values and/or sizes is evaluated to a yield a single value , where may be for example the size of

an input file. Using a simple matching algorithm the value of in combination with Tweight is used to update

1

1http://www.tux.orgl..mayer/linux/bmark.html


24/38

the dual scheduling profiles.

Scavenger does not partition tasks into subtasks; each task described in mobile code cannot be further

divided into subtasks run on different surrogate machines.

2.3 Summary

The methods used to address surrogate discovery, trust establishment and task scheduling are shown in Table

2.1. With the exception of Scavenger, previous proposed cyber foraging systems have not supported all four

design features. The reason for this may be the non-commercial nature of these systems, that focus on particular

aspects of cyber foraging rather than a complete system.

Table 2.1: Table of Cyber Foraging Systems reviewed and their adherence to the four fundamental features

required of a Cyber Foraging system as defined Satyanaranyan an [Satyanaranyanan01a] and Balan [Balan07a]

Figure 2.1: Classification of cyber foraging systems as thin and thick ATA and AAA

2.4 Discussion

In this section the literature and cyber foraging tools described in previousley are discussed in the context

service/surrogate discovery, the mechanisms of these functionalities summarised in Table 1 in the previous

section are expanded upon.

1Network integrated Multimedia Middleware



25/38

2.4. DISCUSSION 17

2.4.1 Surrogate Discovery

Introduction

Of the challanges needed to be addressed in the fusion of distributed and mobile computing [Satyanaranyanan01a],

service discovery is the most critical, without which transient use of cyber foraging would not be possible. Dis-

covery enables entities to properly discover, configure and communicate with each other [Zhu05a]. Discovery

of a specific service in a pervasive compute environment is hampered because it is unreasonable to expect surro-

gates to have services pre installed to cater for all transient clients. The ability to dynamcially discover services

has less administrative overhead than traditional methods, that require prior knowledge of service existence

and the need for manual input of parameters such computer names IP, URL etc, to configure the discovered

surrogate service to the client appliacation.

In this work, we replace the concept of searching for a service with searching for a surrogate. While pre-

vious knowledge of services existence in fixed limited networks may seem practical, a pervasive computing

environment will have transient client usage and potentially hundreds of potential transient surrogates and sur-

rogate services. The cyber foraging systems reviewed in this section are not specifically discovery services, they

are tools and aides for the developer to include features of cyber foraging functionality in mobile applications.

In the work of Zhu [Zhu05a], ten service discovery design features are defined in a unifying taxonomy

of terms and definitions, and comparison made to the adherance of these terms with some exisiting discovery

protocols. The protocols that Zhu compares are primarily for home and enterprise environments that do not

completley compare with a pervasice compute environment, however the design approaches are useful guides

and reference points for design of discovery protocol in pervasice compute environment. In this discussion we

first describe some current discovery protocols, and make the same comparison of adherance that Zhu made in

Table 2.2, to discovery afforded by cyber foraging systems to in a pervasive compute environment described in

the previous section.

Existing Discovery Protocols

Previous studies of announcement and discovery have focused mainly on static environments without mobile

entities. Two examples of static discovery are Jini [Jini03a], Salutation [Salutation01a] and Universal Descrip-

tion, Discovery and Integration [UDDI11a] (UDDI for Web services). In such static envirnoments an entity

can act as both client and server (peer), typically a one or more peer(s) shall take on the role of registrar that

maintains a central register. This registrar peer is contacted by peers that wish to announce that they provide a

service, the service is registered by the registrar peer. In return, the registrar peer provides a lease to the service

peer, that indicates how long the registrar will continue to announce the service to the network. When a peer

wants access to a service it informs the registrar peer of its service requirements. The registar then matches the

requirements with registered services, and returns either a list of known registered services or information of a

type that provides communication details to a registered peer that currently provides the required service. This

is a centralised approach that does not perform well in a mobile environment that may have mobile peers that

both use and offer services concurrently.

An example of a mobile network discovery protocol is the gossip approach as taken by Lee [Lee03a]

called Konark. Here, mobile peers that are offering services, announce services that they are aware of only

to nieghbours whom they beleive are not aware of the said services. The gossip annouoncement approach is

an optimal approach in a multihop network because of its viral nature of communication. A drawback of the

gossip approach is high overhead in wide networks, and suited to single hop neworks such a Mobile Adhoc

Networks (MANTETS). In cyber foraging, constraining to single hop data transfer is impractical.

Another alternative mobile discovery protocol is Plug and Play (UPnP) [UPnP02a] that is based on the Sim-

ple Service Discovery Protocol (SSDP) [Goland99a]. UPnP is not centralised to a single regsiter, rather when

a peer joins a network it advertises a service to control points that reperesent potential clients for the service.

When a potential client (control point) joins the network it sends a discovery message requesting a service,


26/38

the exisiting service adverising peer responds to the requesting client by returning a service announcement

message. A service announcement message has limited service description, should the requesting client wish

to have more service information it responds to the service announcement message requesting more compre-

hensive service description information, which the service advertising peer duly provides. On recieving a full

sevice description the client may choose to use the service, which it does by communicating with the service

peer using the Simple Object Access Protocol (SOAP) [SOAP08a] to use the service.

There are two drawbacks using UPnP in cyber foraging in a pervasive compute environment, firstly the

discovery process is passive, in that networked peers only advertise once, and then wait for a potential client

to respond. Secondly, after a service announcement message is recieved the client must yet again contact the

service peer and request more service description information, before deciding to use the service or not. In the

context of scheduling of tasks in cyber foraging, a client requires continuous service discovery information from

all service surrogates. Conintual manual contacting of every surrogate in the pervasive compute environment,

and recieving multiple service descriptions would incurr unfeasable overhead, considering all peers may be

clients and service surrogates, therefore all peers may need to request to all other peers in the network for

continous monitoring.

In summary, existing discovery protocols do not completely adhere to cyber foraging in a pervasive compute

environment requirements that demand continuous discovery of mobile clients and surrogates (peers) in a

single hop network, that also provided continuous available service description update that may be used fo for

scheduling of partitioned application tasks.

Summary of Cyber Foraging Discovery Design Features

Table 2.2: Table of Cyber Foraging System tools and their adherence to the four fundamental features required

of a Cyber Foraging System in a pervasive compute environment as composed by Zhu [Zhu05a]



27/38

2.4. DISCUSSION 19

Service and Attribute Naming

The template-based approach defines a format for surrogate names and attributes, such as how the name must

be composed. In addition a template-based name for the surrogate, common parameters of a surrogate may

also be predefined. The advantage of templates and predefined sets of parameters, is the decrease in ambiguity

in communication between entities within the system and externally. Specifying the format and composition

of a surrogate name and attributes is dictated by the how much use of standardised components are used.

Initial Communication Method

While unicast is the most efficient initial communication method because it only targets specific entities, its

drawback is the need to have prior knowledge of the entity address. However, the amount of prior knowl-

edge required can be reduced by initially sending multicast UDP messages, from which entities can determine

unicast addresses, and then switch from multicast to unicast. Here less prior knowledge using multicast and

then switching to unicast is required, and unicast addresses may be stored for future reference. Alternatively,

broadcast may be used for single hop networks to bind a discovery protocol to the underlying network protocol

interface, the disadvantage of this method is that entities are then limited those with the underlying network

communication protocol interface.

Discovery and Registration

In the announcement-based approach clients, surrogates and registration directories listen on a communication

channel, when an announcement that a service is made, a client might learn a service is available and directory

might register the service availability. In the query-based approach an entity may receive an immediate response

to a specific query, each query is replied to separately.

Service Discovery Infrastructure

In the directory-based infrastructure model dedicated infrastructure components are used, here the directory

component maintains service information and processes querys and announcements. A directory-based model

infrastructure can have a flat structure in which all directories have a peer-to-peer relationship, or a hierarchical

structure that only communicates with directories that match a certain criteria. The non-directory based infras-

tructure model has no dedicated infrastructure components, here all services process a query, and reply if the

service matches the query. A client may record information from a service announcement for future use.

Service Information State

The status of a service can be maintained as a soft or hard state. Using the soft state, the lifespan of a service is

governed by the lease expiration time contained in the service announcement message. Prior to lease expiration,

a client or directory may poll the service for validity or the service reannounces itself or renew the current

lease, otherwise the service expires and is removed from directory entries of systems using the directory-based

infrastructure model. The hard state is used in systems using the non-directory based infrastructure model, this

requires periodic polling of services by the client and directories to update service information.

Discovery Scope

Defining discovery scope reduces unnecessary computation on clients surrogates and directories. Using scope

criteria based on network topologies, user roles and context information, or combinations thereof, aides tar-

geting correct scope definition. When including the network topology in the discovery scope criteria, LAN

or single hop wireless network range may be used, here an implicit assumption is that clients, surrogates and

directories belong to the same administrative domain. Opting for User roles criteria allows the user control the

target domain; however this entails the user having prior knowledge of the domain, and domain authentication

information. Criteria such as high level context information such as temporal, spatial and user activity is still

uncommon, however including this in scope criteria lends added granularity to the discovery scope.

Service Selection

Using discovery scope criteria may limit the number of service matches, a discovery result may still contain

matched services. In such cases the choice of service selection can be either manual or automatic. Manual

selection provides complete control of service selection to the user, however this option assumes the user has

the correct knowledge to make the optimal choice. Alternatively, an automatic selection requires little or not

user input. An example of automated service selection is MatchMaking with ClassAds using Condor.


28/38

Service Invocation

Once a sevice has been selected, a client invokes the service, depending on the discovery system used. Service

invocation has three facets of information, service location, an underlying communication mechanism, and

operations specific to an application. The first level of serv ice invocation is the network address that provides

service location only, here the application is then responsible for defining communication and operations. The

second level is in addition to service location, and defines t he underlying communications mechanism, normally

communicated using Remote Procedure Calls (RPCs) and variations thereof. The third level is in addition to

service location and communication mechanisms, a definitio n of the application operations that are specific to

an application domain.

Service Usage

Once service usage is granted to a client, the client may explicitly release the resources of a discovered and

selected service. An alternative is the lease based method, in which the client and service negotiate a usage

period as a renewable lease time, when the lease expires the resources are reclaimed and lease information

deleted. For pervasive compute environments that have dynamic service resource availability, the lease based

method is more suitable, than explicit release by the client because if the client fails this could cause issues of

blocking the resource services.

Service Status Inquiry

A client can monitor a service state by either periodically polling the service or using service event notification.

Service event notification requires the client to register w ith the service for the service to notify the client if an

event of interest has occurred. Depending on the system, the method which is the most infrequent should be

chosen.

Discovery Discussion

Spectra [Flinn02a], Chroma - Tactics [Balan03a] and Vivendi [Balan07a] are examples of RPC-based ap-

proaches, in which client applications are partitioned into locally executable code and remotely executable

services. The services are then pre installed on surrogates and may be invoked using RPCs. Data exchangebetween client and surrogate(s) is via system specific Coda fi le system that requires the aforementioned pre in-

stalled remote services. This limits the system to pre-confi gured pervasive compute environments with specific

application support. An advantage in surrogate discovery to pre-installed environment is low overhead when a

surrogate is discovered, as there is no need no determine if the service is installed.

The VM approaches of Slingshot [Su05a], AIDE [Messer02a] and Goyal [Goyal04a] add flexibility of

control over the environment to the user who may install anything on a surrogate without interference to other

systems. The replication approach taken by Singshot [Su05a], entails a proxy running on the client application

device that broadcasts each service request to all replicas, the first recieved response is then passed to the

application. T

introduction to cyber foraging, tools and techniques

Documents