introduction to cyber foraging, tools and techniques
TRANSCRIPT
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
1/38
Introduction to Cyber Foraging, Tools and Techniques
R. Gill
June 30, 2010
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
2/38
2
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
3/38
Contents
1 Introduction 1
1.1 Example Design Brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Cyber Foraging 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Cyber Foraging Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Chroma Tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.3 AIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.4 Networked integrated Multimedia Middleware (NMM) . . . . . . . . . . . . . . . . . 8
2.2.5 Goyal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.6 Slingshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.7 Vivendi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.8 EyeDentify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.9 DiET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.10 Instant-X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.11 Scavenger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 Surrogate Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Appendix 25
i
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
4/38
ii CONTENTS
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
5/38
List of Figures
1.1 Cyber foraging architecture showing mobile client in a pervasive compute environment with
bi-directional communication to surrogates and remote multimedia content server . . . . . . . 3
2.1 Classification of cyber foraging systems as thin and thick ATA and AAA . . . . . . . . . . . . 16
3.1 The Spectra API from [Flinn02a] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Simple Architecture of Spectra from [Flinn02a] . . . . . . . . . . . . . . . . . . . . . . . . . 253.3 Example of a generated Tactic File from [Balan03a] . . . . . . . . . . . . . . . . . . . . . . 26
3.4 A description of Chroma Tactic components from [Balan03a] . . . . . . . . . . . . . . . . . . 26
3.5 Overall architecture of AID from [Messer02a] . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 A simple NMM flow graph from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 NMM buffering and of nodes from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.8 NMM registry hierachy from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.9 NMM code snippet adapted from from [Lohse05a] . . . . . . . . . . . . . . . . . . . . . . . 27
3.10 Goyal architecture from [Goyal04a] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.11 Instantitating a new application replica in Slingshot adapted from Su [Su05a] . . . . . . . . . 28
3.12 An example of Vivendi tactic file from [Balan07a] . . . . . . . . . . . . . . . . . . . . . . . . 283.13 An example of Vivendi file with tactic definition and remote invocation parameters from [Balan07a] 29
3.14 The Ibis framework employed in EyeDentify from [Kemp09b] . . . . . . . . . . . . . . . . . 29
3.15 DiET mobile code reductin flowpath from [Kim09a] . . . . . . . . . . . . . . . . . . . . . . 29
3.16 DiET API showing main working components from [Kim09a]] . . . . . . . . . . . . . . . . . 30
3.17 Scavenger client surrogate interface showing surrogate daemon components frontend and code
execution environment from [Kristensen09a] . . . . . . . . . . . . . . . . . . . . . . . . . . 30
iii
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
6/38
iv LIST OF FIGURES
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
7/38
Glossary of Terms
JVM Java Virtual Machine (JVM)
Proxy Object Proxy Object acts as an intermediary between the client and an accessible object.
The purpose of the proxy object is to monitor the life span of the accessible object
and to forward calls to the accessible object only if it is not destroyed.
RTP Real-time Transport Protocol
RTPS Real-Time Publish-Subscribe (RTPS) Wire Protocol provides two main communication models:
the publish-subscribe protocol, which transfers data from publishers to subscribers;
and the Composite State Transfer (CST) protocol, which transfers state.RPC Remote Procedure Call (RPC) is an inter-process communication that allows a computer
program to cause a subroutine or procedure to execute in program to cause a subroutine
or procedure to execute in subroutine or procedure to execute in the details for this
remote interaction.
Overhead Any combination of excess or indirect computation time , memory, bandwidth, or other
resources that are required to attain a particular goal
Broadcast Transferring a message to all recipients simultaneously.
Unicast Transmitting the same data to all possible destinations
Multicast Delivery of a message or information to a group of destination computers simultaneously
in a single transmission, uses User Datagram Protocol UDP
SIP Session Initation Protocol, The Session Initiation Protocol is an IETF-defined
signaling protocol, widely used for controlling multimedia communication sessions such asvoice and video callsover Internet Protocol (IP).
COTS Commercial Of The Shelf software
v
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
8/38
vi LIST OF FIGURES
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
9/38
The exponential uptake of mobile computing devices, such as smartphones and PDAs, is problematic with re-
gard to streaming multimedia rich content to a mobile device. The advantage and attraction of a portable light
mobile computing device is mobility. Portable power supply technology has not increased the power require-
ments of increased CPU and processing capabilities of mobile devices such as smartphones. Subsequently,
a shortfall exists between available and required power for processing and playing of streamed multimedia
content. However, with portability comes constraint on physical size, consequently such devices are compute
resource limited and unable to execute applications requiring high compute processing such as digitising and
rendering multimedia rich content. The CPU, memory and energy overheads of such multimedia applications
outstrip the capabilities of thin mobile clients, and handheld devices. Although smart phone CPU and memory
has increased recently, the drain on compute energy still limits usage of high intensive compute application
processes. One method to reduce required power requirement is to offload power consuming processes to
surrogates, called cyber foraging [Satyanaranyanan01a]. Access and availability to compute surrogates is pre-
dicted to become ubiquitous in future pervasive compute environments. The implementation of cyber foraging
requires particular essential capabilities such as surrogate (service) discovery, establishing trust, partitioning
what application task operations to give the surrogate, and scheduling surrogate tasks.
In this work a number of current cyber foraging tools and techniques available to the mobile application
developer are described. A typical mobile application design brief is included to provide a context in which
cyber foraging might be used.
1.1 Example Design Brief
The example application is initially required to automatically search and discover available surrogates. When
required, the application compares compute resources of local (on mobile client) entity and remote entity(surrogate), with minimum required compute resources to complete multimedia processing tasks. If local
compute resource is less than minimum required compute resource, the application offloads the processing
task to the surrogate. The surrogate performs the task and streams processed multimedia content back to the
application. The Use Case scenarios describe human interaction and operation of the application in a pervasive
compute environment.
1.1.1 Use Case
This use case is intended to show how a the proposed system may be used in practice, and illustrate what is
required of the system to fulfill its goals.
Hubert is at an airport departures waiting for his plane to go to France. The airport departure lounge is a
designated pervasive compute environment. Hubert is an avid football fan and takes out his mobile phone to
watch the highlights of the latest match played by his favourite football team. Hubert has recently registered
1
1. Introduction
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
10/38
2 CHAPTER 1. INTRODUCTION
with a service provider that provides high definition video footage of replays of recent football matches. His
phone has already discovered that it is in a pervasive compute environment. Hubert starts his phone application
to watch the goals scored by his favourite football team in their most recent match. Unknown to Hubert his
mobile phone only has the computing power to process high definition video at the lowest quality and with
intermittent stops and jerks during playback. His phone application detects the shortfall in compute power and
redirects the video from the service provider to surrogate(s) in the departure lounge that carryout the video
processing and stream the video to Huberts phone, where he watches the footage in high quality.
When Hubert arrives in France, he tries out his new phone language translation application. Again, his
phone does not have the computing power to run the application. Hughbert looks for a cyber cafe or public
building such as a library or museum that normally have pervasive compute environments. He spots an cyber
cafe sign and walks towards it, as he approaches the cafe entrance his phone discovers the surrogates and
displays a message that the language translation application is ready to be executed. Outside the cafe, Hubert
asks some locals directions to his hotel using the language translation application on his phone. During this,
his phone offloads the translation application processing tasks to the surrogates in the cafe, that process the
tasks and return processed data back to the phone application. Eventually, one of the locals recognises the
name of Huberts hotel and with the help of the translation application Hubert was able to understand the route
he must take to get to his hotel.
Before setting off to his hotel Hubert takes a picture of the local who gave him directions, but his mobile
phone could not execute the language translation application and the phones high mega pixel camera at the
same time. Instead of forcing Hubert to close the language translation application, the phone offloads the
camera application processing to the surrogates in the cyber cafe.
1.1.2 Design Considerations
While we use the term mobile client, in this work the scope of mobile is limited to mobility within a single
pervasive compute environment. The use case describes a cyber foraging architecture similar to that shown
in network A and B shown in Figure 1.1. In both A and B, communication channel 1 is between client and
surrogates within a pervasive compute environment. Communication channel 1 is required for access between
client and surrogate, for surrogate discovery, offloading/distributing application tasks to surrogate, distribution
of client and surrogate details, and sending processed task data back to mobile client. Communication channel
2 is required for access between client and remote multimedia content provider, for remote application exe-
cution/invocation and transfer of network data and network entity details. Communication channel 3 enables
access between remote server and pervasive compute environment, for transfer of content for processing and
network details.
Previous approaches to supporting client application using cyber foraging, have taken either a thin client
approach, or thick client approach. In network A, previous solutions to run remote applications have used
thin client solutions such as VNC [Richardson98a] SSH [SSH11a], or web based services such as GoToMyPC
[Citrix11a]. Thin client solutions have the advantage of not requiring modification to the application, and
ease of use. However, thin client solutions require low network connection latency to be effective, and are
consequently reliant on bandwidth. To decrease reliance on latency and bandwidth, approaches using thicker
client solutions have been used.
A thick client approach is one in which a part of the application executes on the mobile client, therefore
when there is inadequate bandwidth, a low quality degraded version of the application may run on the client.
Mobile clients that run a complete application are the extreme of the thick client approach, these make any
attempt at cyber foraging completely redundant. The use case describes an example of both thin and thick client
approaches with cyber foraging support. Network A and B are basically thin and thick client implementations,
respectively.
The thick client approach may be classified into two methods. In method one, called application-transparent
adaptation ( in this work ATA), existing application APIs on surrogates and remote servers are used to control
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
11/38
1.1. EXAMPLE DESIGN BRIEF 3
Figure 1.1: Cyber foraging architecture showing mobile client in a pervasive compute environment with bi-
directional communication to surrogates and remote multimedia content server
and communicate with the client application. ATA is limited to application APIs ability to adapt to external
commands. The second method, called application-aware adaptation(in this work AAA), requires explicit
modification to the application, to work at runtime. AAA enables greater scope for application adaptability
during runtime, and is not limited by prior application adaptation constraints. However, the application-aware
method does require manual intervention and access to application source.
A hybrid of ATA and AAA are approach in which some code from the client application is transferred to
the surrogate during cyber foraging. The application code transferred during cyber foraging is called mobile
code, if choosing a hybrid approach the application must balance the proportion of mobile code to transfer
to a surrogate with respect to the whole application code, and the time required to execute mobile code on
a surrogate prior to cyber foraging taking place. In effect, mobile code prepares the surrogate by rewriting
existing surrogate application, this is different from transferring identification and authentication information.
Whichever approach or method is used, the design of a cyber foraging system to support a mobile client
application in a pervasive compute environment should consider as a minimum, the following four design
features.
1. Discovery of surrogates - Without being aware of the presence of available surrogates no cyber foraging
can take place.
2. Trust establishment between communicating entities - The transient nature of cyber foraging means that
initially both client and surrogate are unknown un-trusted entities that must satisfy some form of trust
criteria before any interaction can take place.
3. Partitioning application tasks to offload to surrogates -
4. Scheduling tasks for offloading and to surrogates and retrieving completed task data to the client appli-
cation.
Human interaction should also be considered, ubiquitous computing ideally requires no conscious human
interaction. However, the option of a manual mode that prompts the user to interact prior to offloading mobile
code to surrogates has psychological usability benefits regarding privacy. Finally mobility, although the scope
of mobility in this work is confined to a single pervasive compute environment, the user would still be expected
to move location within the environment.
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
12/38
4 1. INTRODUCTION
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
13/38
2.1 Introduction
A decade ago, Satyananrayanan [Satyanaranyanan01a] accurately described and predicted pervasive computing
as the next step foreword in computing evolution, as a hybrid of distributed and mobile computing. Motivation
behind the prediction was continual advances made in distributed and mobile computing a decade previous to
Satyananrayanan towards the realisation of Weisers [Wieser91a] vision of ubiquitous computing. Integrating
distributed and mobile computing technology continues to present unique challanges. Cyber Foraging is
the term used by Satyananrayanan to describe a potential solution for one of the unique challenges pervasive
computing must address in order to integrate distributed and mobile computing seamlessly within a pervasive
computing environment. The specific unique challenge cyber foraging seeks to address, is to dynamically
divide and distribute application processing tasks of a resource constrained mobile client, to remote surrogateresources that perform the processing tasks on behalf of the mobile client, and return the processed data back
to the mobile client application. Each potential remote resource surrogate hosts some kind of service that the
cyber foraging system searches for; consequently the activity of searching for remote resource surrogates is
called service discovery. Implementing cyber foraging (also described as offloading [Gu04a]) for a mobile
client within a pervasive environment requires four fundamental stages of operation. Stage one is service
discovery of remote surrogate entity(s), stage two is establishing trust between mobile client and resource
surrogate entity(s), thirdly partitioning the application processing tasks between local and remote execution
[Balan07a], and fourthly scheduling the partitioned processing tasks to the correct surrogate resource entity
[Kristensen10a]. In combination, these four stages define t he fundamental features required of a cyber foraging
system within a pervasive computing environment.
Intuitively, the speed with which a process task can be offloa ded and processed by surrogate is paramount.
If the cyber foraging process cannot maintain transparency to the user, in the form of uninterrupted application
usage, then the cyber foraging system has failed. Factors that affect the offloading and processing of partitioned
tasks are called overheads, and play a very important part in cyber foraging research and development.
Overheads include those network factors that effect overall response time and energy consumption, such as
CPU, memory and battery consumption.
This literature review identifies the current range of cyber foraging systems and the extent to which these
systems adhere to the four fundamental prerequisite features that define cyber foraging within a pervasive
environment. Firstly each cyber foraging systems is briefly described, and then we discuss the salient points of
each cyber foraging system in relation to each of the four design features described earlier. (Figure 1.1 presents
a tabulated snapshot of overall adherence) Secondly, alternative methods to achieve the functionality of the four
design features are discussed.
5
2. Cyber Foraging
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
14/38
2.2 Cyber Foraging Tools
Introduction
Essentially, a cyber foraging system is a fluid network that h as to juggle What processing code to distribute,
Where to distribute it, and How to distribute it, in order for an application to running transparently on a
resource constrained entity within minimum QoS. All the systems introduced in this work have as a minimum,
some form of task scheduler that can range from deciding where a single task should be performed, to multiple
tasks distributed on multiple surrogates. Tasks maybe modelled from data captured by system/network resource
monitors, and prediction algorithms performed on the modelled tasks that allocate surrogate resources to the
tasks. The range and scope of information included in the monitoring and modelling vary from system to
system. The overall system architecture effects interaction and communication between system activities such
as monitoring, scheduling, task/code partitioning and data transfer. Finally, actual communication methods vary
from RPC, RPT, or messages depending on the system and system objectives. All the systems described here
are designed as aides to developing applications with cyber foraging and task offloading functionality; therefore
these are tools and not applications. Some cyber foraging tools vary in maturity and access to descriptive
documentation and technical information; this is reflected in variation section page lengths.
2.2.1 Spectra
Spectra is arguably the forerunner of todays cyber foraging systems. The motivation behind Spectra was to
address the uneven conditioning [Satyanaranyanan01a] that occurs in pervasive compute environments.
Spectra is based on predicting an applications future resource requirements by monitoring and then mod-
elling current application resource usage. Monitoring is performed by six resource monitors each of which
monitor single of related sets of resources, namely CPU, network, battery, file cache state, remote CPU and
remote cache state on surrogates. These monitors are within a modular framework shared by Spectrass client
and server. Spectra consists of a client server architecture running on an application entity, and a server running
on surrogate entity(s). Spectra uses services provided by Coda file system for replication and consistent remote
file execution, and Odyessy for defining the fidelity of tasks f or distribution. Coda file system was used instead
of using a distributed file system because it was felt that net work latency and bandwidth was not sufficient
to sustain consistency. Spectra uses energy consumption during operation execution run times as the primary
metric for scheduling and partitioning of tasks.
The information provided by the resource monitors provide a snapshot of the surrogate state and availability
of surrogate resources, and current state of the existing running system. From this, Spectra predicts balance
or imbalance between current running system and future processing tasks, based on this balance prediction
Spectra indicates the location of available surrogate(s) to the application. The decision to actually use the
surrogate for the next task processing is made by the application.
Predictions on the expected time for process task execution are made by firstly calculating transfer of
data by dividing data for transmission with bandwidth; secondly, calculating task process time from previous
heuristic data such as system logs, that are modelled for prediction of future usage. These predictions are made
by predictors that use the heuristic data to generate the prediction models and update current log data.
Scheduling is also based on modelled heuristic data of the surrogates available in the environment, any
planned future execution of processes and fidelity of execut ion (fidelity is provided by Odyssey). The volume
of data retrieved from the environment affects accuracy of the heuristic prediction models. This means that in a
small environment with few surrogates scheduling predictions cannot be guaranteed to be optimal. Spectra uses
a utility function to evaluate process executions. The function takes into account time for execution, fidelity
and energy consumption. The utility function predicts execution by summing CPU time, data transfer time,
time to manage cache, and time required to ensure consistency of data. Whilst cache management and data
consistency times can remain stable, execution time varies with different applications. To overcome application
specific execution time variability, each application must provide Spectras utility function with a similar func-
6 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
15/38
2.2. CYBER FORAGING TOOLS 7
tion indicating potential application requirement or desire for remote execution i.e. the function 1/T where T is
the predicted execution time, low value would would indicate low desire for local task execution and potential
requirement for remote execution. In this example Spectra provides available compatible surrogate(s) informa-
tion to the application. The decision to perform remote execution is made by the application. To continually
carry out this requirements and desire for remote execution and pass on to Spectras utility function that returns
surrogate information, carries with it non negligible overhead, therefore only execution times approximately
one second duration are passed on to Spectras utility function.
The Spectra API is presented in Appendix - Spectra. A register fidelity call identifies and registers start
operations. The application then defines a set of possible execution scenarios, these scenarios provide different
ways to partition execution between remote surrogate and local entity machines. The set of scenarios also
specify levels of fidelity and input parameters, all of which add to operational complexity. The begin fidelity op
call determines how to execute an operation and where the operation is to be executed. In short, Odysessy
chooses the fidelity level and Spectra chooses the execution scenario (plan), before any actual execution takes
place.
do local op and do remote op calls mark the start of execution of operations, these call make RPCs to
Spectras surrogates and local server. Other than starting execution of operations, these calls also serve to
continually monitor resource usage at surrogates. The end fidelity op call is made by the application to signal
end of operation execution.
In the background to execution of operations a snapshot of resources in generated by the predict avail call,
that iterates through Spectras resource monitors and returns predicted resource availability, this information
is used by the register fidelity operation that generates potential scenarios. Spectras monitors are started and
stopped in tandem with start and stop of execution of operations of the do local op and do remote op calls.
Finally, add usage logs observations made, which may be used as future heuristic data.
2.2.2 Chroma Tactics
Based on Spectra [Flinn02], Chroma differs from Spectra in two major ways, firstly decisions to execute re-
motely are not made by the application but by Chroma. The reason for this was to introduce flexibility so that
application developers are not required to specifically interface with Spectra. Secondly, applications in parallel
is possible with the introduction of remote execution tactics or tactics developed by Balan [Balan03a] (who
also devloped Chroma).
Tactics enables different ways to sub divide operations that can be executed sequentially or in parallel using
RPCs. Once different tactics to execute an operation are defined, Chroma decides which tactic to use, and
if execution shall be local or on remote surrogate resource. Tactics are defined in a generated tactics file, an
example of which is found in Appendix. A tactics file is in two parts, part one is a sequence of RPC calls
describing available RPCs and their associated IN input and OUT output parameters. Part two of a tactic file
consists of single tactic definition descriptors, each of which is made up of a sequence of RPCs. RPCs that
make up tactic file descriptors can be executed in parallel (within brackets) or sequentially (separated by an &
symbol). Each tactic definition is created by the application developer, ideally at the application development
stage.
The tactic files for each task are collected by a solver that schedules each task by selecting an appropriate
tactic plan for particular resource availability scenarios. A tactic plan specifies which tactic to use and where
to execute related RPCs, taken from the tactic file. In order for Chromas solver to select an optimal tactic
plan it requires resource usage information, which is supplied using multiple resource predictors and heuristic
information similar to Spectras heuristic prediction models and resource utility function. However, unlike
Spectra, Chromas solver selects an optimal tactic plan based on resource priority and enforces it in a brute
force manner. The reason for this is the claim that there is only a small number of ways to subdivide an
application task for remote execution to a surrogate. An example of Chroma architecture and that shows the
work flow using tactics may be found in Appendix Chroma.
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
16/38
Parallel remote execution of tasks in Chroma is achieved when the pervasive environment is over resourced
for the current application requirements. In such a scenario, the same task RPCs from a tactic plan are executed
on different surrogates. Chroma employs three different optimisation techniques, called fastest result, data
decomposition, and best fidelity. The fastest result technique performs a tactic at a certain fidelity on multiple
surrogates, and uses whichever result is returned first, any subsequent returned results are discarded. The fastest
result technique increases performance but increases overall load on surrogates, especially in an environment
with multiple application devices are operating simultaneously. The data decomposition technique requires the
programmer to explicitly define a function to subdivide inpu t data that can be sent to multiple surrogates. The
programmer defined function must include a method for mergin g returned data using this technique. The best
fidelity technique is implemented in the tactic file. Here Chr oma sends different tactics to different surrogates
and waits a certain time interval, the returned results within the time interval with the best fidelity are chosen,
all others are discarded.
2.2.3 AIDE
AIDE developed by Messer [Messer02a] takes the approach partitioning the a service running on a client and
offloading to surrogates. AIDE used a modular distributed pl atform employing the Java Virtual Machine (JVM),three modules address monitoring the application execution, partitioning tasks, and offloading of components.
Dynamically partitioning Java programs and offloading code sections to surrogates is based on memory and
processing constraints. AIDE used a graph of the application execution history, and subdivided the graph to
represent code sections to offload to surrogates. The granul arity is defined by Javas class component in relation
to Javas code architecture of objects, classes, and higher level components such as JavaBeans.
Distributed execution was achieved by modifying JVM to have hooks instead of unique object references.
These hooks took the form modifications to JVM that flag object references to remote objects, and then intercept
accesses to remote objects. With these modifications to JVM, the AIDE modules were able to convert remote
access into RPCs between JVMs on client application and surrogates. Any JVM on either client or suggogate
that receives a request used a pool of threads from which to perform RPCs on behalf of other JVMs. Herethreads are not migrated, rather invocations and data accesses allow placement of objects.
Partitioning of java code, was done using heuristic execution data in the form of an exection graph. Based on
the graph MINCUT heuristic, all graph nodes representing a class that cannot be offloaded are partitioned first
and stored on the client entity. Following this, each remaining graph node was evaluated using the MINCUT
heuristic. The AIDE MINCUT heuristic produced a group of minimum cut partitions, that were individually
evaluated to determine which one satisfies the partitioning policy. The partitioning policy was based on a cost
function that returns historical data transferred between partitions. Partitions were then selected which could
be offloaded without detriment to overall network operation and use of resources such as memory.
Resource monitoring during partitioning and application execution was achieved by augmenting JVM code
for method invocations, data field accesses, object creatio n and object deletion. The monitored information
is obtained at the object level and aggregated to class level, to coincide with graph description. Memory
usage within interclass interactions was also monitored and returned as values represented by graph edges
and graph edge parameters. Graph representation of application execution was a fully weighted execution
graph. The execution graph represents node classes annotated with memory usage of objects within the class,
interactions between class objects, and data transfer between class objects. Graph adaptation and adaptive
partitioning policy used memory usage by tracking free memory space from the JVM garbage collector. An
overall archtecture found in Appendix AIDE shows the overall AIDE architecture and the hardware and VN
used.
2.2.4 Networked integrated Multimedia Middleware (NMM)
Although not presented as either a cyber foraging system or a remote execution system, NMM is included in
this work because it demonstrates the features required of a cyber foraging system and remote execution of
8 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
17/38
2.2. CYBER FORAGING TOOLS 9
tasks.
NMM is a component orientated middleware framework that integrates and configures components in a
network. The primary service goal of NMM is to manipulate and render multimedia content within a network
of resources prior to delivery to a mobile device.
NMM is modelled on a logical flow graph, made up of nodes representing individual multimedia content
processing tasks. The flow graph represents the overall task requested by an application, the overall task is
further divided into subtasks represented by different node elements of the flow graph. Typically, a requested
application task maybe for playback, transcoding or recording of multimedia data. To complete the application
request task the data must be read from a source, for multiple media streams the data requires demuliplexing,
and then individual streams decoded prior to rendering. The application request task is represented as node
elements forming a chain of subtasks node elements required to accomplish the overall application request
task. Each node within a logical flow graph accepts as input and produces as output, different defined data
format via input/output jacks. Format of data is defined as a tupel, made up of media type and encoding task
ie audio/mpeg3. As data is passed from node to node through the logical flow graph, the format tupel changes,
until the data is ready for rendering or whatever the task of the final sink node maybe.
Jacks act as connectors between nodes, each node having its own input and output jack. Exceptions depend
on the node type such as a source node that does not have an input jack because it accepts data directly from
the multimedia content repository, therefore the source node produces data to be consumed by the flow graph.
Another exception is the sink type node that does not have an output jack because they are used to render data
or dump to a hard drive, a sink node consumes data produced by the source node. The current NMM includes
60 such node types, each defined as a unique multimedia content processing task. The life cycle states of a node
runs through initialisation, initialise output, activated and finally started, once started processing may begin.
The advantage of using NMM flow graph description is that different nodes can run on different resource
hosts, furthermore the NMM application itself can run on separate hosts from any processing resource host.
Resource node management including discovery, reservation, and instantiation is accomplished using a registry
service that runs a registry server. Each resource node runs registry server which can be accessed by a registry
client running on the application host. An application can use the registry client to query local of distributed
running registry servers. When an application requests a resource node from the registry server, the registry
server checks node availability, if available the node is instantiated and the node goes through its life cycle.
Global resource host information is gained through registry information.
NMM uses a client server and peer-to-peer communication architecture to communicate between nodes in
a network, in which using a proxy object, invocation and execution are separated. Thus, the user of a proxy
object does not need to know the resource host information on which the node is running; only the proxy object
requires this information. These communication channels are therefore abstractions for all communication
between NMM components, such as input/output binding between jacks of different nodes running on different
resource surrogates.
Service discovery and remote execution is performed by a server registry service. Discovery and registration
of nodes is performed statically during initial registry service setup, however once a serve registry is initialised,
dynamic registration of any added plug-in is dynamically added to the registry service. The dynamic addition
of plug-ins to an existing network is added to a hierarchy of registry registers, the interface between client and
servers in an existing network is called IRegistry and IServerRegistry, a description may be found in Appendix
NMM.
At this juncture it is important to underline the difference between a graph description, a node description,
and plug-ins in order to make clear the what information is communicated by the registry service. The concepts
used for describing entities that are administered within the registry service are the same for querying entities
from the registry service. A plug-in is specified by a node description, whereas a complete flow graph is stored
within a graph description Each description type contains all relevant attributes such as object name,ie a node
description includes, node name, node type, format and sharing attributes. In addition to attributes, a node
description also stores a list of events called configuration events used when configuring a plug-in instance.
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
18/38
Therefore a list of any possible state and configuration attr ibutes for that state exists as configuration events.
Node description can be further subdivided into subsets of the overall node description, this allows querying
all node descriptions in the registry and returning only those subsets that full fill query criteria. The properties
in a graph description include specification of nodes and the ir connections, specification of communication
channel, and synchronisation.
Distributed synchronisation in NMM distinguishes intra-stream and inter-stream synchronisation. Intra-
stream synchronisation refers to refers to timings between multiple presentation of the same media stream,
ie stream of subsequent video frames. Inter-stream synchronisation refers to synchronising media streams
themselves, ie synchronising lip-sync for audio and video streams. Each NMM message holds a timestamp
that contains entries for time and stream counter. The time and stream counter entries of the timestamp and
a global system clock, are the main reference to time when synchronising between media streams passed
from one processing task node to another either locally or remotely on surrogates. Synchronization between
surrogate nodes is governed by a set of synchronisation controllers or synchronisers for each task node. These
synchronisers realise inter-stream synchronisation by implementing a synchronisation protocol, and provide an
interface allowing the application to modify the operation of a corresponding flow graph i.e. for pausing data
processing.
The overall objective of the sink synchronisation is to either, provide a synchronised playback of the media
content, or rendering, for distributed audio/video sinks. An example of distributed synchronisation to a sink ap-
plication node may be found at Appendix NMM. Here the buffer is simply a collection of messages containing
timestamp information. Timestamps are handled by local running controllers delegated by synchronising sink
nodes, these local controllers deal with the intra-stream synchronisation. Controllers decide when to present
a particular buffer from a number of buffers from multiple streams by matching of buffer latency and flow
graph latency. A buffer requires a certain time interval for reaching a node, after a timestamp has been set.
This timestamp to node interval is called real latency and is expressed as the difference between arrival time
and time set within the timestamp. Two buffers with corresponding time stamps will have different time for
reaching sink node, i.e. real latency 1 and real latency 2. If one imagines latency as the time stream from the
node where the time stamp was set called the sync-time, to the sink node where the buffer will be presented
called the presentation-time, then latency = presentation-time sync-time. Alternatively, if latency is given,
the controller will calculates a theoretical latency or theo-latency where theo-latency = sync-time latency.
During runtime a controller computes real-latency > theo-latency+ max-skew, where max-skew is a tolerance
previously defined. If the computed real-latency exceeds a v alue depending on the theo-latency, the buffer is
considered as to old an maybe considered as invalid. However, if real-latency < theo-latency + max-skew
presentation of the buffer will be delayed. In summary, the intra-stream synchronisation attempts to maintain
constant latency for stream buffers, so that temporal distance between buffers is the same as their corresponding
sync-time(s).
This differs from inter-stream synchronisation that attempts to maintain equality for latencies of streams
(equal latencies for different streams). This is achieved by setting theo-latency(s) of all controllers to the
maximum real-latency of all current streams. First every controller sends computed real-latency to for first
buffer to arrive, to the synchroniser. Then the synchroniser computes theo-latency as a maximum of all latencies
and sets this theo-latency value as theo-latency for all connected controllers.
Search and instantiation is performed in two stages. The firs t stage is referencing all nodes that match a
given node description, here node description refers to a subset of a complete graph description stored in the
server registry. The second stage is node instantiation of the first node in a list of nodes returned in response
to stage one matching of references; response consists of returning a registry identifier and node identifier.
Using these identifiers, a complete flow graph description ca n be requested using the first responding node. A
description of a client registry and the registry hierarchy of registers that hold a number of specialised registers
for different identifiers and node attributes may be found in Appendix NMM.
The registers in the registry hierarchy are accessible via IRegistry interface, Registry1394 administrates
firewire compatible devices, and LocalRegistry provides in formation for non-specialist available plug-in. Shown
10 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
19/38
2.2. CYBER FORAGING TOOLS 11
in the ClientRegistry is the scope of information held and subsequent operations performed on the flow graph.
Finally, a code snippet that a developer might use to access the registry service may be found at Appendix
NMM. The first part creates a central application object for the application. Here the system server registry is
contacted; if contact fails a local instance is created instead. The second line requests the client registry. The
next part start with NodeDescription is an example for requesting a node specified by node name from the
registry service. Here a graph description is used to request a simple flow graph consisting of three nodes, a
source node for reading data from a file, a converter for decoding MPEG audio, and a sink node for outputting
uncompressed audio. In the last part of the code snippet, all specified edges of the flow graph are specified and
connected and all nodes activated, finally the flow graph is started.
2.2.5 Goyal
Goyal [Goyal04a] was motivated to develop a cyber foraging system on a widely available platform without
the requirement of a large middleware layer.
Goyal developed a lightweight cyber foraging system that differed from Spectra and Chroma because it
did not require use of a common file system such as Coda/Odyssey. Similar to AIDE, Goyal employs virtual
machine technology, however in contrast to AIDE partitioning is done by the application developer and not by
any automated code division method. Multiple virtual surrogates can be created on the same surrogate host.
The argument for using virtual machine technology was that independent virtual servers allowed for greater
isolation, flexibility, resource control, and clean-up compared to running on real host surrogate machines.
Isolation, in terms of no interference between virtual machines. Flexibility, as client applications can arbitrarily
software on the virtual machine. Resource control, in that resources of the physical host can be fairly allocated
between multiple virtual machines. This also allows the physical host to compute separate applications from
the virtual machines without draining virtual machine resources. Clean-up is automated and simple, when a
virtual machine instances shuts down, the allocated disk partition on the host surrogate is restored to original
clean state.
Service discovery was managed by a seperate service discovery server, that maintains lists of registered
surrogates and their individual resource capabilities, represented in an XML syntax description. When a client
requires surrogate resources, the client queries the service discovery server by requesting particular resources.
The service discovery server matches existing listed registered surrogates with client query resource requests.
Matching requests and resources is based on previous profiling of application resource requirements made by
the developer.
A typical implementation of the Goyal system workflow may be found in Appendix Goyal. Initially, the
client sends a request to the server to discover a surrogate from its listed registered surrogates. The service
discovery server replies with an IP and port number of the surrogate manager of a listed registered surrogate.
With this information the client contacts the surrogate manager with a service start request. The surrogate
manager determines adequate resources by matching application requirements with available resources, using
the same XML notation as the initial service discovery request.
After authenticating the client, the surrogate manager sends the client a service start response containing
IP of new virtual machine, after allocating matched resources to a newly started virtual machine. During client
application and virtual machine interaction, the client invokes an operation on the surrogate by sending a Sub
Task Configuration Request to a virtual server manager on the surrogate. A Sub Task Configuration Request
from the client would include a URL of the client program to run, the URL programme would include all the
information the virtual surrogate requires to install and run the program.
Authentication between client and virtual surrogate is addressed using a flexible authentication framework
that supports multiple authentication mechanisms, specified by the client when first connecting to the surrogate
machine during the service start request stage. The different authentication mechanisms available for the client
to specify are SSL, TLS and SSH. Once a client - surrogate session is established, any subsequent transfer of
data of data or communication uses a clients public key for authorisation. The clients public key is stored by
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
20/38
the surrogate and service discovery server for future reference, and future client service discovery requests.
2.2.6 Slingshot
The motivation behind Slingshot [Su05a] is to eliminate the bottleneck that can occur when a client application
attempts cyber foraging on remote surrogates via a wireless hotspot. Slingshot is client surrogate architecture
for deploying mobile services at wireless hotspots, based on the concept of continuous replication of applica-
tion states instantiated on virtual surrogates, as the client moves between available virtual surrogate resource
services. The slingshot architecture replicates remote application state on surrogate computers, co located with
wireless access points. A first replica of each application i s executed on a trusted safe server, and acts as a
backup if subsequent surrogates fail. A subsequent second replicated application state is co located on a virtual
surrogate within closer proximity than the first replicated state virtual surrogate, for quicker response times.
The client applications broadcasts application requests to all replicated states on all virtual surrogates, and
only responds to quickest return from any of the replicated state virtual surrogates. A database of the state of
each replicated application maintains checkpoints for reference for a start point of a new replication instance.
Replication in this fashion is used instead of migration of replicated state from surrogate to surrogate because
during migration, processing cannot continue. In Slingshot processes continue on previous replicated states
while new replicated states are being instantiated. Slingshot instantiates a new replica by check pointing the
first replica, and migrating its volatile state to a surrogat e, and then replaying any operations that occurred after
the checkpoint. The workflow of Slingshot may be found at Appe ndix Slingshot.
The first safe replicated state surrogate server is called th ehome server that maintains a service database.
The service database maintains current service state of replicated server on its virtual disk using SHA-1 values
assigned to 4kb chunks of the latest replicated state. Therefore, at any time the home server has the latest
updated replicated state
2.2.7 Vivendi
Vivendi was developed by Balan [Balan07a] who also developed Chroma tactics [Balan03a] and VERSUDS
[Balan02a]. The Vivendi system has two main components, one that deals with creating a remote execution
tactic file (tactic), and the Chroma runtime system [Balan03 a]. Here we only describe the Vivendi partitioning
system and the interactions between Vivendi, tactics and Chroma. Please refer to previous sections for futher
information on Chroma and tactics.
Vivendi is written in little language [Bentley86a] for rapid modification of applications to enable partition-
ing of application tasks. The motivation for vivendi is to reduce application development time by reducing the
complexity of modifying an application to support cyber foraging at developer level, to allow both novice and
less experienced application developers to develop cyber foraging enabled applications. Essentially, Vivendi
requires the developer to preplan which application tasks should be considered for remote execution, and define
critical variables of the task as parameters to be used to predict expected required resources to carry out the
task, written in little languages as a tactics file. The devel oper includes for each critical task variable parameter,
a fidelity for carrying out the task with satisfactory qualit y results. The developer also specifies RPCs that
define the actual application computation task a surrogate p erforms on behalf of the client application. Finally,
the developer defines combinations of different RPCs that ca n carry out the application task within the defined
fidelity and quality.
Chroma compliments Vivendi tactics file by selecting the app ropriate tactic and the binding of RPCs to
surrogates. Selection of tactic is based on Chromas resource management, prediction and fidelity selection
functions. as depicted in Appendix Vivendi. The Chroma solver module responds to Vivendi stubs generated
by Vivendi RPCs defined in tactics file, and predicts the curre nt optimum tactic to use. The solver prediction
process engages monitored surrogate resources data, computed using Chromas utility function, and similar
tactic heuristics.
12 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
21/38
2.2. CYBER FORAGING TOOLS 13
Vevendi generates two types of stubs, a standard RPC stub and a wrapper stub. The wrapper stub is manually
written by the developer, the wrapper stub contains the application code methods required for the task defined
in the original tactic, the wrapper stub effectively contains the lower level information a surrogate need to
do the task. Using a Vivendi wrapper stub provides a convenient interface between Chroma and the targeted
application. An example of a Vevendi tactic file and Vevendi wrapper stub may be found in Appendix Vivendi.
2.2.8 EyeDentify
EyeDentify EyeDentify is a smart phone object recognition application developed on the Android OS, that
uses the Ibis Distributed Deployment System to deploy remote application on to surrogates, and the Ibis High
Performance Programming System for communication.
The paper developed two versions of the EyeDentify application, one version performed all computation
locally and a second version performed computation on surrogates, the response times for the same computation
processes were compared. Results revealed a 60 fold increase in responsiveness using Ibis for cyber foraging
on remote surrogates than on local phone resources.
The Ibis middleware consists of a number of sub projects, each of which implements a part of the grid mid-
dleware requirements. The left side panel represents th Ibis Deployment System, with JavaGAT as the main
component. The right panel represents the Ibis High Performance Programming System, the main component
of which is Ibis Portability Layer (IPL). The combination of JavaGAT and IPL forms the main cyber foraging
mechanism for EyeDentify. A graphic representation of the Ibis Middleware used can be found in Appendix
EyeDentify/ JavaGAT has adaptors able to bind to any middleware, the adaptors map JavaGAT API to mid-
dleware calls, including SSH. The EyeDentity Android application used two adaptors, one for client resource
access, and one for surrogate resource access using SSH. On top of JavaGAT is a deployment library called
IbisDeploy (Deploy) that starts distributed applications developed using Ibis High Performance Programming
System. On top of IbisDeploy is a GUI from which remote application can be started. Deployment procedure
of an Ibis application on a remote surrogate involves the following sequence of eight subtasks. 1) Replicate the
application, libraries and input file on the remote surrogate. 2) Start and Ibis Server registry process. 3) Form
an overlay network. 4) Construct middleware specific job descriptions. 5) Submit job description to remote
surrogate. 6) Monitor job statuses. 7) Retrieve input file when process is completed. 8) Clean up remote file
system.
Service discovery is performed by the IbisDeploy library when it defines job descriptions and Ibis applica-
tion using a namespace concept. Within an Ibis application description is contained a main class, and virtual
machine options and arguments, within remote surrogate description is contained details of how a remote
surrogate should be accessed. Both the IbisDeploy library and GUI are ported onto Android. In summary,
the application itself is developed on the client and deployed to remote surrogate, therefore the only service
software required to run on surrogates is the Ibis default middleware that JavaGAT binds to, and a JVM.
The EyeDentify application is an object recognition application that has two stages of operation. Stage one
is called the learning mode in which an image of an object is stored on an internal database with a predefined
identification profile. In stage two called the recognition mode, another image is captured and matched to
images in the internal database, the best match between second image and database images are then presented.
Matching images is performed by learning algorithms that extract features and attributes of the second image,
such as colour histograms, shapes, relative size etc. The recognition phase of the application is very resource
consuming; this was the compute that the paper offloaded to surrogates.
2.2.9 DiET
Diet developed by Kim and workers [Kim09a] is a framework that transforms original java bytecode from
a remote content service provider into graduating smaller versions, one version for surrogates, and an even
slimmer version for execution on a client. Motivation for the work is to reduce java bytecode into serialised
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
22/38
distributed objects that are usable by surrogates and clients, without the need for major developer modifications
to the original java application. This slimming down of bytecode is done by replacing the main bodies of
methods with remote procedure calls (RPCs). A client request starts the process by requesting an application
to execute from a remote content service provider. The service providers then slim down the application java
bytecode and transfers this to prediscovered surrogates and clients in the pervasive compute environment. The
surrogates recieve a server byte code and clients recieve a smaller slim bytecode. Since no modification to code
functionality takes place, once the client and server have their received their versions, the cyber foraging takes
place as ATA using JVM on invoked on surrogates. A graphic description of the transfer of server and slim
bytecode is shown in Appendix DiET.
2.2.10 Instant-X
Instant-X is a component based middleware platform that provides a generic programming model with an
API for essential tasks of multimedia applications with respect to signalling and data transmission. The mo-
tivation behind Instant-X is to develop spontaneous communication software, compatible with multimedia
encoder/decoder protocols. The work argues that standard multimedia encoder/decoder protocols are limited
as communication software. For example, the Java Media Framework [Sun99a] offers basic access to multi-
media codecs, and RTP data transmission, but does not support further communication mechanisms such as
signalling. However, JAIN SIP does support signalling, but requires considerable configuration by the multi-
media application developer.
The thrust of Instant-X concept is the ability to replace specific protocols implementations without chang-
ing application code of multimedia application. Instant-X also supports dynamic deployment of unavailable
components at runtime. Although not a cyber foraging system pre se, Instant-X is included in this section be-
cause implemented with OSGI [OSGi07a] as a component platform, the system demonstrates interesting cyber
foraging functionality. The programming model consists of three elements, binding, session and context. A
graphic representation of the programming model can be found in Appendix Instant-X. Binding is a local end-
point of an application represented by URI, that activates the URI and maintains UIR active status. A session
represents a P2P relationship participants or actors, each participant is has a unique URI identification, such
as SIP:[email protected] if using SIP URI method. The UIR identifiers are encapsulated in bindings. Con-
text contains optional parameters required for sessions and binding such as permissions. A SIP session may
contain multiple SIP sessions of RTP sessions for audio and video. The programming model is designed to
provide generic tool for developers who do not need to worry about the underlying protocols required of their
application. The generic API of instant-X is such that the application does not need to change if if the protocol
implementation changes. Instant-X API employs OSGi [OSGi07a] as a java service orientated architecture that
dynamically discovers collaborative components, changes device composition of a variety of networks, without
the need for device restart. Instant-X is demonstrated using cloud computing with OSGi. By using the cloud
computing paradigm, surrogate discovery and scheduling is deferred to the cloud.
2.2.11 Scavenger
Scavenger is a dual profile task scheduling system, written i n python as a hybrid cyber foraging approach based
on locust [Kristensen07a], consisting of a daemon installed on surrogates and libraries installed on the client.
Scavenger is motivated by increasing the effect of heuristic profiling for scheduling to take into account task
complexity, and merging task centric and peer centric profil es. The surrogates consists of two independent
software components that are a daemon running of surrogates using stackless python, and a library running on
the client using normal python. The libraries on the client are the mobile code executed on surrogates through
RPC entry point of surrogate daemon. Libraries can be invoked by the application starts cf or automatically
without the need to start the application.
The daemon on the surrogates has a front-end to receive RPC, and a mobile code environment. The mobile
14 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
23/38
2.2. CYBER FORAGING TOOLS 15
code environment allows dynamic installation and execution of the python code transferred from the client in
RPC.
Kristensen argues that mobile code is a necessity for true mobility because pre-installed tasks on surrogates
mean all surrogates everywhere must have all pre installed tasks. Similarly, using VM is too heavy weight and
takes to long for a full VM to instantiate, especially if the user is mobile and out of reach in a few minutes.
Kristenson argues using trusted pre installed mobile code is better, if the code is not installed then the mobile
client simply installs it.
The daemon execution environment spawns a core scheduler on the surrogate that handles the offloading
application tasks for a particular core. When installing the daemon on a surrogate the number of cores to offer
as surrogate cores is user configurable in the case that the surrogate is a machine such as a laptop used by other
users. Here the laptop maybe used locally for other activities and still be used as a surrogate for cyber foraging
in a pervasive compute environment. A high-level view of the scavenger architecture can be found in Appendix
Scavenger.
Once a surrogate had performed an offloaded task, the task is stored at the surrogate for future use using
a automated UID with MD5 sum naming. So when invoking a given task scavenger first queries if the task
is already installed, before installing from the mobile code. Here mobile code does not have to be transferred
along with the task because the task code is already at the surrogate. This aides the fact that many transient
client will normally use a certain number of tasks more than others.
Security and trust measures include black-listing and white listing of imported known standard library
modules.
Surrogate discovery is performed using a presence discovery framework, used by clients to discover surro-
gates. XML-RPC is used.
Scheduling - The main contribution of scavenger is the dual profiling during scheduling. Each task has
two profiles , a task centric profile where a globally applicable task weight is stored, and a peer-centric profile,
where a profile is maintained for each (peer-task) pair, that has been encountered. The peer-centric profile
stores information about how exactly a peer performs with regards to a specific task. These stored peer-centric
profiles are used to probabilistically determine how an unknown peer may perform during a particular task.
Scheduling then is based on maintaining a history based peer-centric profile. When profiling information is
required during scheduling the history based peer centric is first read to see if that particular peer has been used
before and for which tasks. If the peer is unknown, then the task centric profile is consulted. Task centric profile
is also history based and contains a weight for each task. Task centric weighting is calculated as follows:
Tweight= TdurationPstrength
Pactivity(2.1)
Where Tduration is task duration in seconds, Pstrength is peer strengh for nbench benchmark1, and Pactivity
is the number of tasks running during execution.
During a task, an expected Pstrength is derved by scaling with Tduration, the derived expected Pstrength
is used as a measure to reason about task expected running times on other surrogates. For example a task with
Tduration of 1 second on an surrogate with Pstrength 40, should take approximately 2 seconds on a surrogate.
Experiments verified a correlation between theoretical and real task duration times on surrogates with varying
compute resources carrying out the same task.
The profiles are two dimensional with a second dimension that takes into account task complexity. Task
complexity, is a factor that is also considered as a factor in the scheduling process. Task complexity variables
such as input size and value will vary. At the development stage the application developer is asked to provide a
description of input parameters that determine a tasks complexity. Combining developer specific complexity
parameter values and/or sizes is evaluated to a yield a single value , where may be for example the size of
an input file. Using a simple matching algorithm the value of in combination with Tweight is used to update
1
1http://www.tux.orgl..mayer/linux/bmark.html
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
24/38
the dual scheduling profiles.
Scavenger does not partition tasks into subtasks; each task described in mobile code cannot be further
divided into subtasks run on different surrogate machines.
2.3 Summary
The methods used to address surrogate discovery, trust establishment and task scheduling are shown in Table
2.1. With the exception of Scavenger, previous proposed cyber foraging systems have not supported all four
design features. The reason for this may be the non-commercial nature of these systems, that focus on particular
aspects of cyber foraging rather than a complete system.
Table 2.1: Table of Cyber Foraging Systems reviewed and their adherence to the four fundamental features
required of a Cyber Foraging system as defined Satyanaranyan an [Satyanaranyanan01a] and Balan [Balan07a]
Figure 2.1: Classification of cyber foraging systems as thin and thick ATA and AAA
2.4 Discussion
In this section the literature and cyber foraging tools described in previousley are discussed in the context
service/surrogate discovery, the mechanisms of these functionalities summarised in Table 1 in the previous
section are expanded upon.
1Network integrated Multimedia Middleware
16 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
25/38
2.4. DISCUSSION 17
2.4.1 Surrogate Discovery
Introduction
Of the challanges needed to be addressed in the fusion of distributed and mobile computing [Satyanaranyanan01a],
service discovery is the most critical, without which transient use of cyber foraging would not be possible. Dis-
covery enables entities to properly discover, configure and communicate with each other [Zhu05a]. Discovery
of a specific service in a pervasive compute environment is hampered because it is unreasonable to expect surro-
gates to have services pre installed to cater for all transient clients. The ability to dynamcially discover services
has less administrative overhead than traditional methods, that require prior knowledge of service existence
and the need for manual input of parameters such computer names IP, URL etc, to configure the discovered
surrogate service to the client appliacation.
In this work, we replace the concept of searching for a service with searching for a surrogate. While pre-
vious knowledge of services existence in fixed limited networks may seem practical, a pervasive computing
environment will have transient client usage and potentially hundreds of potential transient surrogates and sur-
rogate services. The cyber foraging systems reviewed in this section are not specifically discovery services, they
are tools and aides for the developer to include features of cyber foraging functionality in mobile applications.
In the work of Zhu [Zhu05a], ten service discovery design features are defined in a unifying taxonomy
of terms and definitions, and comparison made to the adherance of these terms with some exisiting discovery
protocols. The protocols that Zhu compares are primarily for home and enterprise environments that do not
completley compare with a pervasice compute environment, however the design approaches are useful guides
and reference points for design of discovery protocol in pervasice compute environment. In this discussion we
first describe some current discovery protocols, and make the same comparison of adherance that Zhu made in
Table 2.2, to discovery afforded by cyber foraging systems to in a pervasive compute environment described in
the previous section.
Existing Discovery Protocols
Previous studies of announcement and discovery have focused mainly on static environments without mobile
entities. Two examples of static discovery are Jini [Jini03a], Salutation [Salutation01a] and Universal Descrip-
tion, Discovery and Integration [UDDI11a] (UDDI for Web services). In such static envirnoments an entity
can act as both client and server (peer), typically a one or more peer(s) shall take on the role of registrar that
maintains a central register. This registrar peer is contacted by peers that wish to announce that they provide a
service, the service is registered by the registrar peer. In return, the registrar peer provides a lease to the service
peer, that indicates how long the registrar will continue to announce the service to the network. When a peer
wants access to a service it informs the registrar peer of its service requirements. The registar then matches the
requirements with registered services, and returns either a list of known registered services or information of a
type that provides communication details to a registered peer that currently provides the required service. This
is a centralised approach that does not perform well in a mobile environment that may have mobile peers that
both use and offer services concurrently.
An example of a mobile network discovery protocol is the gossip approach as taken by Lee [Lee03a]
called Konark. Here, mobile peers that are offering services, announce services that they are aware of only
to nieghbours whom they beleive are not aware of the said services. The gossip annouoncement approach is
an optimal approach in a multihop network because of its viral nature of communication. A drawback of the
gossip approach is high overhead in wide networks, and suited to single hop neworks such a Mobile Adhoc
Networks (MANTETS). In cyber foraging, constraining to single hop data transfer is impractical.
Another alternative mobile discovery protocol is Plug and Play (UPnP) [UPnP02a] that is based on the Sim-
ple Service Discovery Protocol (SSDP) [Goland99a]. UPnP is not centralised to a single regsiter, rather when
a peer joins a network it advertises a service to control points that reperesent potential clients for the service.
When a potential client (control point) joins the network it sends a discovery message requesting a service,
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
26/38
the exisiting service adverising peer responds to the requesting client by returning a service announcement
message. A service announcement message has limited service description, should the requesting client wish
to have more service information it responds to the service announcement message requesting more compre-
hensive service description information, which the service advertising peer duly provides. On recieving a full
sevice description the client may choose to use the service, which it does by communicating with the service
peer using the Simple Object Access Protocol (SOAP) [SOAP08a] to use the service.
There are two drawbacks using UPnP in cyber foraging in a pervasive compute environment, firstly the
discovery process is passive, in that networked peers only advertise once, and then wait for a potential client
to respond. Secondly, after a service announcement message is recieved the client must yet again contact the
service peer and request more service description information, before deciding to use the service or not. In the
context of scheduling of tasks in cyber foraging, a client requires continuous service discovery information from
all service surrogates. Conintual manual contacting of every surrogate in the pervasive compute environment,
and recieving multiple service descriptions would incurr unfeasable overhead, considering all peers may be
clients and service surrogates, therefore all peers may need to request to all other peers in the network for
continous monitoring.
In summary, existing discovery protocols do not completely adhere to cyber foraging in a pervasive compute
environment requirements that demand continuous discovery of mobile clients and surrogates (peers) in a
single hop network, that also provided continuous available service description update that may be used fo for
scheduling of partitioned application tasks.
Summary of Cyber Foraging Discovery Design Features
Table 2.2: Table of Cyber Foraging System tools and their adherence to the four fundamental features required
of a Cyber Foraging System in a pervasive compute environment as composed by Zhu [Zhu05a]
18 2. CYBER FORAGING
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
27/38
2.4. DISCUSSION 19
Service and Attribute Naming
The template-based approach defines a format for surrogate names and attributes, such as how the name must
be composed. In addition a template-based name for the surrogate, common parameters of a surrogate may
also be predefined. The advantage of templates and predefined sets of parameters, is the decrease in ambiguity
in communication between entities within the system and externally. Specifying the format and composition
of a surrogate name and attributes is dictated by the how much use of standardised components are used.
Initial Communication Method
While unicast is the most efficient initial communication method because it only targets specific entities, its
drawback is the need to have prior knowledge of the entity address. However, the amount of prior knowl-
edge required can be reduced by initially sending multicast UDP messages, from which entities can determine
unicast addresses, and then switch from multicast to unicast. Here less prior knowledge using multicast and
then switching to unicast is required, and unicast addresses may be stored for future reference. Alternatively,
broadcast may be used for single hop networks to bind a discovery protocol to the underlying network protocol
interface, the disadvantage of this method is that entities are then limited those with the underlying network
communication protocol interface.
Discovery and Registration
In the announcement-based approach clients, surrogates and registration directories listen on a communication
channel, when an announcement that a service is made, a client might learn a service is available and directory
might register the service availability. In the query-based approach an entity may receive an immediate response
to a specific query, each query is replied to separately.
Service Discovery Infrastructure
In the directory-based infrastructure model dedicated infrastructure components are used, here the directory
component maintains service information and processes querys and announcements. A directory-based model
infrastructure can have a flat structure in which all directories have a peer-to-peer relationship, or a hierarchical
structure that only communicates with directories that match a certain criteria. The non-directory based infras-
tructure model has no dedicated infrastructure components, here all services process a query, and reply if the
service matches the query. A client may record information from a service announcement for future use.
Service Information State
The status of a service can be maintained as a soft or hard state. Using the soft state, the lifespan of a service is
governed by the lease expiration time contained in the service announcement message. Prior to lease expiration,
a client or directory may poll the service for validity or the service reannounces itself or renew the current
lease, otherwise the service expires and is removed from directory entries of systems using the directory-based
infrastructure model. The hard state is used in systems using the non-directory based infrastructure model, this
requires periodic polling of services by the client and directories to update service information.
Discovery Scope
Defining discovery scope reduces unnecessary computation on clients surrogates and directories. Using scope
criteria based on network topologies, user roles and context information, or combinations thereof, aides tar-
geting correct scope definition. When including the network topology in the discovery scope criteria, LAN
or single hop wireless network range may be used, here an implicit assumption is that clients, surrogates and
directories belong to the same administrative domain. Opting for User roles criteria allows the user control the
target domain; however this entails the user having prior knowledge of the domain, and domain authentication
information. Criteria such as high level context information such as temporal, spatial and user activity is still
uncommon, however including this in scope criteria lends added granularity to the discovery scope.
Service Selection
Using discovery scope criteria may limit the number of service matches, a discovery result may still contain
matched services. In such cases the choice of service selection can be either manual or automatic. Manual
selection provides complete control of service selection to the user, however this option assumes the user has
the correct knowledge to make the optimal choice. Alternatively, an automatic selection requires little or not
user input. An example of automated service selection is MatchMaking with ClassAds using Condor.
-
7/28/2019 Introduction to Cyber Foraging, Tools and Techniques
28/38
Service Invocation
Once a sevice has been selected, a client invokes the service, depending on the discovery system used. Service
invocation has three facets of information, service location, an underlying communication mechanism, and
operations specific to an application. The first level of serv ice invocation is the network address that provides
service location only, here the application is then responsible for defining communication and operations. The
second level is in addition to service location, and defines t he underlying communications mechanism, normally
communicated using Remote Procedure Calls (RPCs) and variations thereof. The third level is in addition to
service location and communication mechanisms, a definitio n of the application operations that are specific to
an application domain.
Service Usage
Once service usage is granted to a client, the client may explicitly release the resources of a discovered and
selected service. An alternative is the lease based method, in which the client and service negotiate a usage
period as a renewable lease time, when the lease expires the resources are reclaimed and lease information
deleted. For pervasive compute environments that have dynamic service resource availability, the lease based
method is more suitable, than explicit release by the client because if the client fails this could cause issues of
blocking the resource services.
Service Status Inquiry
A client can monitor a service state by either periodically polling the service or using service event notification.
Service event notification requires the client to register w ith the service for the service to notify the client if an
event of interest has occurred. Depending on the system, the method which is the most infrequent should be
chosen.
Discovery Discussion
Spectra [Flinn02a], Chroma - Tactics [Balan03a] and Vivendi [Balan07a] are examples of RPC-based ap-
proaches, in which client applications are partitioned into locally executable code and remotely executable
services. The services are then pre installed on surrogates and may be invoked using RPCs. Data exchangebetween client and surrogate(s) is via system specific Coda fi le system that requires the aforementioned pre in-
stalled remote services. This limits the system to pre-confi gured pervasive compute environments with specific
application support. An advantage in surrogate discovery to pre-installed environment is low overhead when a
surrogate is discovered, as there is no need no determine if the service is installed.
The VM approaches of Slingshot [Su05a], AIDE [Messer02a] and Goyal [Goyal04a] add flexibility of
control over the environment to the user who may install anything on a surrogate without interference to other
systems. The replication approach taken by Singshot [Su05a], entails a proxy running on the client application
device that broadcasts each service request to all replicas, the first recieved response is then passed to the
application. T