an application service provider for a mobile computing environmentgkucuk/kucuk_msc.pdf ·...

An Application Service Provider for a Mobile Computing Environment

Gürhan Küçük

A thesis submitted to Institute of Graduate Studies in Science and Engineering partial fulfillment of the

requirements for the degree of Master of Science

in Information and Computer Science

YEDİTEPE UNIVERSITY 1998

Acknowledgements

I would like to express my deepest gratitude to my supervisor, Associate Professor

Şebnem Baydere, for her invaluable guidance, motivation and support during all

stages of this study.

I would like to thank specially to Onur Demir, my colleague and partner throughout

the past four years, for his invaluable help and patience. I would also like to thank the

members of the MaROS project Mehmet Can Yıldız and Giray Devlet, and former

MaROS members Nurhan Çetin, Değer Cenk Erdil and Nilüfer Girgin for their

presence and invaluable help. Moreover, thanks to Yeditepe University Engineering

Faculty staff Alper Özpınar, Ayşegül Ergin, Ertan Toprakbastı, İlker Birbil, Mehmet

Ali Özcan, Rana Belen, Tansel Çağlar and Zeynep Yazıcıoğlu for their moral support

and help.

Last but not least, thanks to my mother, Emek Küçük, and my father, İlhan Küçük,

and the rest of my family for their moral support and patience throughout all stages of

this study and my life.

Abstract

MaROS (Mobile and Relocatable Object System) application development platform

is especially designed for mobile computers. In this platform, registered mobile

computers may transfer their applications to MaROS server. This process is called

“relocation”. In this thesis, the design and implementation of the MaROS Notification

and Recovery modules are presented.

MaROS Notification module deals with transfer and removal of MaROS objects. It is

the preparation step for the object relocation process. MaROS Recovery module deals

with the system recovery and startup process after voluntary shutdown requests. It

coordinates the recovery process by keeping a table called “Recovery Table”.

Contents

LIST OF TABLES vii LIST OF FIGURES viii LIST OF ABBREVIATIONS x

1 INTRODUCTION...................................................................................................... 1 1.1 Introduction.......................................................................................................... 1

1.2 Mobile and Relocatable Object System (MaROS) .............................................. 3

1.3 Motivation and Aims ........................................................................................... 4

1.4 Background Information...................................................................................... 5 1.4.1 Mobile Computing ......................................................................................... 5 1.4.2 Mobile Host ................................................................................................... 5 1.4.3 Disconnected Communication ....................................................................... 6 1.4.4 Objects ........................................................................................................... 6 1.4.5 Object Relocation (Object Migration) ........................................................... 7 1.4.6 Object Notification ........................................................................................ 7 1.4.7 Object Recovery ............................................................................................ 7

1.5 Thesis Summary................................................................................................... 8

2 PREVIOUS WORK................................................................................................... 9 2.1 Introduction.......................................................................................................... 9

2.2 Rover: A Toolkit for Mobile Information Access ............................................... 9

2.3 ARTEMIS:Advanced Reliable disTributed Environment Middleware System 11

2.4 Eden ................................................................................................................... 13

2.5 LOCUS .............................................................................................................. 15

2.6 Discussion .......................................................................................................... 16

3 MaROS Environment .............................................................................................. 18 3.1 Introduction........................................................................................................ 18

3.2 The Physical Structure of MaROS..................................................................... 18

3.3 The Logical Structure of MaROS...................................................................... 20

3.4 The System Agents ............................................................................................ 21 3.4.1 Communication Agent ................................................................................. 22 3.4.2 Object Manager............................................................................................ 22 3.4.3 Notification Agent ....................................................................................... 23 3.4.4 Migration Agent........................................................................................... 23 3.4.5 Recovery Agent ........................................................................................... 23

3.5 Host Registration and Authentication Protocol ................................................. 24

3.6 MaROS Objects ................................................................................................. 26 3.6.1 Object Types ................................................................................................ 26 3.6.2 Object Modes ............................................................................................... 26 3.6.3 Object States ................................................................................................ 27

3.7 Communication Structure .................................................................................. 29

3.8 System Recovery ............................................................................................... 30

4 NOTIFICATION DESIGN ..................................................................................... 32 4.1 Introduction........................................................................................................ 32

4.2 Notification in General ...................................................................................... 33

4.3 Detailed Design.................................................................................................. 34 4.3.1 Peer Entities ................................................................................................. 35 4.3.2 Reserving Ports ............................................................................................ 36 4.3.3 Notifier Tables ............................................................................................. 39

4.3.3.1 Notifier Object Transfer Table (NOTT) ............................................... 39 4.3.3.2 Partial Object Transfer Table (POTT) .................................................. 39 4.3.3.3 Class Dependency Table (CDT) and Class Replica Table (CRT) ........ 40 4.3.3.4 Notifier Information Table (NIT) ......................................................... 42

4.3.4 Object Transfer (Object Creation) ............................................................... 43 4.3.4.1 Full Transfer.......................................................................................... 46 4.3.4.2 Partial Transfer...................................................................................... 47 4.3.4.3 No Need To Transfer ............................................................................ 47

4.3.5 Object Deletion ............................................................................................ 48

5 SYSTEM RECOVERY DESIGN ........................................................................... 50 5.1 Introduction........................................................................................................ 50

5.2 Recovery Table (RT) and Recovery Tree Structure .......................................... 51

5.3 Recoverable Objects vs. Unrecoverable Objects ............................................... 53

5.4 System Shutdown............................................................................................... 55 5.4.1 Creating Image Files .................................................................................... 57

5.5 System Startup ................................................................................................... 58 5.5.1 Object Manager Startup Process .................................................................. 58 5.5.2 Recovery Agent Startup Process.................................................................. 60 5.5.3 Startup of the System Agents....................................................................... 61

5.5.4 Mutation of SS fields and RA Garbage Collector........................................ 62

6 PILOT SYSTEM IMPLEMENTATION............................................................... 63 6.1 Introduction........................................................................................................ 63

6.2 Pilot System Implementation Language ............................................................ 63

6.3 Pilot System Implementation Environment ....................................................... 65

6.4 Pilot System Implementation............................................................................. 65 6.4.1 Notify Package............................................................................................. 66

6.4.1.1 Notify.NotificationAgent Class ............................................................ 66 6.4.1.2 Notify.NIT Class ................................................................................... 66 6.4.1.3 Notify.NotifierClass Class .................................................................... 68 6.4.1.4 Notify.NOTT and Notify.POTT Classes .............................................. 68

6.4.2 Notify.CDT Package.................................................................................... 70 6.4.2.1 Notify.CDT.CDT Class......................................................................... 70 6.4.2.2 Notify.CDT.CRT Class......................................................................... 72

6.4.3 Recovery Package ........................................................................................ 73 6.4.3.1 RecoveryTable Class............................................................................. 73 6.4.3.2 RecoveryAgent Class ............................................................................ 75 6.4.3.3 Recoverable Object Implementation..................................................... 76

7 EVALUATION AND FUTURE WORK ............................................................... 81 7.1 Introduction........................................................................................................ 81

7.2 Performance Evaluation..................................................................................... 81 7.2.1 Full Transfer Tests ....................................................................................... 82 7.2.2 No Need Type Object Transfer and Object Deletion Tests.......................... 85

7.3 Future Work ....................................................................................................... 86 7.3.1 Future Work on the Notification Module .................................................... 87 7.3.2 Future Work on the Recovery Module ........................................................ 87 7.3.3 Future Work on MaROS.............................................................................. 88

8 CONCLUSION......................................................................................................... 90

REFERENCES.............................................................................................................. 92

BIBLIOGRAPHY......................................................................................................... 94

List of Tables

Table 1.1 Characteristics of computer hardware .............................................................. 2

Table 1.2 Characteristics of network technology ............................................................. 2

Table 3.1 An instance of the Host Identification Table (HIT)........................................ 24

Table 3.2 Object states.................................................................................................... 29

Table 4.1 Sample NOTT and POTT instances ............................................................... 40

Table 4.2 Sample CDT and CRT instances .................................................................... 41

Table 4.3 Sample NIT instances ..................................................................................... 42

Table 6.1 The format of the MH version of the NIT ...................................................... 67

Table 6.2 The format of the MSP version of the NIT..................................................... 67

Table 6.3 The format of the NOTT................................................................................. 69

Table 6.4 The format of the POTT ................................................................................. 69

Table 6.5 The format of the CDT ................................................................................... 71

Table 6.6 The format of the CRT.................................................................................... 72

Table 6.7 The format of the Recovery Table.................................................................. 73

Table 7.1 Transfer results of 505319 bytes object.......................................................... 82

Table 7.2 The results of the No Need type object transfer tests...................................... 86

List of Figures

Figure 2.1 The Rover toolkit client/server distributed object model .............................. 11

Figure 2.2 Highly reliable distributed environment provided by ARTEMIS................. 12

Figure 2.3 Transitions between active and passive representations in Eden .................. 14

Figure 3.1 The physical structure of MaROS ................................................................. 19

Figure 3.2 MaROS layers ............................................................................................... 20

Figure 3.3 The host authentication process..................................................................... 25

Figure 4.1 Initial phase of a Notification process ........................................................... 35

Figure 4.2 A Notification design approach (Initial Phase) ............................................. 36

Figure 4.3 Current design of the Notification process (Initial Phase) ............................ 37

Figure 4.4 The comparison of two approaches............................................................... 38

Figure 4.5 Scope of the Notifier tables ........................................................................... 40

Figure 4.6 Object transfer (creation) process.................................................................. 43

Figure 4.7 Message format of a notification request ...................................................... 44

Figure 4.8 cfinfo field in detail........................................................................................ 44

Figure 4.9 Message format of the message that is sent from HMSP to NotifierMSP ......... 46

Figure 4.10 pcfinfo field in detail.................................................................................... 47

Figure 4.11 Message format of the message sent from Notifiers to Handlers................ 48

Figure 4.12 Object deletion process................................................................................ 48

Figure 5.1 A sample Recovery Table instance and its corresponding tree structure...... 52

Figure 5.2 Setting the SS field ........................................................................................ 55

Figure 5.3 The OM realizes the Shutdown Signal.......................................................... 56

Figure 5.4 Flow of the Shutdown Signal ........................................................................ 57

Figure 5.5 Startup of the Object Manager ...................................................................... 59

Figure 5.6 Startup of the Recovery Agent ...................................................................... 60

Figure 5.7 Code replication solution for the recovery process ....................................... 61

Figure 6.1 The example piece of code showing the addition of code states................... 76

Figure 6.2 The implementation of check_ShutdownSignal() method............................. 77

Figure 6.3 An example code part of the saveImage() method........................................ 78

Figure 6.4 An example image file reader code piece...................................................... 79

Figure 7.1 Transfer time vs. buffer size graph for 100 Mbit tests .................................. 83

Figure 7.2 Transfer time vs. buffer size graph for 115200 bit tests................................ 83

Figure 7.3 Transfer time vs. buffer size graph for 19200 bit tests.................................. 84

Figure 7.4 Transfer speed vs. buffer size graph for 100 Mbit tests ................................ 84

Figure 7.5 Transfer speed vs. buffer size graph for 115200 bit tests.............................. 85

Figure 7.6 Transfer speed vs. buffer size graph for 19200 bit tests................................ 85

List of Abbreviations

AO Authentication Object

CA Communication Agent

CPU Central Processing Unit

CDT Class Dependency Table

CGI Common Gateway Interface

CRT Class Replica Table

DBMS Database Management System

DNS Domain Name Service

FTP File Transfer Protocol

H Handler

HIT Host Identification Table

ID Identifier

JVM Java Virtual Machine

LAN Local Area Network

MA Migration Agent

MaROS Mobile and Relocatable Object System

MASL MaROS Application Support Layer

MCSL MaROS Communication Support Layer

MH Mobile Host

MID Mobile Host Identifier

MMX Multimedia Extension

MRB MaROS Recycle Bin

MSP Mobile Host Service Provider

MUI MaROS User Interface

N Notifier

NA Notification Agent

NIT Notifier Information Table

NOTT Notifier Object Transfer Table

O MaROS Object

OID Object Identifier

OOP Object Oriented Programming

OM Object Manager

OS Operating System

OT Object Table

POTT Partial Object Transfer Table

QRPC Queued Remote Procedure Call

RA Recovery Agent

RDO Relocatable Dynamic Object

RT Recovery Table

SS Shutdown Signal

TCp Turkish Coffee Protocol

TCP Transmission Control Protocol

UDP User Datagram Protocol

VPM Virtual Port Mapper

VPR Virtual Port Reservation

WWW World Wide Web

Introduction

1.1 Introduction

Computers and computer networks have opened a new era in the world history:

Information Age. Today, we can access any database located in any region of the

world as if it were locally available, join multimedia conferencing, and do online

shopping. Distributed systems and client/server architecture are actually the best

keywords that may describe this age. A distributed system consists of many

computers that are connected to a computer network. In the client/server model,

servers are the computers that share their resources to the system, and clients are the

computers that use these resources. This scheme is ideal for computer networks with

fixed hosts.

In the beginning of this decade, wireless networks have started to become popular

with the increase in the number of portable computer sales. Wireless networks have

provided computers with wireless interfaces that allow networked communication

even while a user is travelling. The rapid advances in cellular communication

technology, wireless LAN, and satellite services have enabled mobile users to access

information anywhere and at anytime [1].

Wireless networks need everything that classical computer networks need. However,

they also need an improvement over classical client/server model. Because, reliable

transport protocols have been tuned for networks composed of wired links and

stationary hosts [2]. Moreover, the classical client/server model assumes that both the

clients and the servers are connected to a network via a fixed, and continuous

connection. The portable computers have some deficiencies such as low bandwidth

capacity, limited power supply, limited CPU power, and vulnerability to line failures,

hand-offs, etc. Table 1.1 presents the characteristics of computer hardware. From the

table it is clear that portability is traded for performance.

Hardware/ Characteristic

Server Workstation Laptop Palmtop

Processing power High High Medium Limited Storage capacity High High Medium Limited Portability None Limited Slightly limited Full User interface Full Full Slightly limited Limited Reliability High Medium Limited Limited

Table1.1: Characteristics of computer hardware.

Table 1.2 summarizes the major characteristics of network technology. From this

table it is obvious that availability is traded for performance.

Techology/ Characteristic

Fixed WAN/LAN Dial-up wire Dial-up cellular

Bandwidth High Medium Low Reliability High Medium Low Initial Cost High Low Low Latency Low Medium High Cost to use Low Medium High Topology Fixed Fixed, but readily

changeable Dynamic

Available at Outlet in organization Any phone outlet Anywhere (theoretically)

Table 1.2: Characteristics of network technology

A new type of client/server model or an application development platform has to be

designed to deal with the inadequacies of the portable computers. This new model has

to support many new ideas such as disconnected communication, object relocation,

and system recovery, which are hardly needed in classical client/server model.

1.2 Mobile and Relocatable Object System (MaROS)

MaROS is a mobile computing environment that is especially designed for

suppressing the inadequacies of mobile computers [3]. In a mobile computing

environment, there are portable computers that are connected to a static network via

wireless links. Portable computers in these mobile environments, have limited power

supply, and limited communication bandwidth. Because, the classical client/server

model assumes that both the clients and the server (or servers) are connected to a

network via a fixed, and uninterrupted connection, it is not a good solution for

portable computers. Moreover, classical communication protocols, such as TCP, does

not take the wireless networks into consideration.

MaROS proposes a new type of client/server model, and a new type of

communication protocol. This model and protocol, helps mobile hosts to extend their

computing environment. The MaROS server called as MSP (Mobile Host Service

Provider), is a fixed, and powerful host that has a wireless interface to communicate

with mobile hosts (or MaROS clients). MaROS clients (or MHs, or Mobile Hosts),

may transfer their objects to MSP, and execute there. This approach enables Mobile

Hosts to extend their computing environment using fixed, and powerful server

machines.

1.3 Motivation and Aims

Extending the computing environment of a Mobile Host is one of the major aims of

MaROS. With this service, Mobile Hosts may run CPU, and bandwidth bound

applications even when they are powered off. However, this service requires the

transfer of applications, or parts of applications to the MSP site. This process is a kind

of object synchronization process which creates the exact copy of an object in the

MSP. The Notification service primarily concerns with the transfer of the MaROS

objects. The additional task of this service is deletion of transferred objects, when

they are not needed anymore. The object notification process is also a preparation to

object migration process. It simplifies the migration process by transferring the

objects when they are created.

The mobile computing environment of MaROS should be efficient, and reliable. If the

Mobile Host has to be shutdown (Note that, this is a very usual case for a portable

machine), without a Recovery service, many jobs have to be restarted from the

beginning, or -worst of all- many jobs may be lost, and may not be started anymore.

The Recovery service enables the Mobile Hosts to continue their execution after a

proper system shutdown and start-up process. It provides much more stable and

efficient computing environment for Mobile Host users. There are many research

efforts related with recovery in distributed systems. Some of them deal with failure

recovery which covers software and hardware failures all together. Many of them use

checkpointing algorithms for rollbacking to a previous state after a failure. The

current recovery system of MaROS does not deal with hardware failures. The system

tries to recover itself, after a shutdown request is given by the user. The recovery

module is one of the most important parts of the MaROS system. It enables the

MaROS user to exit the system whenever it is necessary to know that at the next

startup, the system will continue its execution from the exact point where it has

stopped.

1.4 Background Information

This thesis is mainly concerned with an application development environment for

mobile computers. Therefore, it is vital for any reader to know the basic concepts in

this area. This section provides a general background information on mobile

computing environments, and object related issues.

1.4.1 Mobile Computing

With the rapid increase in the number of portable computers, and the existence of

their wireless link interfaces, a new computing approach has become feasible: Mobile

Computing. The Wireless Computing, Ubiquitous Computing, Location-independent

Computing and the Nomadic Computing terms are also equally used. The portable

computers in a mobile computing system are usually called as Mobile Hosts.

1.4.2 Mobile Host

A Mobile Host is a portable computer that may connect to a network via its wireless

link interface. It may be carried easily; however, it has some important disadvantages.

It has to be recharged periodically; since, it has a limited power supply. It may be

connected to a mobile network via a cellular phone line, when on the move. This type

of connection provides communication hand-offs when changing cells. Therefore, the

communication is not reliable as it is in fixed networks. Moreover, the communication

bandwidth is far less than the bandwidth of the fixed networks. Furthermore, mobile

communication is still expensive. A Mobile Host is also vulnerable. It may be

dropped and physically broken, or it may be stolen. All of these handicaps have

forced the computer scientists to design new type of computing environments that

covers Mobile Hosts.

1.4.3 Disconnected Communication

A Mobile Application Environment should provide communication primitives

transparent to the applications. Disconnected communication is the part of a new type

of communication protocol designed for supporting mobile hosts. It tries to minimize,

or totally remove the side effects of connecting to a network via a mobile host. By

using queuing strategies at both side of the connection, the client and the server may

send data, even when there is no connection. The queued data is sent to the target

host, when the communication link becomes up again.

1.4.4 Objects

The object concept is used in many areas. A car, a TV set, a computer program, and a

door knob are examples of an object. There are two main characteristics of an object

that are universal: 1) Properties and 2) Methods. Each object has special properties.

For instance, a car object has a color, a type, a width, and a length. It has also

methods such as drive, stop, and change gear. In MaROS context, the objects are

Java threads. They have many properties such as recoverability, relocatability, and

state. A MaROS object has also many methods. The suspendObject, sleepObject,

activateObject, and relocateObject methods are some examples of these methods.

1.4.5 Object Relocation (Object Migration)

Sometimes, an object may be needed to move from one machine to another. The

reasons for that process may be various. The host that contains the object may be very

loaded, and the object may be moved to a less loaded machine. This process is called

as load balancing. Another reason may be long-running objects. They require

uninterrupted execution, and this may not be possible every time. For instance, a

portable computer has a limited power supply and unreliable communication

interface. All these handicaps are not suitable for such kind of objects. In short,

Object Relocation is the movement of an object from one host to another. In MaROS

terms, it is the movement of a MaROS object from Mobile Host (MaROS client) to

Mobile Host Service Provider (MaROS server); since, MaROS supports one-way

object relocation.

1.4.6 Object Notification

In order to support object relocation, the copy of the objects should be created at the

MSP. These mirror objects should also be deleted, when the object is to be deleted

from the mobile host. This is a kind of object synchronization process, and since the

MSP is informed about the process, the process is called as notification.

1.4.7 Object Recovery

In a mobile computing environment, there are portable computers that are not reliable.

The recovery process is used for handling system failures. There are exactly two types

of failures: Involuntary and voluntary. The first one may occur any time, and its

source may be a hardware or software problem. This type of failures are handled by

saving snapshot of the system (checkpointing) periodically, and rollbacking to a

previous stable state in the next system startup. The second type of failures is

voluntary. The user may want to shutdown the system, and the system may be

signalled before the system shutdown occurs. The voluntary system interruptions give

the system a chance to save its crucial data. Of course, dealing with the first type of

failures is much more difficult.

1.5 Thesis Summary

MaROS is an application development platform especially designed for portable

computers. These mobile hosts are called as MaROS clients. MaROS is a client/server

environment, and consists of two types of machine: MaROS clients (MH, in short)

and MaROS server or Mobile Host Service Provider (MSP). A MaROS client runs

MaROS client code and connects to the MSP via wireless link. MSP accepts

connections from MaROS clients and permits them to relocate their objects.

This thesis mainly concerns with the design and implementation of Notification and

Recovery modules of MaROS. It will also cover the general architecture of MaROS to

provide more clear vision of the system to the reader. In Chapter 2, previous work on

this area is surveyed. Chapter 3 covers the overall design of the MaROS. In Chapter 4

and 5, the design of Notification and Recovery services of MaROS is explained in

detail. Chapter 6 provides the implementation details of the pilot system. Chapter 7

presents the performance evaluation and also additional ideas that may be

implemented in the future. Chapter 8 is the final chapter, and it presents a conclusion

for the whole work.

Previous Work

2.1 Introduction

There are several operating systems and toolkits focused on mobile and/or distributed

computing platforms. In this chapter, some of those projects are discussed, in detail,

focusing especially on their recovery modules.

2.2 Rover: A Toolkit for Mobile Information Access

This toolkit is one of the projects that are very close to MaROS. The Rover toolkit

was developed in the Computer Science Laboratory in MIT. It provides mobile

application developers with a set of tools to isolate mobile applications from the

limitations of mobile communication systems. It supports mobile communication by

providing Relocatable Dynamic Objects (RDOs) and Queued Remote Procedure Call

(QRPC). An RDO is an object with a well-defined interface that can be dynamically

loaded into a client computer from a server computer (or vice versa) to reduce client-

server communication requirements. Queued remote procedure call is a

communication system that permits applications to continue to make non-blocking

remote procedure calls [6] even when a host is disconnected: requests and responses

are exchanged upon network reconnection [5].

The Rover toolkit offers applications a uniform distributed object system based on

client/server architecture. Rover applications employ a check-in, check-out model of

data sharing: They import RDOs into their address spaces, invoke methods provided

by the RDOs, and export the RDOs back to servers.

The latest extensions provide the tools for handling a specific class of faults: transient,

recoverable faults. These faults are typically caused by environmental circumstances

(e.g. power glitches, communication link errors or failures, resource exhaustion due to

high system load, etc.) or software errors in rarely used code paths. The extensions do

not address repeatable or non-recoverable failures (e.g. those due to critical design or

implementation errors).

The reliability extensions leverage functionality already provided by the Rover

toolkit: stable logging of each message sent by a client and message retransmission

after communication failures. While the use of stable logging at the client provides

reliable delivery of a message to a server, it does not handle failures at the server [7].

Figure 2.1 shows the Rover toolkit client/server distributed object model. Rover offers

applications client caching and optimistic concurrency control based upon a check-in,

check-out model of data sharing. Client applications use QRPCs to import RDOs

from servers (steps 1 and 2) and to export changed RDOs back to servers (steps 3 and

Figure 2.1: The Rover toolkit client/server distributed object model.

2.3 ARTEMIS: Advanced Reliable disTributed Environment Middleware System

ARTEMIS is a middleware to improve reliability of application programs, which are

executed in distributed environment such as 3-tiers client-server model application

programs or groupware application programs, without changing them.

ARTEMIS is implemented as library routines and daemon processes with the

configuration where there is a backup computer for a server computer. ARTEMIS

uses checkpoints as its key method for achieving high reliability. It provides a

checkpointing protocol which makes checkpoints of distributed processes

consistently.

Figure 2.2: Highly reliable distributed environment provided by ARTEMIS.

Figure 2.2 shows the environment provided by ARTEMIS. In this example,

ARTEMIS controls a WWW server, CGI application programs and a DBMS running

in a server computer as well as WWW browsers running in client computers. Under

control of ARTEMIS, even if the primary server computer goes down, all the

processes in the primary server computer can be resumed in the backup server

computer using their checkpoints and replicated files. DBMS can continue to run in

the backup server computer without executing journal recovery processing.

In the ARTEMIS environment, it is not necessary to modify application programs;

because, ARTEMIS libraries are linked to application programs dynamically, and

they have the same interfaces with and operating systems. ARTEMIS libraries keep

watch on behavior of a process to which they are linked, and acquire checkpoints of

its process [8].

2.4 Eden

The Eden system was developed at the University of Washington in Seattle. The goal

of Eden system was to investigate logically-integrated but physically-distributed

operating systems.

Eden was based on the object model. It is descendant of Hydra (Wulf et al. 1981). All

‘traditional’ programs and physical and logical resources are represented as objects.

There are no pure data objects – Eden objects are supported by active processes. An

Eden object may be seen as an instance of an abstract data type. Because, there are

some differences between Eden’s objects and those of other systems and languages,

the designers refer to them as Ejects (for Eden Objects).

The underlying system of Eden is Berkeley UNIX running on VAXes. Each active

Eject executes within a separate UNIX process with its own address space. This

process is managed by the Eden kernel using UNIX facilities.

Ideally, an Eject should be active. However, it is not always active, either because it

or its computer has crashed, or because it has explicitly deactivated itself in order to

economize on the use of system resources. Thus, an Eject has two manifestations: An

active representation (with its system-level process) and a passive representation. The

passive representation consists primarily of a disk file, and only the passive

representation can survive a crash.

An Eject can perform a Checkpoint operation. This operation creates a passive

representation, that is, a data structure designed to endure system crashes. This means

that the in a passive representation should be sufficient to enable the Eject to

reconstruct its long term state. Acquiring and releasing active and passive

representations are illustrated in Figure 2.3.

Figure 2.3: Transitions between active and passive representations in Eden.

The figure shows that when an Eject is created, only an active representation exists. It

does not have its state saved in permanent store. This implies that if this Eject were to

Deactivate, or if the system were to crash, it would vanish and it could not be invoked

again.

Performing a Checkpoint operation results in the following operations: Opening a

passive representation of an Eject, writing its state in a series of PutData calls, and

completing the passive representation with a call. The Eject then has its state and

identity on permanent store. If this Eject Deactivates or crashed, its active

representation vanishes, but the passive representation remains. If the Eject having a

passive representation is invoked by another Eject, then the kernel reactivates it, that

is, it constructs a new active representation [9].

2.5 LOCUS

LOCUS is a UNIX-compatible, distributed operating system developed by Popek,

Walker and their co-workers at the University of California, Los Angeles. The system

has been in use for several years.

LOCUS’s general goals include making the development of distributed applications

as simple as single machine programming, and realizing the potential that distributed

systems with redundancy have for highly reliable, available operation. The LOCUS

architecture addresses the goals of:

(1) Network transparency – giving all users the illusion of operating on a single

computer. The network is not visible; there is no need to refer to a specific node

of a network;

(2) High reliability and availability – introduced for two general reasons. First,

many applications demand a high level of reliability and availability. Second, the

distributed environment presents new sources of failure, and recovery

mechanisms to deal with them are far more difficult to construct than in

centralized computer systems. LOCUS processes one very important reliability

feature, namely, it supports automatic replication of stored data, with the degree

of replication indicated by associated reliability profiles; and

(3) Good performance – LOCUS achieves two basic performance characteristics

desirable in the case of distributed system:

(a) Access to local resources in a distributed system should have comparable

performance to access to resources in a centralized system, as if mechanisms

for remote access were not present.

(b) Remote access, of course slower than local access, should be reasonably

comparable to local access [9].

2.6 Discussion

All of the above platforms provides system reliability by utilizing a special system

object (agent) or including an extension package to the system. Some of the platforms

use replication strategy for a more reliable system [8,9]. The rest of the platforms

prefer checkpointing strategy for the system recovery. The checkpointing strategy is

divided into two camps. One camp applies checkpointing operation periodically over

a system-wide perspective [7], and the other side uses a one-time checkpointing

operation over only recoverable objects [8].

The replication strategy requires a very extensive network traffic that is not suitable

for mobile platforms. In the mobile platforms, the most vulnerable machines are the

mobile hosts. Keeping a replica for each mobile host is not a feasible approach; since,

the mobile hosts do not have enough network bandwidth for supporting this kind of

strategy. However, this strategy is very effective when dealing with hardware failures.

On the other hand, the checkpointing strategy may be effectively used by the mobile

platforms. However, it is not as effective as the replication strategy for the hardware-

based failures. It is possible to continue execution from the last checkpoint; however,

there is nothing to do, if the checkpoint information is damaged. The second type of

checkpointing strategy (one-time checkpointing) could not deal with hardware

failures.

MaROS Environment

3.1 Introduction

This thesis covers the design and implementation of the two crucial parts (The

notification and the recovery modules) of the MaROS environment; however, it is

necessary to explain the MaROS environment clearly to give an idea about the whole

system, before going into more details about its modules. The first topic of this

chapter is physical and logical structure of the MaROS. Then, the registration and

authentication process of the mobile hosts is explained in detail. The object specific

events and the system recovery process are discussed at the end of this chapter.

3.2 The Physical Structure of MaROS

The physical structure of MaROS consists of many portable computers and a fixed

host. The portable computers may connect and disconnect to the fixed host via their

wireless network interfaces using cellular phones. The fixed host has also a wireless

network interface to communicate with the portable computers. The portable

computers run the MaROS client software, and they are called as MaROS clients. On

the other hand, the fixed host runs the MaROS server software, and it is called as the

MaROS server. In MaROS terms, the MaROS clients are called as Mobile Hosts (or

MHs, in short), and the MaROS server is called as Mobile Host Service Provider (or

MSP, in short). In Figure 3.1, the physical structure of MaROS is shown in detail. In

the current design and implementation of MaROS, only one MSP is available. The

design may be modified by using additional MSPs. These MSPs may provide many

new ideas, which may be implemented in the near future, to the current design such as

parallel processing and load balancing.

Figure 3.1: The physical structure of MaROS.

The MaROS is platform independent. It may run on any OS that supports Java Virtual

Machine (JVM). This is a great advantage for both system programmers and

application developers. Once a Java code is compiled, it may be transferred into any

other platform and run there.

3.3 The Logical Structure of MaROS

MaROS uses a layered approach. In each layer, there may be one or more modules.

Each module is responsible from a specific task, and it may use the services provided

by the lower layer modules. The interaction between layers is shown in Figure 3.2.

RA: Recovery Agent MA: Migration Agent OM: Object Manager NA: Notification Agent CA: Communication Agent A, B: MaROS Application

Figure 3.2: MaROS layers.

The lowest layer is a composite layer. It is the combination of the OS kernel and the

JVM. The upper layers do not directly communicate with the kernel. They use the

lowest layer services via JVM.

The second layer is called as the MaROS Communication Support Layer (MCSL). In

this layer, the Communication Agent (CA) takes place. It provides reliable

communication primitives. These layer also supports disconnected communication

that is essential for the mobile hosts. The communication protocol provided by this

layer is called as the Turkish Coffee Protocol (TCp). The name is probably an

inspiration based on Java.

The third layer is the MaROS Application Support Layer (MASL). This layer is very

rich in agents. There are four agents in this layer: Object Manager, Notification

Agent, Migration Agent, and Recovery Agent. These agents provide many services to

the MaROS applications. Even the Communication Agent uses some of these

services. Object specific services such as object creation, deletion, notification,

relocation and system recovery are supported in this layer.

In the upper layer, the MaROS applications are take place. A MaROS application is a

Java application that may use MaROS services provided by the lower layers.

3.4 The System Agents

The software agent concept is one of latest programming techniques in the computing

world. Software agents are software modules with cognitive abilities such as

motivation, goal processing, reasoning and autonomy [10]. They are capable of

learning, act independent of the user to achieve a given goal [11]. In the current

implementation of MaROS, the system agents are MaROS applications that are

responsible from performing the given tasks and providing services to the applications

independent of the user. They do not have artificial intelligence.

There are five system agents in MaROS: Communication Agent (CA), Object

Manager (OM), Notification Agent (NA), Migration Agent (MA) and Recovery

Agent (RA). In the current design, the MSP does not support recovery. Therefore, the

Recovery Agent is available for only MHs. On the other hand, the other system agents

have MH and MSP versions.

3.4.1 Communication Agent In MaROS, the Communication Agent is the system agent which is responsible from

all the communication backbone. It uses a new communication protocol called TCp

(Turkish Coffee Protocol) that supports disconnected operations and virtual

connections. Some crucial tasks of the Communication Agent are handling

disconnected operations, providing non-blocking primitives, and the re-establishment

of existing connections after voluntary shutdowns.

3.4.2 Object Manager The Object Manager is the system agent that is responsible from all the object specific

operations. It handles the Object Table that keeps information about all objects in the

local system [12]. It is the creator of the other system agents. It interacts with other

system agents to support notification, migration and recovery operations. Another task

of the Object Manager is to provide system-wide unique identification for all the

objects.

3.4.3 Notification Agent

When a relocatable object1 is to be created or deleted, the MSP site, which keeps the

exact copy of the object, has to be informed. This process is necessary for the object

synchronization between the MH and the MSP, and called as notification. In the

object creation phase, the copy of the object is automatically created at the MSP site.

The Notification Agent uses notifier threads for handling notification requests.

Actually, the notification process is a preparation for the relocation process. Chapter 4

explains the Notification Agent, in detail.

3.4.4 Migration Agent The agent that handles the object relocation requests is the Migration Agent. Since,

the Notification Agent automatically creates copies of the relocatable objects at the

MSP site, the Migration Agent deals with the transfer of the parameters, running the

object at the MSP site, and retrieving the results when the object execution ends. The

logical structure of the Migration Agent is very similar to Notification Agent. It uses

migrator objects to handle the relocation requests [13].

3.4.5 Recovery Agent The mobile hosts may not run forever. They need to be shutdown periodically. The

task of the Recovery Agent is the recovery of the system after voluntary shutdowns. It

detects the shutdown request, and coordinates the shutdown process by managing a

table called Recovery Table (RT). This agent is explained in Chapter 5.

1 Types of objects are explained in section 3.6.1

3.5 Host Registration and Authentication Protocol

The system security is one of the most important parts of a system. The system should

be protected from unauthorized access. MaROS tries to provide the system security

by registering its users. After the registration process, the system may easily identify

and authenticate its clients.

The Host Identification Table (HIT) keeps the information of registered users. If a

mobile host user wants to use MaROS environment, s/he supplies some information

to the administrator of the MSP. This information is recorded into HIT and the user is

given a password for further authentication [12].

NAME ID AT MT OI TI P

gurhan 00:27:45:10 30/08/1998 30/08/1998 Yeditepe Univ. Compaq35 NB xxxx

mandrake 00:26:40:12 01/09/1998 01/09/1998 ABC Company IBM 350L xxxx

… … … … … … …

Table 3.1: An instance of the Host Identification Table (HIT).

The Host Identification Table contains seven fields:

1. NAME: The name of the mobile host.

2. ID: Network interface number (It is a worldwide unique identifier)

3. AT: Time of the host registration (Addition Time).

4. MT: Time of modification.

5. OI: Mobile host owner information.

6. TI: Technical details of the mobile host.

7. P: Encrypted password of the mobile host.

The authentication is done at the startup by the Authentication Object (AO). In order

to join the MaROS environment, the Authentication Object at the mobile host sends a

data packet that contains its NAME, ID and Password to its peer at the MSP site. The

Authentication Object at the MSP site searches the Host Identification Table for the

ID of the mobile host. If the ID is found in the table, the password is checked. If the

ID is not found in the table or the password is incorrect, the authentication request is

discarded. Otherwise, the host is accepted to the MaROS environment by sending a

positive acknowledgement. The host authentication process is depicted in Figure 3.3.

AO: Authentication Object HIT: Host Identification Table

Figure 3.3: The host authentication process.

3.6 MaROS Objects

An object can be defined as a collection comprising of a data structure and a set of

operation on this data structure [9]. In MaROS, an object is a program or a part of it,

which can be executed in the MaROS environment.

3.6.1 Object Types MaROS environment provides a big opportunity for its objects: The relocation

process. However, some parts of the objects do not need relocation capability.

Therefore, two types of objects are available in MaROS: ordinary and relocatable.

The type of the object should be decided at the time of the creation. After the creation

of an object, it is not possible to change its type. The ordinary objects do not have

relocation capability, and they may only be run at the host in which they are created.

On the other hand, the relocatable objects are automatically transferred to the MSP

site via the notification process. After a successful notification process, the relocatable

object may be run either in the MH or in the MSP. However, there is a restriction for

the relocatable objects. Once they start to execute, they may not change their host.

Since, each site has a copy of the object, another object internal is considered to start

the execution of the relocatable object: The object modes.

3.6.2 Object Modes A relocatable object has two copies at both sites. It is not possible to run them at the

same time in the system. There are two modes of a relocatable object: active or

passive. Only the object that is in active mode may be run. The ordinary objects have

only one mode that is always active.

The object may be activated in one of the two possible ways:

• By using activate() call: Ordinary objects and relocatable objects that do not need

relocation is activated by using this call. They are run in the environment where

they are created.

• By using relocate() call: Only the relocatable objects that are notified may use

this call. They are run in the MSP site, if this call is used. The mode of the object

at the mobile host is set to passive, and the mode of the copy object located in the

MSP site is set to active.

3.6.3 Object States Every object has a state concept. For instance, a student may be in the studying state.

A worker may be in the working state. The MaROS objects may be found in one of

the nine possible states:

• created: This state is the initial state of ordinary objects.

• created_notnotified: This state is the initial state of relocatable objects. The

objects in this state can be activated on the MH. However, the relocate()

primitive may not be used in this state.

• created_notified: After a successful object notification phase, the state of the

object is set to created_notified. By the use of the relocate() primitive, that object

may be run at the MSP site.

• ready: When an object is activated, its state becomes ready. The objects in the

ready state may be suspended or deleted. If the object is a relocatable object, it

cannot be relocated once it is activated.

• sleeping: If the execution of an object is suspended temporarily, its state becomes

sleeping. In this state, the object can be resumed (activated again) or deleted.

• relocating: The relocation process transfers all the input parameters of a

relocatable object to the MSP site. Until the end of the transfer operation, the

state of the object is relocating. The next state may be either relocated or

deleted_notnotified.

• relocated: After the transfer of all the input parameters of the object, the state of

the object is set to relocated. The next state may only be deleted_notnotified.

• finished: This state is special to the relocatable objects. If a relocatable object on

the MSP finishes its execution (not being deleted by a system call), its output

values must be transferred back to the mobile host. During the output transfer

process, the state of the object is set to finished.

• deleted_notnotified: After the end of the output transfer operation or a delete

request from the parent object, the object changes its state to deleted_notnotified.

This state lasts until the end of the successful notification for the object deletion

process, and ends up with the death of the object.

It is not possible to create relocatable objects at the MSP site. That means some of the

states are not used at the MSP site, whereas some of them are not used at the MH site.

Table 3.2 shows all the possible states of a MaROS object depending on its type and

location. More detailed information can be found in [3].

Relocatable Ordinary Object States MH MSP MH MSP

ready √ √ √ √ sleeping √ √ √ √ created √ √ √ created_notnotified √ created_notified √ relocating √ relocated √ finished √ deleted_notnotified √ √

Table 3.2: Object States.

3.7 Communication Structure

The communication design is one of the most important parts of a system like

MaROS. The performance of the system heavily depends on its communication

infrastructure.

The mobile systems mostly operate in a voluntary disconnected state. Since, the

current communication primitives are designed for the fixed networks, they easily

block a mobile host if there is no connection. In MaROS, a new communication

protocol, the Turkish Coffee Protocol (TCp), is designed to overcome the problems of

the mobile hosts. It resides over UDP, and provides virtual ports, nonblocking

primitives and message queues for supporting the reliable disconnected operations.

The Tp protocol uses two system objects to manage queues and virtual ports: The

Communication Agent (CA) and Virtual Port Mapper (VPM).

The CA is the creator and the manager of the outgoing message queue (send queue).

This queue contains the messages that are to be sent to the remote host. The use of

queues prevents the loss of messages, incase the mobile host is disconnected from the

system. The other system agents and user programs may get information about the

connection status via the CA.

The VPM is the controller of the virtual ports. There are two main task of the VPM:

1. Virtual Port Assignment: The job of the VPM is the mapping process of the

virtual ports to physical ports. When an object requests a port, the VPM allocates

a physical port. Then, it maps the port to a virtual port in its table, and returns the

virtual port number to the object. The object, which requests a virtual port, may

request any virtual port, or a special virtual port. The VPM provide services for

both types of requests.

2. Virtual Port Reservation: An object may need to reserve a port for its

subobjects. This process is very popular among the system agents to gain time.

The object reserves a virtual port and is given a key for accessing that port, later.

Then that object or any object that knows the port number and the key may use the

reserved port. Section 4.3.2 explains the process in more detail.

3.8 System Recovery

The mobile hosts have limited power supply. They have to be voluntarily shutdown,

when their battery requires rechargement. Because of this limitation, a mobile host

user does not want to run a long-run process. However, when the system shutdown is

necessary, a program may signal all the programs in the system to create their

recovery files. In MaROS, the task of the Recovery Agent is exactly the same. It

detects and coordinates a shutdown process. More detailed information may be found

in Chapter 5.

Notification Design

4.1 Introduction

This chapter briefly explains the design of the Notification process of the MaROS

system. In MaROS, there are two sites that need to communicate with each other: The

Mobile Host and the Mobile Host Service Provider. The user at the MH may need to

transfer his/her MaROS programs (objects) to the MSP site, and may want to delete

them after running those programs and retrieving the results. The transfer operation

creates copies of the chosen objects at the MSP site, and deletion operation deletes

those copies. In both operations, the MSP site is said to be notified. There is a special

system agent that is responsible from the Object Transfer and the Object Deletion

operations. This agent is called as Notification Agent, and the term Notification

covers both Object Transfer and Object Deletion.

4.2 Notification in General

A relocatable object may only be created at the MH site and may only be transferred

to the MSP. The initial state of a relocatable object is created-notnotified. In this state,

a MaROS user has two possible choices: 1) Running the object at the MH site, or 2)

Running the object at the MSP site. If the first choice is selected, the user may run the

object anytime it is necessary. However, after running the object, the user may not

interrupt the object execution and relocate it to the MSP site (In the current design and

implementation, the migration of running objects are not supported). If the user

selects the second way, he/she should wait the object state to change to

created_notified. When a relocatable object is created, this state change is

automatically initiated by MaROS. The Notification Agent is responsible from the

transfer of the Java Class files of the relocatable object to the MSP site. When the

files are transferred, the object may be run at the MSP site, also. The Notification

Agent receives Notification requests2 directly from the Handler (The worker thread of

the Object Manager), and informs it whether or not the transfer or deletion job was

successful. If the notification job is successfully done, and it is an object transfer

process, then the OM changes the state of the relocatable object to created_notified.

In order to run the object at the MSP site, relocate() primitive is used. The relocate()

primitive activates the Migration Agent, and the MA transfers the command line

parameters that are necessary to run the object to the MSP site.

2 A Notification request is either an object transfer or object deletion request.

4.3 Detailed Design

The NA uses reliable communication primitives of Turkish Coffee Protocol (TCp). It

directly receives information from the Handler thread of the local Object Manager.

The information contains the type of the task (Actually, the NA may handle more than

one task). After successful transfer of the object, the user may want to delete the

object. It is in the NAs responsibility to delete the object specific files after the arrival

of delete request.

The Notification Agent is a very busy agent, and it should be available any time it is

needed. Transferring and deleting objects are very time-consuming tasks, and they

require extensive work. If the request is a transfer or delete request (if it may be an

invalid packet, the NA actually does nothing, and ignores the packet.) It immediately

reserves a virtual port from the Virtual-Port-Mapping service of the Communication

Agent, and creates a Notifier. The Notifier is a system object that is responsible for

transfer or delete requests. When creating the Notifier, the NA passes all the

necessary information to the Notifier as parameters. This information contains

reserved virtual port number, type of the task, Object ID and name, and Java Class file

names to be transferred (if the job type is object transfer). The Notifier uses a reserved

virtual port for communicating with other system objects. There may be more than

one Notifier running at a time. Each of them is responsible for one special transfer or

delete request.

4.3.1 Peer Entities

The four system agents are located at both on the MSP and on the MH. For instance,

the Notification Agent at the MH has a peer at the MSP site. In the rest of the text, the

NAMH refers to the Notification Agent at the Mobile Host, and the NAMSP refers to the

Notification Agent at the MSP site. When the NAMH is creating its Notifier, it also

informs its peer for the new incoming request. As depicted in Figure 4.1, the NAMSP

reserves a virtual port number and creates a peer Notifier at the MSP site. When

creating the Notifier, the NAMSP gives all necessary information about the notification

request and its peer Notifier which is located at the MH site. This information

contains the virtual port number of the NotifierMH, the type of the notification request,

the Internet address of the Mobile Host, the reserved virtual port number and virtual

port key for using that port.

O: MaROS object OM: Object Manager H: Handler NA: Notification Agent N: Notifier VPM: Virtual Port

Mapper

Figure 4.1: Initial phase of a Notification process

Figure 4.1 shows the initial phase of the Notification process. At this stage, both the

NAMH and the NAMSP assign the given task to the Notifier peers and start waiting for

new requests. In the figure, the numbers on arrows indicate the order of events.

4.3.2 Reserving Ports

The Notification process is a time-critical process, and everything should be

organized in an efficient way. The first priority job of the NAMH is to inform its peer

(NAMSP) as soon as possible. Any latency at this process delays the Notification

process. There are two design choices in the initial phase of Notification process. The

first approach is simpler than the second one; however, it is not optimal. The Figure

4.2 illustrates the first approach.

tnc: Time of Notifier creation at MH tnc’: Time of Notifier creation at MSP ti: Time for informing peer NA tm: Time for informing OMMSP

Figure 4.2: A Notification design approach (Initial Phase).

In this approach, the NAMH creates its Notifier before informing its peer. The NAMH

and the NAMSP are in different machines and they may carry on their work in parallel.

This approach proposes a sequential execution. The NAMSP sits idle until the

NotifierMH informs it. In the Figure 4.3, tnc denotes the time for the Notifier creation, ti

denotes the time for informing the NAMSP, tnc' denotes the time for the NotifierMSP

creation, and tm denotes the time for informing the OMMSP.

Figure 4.3 illustrates the second and currently used approach. The NAMH immediately

informs the NAMSP. In this case, they may work in parallel providing shorter

Notification time values over the first approach. In Figure 4.4, these two approaches

may easily be compared.

tnc: Time of Notifier creation at MH tnc’: Time of Notifier creation at MSP ti: Time for informing peer NA tm: Time for informing OMMSP

Figure 4.3: Current design of the Notification process (Initial Phase).

However in the second approach, the NAMH should inform its peer about its

NotifierMH, which is not created yet. The NAMH should know the virtual port number

of its NotifierMH, before the creation of that NotifierMH. The same process is used at

the MSP site by the NAMSP. At the end, the NAMH may not inform its peer, before

creating its Notifier, and it may not create its Notifier, before informing its peer. This

deadlock is solved by a service called Virtual Port Reservation (VPR) in Virtual Port

Mapper (VPM) system object of the communication layer. The VPR is actually not a

NA specific service. It is used by Handler objects of the Object Manager, and it may

be used by any other user program. The VPR enables the NA to reserve port numbers.

The NA requests and reserves a port number. This adds a small amount of time to the

Notification process. This time period is shown in Figure 4.4 as time period of tpr (tpr

denotes the time for virtual port reservation). It receives the port number and also a

key for using that reserved port. The key provides security in the reservation process.

Only the Notifier with this key may get the reserved port. However, the Notifier

should use a different method when creating a connection. It passes virtual port

number and the virtual port key to the VPM for obtaining the reserved port.

tnc: Time of Notifier creation at MH tnc’: Time of Notifier creation at MSP ti: Time for informing peer NA tm: Time for informing OMMSP tpr: Time for port reservation

Figure 4.4: The comparison of two approaches.

4.3.3 Notifier Tables

Notifiers are the worker threads of the Notification Agent. There are five different

tables hold and used by Notifiers: Notifier Object Transfer Table (NOTT), Partial

Object Transfer Table (POTT), Class Dependency Table (CDT), Class Replica Table

(CRT), and Notifier Information Table (NIT).

4.3.3.1 Notifier Object Transfer Table (NOTT)

This table holds the names and full path of the class files to be transferred, and also

the length of these files. When an object transfer is in progress, the NOTT is always

used. The NOTT contains the names of the files that are going to be transferred.

4.3.3.2 Partial Object Transfer Table (POTT)

The POTT is used when a partial transfer operation is in progress. An object may be

partially transferred to the MSP site, due to many factors such as system shutdown,

link failures, etc. When the object transfer operation is interrupted, the retransfer of all

already-transferred objects in the next startup is not an ideal way. It is wasting of time

and system resources. MaROS tries to optimize the transfer operation by keeping an

additional table called POTT. The POTT keeps the indices of the partial files in the

NOTT. It additionally keeps the current length of each partial file. This information is

used to transfer partial files.

idx Filename with fullpath File length

1 /MaROS/test.class 17000

2 /MaROS/sample.class 25000

3 /MaROS/rect.class 1007

idx NOTT index Current length

1 1 4096

2 2 8192

Table4.1: Sample NOTT and POTT instances.

4.3.3.3 Class Dependency Table (CDT) and Class Replica Table (CRT)

The CDT and its sub-table the CRT are used by only the Notifiers on the MSP. The

Figure 4.7 depicts the scope of all tables used by the Notification Agent and its

Notifiers. Each NOTT and POTT are created and used by only one Notifier. It is

shown that each NOTT and POTT have a copy at the peer Notifier.

Figure 4.5: Scope of the Notifier tables.

On the other hand, the NIT, the CDT and the CRT are global tables used and updated

by all Notifiers. For this reason, these tables should be synchronized for preventing

readers-writers problem (A typical and famous synchronization problem).

OID # of

Classes

File 1 File 2 File 3 …

28 3 1 2 3

29 2 2 3

… … … … … …

idx Class Name # of occurrence

1 /gurhan/Circle.class 1

2 /gurhan/Rect.class 2

3 /gurhan/Test.class 2

… … …

CRT Table 4.2: Sample CDT and CRT instances.

In the sample tables above, the task of each table is depicted. The CDT keeps object

records. It knows which object has which classes. There are references to the CRT for

class names. For example, in the sample CDT object Circle has three classes that are

referenced as 1st, 2nd, and 3rd positions in the CRT. A class file may be used by more

than one MaROS object. The CRT holds the information of how many objects are

using how many dependent classes. From the sample CRT, it is seen that class Rect

and class Test are used by two objects; whereas, class Circle is used by only one

object. Moreover, all these classes are uploaded from the mobile host which is called

as gurhan.

If any object deletion is necessary; first, the CDT is searched and all dependent

classes are located at the CRT. Each class occurrence number is decremented by 1 in

the CRT. If there are any class file with an occurrence number less than 1, this means

the object may be physically deleted since there are no any other object is using that

class (Another design preference is to hold the class file, since a new object may need

it very soon. However, at this case, a ttl (time-to-live) number may need to be

attached to that record. The CRT is periodically checked and if any ttl becomes zero,

the corresponding object is deleted)

4.3.3.4 Notifier Information Table (NIT)

As its name implies, this table holds the information of all Notifiers. There are two

versions of the NIT: The MH version and the MSP version. The MSP version holds

Mobile Host Identifier (MID) as an additional field. Each Notifier records itself into

that table in the beginning of its execution, and deletes itself at the end of its

execution. It is a synchronized table as CDT and CRT. The NITMH consists of three

fields: The Object ID of the Notifier, the Object ID of the object that Notifier deals

with, and finally the notification type. The notification type may be Creation or

Deletion. The NITMSP contains an MID field. This is used to identify which objects

belong to which Mobile Hosts. This table is mainly used by the Object Deletion

process (Section 4.3.5).

Notifier

OID MID Type of

Notification

3090 980 00000000001 C

3095 993 00000000001 C

3110 765 00000000001 D

… … … …

Notifier

OID Type of Nofification

1050 980 C

1067 993 C

1056 765 D

… … …

NITMH NITMSP

Table 4.3: Sample NIT instances.

4.3.4 Object Transfer (Object Creation)

The object transfer process deals with copying all the Java class files of the object to

the MSP site. It is a unidirectional process; the object transfer is only possible from

the MH to the MSP. This process is automatically initiated by the Object Manager,

when an object is created as relocatable. The Figure 4.10 shows the entire object

transfer scenario.

Figure 4.6: Object transfer (creation) process

The Object Manager creates a Handler (3), and the Handler at the Mobile Host (HMH)

sends a Notification request to the Notification Agent (4). The message format of the

request is as follows:

pnum rt oid path ncf cfinfo opts

Figure 4.7: Message format of Notification request.

This message is sent from the HMH to the NAMH. The first field, pnum, contains the

virtual port number of the HMH. This information is used by the NotifierMH to connect

to the HMH. The second field is the request type. As indicated before, a Notification

request may be either a Create (Transfer) or a Delete request. The next field contains

the object identifier of the object that will be notified. The next three fields are only

used when the request is an Object Creation request. The path contains Java

CLASSPATH of the object. The field ncf contains the number of class files to be

transferred. The next field, cfinfo, holds the information of the class files. The figure

4.12 shows this part of the message, in detail. The final field, opts, is reserved for

future use. It is added to the message format to hold any possible options that may be

added to the Notification process, in the future.

name len cdt owner Opts # …. #

Figure 4.8: cfinfo field in detail.

The fifth field in the Notification request message contains information about the

class files that the object have. Each file record has five fields, and file records are

separated with # signs. The first field contains the name of the class file. This

information is combined with the path value for locating the class file. The next field

is the length of the file. The third field contains the creation date and time of the file.

This information is planned to be used to detect different versions of the objects. The

fourth field holds the owner of that file in the system. In current design, there may be

only one user running MaROS in a MH, and this field is set to user maros. The last

field is again reserved for future use, and currently null.

The NAMH reserves a virtual port (5) for its (currently non existent) Notifier and

writes this port to the first field of the message. Then, it adds MID (Mobile Host

Identifier) and 48-bit security code3 to the head of the message, and forwards it to its

peer (6), and creates a Notifier for dealing with that process (7). The Notifiers use a

file transfer protocol similar to FTP (File Transfer Protocol). Since, TCp primitives

are being used, the Notifiers do not deal with packet sequencing and error correction.

When the NAMSP receives the request, it reserves a virtual port number (8) for its

(currently non existent) Notifier and writes this port to the third field (pnum) of the

message. This process is very similar to the job of the NAMH. After replacing the first

field in the message, it forwards the message to the OMMSP (10). The OMMSP creates a

Handler (HMSP) for informing the NotifierMSP (11). The HMSP checks the file structure

and creates a message for the NotifierMSP (12).

3 In current design, this security code is added for future work. It does not have any function, yet.

tt ncpf pcfinfo opts

Figure 4.9: Message format of the message that is sent from HMSP to NotifierMSP.

This message contains all the necessary information related with the object transfer.

The first data field contains the type of transfer. An object transfer may be in three

types: Full, Partial, or No need for transfer as illustrated in Figure 4.13. When the

HMSP checks the object files in the file system of the MSP, it may find out that none of

the object files are found in the file system. Then it determines the object transfer type

as Full transfer. However, if some of the object files are in the system, or some of

them are partially in the system, the object transfer type is set to be Partial transfer. In

the third case, all of the object files may already be in the system. Then, there is no

need for the object transfer, since all the files are in the system. The transfer type is

No Need in this case. Brief explanation of these types is as follows:

4.3.4.1 Full Transfer

In Full Transfer mode, the HMSP only sends an F (indicating Full transfer) in the

message. When the NotifierMSP receives this message, it knows that all the files

should be transferred in the NOTT (The NOTT is created and filled in the beginning

of the Notifier execution). The NotifierMSP forwards the packet to the NotifierMH (13),

and the NotifierMH starts transferring files one by one (14).

4.3.4.2 Partial Transfer

In Partial transfer mode, the HMSP sends a P (indicating Partial transfer) followed by

the number of partial files (ncpf). The pcfinfo contains partial class files information.

The Figure 4.14 pcfinfo field in detail. It contains the NOTT indices, and the lengths

of those partial files at the MSP site (If file does not exist, this field is 0). The file

records are again separated with # signs. The NotifierMSP forwards this packet to the

NotifierMH (13). Both of them create their POTT, and the NotifierMH starts transferring

only those partial files (14).

NOTT_index len opts # … #

Figure 4.10: pcfinfo field of the message in Figure 4.13.

4.3.4.3 No Need To Transfer

In this case, the HMSP sends an N (indicating No transfer is needed) in the message.

When the NotifierMSP receives this message it decides that all the files already exist in

the MSP. It simply forwards this message to the NotifierMH (13).

When the transfer operation ends, the NotifierMSP signals the HMSP, and the NotifierMH

signals the HMH for success or failure in object transfer operation (15). Figure 4.15

displays the message format of the message that is sent from Notifiers to Handlers.

This message contains a Success or an Fail indicating the result of the Notification

process. The second field contains the object identifier of the object that is just

notified or failed to be notified. If the operation is successfully finished, the HMSP and

the HMH update their tables and change the state of the object from

created_notnotified to created_notified at the MH site.

result oid

Figure 4.11: Message format of the message sent from Notifiers to Handlers.

4.3.5 Object Deletion

After running the object, and obtaining the results, the user may want to delete the

object. In order to delete the object, deleteObject() primitive of the OM is used.

Figure 4.16 illustrates the object deletion process.

Figure 4.12: Object deletion process

When the OM receives the deletion request (1), it immediately checks the type of the

object (2). If the object is an ordinary (non-relocatable) type object, the OM may

delete the object immediately. Ordinary objects may be deleted without regarding

their state. However, the object may be a relocatable object. In this case, the state of

the object is checked by the OM. The OM creates a HMH (3), and the HMH interacts

with other system agents for a successful object deletion.

Object deletion process is very similar to object transfer process. The Notifiers are

created at both site (7, 9), and the OMMSP is informed as it is in object transfer (10).

However, each Notifier checks the Notifier Information Table (NIT) for learning if

that object is already transferred or not (8, 10). If there is a Notifier dealing with the

transfer of that object, it is stopped by the Notifier who is charged to delete that

object. Meanwhile, the OMMSP creates a HMSP for dealing with this deletion process

(11). HMSP checks the Object Table (OT) and if the object is present, it removes the

object from the OT (12). Then it informs the NotifierMSP whether the deletion was

successful or not (13). If the object is deleted from the OT successfully, the

NotifierMSP tries to delete all the class files of that object. First, it deletes the object

record from the CDT and the CRT (The deletion of the object from the CDT and the

CRT is explained in Section 4.3.3.3). Then, it sends the result of the deletion process

to the NotifierMH (14). Finally, the NotifierMH does nothing but forwards the packet to

the HMH (15). The first field of this packet contains an S or F for notifying success, or

failure in the deletion process. The second, and the last, field contains the object

identifier of the object for providing security in the Notification process.

System Recovery Design

5.1 Introduction

System recovery is one of the most crucial parts in MaROS. There are two possible

types of Recovery: Heavy-weight Recovery, and light-weight Recovery. The first type

of recovery deals with all type of unexpected failures such as system lock-ups,

hardware failures, etc. In the current design and implementation, this type of recovery

is not handled. The second type of recovery deals with expected system interruptions

such as shutdown request by the user. Since, a shutdown request is detected by the

system, a negligible amount of time may be spent to backup some crucial data. This

process enables the system to continue its execution as if there were no interruptions,

in the next system startup. The process of backing up all the crucial data is not a

straightforward issue, and it should be coordinated in a careful manner. MaROS uses

a special agent to control the recovery process: Recovery Agent (RA). Recovery

Agent provides a controlled shutdown and this is called as System Suspension. When

the MaROS is rebooted, everything continues their execution from the point where

they are suspended. Without the RA, all interrupted processes should be restarted

without any chance. That means wasting of resources, and time.

5.2 Recovery Table (RT) and Recovery Tree Structure

Java does not support signal handling. This deficiency led the MaROS group to

implement their own signal handling backbone. The Recovery Agent and the

Recovery Table are the two main components of the signal handling structure. The

Recovery Agent keeps track of the Recovery Table (RT) for handling recoverable

objects. The RT holds all recoverable objects and their subobjects (if there are any).

This table actually holds a recovery tree structure4 as depicted in Figure 5.1. The root

of the tree is the Recovery Agent. Since each system agent controls a crucial part of

the system, all of them are recoverable.

The RA creates and handles a Recovery Tree structure by the help of the RT. If a

MaROS object has subobjects, the programmer should decide whether these

subobjects need recovery or not. For instance, the Notifiers are such subobjects that

rely on recovery. They transfer class files from the MH to the MSP. Incase there is an

interruption, all the transfer operation should not be started from the beginning.

4 Recovery Table does not contain the root (RA) of the Recovery Tree.

Figure 5.1: A sample Recovery Table instance and its corresponding tree structure.

One of the main tasks of the RA is to detect shutdown requests and initiate the

shutdown process. Before a proper shutdown, all recoverable objects should be

signalled and given a chance to write their crucial data to disk. The RA signals all

recoverable objects by traversing the Recovery Tree using the RT. Traversing the

Recovery Tree is a cooperative task. The RA initiates the process, and the OM

continues. All recoverable objects signal their subobjects by using the RT class

methods and wait until the subobjects finish their recovery process. Then, they do

their recovery work and signal their parent object.

5.3 Recoverable Objects vs. Unrecoverable Objects

Recoverable Objects are MaROS objects that are recorded into the RT. This table

contains the recoverable objects and their subobjects. This means if a subobject of an

object is recoverable, it should also be recoverable. Best examples for recoverable

objects are system objects such as the Notification Agent and its Notifiers. They are

all recoverable and they are recorded into the RT, when the system is in the startup

process. A MaROS programmer should make a plan and decide which of his/her

objects should be recoverable. Unfortunately, choosing the objects that are eligible to

be recoverable is not a straightforward issue. There are some drawbacks of making an

object recoverable. First of all, a recoverable object should have extra code inside and

this additional code decreases the execution speed slightly. So, if there will be a five

minutes execution of an object, that object may not be eligible to be recoverable.

However, if an object has a very time-consuming calculation, there may be need for

recovery. After many hours of calculation, it may be necessary to shutdown the

system. The object should be recoverable for surviving from system shutdowns.

If an object is chosen as recoverable, it is written into the RT by the Object Manager

(or by its parent object if it is a subobject). A recoverable object should be signalled

for any incoming shutdown process. The object is signalled by setting the SS

(Shutdown Signal) field of the recoverable object from 0 to 1 in the RT.

Each recoverable object has a record in the RT. Each of them has a Shutdown Signal

(SS) field that is initially 0. They should periodically check their SS fields. The object

knows that system will be shutdown soon, if its SS field becomes 1.

Shutdown process is a hierarchical and decentralized process. Some objects should

wait other objects to go into the recovery state. This explains the meaning of the

keyword hierarchical. In addition, each recoverable object is responsible for its

recovery. The RA does not deal with how the recoverable objects recover themselves.

It just coordinates the proper shutdown process.

Periodically checking the SS field is possible with additional code inside the

recoverable objects. Additional code deals with the object execution states. An object

changes its execution state, when it receives data, produces data or sends data. After

changing its execution state, the object should check the RT for its SS field. It is in

programmer's control to determine these execution states. States may be atomic or

coarse depending on the program behavior. However, one must understand that after

the system startup, the object may continue its execution starting from it last state

visited. Unfortunately, it looses all calculations after that state.

There are some disadvantages of this approach. First of all, it increases the object

code size and the execution time. However, its advantage is very obvious. An object

may continue its execution from the point where it is interrupted. Another big

advantage over signal-based recovery approach is that it is more efficient and nearly

optimal. If a shutdown signal approaches between states, the object is not aware of the

shutdown event until it changes its state. This approach requires sligthly more

shutdown time, but provides more efficient recovery.

5.4 System Shutdown

When the RA receives a shutdown request, it immediately sets the SS field of the OM

to 1. When periodically checking the RT, the OM realizes that its SS field is 1, and it

immediately changes SS fields of its subobjects to 1 in a hierarchical order. By doing

so, it signals user objects, and its Handlers, and wait their SS fields all become 0

again. When they all become 0, the OM knows that user objects properly finished

their execution.

Figure 5.2: Setting the SS field.

Then it signals the system objects in a hierarchical order. It signals the Notification

Agent and the Migration Agent first. (since, there are no user objects at this stage, it

does not receive any request such as create, delete, or relocate. This means that the

Object Manager does not create new Handlers.) There is a design choice after this

stage. When the NA and the MA are done with the shutdown process (their SS field

becomes 0 again), the OM may signal its Handlers, instead of signalling them

simultaneously with the user objects. Handlers are the worker threads of the Object

Manager, and they just send Notification requests to the Notification Agent. Then,

they start waiting for the Notification results from the Notifiers. This design

alternative may optimize the shutdown process. However, in the current design, the

Handlers are signalled before the NA and the MA. The CA is the final object to be

signalled, since it is the lowest layer agent that is responsible from all the

communication backbone of the MaROS. Figure 5.4 illustrates the whole recovery

hierarchy. The numbers indicate the shutdown order. Some of the objects have the

same number for indicating parallel processing.

Figure 5.3: The OM realizes the Shutdown Signal.

Figure 5.4: Flow of the Shutdown Signal

5.4.1 Creating Image Files As indicated in Section 5.3, the RA uses a decentralized approach for the recovery.

Each recoverable object is responsible for backing up its valuable data before

shutdown. Basically, they create a file that stores all the necessary information to

resume the object at the next startup. This file is called image file. After the creation

of the file, the recoverable object set its SS field to 0 again to signal its parent that it

has finished its recovery work.

Each recoverable object is responsible from its image file content. The file name is

the MaROS object identifier of the object. For instance an object with an Object ID of

4 creates an image file 4. It is ideal to keep all the image files in the same directory.

5.5 System Startup

When starting up the system, the first object, which will be run, is the Object Manager

(the creator of all other MaROS objects). It checks if there is an image file in the

image file directory. If there is not, normal startup process is initiated. However, if

there is such an image file, it immediately enters its recovery state. Each recoverable

object has a recovery state that restores its tables and data. In this state, the object

reads its image file and restores its environment.

5.5.1 Object Manager Startup Process Figure 5.5 illustrates the startup process of the Object Manager. The OM restores its

Object Table and creates the Recovery Agent. Then, it waits RA to finish its recovery

work. After the creation of the RA, the RA restores its RT and flushes all SS fields

with 1s. Then, it signals the OM. The OM continues with creating other system and

user objects. Thereafter, it starts waiting all objects' SS fields to become 0. Each

recoverable object checks whether its image file is present or not. Incase it is present,

the recoverable object starts by running its recovery state. After the recovery state

finishes, the object set its SS field to 0. If all SS fields that the OM is checking

becomes 0, the OM starts its normal execution after flushing 1s to the SS fields

signalling all of its objects that the recovery state successfully completed. When a

recoverable object realizes that its SS field is 1 again, it understands that everything is

OK. It sets its SS field 0 again and continues its execution.

Figure 5.5: Startup of the Object Manager

5.5.2 Recovery Agent Startup Process

The very first job of the Recovery Agent after its creation is to check its image file. If

its image file does not exist, it continues its normal execution without entering any

recovery state. Otherwise, it enters a recovery state in which it restores its Recovery

Table and its state. Figure 5.6 depicts the startup process of the Recovery Agent. After

restoring the Recovery Table, the RA flushes SS fields of all objects with 1s. This

process is one of the key points of the recovery protocol. The Object Manager detects

recovered objects by checking their SS fields.

Figure 5.6: Startup of the Recovery Agent.

5.5.3 Startup of the System Agents The startup process of the other system agents are very similar to each other. The

same approach may be used by all of the recoverable objects. In the current design,

each recoverable object checks whether there is a filename equals to its object

identifier. If there is any, the object first restores its tables and variables by reading

that image file. Then, it continues from the state where it left. The original code is

replicated for each state. An alternative approach may be use of labels for jumping to

the exact desired state. However, Java does not allow this solution; since, it does not

support unconditional jump operations. Figure 5.7 depicts the code replication

process.

Figure 5.7: Code replication solution for the recovery process.

5.5.4 Mutation of SS fields and RA Garbage Collector

The execution of the recoverable objects may finish any time, and the immediate

removal of those objects from the RT is not a viable solution; since, their parent

objects may be blocked on a suddenly dead child object. The parent objects should be

informed in some way. The mutation process of the shutdown signal field sets the SS

field of the dead object to an extraordinary value of 2. This value of the SS field

indicates that the owner of that SS field is dead.

The Recovery Agent runs a garbage collector in order to remove those dead objects

from the Recovery Table. The garbage collector periodically checks the SS field of all

recoverable objects for mutation. Incase it finds any, it immediately removes that

object from the list of its parent object record, and from the Recovery Table.

Pilot System Implementation

6.1 Introduction

This chapter provides a detailed information for the pilot implementation of MaROS

environment. First, the implementation language that is chosen to implement MaROS

is discussed, briefly. Secondly, the pilot system implementation environment is

described. Finally, the implementation details of the Notification and Recovery

modules of MaROS will be given.

6.2 Pilot System Implementation Language

The implementation language of the system has been chosen as Java. It is an object-

oriented programming language very similar to C++. It has many advantages, and

unfortunately some disadvantages. The choice of Java as a programming language in

MaROS was one of the milestones in the design phase. Java is a very powerful

programming language because:

• Java is platform independent. A compiled object code may be run on any

hardware and OS without any modification and even compilation. This was one of

the main reasons for choosing Java as the implementation language.

• Java does not support pointers. This property provides system security; since,

users may not garble the crucial memory locations by using pointer operations.

• Java does not support the disadvantageous properties of other object-oriented

languages. For example, many OOP languages support multiple inheritance,

which can sometimes lead to confusion or unnecessary complications. Java does

• Java provides many pre-implemented utility classes such as Hashtable, Stack and

Vector classes. This property prevents programmers to implement and use their

own classes providing simplicity. This property also enhances the code

readability. Of course, programmers may extend these main classes by writing

their own methods.

• Java has a simple Thread package. MaROS is a multithreaded system, and Java is

one of the ideal programming languages for implementing such a system.

Java has also some disadvantages. First of all, it is slower than its counterparts; since,

it interprets the compiled byte code at runtime. It does not give full control to the

programmer for the sake of system security. For instance, with the lack of pointers,

and the process management tools, system programmers may encounter very

frustrated work.

Briefly, the choice of Java is a trade-off between system performance and system

portability & security.

6.3 Pilot System Implementation Environment

The MaROS environment consists of one SUN UltraSparc1 and eight Intel PCs. The

SUN system uses Solaris 2.5.1 operating system. Four of the PCs run Windows'95,

and the other four run Turkuaz 1.0.3 GNU/Linux operating systems. The UltraSparc1

is used as the MSP and Windows 95 machines are used as MHs. The Linux machines

were used as local MSPs by each programmer for testing MaROS modules. When the

new versions of the modules become feasible, those modules are transferred to the

6.4 Pilot System Implementation

There are five main modules that forms the MaROS when combined together. These

modules are called as packages in Java. The five main packages are listed below:

• OM: The MaROS.OM package is the Object Manager of the system. It handles

objects and their operations.

• net: The MaROS.net package is responsible from the communication

infrastructure of MaROS.

• Notify: The MaROS.Notify package handles the object notification process.

• Migration: The relocation of the objects is managed by the MaROS.Migration

package.

• Recovery: The system recovery process after voluntary shutdowns is handled by

the MaROS.Recovery package.

There are also additional utility packages in MaROS. This thesis only covers the

Notify and the Recovery packages.

6.4.1 Notify Package

The MaROS.Notify package contains all the applications that are necessary for the

object notification process. The NotificationAgent is the main class of that package.

In the following subsections, the classes of the Notify package are overviewed.

6.4.1.1 Notify.NotificationAgent Class

This class is the heart of the notification process. It is implemented as a MaROS

thread, and it is the part that listens notification requests coming from the Object

Manager. This Class also contains Notifier class, and additional three table classes:

Notifier Information Table (NIT), Notifier Object Transfer Table (NOTT), and Partial

Object Transfer Table (POTT).

There are two versions (MH and MSP) of this class; since, there are two types of

machines in MaROS. For the system recovery, the MH version has additional

methods such as check_ShutdownSignal(), and saveImage(). Since, the job of the

Notification Agent is listening notification requests and assigning Notifiers to those

requests, it contains an infinite loop and a Notifier creation code.

6.4.1.2 Notify.NIT Class

NotificationAgent class maintains Notifier Information Table (NIT) for keeping track

of its Notifiers. The NIT is a global table that is used by all the Notifiers. Therefore,

all of the NIT methods are synchronized so that there is only one access to each

method at a time. The NIT has been implemented by using a table structure mapped

on array structures.

There are two versions of the table: The MH and the MSP version. In the MSP

version, there is an additional field for keeping the MaROS identifier of the mobile

host that is the source of that notification request. Table 6.1 and 6.2 displays the

formats of two versions of the NIT.

Notifier Object Identifier Object Identifier Notification Type

int int char

Table 6.1: The format of the MH version of the NIT.

Notifier Object Identifier Object Identifier Notification Type Mobile Host Identifier

int int char 12 chars

Table 6.2: The format of the MSP version of the NIT.

The table size is 255, in default. That number determines the maximum number of

Notifiers that may run simultaneously. It may seem very large for a MH, and very

small for the MSP. Currently, the use of the array structure seems optimal; however,

the use of other data structures may be considered for better scalability in the future.

The methods of the NIT are explained below:

• ClearTable(): It sets the length of the table to 0. All elements in the table are left

untouched. However, they may be overwritten with the new table entries.

• insert (int NOID, int OID, char Type): This method inserts a new item into the

NIT. It accepts three parameters: The Object Identifier of the Notifier (NOID),

Object Identifier of the object which is to be notified (OID), and finally the type of

the notification (Create or Delete).

• delete (int NOID): This method deletes an existing entry from the table.

• get_NOID (int OID, char Type): It returns the Notifier Object Identifier (NOID)

of a given notification request.

• get_tmax (): This method returns the current size of the table.

• set_tmax (int mx): With a given parameter, this method sets the table size. It is

used by the recovery process.

• get_Notifier_OID (int idx), get_ObjectID (int idx), get_Type (int idx): Those

three methods are used by the recovery process to restore the table entries.

6.4.1.3 Notify.NotifierClass Class

The NotifierClass is the class that is responsible from the Object Notification process.

It accesses NIT for registering or removing the current Notifier instance. It handles

the notification request forwarded by the Notification Agent. For the object transfer,

two additional tables are used. These tables are called as the NOTT and the POTT.

There are two versions of the NotifierClass like the NotificationAgent Class: The MH

and the MSP versions. The MH version contains additional methods for the recovery

process. It also has additional code inside. That means the MH versions of the

Notification Agent and the Notifiers run slower than their MSP counterparts for the

sake of reliability.

6.4.1.4 Notify.NOTT and Notify.POTT Classes

The Notifier Object Transfer Table (NOTT) and the Partial Object Transfer Table

(POTT) are used by the NotifierClass, and they are local tables for each of the

Notifiers. Each Notifier, that is responsible from an object transfer process, manages

its NOTT. On the other hand, the Partial Object Transfer Table (POTT) is created,

and used only, when the object transfer operation is in partial type.

File Names File Lengths

String long

Table 6.3: The format of the NOTT.

File Indices Current File Lengths

int long

Table 6.4: The format of the POTT.

The structure of the NOTT and the POTT are very similar to NIT. The formats of

both tables are shown in Table 6.3 and Table 6.4, respectively. The maximum table

size is 255 for both tables, in default. The methods of the NOTT are explained below:

• ClearTable(): It sets the length of the table to 0. All elements in the table are left

untouched. However, they may be overwritten with the new table entries.

• insert (string Filename, long Filelength): This method inserts a new item into

the NOTT. It accepts two parameters: The name and the length of the file. This

information is used for the object transfer operation.

• get_tmax (): It returns the current size of the table.

• get_filename (int idx): It returns the filename field of a given index in the NOTT.

• get_filelength (int idx): This method returns the filelength field of a given index

in the NOTT.

The POTT has almost the same methods. There are some differences in the methods

listed below:

• insert (int Fileindex, long currentFilelength): This method inserts a new item

into the POTT. It accepts two parameters. The first parameter is the index of the

file in the NOTT. The second parameter contains the partial file length of the file.

The partial transfer starts from that point.

• get_fileidx (int idx): It returns the fileindex field of a given index in the POTT.

• get_currfilelength (int idx): This method returns the filecurrlength field of a

given index in the POTT.

6.4.2 Notify.CDT Package

The Class Dependency Table (CDT) package is available in the MSP version of the

Notify package. A mobile host user may want to transfer many objects that use shared

classes. If a copy of a class file is available at the MSP site, there is no use of

transferring it again. The CDT package prevents retransmission of the same classes by

keeping track of a table called as Class Dependency Table (CDT). The CDT class

uses another class called as Class Replica Table (CRT). Chapter 4 contains a very

detailed information about the logical structures of these tables. This section gives a

detailed explanation about the physical structure of the tables.

6.4.2.1 Notify.CDT.CDT Class

The Class Dependency Table (CDT) has been implemented as an array structure. The

table size is 512, in default. Since, the CDT and the CRT are global tables that may be

updated by many Notifiers at a time, they are synchronized. The format of the CDT is

shown in Table 6.5.

Object Identifier Object Name # of Dependent Classes Dependent Classes

int String int int Vector

Table 6.5: The format of the CDT.

Each MaROS object consists of at least one class file. The CRT contains the names

and the number of occurrences of transferred files at the MSP. The CDT contains the

number of these files and their references to the CRT. The Dependent Classes field is

a Vector structure that contains the indices of these files in the CRT. All of the vector

components are in integer format.

There are a number of methods for the table management:

• insert (int OID, String ObjectName, int DepClassNo): This method inserts a

new element into the table. The dependent classes are inserted by the insertClass()

method.

• insertClass (int index, String ClassName): It inserts the dependent class file

information into the dependent classes vector field. This information contains the

CRT index and the name of the class file.

• delete (int OID): This method removes an entry from the CDT. It also updates

the CRT entries.

• getClassName (int index, int classIndex): It is used for obtaining the file names

from the CDT. This method is mainly used by the Notifiers for the full object

transfer operation.

• printCDT(): It is used for debugging purposes.

6.4.2.2 Notify.CDT.CRT Class

The Class Replica Table (CRT) is used by the CDT. The CRT has been implemented

as an array structure. Its default size is 16384 (4000 in hexadecimal format). In the

future, it is planned to be implemented by using hashtable structure. The format of the

table is shown in Table 6.6. This table simply keeps the number of occurrences of

each transferred class file in the MSP.

Class Names How Many Copies

String int

Table 6.6: The format of the CRT.

There are several methods for maintaining the table:

• insert (String ClassName): It inserts a new class name into the table. If that class

name already exists, the How Many Copies field is incremented by 1.

• delete (int index): This method decrements the How Many Copies field by 1. If

its value becomes 0, that record is deleted, and its corresponding class file is

removed from the system.

• getClassName (int index): It returns the name of the class file at a given index.

• printCRT(): It is used for debugging purposes.

6.4.3 Recovery Package

The Recovery package has been designed and implemented to increase the reliability

of the mobile hosts. Therefore, this package has only MH version. There are two

classes in the Recovery package: The RecoveryAgent and the RecoveryTable classes.

Since, there is no signal handling in Java, an alternative approach has been designed.

This approach uses a mutually exclusive shared global table: The Recovery Table.

Chapter 5 contains all of the details for the design of this table. In this section, the

implementation of the table is explained.

6.4.3.1 RecoveryTable Class

The Recovery Table is used by the Recovery Agent and all recoverable MaROS

objects. It is a signal handling backbone for the MaROS. A recoverable object may be

signalled by setting its Shutdown Signal (SS) bit to 1 in the Recovery Table.

Moreover, a recoverable object may wait for another recoverable object, and then

continue its work by using special Recovery Table methods. The Recovery Table is a

global table, and it must be set as mutually exclusive. All the methods of the

Recovery Table are synchronized. The format of the table is shown in Table 6.7. The

table is a hashtable. The keys for the hashtable are OIDs of the objects. All hashtable

elements are vectors. Each element in the vector is in Object format.

Shutdown Signal Object Identifier

(Hash Key)

OID of

SubObject #1

OID of

SubObject #2

Object Object Object Object …

Table 6.7: The format of the Recovery Table

The RecoveryTable class methods are explained below:

• Insert (Object key): This method inserts a new recoverable object information

into the RT. In all of the RT methods key is the object identifier, and Pkey is the

parent object's object identifier.

• Insert_SubObjectID (Object Pkey, Object key): It is used by the Recovery

Agent for recovering the table at the startup.

• Insert_SubObject (Object Pkey, Object key): This method inserts a new

element into RT, and updates its parent record adding the key of the subobject.

• Shutdown_Signal (Object key): It returns either 0 or 1. If this method returns a

1, that means the object with the given key should start its recovery procedure.

• Signal_Object (Object key): An object may signal another object by using this

method. It simply sets the Shutdown Signal (SS) field of the given object to 1.

• Wait_Object (Object key): In the recovery hierarchy, an object may have to wait

another object (e.g. its subobjects) to go on with its own recovery procedure. This

method blocks the calling object until the object with the given key finishes its

recovery.

• Signal_All_SubObjects (Object Pkey): This method is the enhanced version of

the Signal_Object() method. The parameter Pkey is the object identifier of the

object with one or many subobjects. It simply calls Signal_Object() method for all

of the subobjects.

• Wait_All_SubObjects (Object Pkey): This method is the enhanced version of

the Wait_Object() method. The object with an object identifier equals to Pkey

waits until all of its subobjects finish their recovery.

• clearShutdownSignal (Object key): It clears the Shutdown Signal (SS) field of

the given object in the Recovery Table.

• mutateShutdownSignal (Object key): This method sets the Shutdown Signal

field of the given object to 2. It is used by the terminating objects as a last call.

The methods Wait_Object() and Wait_All_SubObjects() check the SS field of the

object(s). These objects are removed from the RT by the garbage collector.

• deleteObject (Object Pkey, Object key): This method removes the object with

an object identifier equals to key. It also removes its entry from the record of its

parent object.

6.4.3.2 RecoveryAgent Class

The RecoveryAgent class is responsible from the coordination of the recovery

process. It is one of the main MaROS system agents. It manages the Recovery Table

(RT) and enables the system and the user objects to use the available Recovery Table

methods.

Another task of the Recovery Agent is the recovery of the Recovery Table. When the

system recovery is in progress in the system startup, the Recovery Agent restores the

Recovery Table.

The RecoveryAgent class maps the RecoveryTable class methods. Those methods

have already been explained in the previous section. The Recovery Agent has also

additional methods for the recovery process of the Recovery Table. These methods

are explained below:

• Signal (): This method is the controller of the shutdown process. It is triggered by

a shutdown request. Then, it initiates the recovery process.

• saveImage(): It is used for saving the image of the Recovery Table.

6.4.3.3 Recoverable Object Implementation

The current design and implementation of the recovery process in MaROS does not

provide a user transparent interface. In order to make an object recoverable, the

programmer should complete the following steps:

• Shutdown Specific Steps:

• The determination of the code states: A MaROS code may be divided into

several pieces. These pieces are called as code states. A code state change may

occur, when a program sends, receives or updates data. Figure 6.1 displays a

code part before and after the addition of the code states.

String tmpstr = strarr.substring (5); String tmpstr = strarr.substring (5); strarr=tmpstr; strarr=tmpstr; lngth = lngth - 5; lngth = lngth - 5; strarr=strarr.substring(0,lngth)+'\0'; strarr=strarr.substring(0,lngth)+'\0'; // Get Virtual Port Number from VPM // ##################################

// MESSAGE RECEIVED try { // E N T E R I N G S T A T E 1 dummy = new TCpClient (); }catch(MaROS.net.VPortException vpe) { STATE = 1; // exception handling // NECESSARY ROLLBACK DATA: } // - NIT

// - lngth PortNumber = dummy.reservePort(); // - strarr (<- OMMH) PortKey = dummy.getKey(); // - HandlerPort

// Check Shutdown Signal if (check_ShutdownSignal() == 1)

return; // ##################################

// Get Virtual Port Number from VPM

dummy = new TCpClient (); }catch(MaROS.net.VPortException vpe){

// exception handling } PortNumber = dummy.reservePort(); PortKey = dummy.getKey();

BEFORE AFTER

76Figure 6.1: The example piece of code showing the addition of code states.

• The addition of the signal checker and the image file creator: Each

recoverable object has a record in the Recovery Table. Since, there is no signal

handling backbone in Java, the objects should periodically check their

Shutdown Signal field in the Recovery Table. This check may be done

between the code state transitions. At each transition, a method may be called

to check the Shutdown Signal field of the recoverable object. This method is

check_ShutdownSignal() as a tradition. Figure 6.1 and Figure 6.2 show the use

and implementation of this method, respectively.

// This method check shutdown signal for this object. // It returns 0, if everything is usual // Otherwise, it returns 1 indicating shutdown signal has reached public static int check_ShutdownSignal() { if (RecoveryAgent.Shutdown_Signal(new

Integer(MaROSobject.currentObject().getOID() ) ) == 1) { saveImage(); RecoveryAgent.clearShutdownSignal(new Integer

(MaROSobject.currentObject().getOID()) ); return (1); } return (0); } // check_ShutdownSignal()

Figure 6.2: The implementation of check_ShutdownSignal() method.

The next job is the creation of the image file creator. The traditional method

saveImage() is used for this process. This method creates a random access file in

the image directory with the name of the object identifier of the recoverable

object. Then, it saves the necessary tables and variables one by one. The format of

the image file is left to the programmer. However, it is also tradition to use "^"

character between the fields.

// This method takes image of the current object instance to disk // including tables, etc. public static void saveImage() { // Signal all subobjects RecoveryAgent.Signal_All_SubObjects (new

Integer(MaROSobject.currentObject().getOID() )); // All subobjects signalled // Now wait them to finish RecoveryAgent.Wait_All_SubObjects (new Integer

(MaROSobject.currentObject().getOID() )); // All subobjects finished their recovery job // Save Image File RandomAccessFile imagefile; String imagefilename = null; // Create file imagefilename =

SysConst.TempRecoveryPATH+MaROSobject.currentObject().getOID(); try { imagefile = new RandomAccessFile (imagefilename,"rw"); }catch (IOException ioe){

// error handling return;

} // STATE try { imagefile.writeByte (STATE); imagefile.writeBytes ("^"); } catch (IOException ioe) { // Error handling: Unable to write image file } // Write data into file if (STATE >= 0) { // NIT try { int i; int tmax = NIT_instance.get_tmax(); imagefile.writeInt(tmax); for (i=1; i<=tmax; i++) { imagefile.writeInt (NIT_instance.get_Notifier_OID (i)); imagefile.writeBytes ("^"); imagefile.writeInt (NIT_instance.get_ObjectID (i)); imagefile.writeBytes ("^"); imagefile.writeChar(NIT_instance.get_Type (i)); imagefile.writeBytes ("^"); } } catch (IOException ioe) { // Error handling: Unable to write NIT to image file } // NIT written } if (STATE >= 1) { <CODE CONTINUES>

Figure 6.3: An example code part of the saveImage() method.

• Startup Specific Steps:

• The addition of the image file reader: Each recoverable object code start by

checking its image file. If the object has an image file, it should be read and

the last state has to be restored. An example image file reader code is shown in

Figure 6.4.

RandomAccessFile imagefile; String imagefilename = null; byte recovery_state = (byte) 255; // Image file name imagefilename = SysConst.TempRecoveryPATH+MaROSobject.currentObject().getOID(); try {

int mx=0; // Maximum size of the NIT int i; int _NOID, _OID; char _Type; imagefile = new RandomAccessFile (imagefilename,"r"); // There is an image file // Get STATE First try { recovery_state = imagefile.readByte(); imagefile.readByte(); } catch (IOException ioe){ // Error Handling: Unable to read image file } if (recovery_state >= 0) { // recover NIT mx = (imagefile.readInt ()); NIT_instance.set_tmax (mx); for (i=1;i<=mx;i++) { _NOID = imagefile.readInt (); imagefile.readByte(); _OID = imagefile.readInt (); imagefile.readByte(); _Type = imagefile.readChar(); imagefile.readByte();

NotificationAgent.NIT_instance.insert (_NOID, _OID, _Type); } } if (recovery_state >= 1) { <CODE CONTINUES>

Figure 6.4: An example image file reader code piece.

• The replication of the original code for each state: The original code should

be replicated for each state in the final step of the recovery. The replicated

code at each state enables the continuation of the execution of MaROS

objects. Another approach may be use of labels and unconditional jumps;

however, Java does not support any of them. Figure 5.7 shows the code

replication process in detail.

Evaluation and Future Work

7.1 Introduction

This chapter presents the results and the evaluation of the performance tests for the

notification module. Moreover, the future research areas for the system is discussed at

the end of the chapter.

7.2 Performance Evaluation

The testing platform has been set by using two different computers: One of them is

for the MaROS client and the other for the MaROS server. A Pentium 166MMX

machine with Turkuaz GNU/Linux 0.99 operating system has been set as an MSP,

and a Pentium 200 machine with Windows '95 operating system has been set as a

MaROS client. In the tests, two types of object transfer (full transfer and no transfer)

and the object deletion processes have been tested on a 100 Mbit ethernet, 115200

bit/sec., 24000 bit/sec. and 19000 bit/sec. modem connections. In all the tests, the

machines have minimum CPU load, and there are minimum network traffic. Both of

the machines run Java 1.1.6.

7.2.1 Full Transfer Tests

NotifierMH reads the files into a buffer, and then it sends them to NotifierMSP. The

buffer size is 4096 bytes (4K), in default. The NotifierMSP constructs the files by

collecting the incoming packets together. The default buffer size may be increased or

decreased by changing the SysConst.DEF_BUFF_SIZE system constant. In full

transfer tests, the effect of different buffer size values over transfer speed has been

tested. Since, the TCp uses 8K-packet size, the tests were run for 2K, 4K, 6K and

8000 bytes buffer sizes (The 8K-buffer size is not allowed, since 8K-TCp packet

contains a header). A MaROS object, with a size of approximately 500K, has been

transferred throughout the test. The results are shown in Table 7.1. The timer is

started before the first packet is sent from the NotifierMH to the NotifierMSP, and

stopped right after the HandlerMH receives the notification result.

Buffer Size vs. Transfer Time (ms.) 100 Mbit 115200 bit 19200 bit

2K (2048 bytes) buffer size 60040 259910 432083

8000 bytes buffer size 18837 186070 351723

Table 7.1: Transfer results of 505319 bytes object.

From Figure 7.1 through Figure 7.3, the graph of transfer time vs. buffer size and

transfer speed vs. buffer size are shown for both 100 Mbit ethernet and modem tests.

Those figures show that the performance of the 100 Mbit ethernet connection

increases, when the buffer size is increased. On the other hand, modem tests show that

there is a barrier value for the buffer size (Figure 7.3). There is no performance

increase in the full transfer operation, when this barrier is exceeded. Moreover, the

use of larger buffer sizes may drastically decrease the performance as a side effect,

since large buffers has to be segmented for TCp encapsulation. In order to summarize,

there is no speed-up when using buffer size values larger than the communication

bandwidth of the mobile host.

010000200003000040000500006000070000

2K 4K 6K 8000Buffer Size

Figure 7.1: Transfer time vs. buffer size graph for 100 Mbit tests

050000

100000150000200000250000300000

Figure 7.2: Transfer time vs. buffer size graph for 115200 bit tests

100000

200000

300000

400000

500000

Figure 7.3: Transfer time vs. buffer size graph for 19200 bit tests

The transfer speed vs. buffer size graphs below clearly depict the effect of different

buffer size values on full transfer operation. It is seen that the transfer speed increases,

if the buffer size is increased. However, there is a barrier for the buffer size as it is

seen in Figure 7.6. This barrier value is strictly effected by the network bandwidth.

For example, the 19200 bit/sec. modem connection provides maximum of about 6

Kbit/sec network bandwidth with compression, and this is the barrier for the buffer

size value.

1000015000200002500030000

Figure 7.4: Transfer speed vs. buffer size graph for 100 Mbit tests

10001500200025003000

Figure 7.5: Transfer speed vs. buffer size graph for 115200 bit tests

0200400600800

1000120014001600

Figure 7.6: Transfer speed vs. buffer size graph for 19200 bit tests

7.2.2 No Need Type Object Transfer and Object Deletion Tests A No Need Type object transfer is an object transfer operation without a full transfer

or a partial transfer. The tests performed on the same test platform where the full

transfer tests were done. The timer is started right after the notification request is

received by the Notification Agent. It is stopped right after the HandlerMH receives the

notification result. In the current code, there is a five seconds synchronization delay

included in these test results. Since, there is no object transfer, the size of the buffer

has no importance on the test results. Table 7.2 displays the results of the tests. The

modem tests have required approximately three more seconds to finish the operation

when compared with the 100 Mbit tests. The table also contains the results of the

object deletion tests. There is not much time difference between these two test results.

In these tests, it is seen that the bandwidth of the TCp connection does not have much

importance on the No Need Type object transfer and the object deletion operations.

Run #1 (ms.) Run #2 (ms.) Run #3 (ms.) Average (ms.)

24 Kbit modem

(No Need Transfer)

12300 11810 12410 12173

24 Kbit modem

(Object Deletion)

11860 12300 12300 12153

100 Mbit ethernet

(No Need Transfer)

10050 9410 9120 9527

100 Mbit ethernet

(Object Deletion)

9060 10110 9170 9447

Table 7.2: The results of the No Need type object transfer tests.

7.3 Future Work

This section presents the future research areas related with the Notification and

Recovery modules of the MaROS. Furthermore, the future works planned for the

MaROS is discussed at the end of the section.

7.3.1 Future Work on the Notification Module

The Notification Module deals with the transfer and the deletion operations on the

relocatable objects. In the performance tests, it has been proved that there is no

considerable speed-up in the object transfer operation, when the buffer size exceeds

the communication bandwidth. Since, the mobile hosts may connect to the MSP in

different connection speeds, dynamic buffer size values may be used in order to

optimize the transfer operation for each connection.

Object compression is another useful approach for an optimal transfer process.

MaROS objects can be compressed and then be transferred to the MSP. When the

MSP receives the compressed object data, it may decompress and create the

relocatable MaROS object. This process requires additional object compression time;

however, the transfer speed will be improved considerably.

The object deletion process may be modified by adding a new server machine next to

the MSP. This machine may be called as MaROS Recycle Bin (MRB), and all the

objects, which are to be deleted, may be moved to the MRB instead of being deleted

from the MSP automatically. This approach does not increase the object deletion time

too much; since, there will be a very fast network connection between the MSP and

the MRB.

7.3.2 Future Work on the Recovery Module Currently, the system is vulnerable to failures such as system lockups and hardware

problems. In order to overcome these problems, the Recovery Module should be

completely redesigned. Since, Java does not provide signal handling primitives, the

implementation language may need to be changed. However, this is not good for the

portability of the MaROS.

The Recovery Module may be made at-least semi-transparent to the programmers by

providing a programming interface, in the future. In the current version of Recovery

Module, a MaROS programmer should know almost everything about Recovery

Module to write recoverable applications.

7.3.3 Future Work on MaROS There are many research areas that are not designed and implemented in the current

version of MaROS. Some of these areas are system security, heavy-weight migration,

and load balancing on multiple MSPs.

The system security is one of the most important issues in a system like MaROS;

since, there may be many unauthorized attempts to access to the system. There is a

host registration and authentication protocol; however, in the future, the design of a

new agent (Security Agent) should be considered.

In the current design and implementation, the Migration Agent only deals with the

light-weight type object migration. In the future, the migration of running MaROS

objects may be implemented. This type of migration may be made possible with

increase in the bandwidth of wireless connections in the future.

Another possible enhancement that may be implemented in the future is the use of

multiple MSPs. The current design may be extended to a distributed system of MSPs

connected via high-speed networks. In this case, MaROS may be optimized by using

techniques such as load balancing, and parallel processing.

In order to increase the system performance, the MaROS threads may communicate

using shared memory instead of using MaROS communication primitives. However,

all the possible problems such as starvation and deadlock of the objects should be

dealt in that case.

Finally, the implementation language may be changed in order to increase the overall

system performance. However, in this case, the system should be redesigned

considering the advantages and disadvantages of the new implementation language.

The C++ seems the ideal alternative. In order to keep the portability feature of

MaROS, the Java-based MaROS objects may continue to be used.

Conclusion

The Mobile and Relocatable Object System (MaROS) is an application development

platform especially designed to minimize the problems that arise from the limitations

of mobile computers. The system supports disconnected operations, object relocation,

and recovery of MaROS clients. In this dissertation, the design and the

implementation of the Notification and the Recovery modules have been presented.

The transfer operation of the relocatable objects is automatically initiated by the

system. A copy of the relocatable object is created on the MSP site, while the object is

being created on the mobile host. This process is called as the notification. The

notification process simplifies and speeds up the object relocation process. The

Notification Agent and other system agents use worker threads to achieve optimal

response times for the requests.

System recovery is one of the most important issues in a system like MaROS. In the

current design, the recovery of all recoverable objects is possible after voluntary

shutdowns. However, the Recovery module is not transparent to the programmers.

Currently, a MaROS programmer should follow a well-defined path to code

recoverable objects.

The recovery process is hierarchical and decentralized. The Recovery Agent only

coordinates the process by signalling the system objects in a hierarchical order.

However, the process is not centralized; since, all the system agents and recoverable

objects are responsible from their own recovery.

The current design of Recovery module does not cover failure recovery which is the

result of hardware and OS failures. In order to deal with that type of recovery, a

checkpointing approach should be designed and implemented. Java does not provide

low level primitives for accesing to the system resources directly. Therefore, another

programming language may be chosen for the implementation, in the future.

Mobile Computing is the technology of the future. Currently, there are many research

projects that are carried out on mobile computing platform. The aim of those projects

is to improve the performance and the functionality of mobile computers, in general.

MaROS is one of them, and it tries to provide an application development platform

especially designed for the mobile computers.

References

[1] M. Faiz, A. Zaslavsky, B. Srinivasan. Revising Replication Strategies for Mobile

Computing Environments

[2] Ramon Caceres, Liviu Iftode. Improving the Performance of Reliable Transport

Protocols in Mobile Computing Environments, Proceedings of the IEEE, special issue

on Mobile Computing Networks, 1994.

[3] Şebnem Baydere et. al. MaROS: A Framework For Mobile Application

Development, EURO-PDS'97 International Conference on Distributed and Parallel

Systems, June'97, Barcelona, Spain http://www.yeditepe.edu.tr/MaROS/paper1.ps.Z

[4] Jeppe D. Nielsen. Transactions in Mobile Computing,1995.

[5] Anthony D. Joseph, Alan F. deLespinasse, Joshua A. Tauber, David K. Gifford

and M. Frans Kaashoek, Rover: A Toolkit for Mobile Information Access, Proceedings

of the Fifteenth Symposium on Operating Systems Principles, December 1995.

[6] A.D. Birrel and B.J. Nelson, Implementing Remote Procedure Calls, ACM Trans.

Comp. Syst., 2(1):39-59, Feb. 1984.

[7] Anthony D. Joseph, M. Frans Kaashoek, Building Reliable Mobile-Aware

Applications Using the Rover Toolkit, Wireless Networks Magazine, Vol.3 (1997),

No. 5, October 1997.

[8] Toshio Shirakihara, Hideaki Hirayama, Kiyoko Sato and Tatsunori Kanai,

ARTEMIS: Advanced Reliable disTributed Environment Middleware System,

Proceedings of the International Conference on Parallel and Distributed Processing

Techniques and Applications, July 97.

[9] Andrzej Goscinski, Distributed Operating Systems The Logical Design, 1992,

Addison-Wesley Publishing Company

[10] Ören, T.I., Software Agents: Basic Concepts and Internet Applications,

Bilisim’96, Bildiriler96, 1996.

[11] Wreggit, D.J., Software Agents Using Java, Distributed Processing, 1995.

[12] Yıldız, M.C., Object Naming and Creation in a Mobile System, MSc. thesis,

Yeditepe University,1998.

[13] Demir, O., Object Relocation in a Mobile Computing Environment, MSc. thesis,

Yeditepe University, 1998.

[14] Devlet, G., A Communications Infrastructure for Disconnected Operations in a

Client/Server Computing Environment, MSc. Thesis, Yeditepe University, 1998.

Bibliography

[1] Naughton P., Schmidt H., Java: The Complete Reference, Osborne, MCGrawHill.

[2] Brian N. Bershad and Henry M. Levy. A remote computation facility for a

heterogeneous environment, Computer, 21(5): 50-60, May 1988.

[3] Bruce Walker, Gerald Popek, Robert English, Charles Cline and Greg Thiel, The

LOCUS Distributed Operating System, In Proceedings of the Ninth ACM Symposium

on Operating System Principles, pages 49-70, October 1983.

[4] P. Stanski, An Integrating Architecture for Distributed and Persistent Mobile

Software Agents, PESOS Technical Report, Monash University Department of

Computer Technology, Australia, 1997.

[5] Theimer M.M., Lantz K.A. and Cheriton D.R., Preemptable Remote Execution

Facilities for the V System, In Proceedings of the Tenth ACM Symposium on

Operating Systems Principles, Orcas Island, Washington pp.2-12, 1985.

[6] V. Koudounas, Why Mobile Computing? Where can It be Used?, http://www-

dse.doc.ic.ac.uk/~nd/surprise_96/journal/vol1/vk5/article1.html

[7] J.F. Bartlett, W4 – the Wireless World Wide Web, In Proceedings of IEEE

Workshop on Mobile Computing Systems and Applications, December 1994.

[8] T.F. La Porta, K.K. Sabnani and R.D. Gitlin, Challenges for nomadic computing:

Mobility management and wireless communications, Mobile Networks and

Applications 1(1), 1996.

[9] Object Management Group, Corba Services: Common Object Services

Specification, revised edition, 95-3-31, March 1995.

[10] Object Management Group, The Common Object Request Broker Architecture

and Specification 2.0, July 1995.

[11] G.M. Voelker and B.N. Bershad, Mobisaic: An Information system for a mobile

wireless computing environment, In Proceedings of IEEE Workshop on Mobile

Computing Systems and Applications, December 1994.

[12] R. Want et al., An overview of the ParcTab ubiquitous computing environment,

IEEE Personal Communications Magazine, 2(6), December 1995.

[13] T.F. La Porta et al., Experiences with network-based user agents for mobile

applications, Mobile Networks and Applications, Vol.3 pp.123-141, August 1998.

[14] N. Davies et al., L2imbo: A distributed systems platform for mobile computing,

Mobile Networks and Applications, Vol.3 pp.143-156, August 1998.

[15] W.N. Schilit, A system architecture for context-aware mobile computing, PhD.

Thesis, Department of Computer Science, Columbia University, New York, 1995.

[16] B.D. Noble, M. Price and M. Satyanarayanan, A programming interface for

application-aware adaptation in mobile computing, In Proceedings of MLIC’95,

pp.57-66, Ann Arbor, MI, April 1995.

[17] A. Friday and N. Davies, Distributed systems support for mobile applications, In

Proceedings of IEEE Symposium on Mobile Computing and its Applications, Savoy

Place, London, November 1995.

[18] N. Davies, S. Pink and G.S. Blair, Services to support distributed applications in

a mobile environment, In Proceedings of SDNE’94, pp.84-89, Prague, June 1994.

[19] R. Parkash, M. Singhal, Dependency sequences and hierarchical clocks: Efficient

alternatives to vector clocks for mobile computing systems, Wireless Networks, Vol.3.

pp.349-360, October 1997.

[20] G.H. Forman and J. Zahorjan, The challenges of mobile computing, IEEE

Computer 27(4) pp.38-47, April 1994.

[21] K. Birman and T. Joseph, Reliable communication in the presence of failures,

ACM Transactions on Computer Systems 5(1) pp.47-76, February 1987.

[22] M. Ahamad, P. Dasgupta and R.J. Leblanc, Fault-tolerant atomic computations

in an object-based distributed system, Distributed Computing 4 pp.69-80, 1990.

[23] Sun Microsystems Corporation, Remote Method Invocation for Java,

http://chatsubo.javasoft.com/current/rmi/index.html, July 1996.

an application service provider for a mobile computing environmentgkucuk/kucuk_msc.pdf ·...

Documents

small computing & mobile computing

mobile computing models mobile computing cnt 5517-5564

syllabus - pit computing.pdfit6601 mobile computing unit –...

solution provider for industrial computing

21st century digital government: secure, connected,...

efficient resourceful mobile cloud architecture (mrarsa) for...

mobile computing introduction - disco · 2017-06-28 ·...

mobile computing - university of...

mobile computing and applications of mobile computing

mobile cloud computing - webspaces - accueil › ... ›...

wireless networks and mobile computing · systems or...

pervasive computing or mobile computing

mobile computing

syllabus - pit computing.pdf · it6601 mobile computing...

mobile provider

mobile computing

mobile computing projects,mobile comp/uting projects in...

mobile computing 1. what is mobile computing?...

an application service provider for a mobile computing...

bca-mobile computing- basics of mobile computing