the design qnd implementation of clouds distributed system...behind the clouds redesign in section...

36
The Design qnd Implementation of the Clouds Distributed Operating System P. Dasgupta, R. C. Chen, S. Menon, M. P. Pearson, R. Ananthanarayanan, U. Ramachandran, M. Ahamad, R. J. LeBlanc, W. F. Appelbe, J. M. Bernabéu-Aubán, P. W. Hutto, M. Y. A. Khalidi, and C. J. Wilkenloh Georgia Institute of Technology ABSTRACT: Clouds is a native operating system for a distributed environment. The Clouds operating system is built on top of a kernel called Ra. Ra ís a second generation kernel derived from our experi- ence with the first version of the Clouds operating system. is a minimal, flexible kernel that pro- vides a framework for implementing a variety of distributed operating systems. This paper presents the Clouds paradigm and a brief overview of its first implementation. We then present the details of the Rø kernel, the rationale for its design, and the system services that consti- tute the Clouds operating system. This work was funded by NSF grant CCR-8619886. @ Computing Systems, Vol. 3 . No. I . Winter 1990 11

Upload: others

Post on 25-Jan-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

  • The Design qndImplementation of the CloudsDistributed Operating System

    P. Dasgupta, R. C. Chen, S. Menon,

    M. P. Pearson, R. Ananthanarayanan,U. Ramachandran, M. Ahamad,

    R. J. LeBlanc, W. F. Appelbe,J. M. Bernabéu-Aubán, P. W. Hutto,

    M. Y. A. Khalidi, and C. J. WilkenlohGeorgia Institute of Technology

    ABSTRACT: Clouds is a native operating system fora distributed environment. The Clouds operatingsystem is built on top of a kernel called Ra. Ra ís asecond generation kernel derived from our experi-ence with the first version of the Clouds operatingsystem. Rø is a minimal, flexible kernel that pro-vides a framework for implementing a variety ofdistributed operating systems.

    This paper presents the Clouds paradigm and abrief overview of its first implementation. We thenpresent the details of the Rø kernel, the rationalefor its design, and the system services that consti-tute the Clouds operating system.

    This work was funded by NSF grant CCR-8619886.

    @ Computing Systems, Vol. 3 . No. I . Winter 1990 11

  • I. Introduction

    Clouds is a distributed operating system built on top of a minimalkernel called Rø. The paradigm supported by Clouds provides anabstraction of storage called objects and an abstraction of execu-tion called threads.

    1.1 Basic Philosophies

    A primary research topic of the Clouds project is the developmentof a set of techniques that can be used to construct a simple,usable distributed operating system. A distributed operating sys-tem should integrate a set of loosely-coupled machines into a cen-talized,work environment. The following are the requirements ofsuch a distributed system:

    . The operating system must integrate a number of comput-ers, both compute servers and data servers, into one operat-ing environment.

    . The system structuring paradigm is important in decidingthe appeal of the system. This should be clean, elegant, sim-ple to use and feasible.

    . A simple, efficient, yet effective implementation.To attain this end, we proposed the following basic

    philosophies:

    . We use the much advocated minimalíst philosophy towardsoperating system design. The operating system is dividedinto several clean, well defined modules and layers. Thekernel is one such layer ofthe operating system and sup-ports only features that cannot be supported elsewhere.

    Dasgupta et al.1,2

  • Modularity in these layers limits the amount of interference(side-efects) and gives rise to easier upgrading anddebugging.

    . Three basic mechanisms common to all general-purpose sys-tems are computation, storage and I/O. We use simpleprimitives to support these: namely, light-weight processesand persistent object memory. Light-weight processes areinvolved in computation. Persistent object memory servesfor all storage needs. The need for user-level disk I/O in ourmodel is eliminated. Terminal I/O is provided as a specialcase ofobject access.

    1.2 Clouds Design Objectives

    Clouds is designed to run on a set of general purpose computers(uniprocessors or multiprocessors) that are connected via a local-area network. The major design objectives lor Clouds are:

    . Integration of resources through cooperation and locationtransparency, leading to simple and uniform interfaces fordistributed processing.

    . Support for various forms of atomicity and data con-sistency, including transaction processing, and the ability totolerate failures.

    . Portability, extensibility and efficient implementation.Clouds coalesces a distributed network of computers into an

    integrated computing environment with the look and feel of a cen-tralized timesharing system. In addition to the integration, theparadigm used for defrning and implementing the system structureof the Clouds system is a persistent object/thread model. Thismodel provides threads to support computation and objects tosupport an abstraction of storage. The model has been augmentedto to provide support for reliable programs [Chen & Dasgupta1989; Chen 19901. The consistency techniques have beendesigned, but not implemented, and are outside the scope of thispaper.

    The rest of this paper is organized as follows. An overview ofthe Clouds project is provided in Section 2, followed by an over-view of the Clouds paradigm in Section 3. The paradigm is the

    The Design ønd Implementation of the Clouds Distributed Operating System r3

  • common link between the first implementation of Cbuds (Cloudsu.1) and the current version (Clouds v.2). We present the imple-mentation of Clouds v.1 in Section 4,the structure and implemen-tation of Clouds v.2 in more detail in Section 5, and the reasonsbehind the Clouds redesign in Section 6. Section 7 presents someof details of the Ra implementation and Section 8 describes thecurrent state of the implementation of Cbuds on Ra.

    2. Project Overview

    The first version of the Clouds kernel was implemented in 1986.This version is referred to as Clouds v.l and was used as an exper-imental testbed by the implementors. This implementation wassuccessful in demonstrating the feasibility of a native operatingsystem supporting the object model. Our experience with Cloudsv./ provided insights into developing better implementations ofthe object/thread paradigm.

    The lessons learned from the first implementation have beenused to redesign the kernel and build a new version of the operat-ing system called Clouds v.2. Clouds v.2 uses an object/threadparadigm which is derived from the object/process/action system

    [Allchin 1983; Spafford 1986; Wilkes 1987] used by Clouds v.I.However, most of the design and implementation of the systemare substantially different. Clouds v.1 was targeted to be a testbedfor distributed operating system research. Clouds v.2 is tatgeted tobe a distributed computing platform for research in a wide varietyof areas in computer science.

    The present implementation of Cbuds supports the persistentobject/thread paradigm, a networking protocol, distributed virtualmemory support, user-level I/O with virtual terminals on UNIX, aCr+ based programming language, and some system managementsoftware.

    Work in progress includes user interfaces, consistency support,fault tolerance, language environments, object programming con-ventions and better programming language support for object typ-ing, instantiation, and inheritance.

    14 Dasgupta et al.

  • 3. The Clouds Paradigm

    All data, programs, devices, and resources in Clouds are encapsu-lated in objects. Objects represent the passive entities in the sys-tem. Activity is provided by threads, which execute withinobjects.

    3.1 Objects

    A Clouds object, at the conceptual level, is a virtual address space.Unlike virtual address spaces in conventional operating systems, aClouds object is persistent and is not tied to any thread. A Cloudsobject exists forever and survives system crashes and shutdowns(as does a frle) unless explicitly deleted. As will be seen in the fol-lowing description of objects, Clouds objects are somewhat"heavyweight" and are better suited for storage and execution oflarge-grained data and programs because invocation and storage ofobjects bear some non-trivial overhead.

    An object consists of a named address space and the contentsof the address space. Since it does not contain a process, it iscompletely passive. Hence, unlike objects in some object basedsystems, a Clouds object is not associated with any server process.(The frrst system to use passive objects was Hydra [Wulf et al.1974; Wulf et al. 19811.) The contents of each virtual addressspace are protected from outside access so that memory (data) inan object is accessible only by the code in that object and theoperating system.

    Each object is an encapsulated address space with entry pointsat which threads may commence execution. The code that isaccessible through an entry point is known as an object operation.Data cannot be transmitted in or out of the object freely, but canbe passed as parameters (see the discussion on threads in Sec-tion 3.2).

    Each Clouds object has a global system-level name called asysname, which is a bit-string that is unique over the entire distri-buted system. Sysnames do not include the current location of theobject (objects may migrate). Therefore, the sysname-based nam-ing scheme in Clouds creates a uniform, flat system name space

    The Design and Implementation of the Clouds Distributed Operating System 15

  • for objects, and allows the object mobility needed for load balanc-ing and reconfiguration. User-level names are translated to

    sysnames using a nameserver.Clouds objects are programmed by the user. System utilities

    can be implemented using Clouds objects. A complete Cloudsobject can contain user-deflned code and data, and system-defrned

    code and data that handles synchronization and recovery. Inaddition, it contains a volatile heap for temporary memory alloca-tion, and a permanent heap for allocating memory that becomes apart of the data structures in the object. Locks and sysnames ofother objects are also a part of the object data space.

    3.2 Threads

    The only form of user activity in the Clouds system is the userthread. A thread can be viewed as a thread of control that runscode in objects, traversing objects and machines as it executes. Athread executes in an object by entering it through one of severalentry points; after the execution is complete the thread leaves the

    object. The code in the object can contain a call to an operationin another object with arguments. When the thread executes thiscall, it temporarily leaves the calling object, enters the calledobject and commences execution there. The thread returns to thecalling object after the execution in the called object terminateswith returned results. These argurnents/results are strictly data;they may not be addresses. (Note that sysnames are data.) Thisrestriction is necessary as addresses which are meaningful in thecontext of one object are meaningless in the context of anotherobject. In addition, object invocations can be nested.

    Several threads can simultaneously enter an object and execute

    concurrently (or in parallel, if the host machine is a multiproces-sor). Multiple threads executing in the same object share the con-tents ofthe object's address space.

    Unlike processes in conventional operating systems, a threadcan execute code in multiple address spaces. Visibility within anaddress space is limited to that address space. Therefore, a threadcannot access any data outside its current address space. Controltransfer between address spaces occurs through object invocation,

    T6 Dasgupta et al.

  • and data transfer between address spaces occurs throughparameters.

    3.3 Object/Thread Paradigm

    The structure created by a system composed of objects andthreads has several interesting properties. First, all inter-objectinterfaces are procedural. Object invocations are equivalent toprocedure calls on long-lived modules which do not share globaldata. Machine boundaries are transparent to inter-object pro-cedure calls or invocations. Local invocations and remote invoca-tions are differentiated only by the operating system.

    The storage mechanism used in Clouds differs from thosefound in conventional operating systems. Conventionally, files areused to store persistent data. Memory is associated with processesand is volatile. The contents of memory associated with a processare lost when the process terminates. Objects in Clouds unify theconcepts of persistent storage and memory to create the conceptof a persistent address space. This unifrcation makes program-ming paradigms simpler.

    Although files can be implemented using objects (a frle is anobject with operations such as read, write, seek, and so on), theneed for having files disappears in most situations. Programs donot need to store data in file-like entities, since they can keep thedata in the data spaces of objects, structured appropriately. Theneed for user-level naming of files transforms to the need foruser-level naming of objects. Also, Clouds does not provide user-level support for disk t/O, since frles are not supported by theoperating system. The system creates the illusion of a largememory space that is persistent (non-volatile). Since memory ispersistent, the need for files to store persistent data is eliminated.

    In the object/thread paradigm, the need for messages is elim-inated. Like files, messages and ports can be easily simulated byan object consisting of a bounded buffer that implements the sendand receive operations on the buffer. An arbitrary number ofthreads may execute concurrently within an object. Thus,memory in objects is inherently sharable memory.

    The system therefore looks like a set of persístent addressspaces, comprising what we call object memory, which allow

    The Design and Implementation of the Clouds Distributed Operating System t7

  • control to flow through them. Activity is provided by threadsmoving among the population of objects through invocation (Fig-ures 1 and 2). The flow of data between objects is supported byparameter passing.

    3.4 Objects and Types

    At the operating system level there is only one type of object:Clouds obiect. A Clouds object is an address space which maycontain one (or more) data segment(s) and one (or more) codesegment(s). At the operating system level, the Clouds object hasexactly one entry-point. User objects are built using Cloudsobjects via an appropriate language and compiler. User objectshave multiple user-defined entry points and are implementedusing the single entry point, operation numbers and a jump table.

    User objects and their entry points are typed. Static typechecking is performed on the object and entry point types. No

    Dasgupta et al.

    dlstributed object space (non-volatile, virtual memory)

    Figure l: Object Memory in Clouds

    entry \points

    i\I

    18

  • user-defined

    operat¡ons

    code segment

    Figure 2: Structure of a Clouds Object

    runtime type checking is done by Clouds. Thus, type checking isimplemented completely at the language level and enforced bycompilers. A user object may contain several language defrnedobject classes and object instances. These are completely con-tained within the user object and are not visible to the operatingsystem. Using this objectlthread paradigm, programmers candefine a set ofobjects that encapsulate the application at hand.

    Currently, we defrne user objects in an extended C++ language.The language supports single inheritance on Clouds user objectsusing the Cr+ object structuring paradigm.

    4. Clouds v.I

    The first implementation of a kernel for Clouds was finished dur-ing 1986 and is described in Spafford [ 1986] and Pitts [1986]. Thekernel was broken up into four subsystems: object management,

    The Design and Implementation of the Clouds Distributed Operating System r9

  • storage management, communications, and action management.A kernel-supported extension of the nested action model of Moss

    [Moss 1981; Allchin 1983; Kenley 1986] made it possible for theprogrammer to customize synchronization and recovery mechan-isms with a set of locking and commit tools.

    As one of the goals of Clouds was to produce an efficient andusable system, a direct implementation on a bare machine(VAX-I1) was preferred to an implementation on top of an exist-ing operating system such as UNIX. Also most of the functional-ity needed by Clouds did not exist on UNIX (such as persistentshared memory, threads) and most of the features provided byUNIX were not needed for Clouds. The main goal of the imple-mentation effort was to provide a proof of feasibility of imple-menting the object/thread model. The kernel, notwithstandingsome of its drawbacks, was a successful demonstration of thefeasibility of the Clouds approach, both in terms of implemen-tation and use.

    4.1 Objects

    The basic primitives provided by the Clouds v./ kernel areprocesses and passive objects. Indeed, one kernel design objectivewas to support passive objects at the lowest possible level in thekernel [Spafford 1986]. The main mechanism provided by thekernel was object invocation. The system relied heavily on theVAX virtual memory system to provide object support.

    Passive objects ín Clouds v.l ate implemented as follows: theVAX virtual address space is divided into three segments by thesystem architecture, namely the P0, Pl and System segments. TheSystem segment is used to map the kernel code. The Pl segmentis used to map the process stack and the P0 segment is used tomap the object space. The object space resides on secondarystorage and page tables are used to map the object under invoca-tion into the P0 segment directly from disk. Each process has itsown pageable process stack.

    The above scheme allows processes to traverse several objectspaces, while executing, by simpty remapping the P0 segment of aprocess to contain the appropriate object space upon each invoca-tion (see the next section for further details). Also, several

    20 Dasgupta et al.

  • processes are allowed to concurrently execute within the sameobject.

    4.2 Object Invocation

    The basic mechanism used in the Clouds kernel to process anobject invocation from a thread r executing in object 01 is the fol-lowing: / constructs two argument lists, one for transferring argu-ments to the object being invoked, 02, înd the other to receive theoutput parameters (results) from the invocation of object 02 (seeSpafford [ 1986] for more details). After the construction of theargument lists, thread ¿ enters the kernel through a protected sys-tem call or trap. The kernel searches for the object locally and, iffound, uses the information in the object descriptor to constructthe page mappings for the P0 segment. Then, the kernel saves thestate of the thread, copies the arguments into the process space(Pl segment) and sets up the new mappings for the P0 segment.At this point, the contents of 02 are accessible through the map-pings of the P0 segment, and t can proceed with the invocation ofO2t method.

    On return from the invocation, the thread t also builds anargument list with the return parameters, and then enters the ker-nel by means of a protected system call. The kernel now saves theparameter in a temporary atea, sets up the P0 segment mappingsfot O¡ restores the saved state of t, and copies the return parame-ters wherever specifred by the second argument list constructed bythe thread at invocation time.

    If upon invocation, the kernel cannot frnd object O2locally, ittries to frnd it remotely. To do so, it broadcasts an RPC request.The RPC server in the node that has 02 acknowledges the invoca-tion request, and creates a local slave process to invoke 02 onbehalf of ¡. The slave process then proceeds to invoke o2locally,and when the invocation completes, it sends the return argumentsback to the invoking node. Then the kernel goes on to process thereturn parameters as described above.

    The Design and Implementøtion of the Clouds Distributed Operating System 2l

  • 4.3 Experiences with Clouds v.I

    Our experiences with Clouds v.1 were gained over a period of 6-8months while attempting to build a user environment on top ofthe kernel. During this time, we encountered a number of prob-lems in the kernel that had to be addressed. Unfortunately, whiledoing so, we discovered that complex module interdependencieswithin the kernel made it extremely difficult to add necessaryfunctionality or fix non-trivial bugs without introducing new bugsin the process. One reason for this was due to poor code struc-ture. Also, direct concurrent access and update of data structuresby routines in different subsystems within the kernel resulted insynchronization and re-entrancy problems.

    The final problem involved a change of platform. Foreconomic reasons, we wanted to move to Sun workstations. How-ever rrye found that VAX dependencies that appeared throughoutthe kernel (such as programming the VAX bus) made the kernelvirtually impossible to port to a new architecture without a com-plete re-write.

    In hindsight, these problems are not entirely surprising. Whileportable kernels are desirable, portability was not emphasized dur-ing the design and implementation of Clouds v.1. We were moreinterested in producing a working prototype. Also, Clouds v./ wasimplemented in C. Given the lack of support in C for softwareengineering techniques, it should not have come as a surprise thatour first attempt at structuring the internals of a kernel supportingpersistent objects resulted in a less than optimal design.

    The basic design and philosophy behind Clouds v.2 is a directresult of the problems we encountered in working with the firstkernel. The second kernel is a minimalketnel, designed withflexibility, maintenance, and portability in mind. A complete dis-cussion of the similarities and differences between Clouds v.I andR¿ is contained in Section 6.

    22 Dasgupta et al.

  • 5. Clouds v.2

    The structure of Clouds v.2 is different from Clouds v.1. Theoperating system consists of a minimal kernel called Ra, and a setof systemJevel objects providing the operating system services..Rø [Bernabéu-Aubán et al. 1988; 1989] supports a set of basic sys-tem functions: virtual memory management, light-weightprocesses, low-level scheduling, and support for extensibility.

    5.1 The Minimal Kernel Approach

    A minimal kernel provides a small set of operations and abstrac-tions that can be effectively used to implement portable operatingsystems independenl of the underlying hardware. The kernelcreates a virtual machine that can be used to build operating sys-tems. The minimal kernel idea is similar to the RISC approachused by computer architects and has been effectively used to buildmessage-based operating systems such as V [Cheriton & Zwaen-poel 19831, Accent IRashid & Robertson 1981], and Amoeba[Tanenbaum & Mullender l98l]. The rule we attempted to followin our design was: Any service that can be provided outside thekernel wíthout adversely effecting perþrmance should not beincluded in the kernel.

    As a result, Rø is primarily a sophisticated memory managerand a low-level scheduler. R¿ creates a view of memory in termsof segments, windows and virtual spaces. Unlike the Clouds v.lkernel, R¿ does not support objects, invocations, storage manage-ment, thread management, device management, network proto-cols, or any user services. All of these services are built on top ofthe Ra kernel as modules called system objects. System objectsprovide other systems services (user object management, syn-chronization, naming, networking, device handling, atomicity andso on) and create the operating system environment. Currentlythe Ra kernel and all of the essential system objects are opera-tional; the project is now focusing on higher level services.

    There are several advantages to the use of a minimal kernel.The kernel is small, hence easier to build, debug, and maintain.Minimal kernels assist in the separation of mechanisms from

    The Design and Implementation of the Clouds Distributed Operating System 23

  • policy which is critical in achieving operating systems flexibilityand modularity [V/ulf et al. 19741. A minimal kernel provides themechanisms and the services above the kernel implement policy.The services can often be added, removed or replaced without theneed for recompiling the kernel or rebooting the system.

    5.2 The Ra Kernel

    The principal objectives in Ra's design were:

    . Ra should be a small kernel that creates a logical view of theunderlying machine.

    . .Rø should be easily extensible.

    . One of the possible extensions of Ra should be Clouds.

    . It should be possible to efect an efficient implementation ona variety of architectures.

    In addition to the above, the implementation of Ra shouldclearly identify and separate the parts that depend on the architec-ture for which the implementation is being targeted. This shouldreduce the effort required to port the kernel to differentarchitectures.

    As was previously stated, object invocation in the frrst versionof Clouds was implemented by manipulating the virtual memorymappings. The Clouds v./ kernel was targeted to support objectinvocations. R¿ is targeted to support the memory managementneeds of objects. From our experience with the frrst Clouds kernel, we identifred generalizations in the virtual memory manage-ment mechanisms that we felt would provide alarget degree offlexibility in the the design and implementation of new systems.

    Memory is the primary abstraction in the Clouds paradigm.Since objects are implemented as autonomous address spaces,they need to be built out of sharable, pageable segments. Thesesegments should be easily migratable to support distributedmemory. These requirements led to the design of the virtualmemory architecture of R¿.

    .Rø supports a set of primitive abstractions and an extensionfacility. The three major abstractions are:

    24 Dasgupta et al.

  • IsiBas. An IsiBat is an abstraction of activity, and is basicallya very lightweight process. The IsiBa is simply a schedulableentity consisting of a PCB. An IsiBa has to be provided with astack segment and a code segment before it is scheduled. There-fore, IsiBas can be used to create user-level processes, threads, ker-nel processes, daemons and can be used for a host of tasks need-ing activity.

    Segments. Segments conceptualize persistent memory. A seg-ment is a contiguous block of uninterpreted memory. Segmentsare explicitly created and persist until destroyed. Each segmenthas a unique system-wide sysname, and a collection of storageattributes. Segments are stored in a facility called the partition.Rø does not support partitions, but assumes the existence of atleast one. Partitions are described later. Ãa provides a set offunctions to manipulate segments (create, extend, install, page-in/out and so on).

    Virtual Spaces. A virtual space abstracts a frxed-size contigu-ous region of an IsiBa's virtual address space. Ranges of memoryin a virtual space can be associated with (or mapped to) an arbi-trary range of memory in a segment. Each such mapping (called awindow) also defrnes the protections (userJevel read, read-write,kernel-level read, etc.) on the defrned range of memory. Virtualspaces are defrned and controlled by a Virtual Space Descriptor orVSD îor short (see Figure 3). Virtual spaces can be associated withan IsiBa. This is called installing a virtual space. R¿ is responsi-ble for ensuring that the state of the IsiBa's hardware addressspace corresponds to that defrned by its installed virtual spaces(the virtual memory architecture is described in more detail inSection 5.3). Virtual spaces can be used by higher level routinesto implement such things as objects. Rø supplies functions thatassemble, install and manipulate virtual spaces.

    Ra is implemented using the C+r language and heavily usesthe object oriented programming paradigm provided by C++. The

    l. The term IsiBa comes from early Egyptian. Isi = light, Ba = soul and was coined bythe early designers ofthe R¿ kernel. It is now felt that the term is confusing and weperiodically agree to change it but have not been able to agree on a replacement termthat is both informative and non-boring. The choice ofa replacement term is arecurring topic of animated discussion.

    The Design and Implementation of the Clouds Distributed Operating System 25

  • Vinual Space defined by descriptor

    Figure 3: VSD Defrning a Virtual Space

    functional units in Ra arc encapsulated in C++ objects and Cr+classes are used to provide structure to the implementation. Cr+classes are used, for example, to defrne base classes (and henceminimal interfaces) for system objects, devices, partitions, virtualmemory managers, etc.

    J?ø is related to the Mach kernel [Accetta et al. 1986] by itssome of its virtual memory mechanisms and its approach to port-able design [Rashid et al. 1987]. The Choices kernel [Campbell etal. 19871 is built using object-oriented programming techniquesand is thus related to the implementation structure of Rø. Duringthe design of Ra, we were also influenced by ideas from Multics[Organick 19721, Hydra [Wulf et aI. 1974], and Accent [Rachid &Robertson 1981l, as have other distributed operating systemsdesigned in the tradition of Clouds [Nett et al. 1986; Northcuttle87l.

    26 Dasgupta et al.

  • 5.3 Virtual Memory Architecture

    Ra provides a two-level virtual memory architecture similar toClouds v./ (similarities and differences are discussed in Section 6).Rø breaks down the machine's virtual address space, hereafterreferred to as the hardware address space, into a number of frxed-sized, contiguous, non-overlapping regions. These regions areused to map different types of virtual spaces.

    5.3.1 Virtual Spaces

    Ra supports three types of virtual spaces: O, P, and K (see Figurea). The contents of each virtual space (of any type, O, P, or K) isdescribed by a set of windows in the virtual space descriptor.Each window lr associates protection with a range (x,x +n) of vir-tual memory controlled by the virtual space and associates a range(y,y +n) of bytes in a segment s into the virtual memory range(x,x+n). An access of byte x+a in window w will reference bytey+a in segment s. A virtual space V may contain an arbitrary

    O-Space Partition (storage)

    I

    Segments I

    P-Space

    K-Space

    Figure 4: Hardware Address Space

    The Design and Implementation of the Clouds Distributed Operating System

    Virtual Space

    27

  • number of windows mapping non-overlapping ranges within thebounds of the region controlled by the virtual space.

    Instances of virtual spaces of type O (O space) are intended tomap shared entities (e.g. objects), and spaces of type P (P space)are intended to map process private entities (e.g. stacks). In asingle-processor system there can be only one instance of a virtualspace of type K (K space). This is used to map the Ra kernel andthe system objects. There are additional operations that K spacedescriptors must support in addition to the standard operationsdefined by the virtual space class. This is handled by making theKernel Virtual Space Descriptors (KVSD) a derived class (subclass)from the base class VSD.

    Each IsiBa can have one O space, one P space, and oneK space currently installed. In the current uniprocessor imple-mentation, every IsiBa has the same K space installed, effectivelymapping the kernel into the address space of every IsiBa.

    5.3.2 Shared Memory

    Sharing memory occurs when two virtual addresses refer to thesame piece of physical memory. In the Rø kernel, sharing cantake place at two levels. Two IsiBas may share memory at the vir-tual space level by installing the same P or O (or both) virtualspace. When they reference memory within the bounds of theshared virtual space, they will automatically reference the samepiece of physical memory or backing store.

    Memory may also be shared at window level. If virtual spaceVl has a window wI (a,a+n) that maps to (x,x+n) within the seg-ment s and a virtual space V2 has a mapped window w2 (b,b +m)that maps to $t,y +m) within the same segment s and the tworanges overlap, then the overlapping area of memory will beshared by the two windows w1 and w2. To take a simple case, ifwI and w2 both map to the same range of addresses in the seg-ment s (that is, m = n), then a reference to address a+2,(z

  • aîd V2 are installed in the same IsiBa). Thus, two IsiBas mayshare memory by installing the same virtual space. Also they mayshare memory if they have diferent virtual spaces installed butthe virtual spaces map common segments.

    5.4 Ra Synchronization Primitives

    The kernel defines a set of primitives to be used for synchroniza-tion and concurrency control inside the kernel and system objects.At the lowest level, Ra defines (machine-dependent) intemrpt levelcontrol facilities. Since R¿ is intended to be portable to multi-processors, spin locks are provided. These low-level facilities areused to construct kernel-level semaphores, memoryless events, andread/write locks.

    5.5 Extensibility

    The design of Ra abstracts a logical machine that provides virtualmemory mechanisms and low level scheduling. Hence, it is unus-able by itself, and must be extended using system objects into anoperating system such as Clouds. System objects reside in kernelspace (K space), not vseÍ space (O and P space), have direct accessto data and code in the kernel, and share the same protection andprivileges. Hence, system objects are not the same as Cloudsobjects. Some system objects are called essential system objectsbecause they are essential to run Ra. For example, the Partition isan essential system object. .Rø assumes the existence of at leastone partition.

    The system objects in Clouds are organized in a hierarchy ofclasses, ultimately deriving from the SysObi class. Rø defrnes asystem object interface and each system object must adhere to thisinterface. The system object must have initialize and shutdownmethods, can have private memory and can access kernel datastructures though kernel classes.

    Kernel classes are collections of kernel data and procedures toaccess and manipulate that data. These include the VSD class,sysname class, and cpu class. As viewed from a system object, thekernel is a collection ofkernel classes and instances oftheseclasses. Each cpu, for example, is a specific instance of the cpu

    The Design and Implementation of the Clouds Distributed Operating System 29

  • class and can be manipulated using methods defined by the cpuclass.

    The system object interface enforces the strict adherence tomodularity in the operating system. Clouds has been built byattaching all the relevant system objects to Ra. The systemobjects can be thought of as plug-in software modules that can belinked in with the kernel or loaded dynamically into a runningsystem.

    5.6 Partitions

    Partitions are repositories for segments. Rø requests segmentsfrom a partition when needed and releases them to the partitionwhen their usage is over. The partition is responsible for manag-ing segments on secondary storage. A partition must support atleast the following operations: ActivateSegment, DeactivateSeg-men¡ ReadPage, and WritePage. A segment must be activatedbefore being used, by calling the ActivateSegment operation. Thisallows the partition to set up any necessary in-memory data struc-tures. DeactivateSegment is then called after the kernel hasfrnished using the segment. ReadPage and WrítePage are self-explanatory.

    A partition is responsible for maintaining and manipulatingsegments. Each segment is maintained by exactly one partition,and the segment is said to reside in that partition. The partitionin which the segment resides is called the controllíng paftition.Creation and deletion of segments is performed through their con-trolling partitions. The partition is responsible for maintainingthe information which describes the segment in secondary storage,similar to file-system code in a conventional operating system. Toread or write segment pages on secondary storage, the controllingpartition of the segment is invoked. The partition is notifred thatone of its segments will be subject to further activity by activatingthe segment. Similarly, the partition is told that a segment willnot be used in the near future by deactivating the segment.

    In our implementation, a network disk partition stores seg-ments on a UNIX file-server. Each segment is stored as a UNIXfrle. The partition provides pages of segments to Rø over a net-work. The local lR¿ kernel is unaware of this mechanism. The

    30 Dasgupta et al.

  • network disk partition provides an easy way to create segments onUNIX which are then available to Clouds machines. This parti-tion is also used for paging and swapping activity.

    5.7 Distributed Shared Memory

    Segments residing in a network disk partition cannot be shared bymultiple machines. Thus each network disk partition creates aprivate storage space for each Clouds machine.

    However, to allow distribution, all objects should be accessibleby all Clouds machines. This means the segments should bestored by a set of data-servers and should be accessible by allcompute-servers. This is allowed by Distributed Shared MemoryPartitions (DSM Partitions).

    Sharing segments among machines may lead to multiplecached copies of the segments. To ensure one-copy semanticsused by the Clouds paradigm, a coherence controller is necessary.The DSM partition implements a set of protocols which enforcesthe coherence of shared segments (Figure 5). Each DSM partitionknows about a set of segment managers called DSM controllers.Currently, DSM controllers run on UNIX machines and store seg-ments as files. When Clouds accesses a segment (due to the invo-cation of an object that uses that segment), the DSM partition onthe local machine gets the segment from the DSM controller. TheDSM controller runs coherence algorithms which ensure that thereis only one copy of the segment in the Clouds system [Ramachan-dran et al. 1989; Khalidi 1989]. This makes memory appear to beone-copy shared and creates the view that all objects reside on allmachines.

    6. Ra and Clouds v.l

    The design of Ra draws heavily on our experiences with Cloudsv.1. The Clouds y.1 was essentially a monolithic kernel. Expanda-bility and flexibility were not primary design goals.

    As every academic/commercial operating systems evolves, newsets of researchers and implementors add to the design and func-tionality in ways not anticipated by the original designers. While

    The Design and Implementation of the Clouds Distributed Operating System 3l

  • 32

    Clouds Clouds Clouds

    Ra Ra Ra

    Partition

    Client.

    Part¡tion

    Cllent.

    Partition

    Client.

    Network

    Partition Server

    /;n'"; --Partition Server

    /;nr"; .-

    Figure 5: DSM Partitions

    attempting to build a user environment for Clouds v.l, we realizedthat it was difficult to extend the system. A number of features inRa attempt to address this problem.

    Rø supports a more primitive set of abstractions than Cloudsv..1, however, the system object class allows the addition of operat-ing system services not supported by the kernel. System objectsallow flexibility and enforce modularity. This structuring tech'nique is is worthy of note in that while many systems allow exten-sions in the form of device drivers, far more fundamental operat-ing system services have been implemented as.Rø system objects.

    To support testing of higherJevel algorithms, the Ra designalso strives to separate mechanisms that implement actions frompolicies that decide when actions should be taken. Furthermore,the object-oriented structure of the R¿ kernel lends itself to easiermaintenance than the Clouds v.1 kernel.

    Dasgupta et al.

  • 6.1 Multi-threading

    In the Clouds v.I design, processes were somewhat heavyweightand the kernel was basically passive. This meant that virtually alloperating system services were provided through intemrpt andfault handlers. Therefore, the entire kernel had to be multi-threaded. This necessitated very delicate handling of interruptpriority levels and synchronization primitives throughout theentire kernel. The problem was compounded by the fact thatconcurrency-related mistakes usually result in non-reproducibletiming-related errors which are very difficult to debug. Unfor-tunately, in a 22,000line multi-threaded kernel written fromscratch by a handful of people, it is virtually guaranteed thatconcurrency-related errors will creep in.

    In Ra,IsiBas can be used to implement kernel daemons. Ker-nel daemons run in kernel space with kernel privileges. Thesedaemons can be used to implement single-threaded servers thatare dispatched using interrupts. This reduces the amount of codethat has to be multi-threaded and run with intemrpts masked.Kernel daemons have been used in a number of places, most not-ably in the implementation of remote procedure calls, timer ser-vices, network protocols and DSM.

    6.2 Virtual Memory Architecture

    The design and implementation of Rø's virtual memory architec-ture incorporates a generalization of the virtual memory architec-ture of Clouds v.1 and portability-oriented structuring ideas foundin Mach [Rashid et al. 1987]. Rather than embed support deeplyin the kernel for one notion of how objects could be structured,the decision was made to support a flexible, sophisticated virtualmemory architecture that could be adapted to more than oneobject implementation. This would allow future developers thefreedom to modify the object implementation without having tore-write kernel internals.

    R¿ thus supports the notion of an O space, P space, andK space. By virtue of using the VAX P0, Pl, and System segmentsto contain objects, the process stack, and the kernel, respectively,Clouds v./ also supported the equivalent of an O, P, and K space.

    The Design and Implementation of the Clouds Distributed Operating System 33

  • However, since support for objects, processes, and actions wasdesigned into the kernel at the lowest possible level, the structur-ing of each type of space was hard-coded into the kernel. The P0segment was broken up into windows but the Pl and System seg-ments were not. In Ra, the structure of the Clouds v.1 P0 segmentis generalized into the notion of a virtual space and applied in anorthogonal fashion to all three types of virtual spaces. This gen-eralization leads to a number of advantages.

    By structuring the hardware address space as virtual spaces, itis possible in R¿ to break the address space up into more thanthree virtual spaces. V/hile the original designers foresaw only theneed for an O, P, and K space, our recent work in memory seman-tics has convinced us of the need to add a fourth virtual space[Dasgupta & Chen 1990; Chen 1990]. Due to the uniform han-dling of virtual spaces in the kernel, adding another virtual spacewill be straightforward.

    The fact that P space can be broken up into windows allowsmore flexibility in designing and implementing user processes andobject invocation. The original Clouds v./ desþ placed all non-stack per-process data in P0 space. The current user object imple-mentation could have been implemented similarly. However, itwas decided to have all per-process data (stack, heap, parameterpassing area) reside in P space instead, which means the entireobject space can be shared. This allows all threads executing inthe same object to share the same O space virtual memory tables.The Rø kernel is capable of supporting either implementation.

    The K space windows allow the kernel and system objects touse the window class to map portions of segments into well-defined address ranges, operate on the data there, flush thechanges out and unmap the window (freeing it up for further use).This facility is used by the user process controller to set up a newprocesses and freeze/restore them (see Section 8.2).

    Also, the -R¿ virtual memory system allows for the existence ofmore than one virtual memory manager. Virtual memorymanagers handle page faults. Each Rø segment can have a seg-ment type and each segment type can have its own virtualmemory manager. V/hen a page fault occurs, the kernel deter-mines the segment being accessed by the fault. If the segment hasits own virtual memory manager, the kernel calls that manager.

    Dasgupta et al.34

  • Otherwise, the kernel calls a default virtual memory manager toservice the fault. This'design allows the complexity of servicessuch as shadow-based recovery to be isolated in its own virtualmemory manager.

    Finally, unlike tllre Clouds v.1 virtual memory system, the Ravirtual memory system is designed with machine independent andmachine dependent classes (inspired by Mach). Each machineindependent class that may have to rely extensively on machine-dependent details has a corresponding machine-dependent classthat presents an abstraction of the underlying hardware to the restof the kernel. This isolates the machine dependencies from therest ofthe kernel.

    7. Implementation of Ra

    R¿ is implemented in Cr+ [Stroustrup 1986] on the Sun-3/60architecture. C++ was chosen over C due to the extra support fordata abstraction, and object-oriented design.

    C++ facilities such as private data and methods, derivedclasses, and virtual functions make it easier to write modular, lay-ered code and hide the implementation details of one part of thekernel from the rest. This, in turn, reduces hidden interdependen-cies which makes it easer to change parts of the system withoutintroducing unforeseen side-effects. While all good softwaredesigners strive to do this, the language support (and enforcement)provided by C+r has made this a much easier task. Furthermore,the C++ inline function facility makes it possible to write highly-layered/modular code without incurring extra function-call over-head.

    Thus, while the kernel internals are quite intricate, well-defined interfaces exist for requesting kernel services. C++ type-checking enforces adherence to the defined interfaces and henceprevents system implementors from by-passing those interfaces,but the ability to deflne optional parameters in class methodsenables interfaces to be easily extended while retaining compati-bility with existing code.

    The design and implementation of Ra is designed to identifyand isolate machine dependencies. Like the virtual memory

    The Design and Implementation of the Clouds Distributed Operating System 35

  • system, Ra is divided into two sets of frles containing machinedependent and machine independent classes and definitions.

    The implementation of the R¿ kernel consists of about 1,000lines of assembly code and 12,000 lines of Cr+ code. Approxi-mately 6,000 lines are machine dependent code while the rest aremachine independent. In addition, 17,000 lines of C++ code havebeen added to the system in the form of system objects.

    8. Clouds v.2 and Rq

    Clouds v.2 consists of the Rø kernel plus a collection of systemobjects implementing Clouds semantics (see Figures 6 and 7).

    The system objects contain more code than the R¿ kernelitself. A complete description of these are beyond the scope of thepaper. The system objects currently in operation include: buffermanager, user I/O manager, tty manager, Ethernet driver, RaTransport Protocol, network disk partition, DSM partition, objectmanager, thread manager, process controller, RPC controller, and

    The Ra KernelI ctassA ì

    --L::::J G;-

    f;h..g I -Fh.'p

    I

    The System Area

    @@@

    36 Dasgupta et al.

    Figure 6: System Objects and System Space

  • system monitor. Many of the system objects are implementedusing kernel daemons. Some of the key functions of the systemobjects are described below.

    8.1 Object Management

    An object is implemented in Raby using a virtual space (O space).The storage of the object is ultimately accessed via the segmentsmapped by the windows of the virtual space. The sysname of theobject is the sysname of a segment containing the windowinginformation (VSD). Objects are always mapped into the O spacewhen being executed. A Clouds object is a special case of a Ravirtual space.

    The object invocation mechanism is implemented by means oftwo system objects: the Process Controller and Object Manager.Processes are implemented using segments (to back the per-process stack for example) and are managed by the process

    User Objects

    Figure 7: The Clouds/Ra Environment

    SchedulingSegmentslnterruptsVirtual Space

    The Design and Implementation of the Ctouds Distributed Operating System 37

  • 38

    controller. Objects are invoked by threads (implemented usingprocesses). When a thread invokes an object, the process con-troller is responsible for saving/protecting the state of the currentinvocation and installing any state required by the new invoca-tion. During this process, the process controller notifies the objectmanager that an object is about to be invoked (referenced) andthat object manager is responsible for ensuring that the controlinformation (virtual space descriptor) for the object exists. Theobject is installed in the O space of the thread and the thread isplaced in the entry point. The physical location of the object is ofno consequence as the DSM partition will page in the segmentsfrom the appropriate DSM controller (or server). Remote objectinvocation is performed through an RPC handler which communi-cates to an RPC server on a remote machine and asks it to invokethe requested object in a manner similar to the Clouds v.1 objectinvocation mechanism.

    Invoking the object locally and relying on DSM to page in theobject across the network is not always more efficient than per-forming a remote procedure call. If processes on different nodeshave the same locality of reference in an object and they invokethe same object using DSM to page the object, DSM will thrash,paging the common pages back and forth between the differentnodes. In that case, it might be better to move all the computa-tion to the same node by performing an RPC where the processescan physically share the memory.

    In addition, if the load on the nodes of the system is not bal-anced, it may be a better idea to send the computation to aremote node. Notice that the object being invoked need notreside on the chosen node as DSM can be used to access theobject's segments. This provides another mechanism besides pro-cess migration that can be used to balance the load on the system.However, using local invocations and DSM is superior to RPC incases where there is a high degree of locality.

    The policy that decides whether to execute a local invocationor a remote invocation has not been implemented. Currently theuser can decide on the invocation mechanism by using appropri-ate system calls.

    Dasgupta et al.

  • 8.2 Thread Management

    Processes are implemented by associating an IsiBa with a virtualspace (installed in the P space) which controls per-process memorysuch as the process stack and the parameter passing areas. Athread is a process if the thread only uses local invocations, and isa collection of processes if the thread has performed a remoteinvocation. The thread manager keeps track of a thread's RPCcalls (if any) and controls creation and termination of threads.However, thread managers do not directly manipulate processes orper-process memory. The process controller defrnes an abstractionof a process. The thread manager makes requests on the processcontroller to manipulate processes and to perform object invoca-tions and returns.

    Although the P space of a process may contain more than onesegment, the state of a process may be saved into one controllingsegment and then later restored. The ability to freeze a processinto one segment and remotely activate that segment (and allnecessary segments after that) using DSM provides Clouds v.2 witha simple, easy way of performing process migration.

    8.3 Ra Transport Protocol

    The R¿ Transport Protocol (RaTP) provides reliable message tran-sactions over the Ethernet [Wilkenloh 1989]. The protocol isdesigned to be connectionless and is efficient for providing therequest-reply form of communication that is common in client-server interactions. Since Clouds supports object invocationsusing RPC or DSM, this is the type of communication that isencountered in the system.

    RaTP has been implemented both on Ra as well as on UNIX.In addition to message transaction, RaTP provides interfaces tothe RPC and DSM mechanisms. We are currently using RaTP torun the DSM clients on Ra and DSM servers on UNIX file servers.

    The Design ønd Implemenlation of the Clouds Distributed Operating System 39

  • 8.4 Clouds I/O System

    The Clouds I/O system allows user objects to perform user-levelterminal I/O. Note that Clouds does not support nor does it needuser-level disk or network I/O. The I/O system is handled by asystem object. Each user object is provided with two specialsysnames of I¡o objects (called stdin and stdout, inspired byUNIX). These objects do not actually exist. They are operatingsystem level pseudo-objects that support read and write calls. If auser object calls the write routine, the output reaches the (logical)terminal associated with the thread. Each logical terminal is a"text window" on a UNIX workstation.

    When a thread is created, a logical terminal is associated withthe thread. The thread carries with it the sysname associated withthis text window. Thus, all I/O calls made from object programsreach the user, regardless ofwhere the thread is executing. I/Oredirection can be done by simply changing the stdin/stdoutsysnames to those of a user object that supports read and writecalls. Further I/O will cause that user object to be invoked insteadof the Clouds I/O system object.

    8.5 Other Services

    The other system objects used in the current implementation ofClouds include a Virtual Memory Manager, System Monitor,Buffer Manager, Ethernet driver, network disk partition, and DSMpartition.

    Devices in Ra conform to a standard interface: a classdefinition called RaDevice which is used to define device drivers.All device drivers must provide at least the methods defined inthe base class RaDevíce. They are open), close), read), write),getmsg), putmsg), poll), and ioctl).

    The system monitor provides a low-level monitor/shell capa-bility. The monitor can be used to read and alter values of vari-ables in the kernel or system objects, as well as execute arbitrarymethods in the kernel or any system object. The system monitorcan also be used for invoking user objects, thus providing a rudi-mentary shell for Clouds.

    40 Dasgupta et al.

  • The network disk partition is implemented as a system object.Segments managed by this partition seem to reside on the Cloudsnode but actually reside on a UNIX frle server. Reads, writes, andcontrol messages are shipped to the UNIX system where a serveroperates on the UNIX files that correspond to the indicated seg-ments (see Figure 8).

    Many services are provided by UNIX programs. These includeterminal emulators, which run under SunWindows and on top ofRaTP interfacing through the user I/O system on rR¿. The com-piler is a modifred Gnu C++ compiler that generates Clouds seg-ments. The Clouds shell is a UNIX program that is used to invokeClouds objects using the RPC facility.

    Figure 8: Network Disk and DSM Services

    The Design and Implementation of the Clouds Distributed Operating System 4l

  • 8.6 Status and Work in Progress

    The facilities described above are operational. Project work iscurrently focusing on better system-level and user-level supporttools for effective use of the Clouds operating system. Thisincludes a user programming environment, a user-level namingsystem, and development of application programming techniquesin the persistent object framework.

    In addition, work is being done in the area of reliability andpersistent-object programming, the thrust of the original Cloudssystem. We have explored memory semantics for programmingpersistent objects and have developed a set of flexible mechanismsthat support customized data consistency [Chen & Dasgupta 1989;Dasgupta & Chen 1990; Chen 1990]. In other fault toleranceresearch, we have developed a scheme that replicates data as wellas computation to guarantee forward progress of computations[Ahamad et al. 1990]. Work is underway to design schemes thatexploit multicast communication to make a variety of services(e.g. object location, group communication, commit protocols,replication management, and so on) more efficient [Ahamad &Belkeir 1989; Belkeir & Ahamad 19891.

    9. Concluding Remarks

    Clouds is intended to serve as a base for research in distributedcomputing at Georgia Tech. The new Clouds kernel, Rø, coupledwith system objects, provides an elegant environment for operat-ing systems development. The design and implementation of Rabenefited from the experience gained from the design and imple-mentation of the frrst Clouds kernel.

    The design and implementation of Cbuds v.2 are geared moretowards flexibility, portability, and maintenance than the Cloudsv.1 kernel. This is reflected in the design of the Rø kernel and itsvirtual memory architecture, support for lightweight kernel dae-mons, and system object facility. Furthermore, the additionalfreedom allowed by Ra facilitates more easy testing of alternativesystem designs (both mechanisms and algorithms) using R¿ as theimplementation base.

    Dasgupta et al.42

  • V/e would like to acknowledge all those who have worked onthe Clouds project, past and present, for making this design andimplementation possible. This includes Jim Allchin, MartinMcKendry, Gene Spaford, Dave Pitts, Tom Wilkes and HenryStrickland for their contributionsto Clouds u..1, and Nasr Be-lkeir,M. Chelliah, Vibby Gottemut¡kala, Ranjit John, Ajay Mohindr4Gautam Shah, and Monica Skidmore for their contributions toClouds v.2.

    The Design and Implementation of the Clouds Distibuted Operating Systeru 43

  • References

    M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian,and M. Young, Mach: A New Kernel Foundation for UNIXDevelopment, Proceedings of the Summer USENIX Conference,July 1986.

    M. Ahamad and N. E. Belkeir, Using Multicast Communication forDynamic Load Balancing in Local Area Networks, ln 14th AnnualConf. on Local Computer networlcs, October, 1989.

    M. Ahamad, P. Dasgupta, and R. J. LeBlanc, Fault-Tolerant AtomicComputations in an Object-based Distributed System, DístributedComputing Journal, April, 1990.

    J. E. Allchin, An Architecture for Reliable Decentralized Systems, Ph.D.Thesis, School of Information and Computer Science, GeorgiaInstitute of Technology, 1983. (Available as Technical Reportcrr-rcs-83/23.)

    N. Belkeir and M. Ahamad, Low Cost Algorithms for Message Deliveryin Dynamic Multicast Groups, ln Proceedings of the 9th Interna-tional Conference on Distributed Computing Systems, June, 1989.

    J. M. Bernabéu-Aubán, P. W. Hutto, M.Y.Khalidi, M. Ahamad,W. F. Appelbe, P. Dasgupta, R. J. LeBlanc, andU. Ramachandran, Clouds - A Distributed, Object-Based Operat-ing System: Architecture and Kernel Implementation, EuropeanUNIX systems User Group Autumn Conference, October 1988.

    J. M. Bernabéu-Aubán, P. W. Hutto, M.Y.Khalidi, M. Ahamad,W. F. Appelbe, P. Dasgupta, R. J. LeBlanc, andU. Ramachandran, The Architecture of Ra: A Kernel for Clouds,Proceedings of the Twenty-Second Annual Hawaii InternationalConference on System Sciences, January 1989.

    R. Campbell, G. Johnston, and V. Russo, Choices (Class HierarchichalOpen Interface for Custom Embedded Systems), Operating SystemReview, 2l(3), July 1987.

    Raymond C. Chen, Consistency Mechanisms and Memory Semantics forPersistent Object-Based Distributed Systems, Ph.D. Thesis, Schoolof Information and Computer Science, Georgia Institute of Tech-nology, 1990. (In Progress.)

    R. C. Chen and P. Dasgupta, Linking Consistency with Object/ThreadSemantics: An Approach to Robust Computation,ln Proceedingsof the 9th Internatíonal Conference on Distributed Computing Sys-tems, Jlune 1989.

    44 Dasgupta et al.

  • D. Cheriton and W. Zwaenpoel, The Distributed V Kernel and its Per-formance for Diskless Workstations, Proceedings Of the gth ACMSymposium on Operating System Principles, pages 129-139,October l0-13, 1983.

    P. Dasgupta and R. C. Chen, Memory Semantics for Programming Per-sistent Objects, 1990. (In Progress.)

    Greg Kenley, An Action Management System for a Distributed Operat-ing System, Master's Thesis, School of Information and ComputerScience, Georgia Institute of Technology, 1986.

    M. Yousef A. Khalidi, Hardware Support for Distributed Object-basedSystems, Ph.D. Thesis, School of Information and Computer Sci-ence, Georyia Institute of Technology, 1989. (Available as Techni-cal Report GIT-ICS-89/I 9.)

    J. Moss, Nested Transactions: An Approach to Reliable DistributedComputing, Technical report MIT/LCS/TR-260, MIT Laboratory forComputer Science, 1981.

    E. Nett, J. Kaiser, and R. Kroger, Providing Recoverability in a Transac-tion Oriented Distributed Operating System, 6th InternationalConference on Distributed Computing Systems, pages 590-597,1986.

    J. D. Northcutt, Mechanisms for Reliable Distributed Real-Time Operating Systems, Volume 16 of Perspectives in Computing, AcademicPress, 1987. Editors: W. Rheinboldt and D. Siewiorek.

    E. E. Organick, The Multics System, MIT Press, 1972.

    David V. Pitts, A Storage Management System for a Reliable DistributedOperating System, Ph.D. Thesis, School of Information and Com-puter Science, Georgia Institute of Technology, 1986. (Availableas Technical Report GIT-ICS-86/21.)

    Umakishore Ramachandran, Mustaque Ahamad, and M. YousefA. Khalidi, Coherence of Distributed Shared Memory: UnifyingSynchronization and Data Transfer, In Eighteenth Annual Interna-tional Conference on Parallel Processing, August 1989.

    R. Rashid and G. Robertson, Accent: A Communication Oriented Net-work Operating System Kernel, Proceedings of the 9th Symposíumon Operating Systems Principles , pages 64-7 5, December I 98 I .

    R. Rashid, A. Tevanian, M. Young, D. Golub, R. Baron, D. Black,W. Bolosþ, and J. Chew, Machine-Independent Virtual MemoryManagement for Paged Uniprocessor and Multiprocessor Architec-tures, In Proc. 2nd International Conference on Architectural

    45The Design and Implementation of the Clouds Distributed Operating System

  • Support for Programming Languages and Operatíng Systems(ASPLOS il), pages 3l-39, October 1987.

    Eugene. H. Spafford, Kernel Structures for a Distributed Operating Sys-tem, Ph.D. Thesis, School of Information and Computer Science,Georgia Tech, 1986. Available as Technical Report GIT-ICS-86/l 6.)

    B. Stroustrup, The C++ Programming Language, Addison-rWesley Pub-lishing Company, 1986.

    A. S. Tanenbaum and S. J. Mullender, An Overview of the Amoeba Dis-tributed Operating System, Operating System Review, I 3(3):5 l-64,July 1981.

    Christopher J. Wilkenloh, Design of a Reliable Message Transaction Pro-tocol, Master's Thesis, School of Information and Computer Sci-ence, Georgia Institute of Technology, 1989.

    C. Thomas Wilkes, Programming Methodologies for Resilience andAvailability, Ph.D. Thesis, School of Information and ComputerScience, Georgia Institute of Technology, 1987. (Available asTechnical Report GIT-ICS-87/32.)

    W. Wulf, E. Cohen, W. Corwin, A. Jones, R. Levin, C. Pierson, andF. Pollack, Hydra: The Kernel of a Multiprocessor Operating Sys-tem, Communications of the ACM,17(6):337-345, Jlune 1974.

    W. A. Wull R. Levin, and S. P. Harbison, HYDRA/C.mmp, An Experi'mental Computer System, McGraw-Hill, Inc., 1981.

    46 Dasgupta et al.