amoeba distributed operating system

27
Amoeba Distributed Operating System What is Amoeba Amoeba is a distributed operating system Developed by Andrew Tannenbaum Uses timesharing User logs into the system as a whole, not just his local machine. When the user runs a program, the system decides which machine (or machines) in the system should execute it. This decision is invisible to the user. Amoeba is a distributed operating system. It collects a huge varity of single machines connected over a (fast) network to one, huge computer. It was originally developed at the Vrije Universiteit in Amsterdam by Andrew Tanenbaum and many more. Amoeba was always designed to be used, so it was deemed essential to achieve extremely high performance. Currently, it's the fastest distributed operating system. The original Amoeba sources are handled under a public license - similar the BSD license. Amoeba builds upon a traditional micro kernel. It supports true multithreading (kernel controlled), segment based memory management. All Amoeba components communicate with eachother over a standardized RPC (Remote Procedure Call) interface - simple but very powerfull. No

Upload: adamridzuan91

Post on 08-Apr-2015

459 views

Category:

Documents


12 download

TRANSCRIPT

Page 1: Amoeba Distributed Operating System

Amoeba Distributed Operating System

What is Amoeba

Amoeba is a distributed operating system

Developed by Andrew Tannenbaum

Uses timesharing

User logs into the system as a whole, not just his local machine.

When the user runs a program, the system decides which machine (or machines) in the

system should execute it.

This decision is invisible to the user.

Amoeba is a distributed operating system. It collects a huge varity of single machines

connected over a (fast) network to one, huge computer. It was originally developed at the Vrije

Universiteit in Amsterdam by Andrew Tanenbaum and many more. Amoeba was always

designed to be used, so it was deemed essential to achieve extremely high

performance. Currently, it's the fastest distributed operating system.  

The original Amoeba sources are handled under a public license - similar the BSD license.

Amoeba builds upon a traditional micro kernel. It supports true multithreading (kernel

controlled), segment based memory management. All Amoeba components communicate with

eachother over a standardized RPC (Remote Procedure Call) interface - simple but very

powerfull. No matter if a client or server thread is running in kernel or user mode - it uses the

same RPC interface. Always. Everywhere. This leads to a very clean and simple OS design,

very well suited for beginners.  

Page 2: Amoeba Distributed Operating System

Because Amoeba was designed from scratch with  new concepts, never seen before, it 

suffered from a lack of application programs. Therefore a POSIX-compliant UNIX emulation was

added. It makes porting UNIX programs much easier!  Now, with additional changes, a huge

varity of application programs  from the UNIX-world work under Amoeba:

X11 with applications, various compilers (gcc, ocaml, tcl/tk), bash shell, editors and many, many

more. Amoeba is ready to use!  Of course not without (little) problems.

It's still in an sligthly experimental state, but ready to use. FSD-Amoeba is intended to

use for dedicated programmers only, and not for end users!

Highlights:

A huge number of kernel extensions and changes

New hardware driver stuff:

o several new network driver

o full pci support

o new parallel and serial port support

o virtual console support

o enhanced page protection

100 Mbps Fast Ethernet support

New interrupt handling, national keyboard support, enhanced IOP server for X11

X11 Release 6.4

Bash-2.02, tcl-7.6, tk-4.0, ocaml-3.01, xv-3.10a, ghostscript 4.03 & 6.01, all GNU-

textutils, gmake, gzip...

New Setup and installation scripts, various new menu driven system administration tools

Crosscompiling environment under Linux

Fireball Documentation Project (FDP)

Andrew Toolkit for the Fireball Documentation Project

Page 3: Amoeba Distributed Operating System

History

The history of modern computing can be divided into the following eras:

1970s: Timesharing (1 computer with many users)

1980s: Personal computing (1 computer per user)

1990s: Parallel computing (many computers per user)

80’s computers could be networked together and files could be shared between users

RPCs.

Parallel computing in the 90’s and today are used to share CPU resources among a

network of computer systems.

This concept is referred to as distributed computer systems or parallel computing

How can we exploit with the one-to-many computer system configuration?

The answer is Amoeba OS can solve this all problem.

Developed at the Vrije Universiteit Amsterdam, Netherland. Chief designer: Andrew S.

Tanenbaum; other developpers were Frans Kaashock, Sape J. Mullender, Robbert van

Renesse, Leendert van Doorn,  Kees Verstoep and many, many more.

First proto release in 1983 (V1.0), last official release 1996 (V5.3)

Supports multiple architectures: 68k, i80386, SPARC

Page 4: Amoeba Distributed Operating System

Virtual Amoeba Machine and Features in Amoeba

The next step: a virtual machine supplying the Amoeba concepts like RPC. Either running

natively under Amoeba, or under UNIX with the AMUNIX library together with the FLIP protocol

module. The VM is derived and build from OCaML. The great advantage: Amoeba programs

written in OCaML and compiled to bytecode can run independently from the underlying OS!

1. Design goals 

The basic design goals of Amoeba are:

Distribution—Connecting together many machines

Parallelism—Allowing individual jobs to use multiple CPUs easily

Transparency—Having the collection of computers act like a single system

Performance—Achieving all of the above in an efficient manner

One of the main goals of the Amoeba development was to design a transparent

distributed system that allows users to log into system as a whole:Transparency. That

means: Hiding the comlexities of a distributed system from the users. Amoeba users

should not be concerned about the number of processors in the system, nor must they

know the location of the other machines or servers (like the Filesystem server...).

Several machines connected over a network operate as a single system: Distribution.

Amoeba gives its users the illusion of interacting with a single, powerfull system.

Parallelism: On an Amoeba system, a single program or command can use multiple

processors to increase performance. The user simply requests an operation, and the

Amoeba OS decides the best way to execute the request. Amoeba will decide which

processor (or processors) are appropriate for the request, based on the current state of

the system. Additionaly, special development tools have been made for an Amoeba

environment that take advantage of the inherent parallelism. For example, Amoeba

supports a parallel 'make'  program.

Peformance:  Much effort was given to meet this goal!!!This is accomplished with a

newly developed High Performance network protocol called FLIP(Fast Local Internet

Protocol). When FLIP was developed, none of the current protocols provided adequate

Page 5: Amoeba Distributed Operating System

support for distributed systems. FLIP performs clean, simple and efficient

communication between distributed nodes.

Development from the scratch; Amoeba doesen't based on any existing operating

system

Amoeba  interact with  the user as a UNIX-like Timesharing System

2. System architecture

Amoeba implements a universell distributed Client-Server-Modell. In fact, basically the whole

system needs only three Functions to do all the work: The transaction call from the Client, and

the GetRequest and PutReply functions on the Server side.  

An Amoeba System consists of four principle components: 

1. Workstations

2. Pool Processors

3. Specialized Servers (File server...)

4. WAN Gateways

Page 6: Amoeba Distributed Operating System

Objects

Abstract data types with data and behaviors.

Amoeba primarily supports software objects, but hardware objects also exist.

Each object is managed by a server process to which RPCs can be sent. Each

RPC specifies the object to be used, the operation to be performed, and any

parameters to be passed

Capabilities

128-bit value object description created and returned to the caller when the

object is created.

Subsequent operations on the object require the user to send its capability to the

server to both specify the object and prove the user has permission to

manipulate the object.

Capabilities are encrypted to prevent tampering.

In more detail:  

 

Amoeba is designed as a collection of micro kernels. Thus the Amoeba system

consists of many CPU's connected over a network. Each CPU owns his own

local Memory in the range from 2MB to several 100MB. A huge number of 

Processor's build the so called Processorpool. This group of CPUs can be

dynamically allocated as needed by the system and the users. Specialized

servers, called Run server, distribute processes in a fair manner to these

machines. 

Many different Processor architectures are supported: i80386(Pentium), 68k,

SPARC. Today, only the i80386 architecute is significant for building an Amoeba

system (cheap!!!).

Workstations allow the users to gain access to the Amoeba system. There is

typically one workstation per user, and  the workstation are mostly diskless; only

a workstation kernel must be booted (from floppy, via tftp, burned in Flash-

EEPROM). Amoeba supports X-Windows and UNIX-emulation.

At heart of the Amoeba system are several specialized servers that carry out and

synchronize the fundamental operations of the kernel. Amoeba has a directory

Page 7: Amoeba Distributed Operating System

server (called SOAP) that is the naming service for all objects used in the

system. SOAP provides a way to assign ASCII names to an object  so it's easier

to manipulate(by humans). The directory server can replicate files without fearing

their change. Amoeba has of course a file server (called the Bullet Server) that

implements a stable high speed file service. High speed is achieved by using a

large buffer cache. Since the files are first created in cache, and are only written

to disk when they are closed, all the files can be stored contigously. The

underlying idea behind immutable files is to prevent the replication mechanism

from undergoing race conditions. And file server crashes normally don't  result in

an inconsistent file system! The Bullet server uses the virtual disk server to

perform I/O to disk, so it's possible that the file server run as a normal user

program!  The Boot server controll all global system servers (outside the kernel):

start, check and poll, restart if crashed.

All Amoeba objects (files, programs, memory segments, servers) are protected 

and discribed with so called Capabilities (see below).

An example for a processor pool: About 60 -80  Sun motherboards were build into a  rack

system at the Vrije Univerity. Of course cheap and normal IBM-PC 's can be used as CPU-

Servers, too!

3. The Amoeba Micro-kernel 

Microkernel and Server Architecture

Amoeba is built upon a microkernel architecture.

The microkernel supports the basic process, communications, and object

primitives. It also handles device I/O and memory management.

Each machine in the Amoeba system runs a small identical software program -

called the microkernel.

The function of the kernel is to allow efficient communication between client

processes, which run application programs, and server processes, such as the

Bullet File server or the directory server.  

A small piece of code, called the microkernel, is present on all Amoeba machines and they run

nearly the same microkernel which handles 

Page 8: Amoeba Distributed Operating System

Low level I/O management

Communication between processes or threads

Low level Memory management

Process and  thread (kernel/user space) management

Server processes (see above) supply other  operating system services and generally run in user

mode. This job specialization allows the microkernel to be small and efficient, increases

reliability, allows as much as possible of the operating system to run as user processes,

providing flexibility and no extra burdens are added to individual CPUs with faciliites that it

doesn't need. 

Threads

Each process has its own address space and contains multiple threads.

These threads have their own stack and program counter, but share the global data and code of

the process.

Remote Procedure Calls

RPC is the basic communication mechanism in Amoeba. Communication consists of a client

thread sending a message to a server thread, then blocking until the server thread sends back a

return message, at which time the client is unblocked.

Amoeba uses stubs to access remote services which hide the details of the remote services

from the user. A special language in Amoeba called the Amoeba Interface Language (AIL)

generates these stubs automatically. The stubs will then marshal parameters and hide the

communication details from the user.

Process concept:

Amoeba supports traditional process concept

Processes consists of several threads (at least one)

Each thread has his own registers, Instruction Pointer, stack; but all threads of a

process share the same  memory region

Page 9: Amoeba Distributed Operating System

Example: File server. Each request is handled by one thread, but all threads use

the same cache; synchronization through Mutex  and Semaphores

Memory management:

Threads can allocate and deallocate blocks of memory, called Segments . 

These segments can be read and written, and can be mapped into and out of the

address space of the process.

A process owns at leat one segment, but may have many more of them.

Segments can be used for text, data, stack, or any other purpose the process

desires. The operating system doesen't enforce any particular pattern on

segment usage.

I/O-Managment:

For each I/O-Device attached to a machine, there is a device driver in the kernel.

The driver manages all I/O for the device.

All drivers are static linked to the kernel; no dynamic Module support

Mostly the communication with Device-Drivers are performed through the

standard message protocoll (like the rest of the system in user space)

Communication:

Page 10: Amoeba Distributed Operating System

Two forms of communication are provided:  

 

o Point-to-Point communication

o Group communication

Group Communication

o Amoeba provides a mechanism that allows all receivers in a one-to-many

configuration to receive a transmitted message in the same order. This

simplifies parallel processing and distributed programming problems.

4. Amoeba's  Object  concept 

Amoeba

Programs can execute wherever OS decides.

No concept of host machine.

Objects and Capabilities are used to manage file systems.

Network OS

Programs run locally unless specified.

User aware he is using a local host machine.

Files are maintained and accessed from local machine unless using a remote file

system.

The central point of the software concept for a server implementation is the Objectconcept.

Each object consists of 

Data and

Operations on this data

Amoeba is organized as a collection of objects (essentially abstarct data types), each with some

number of operations that processes can perform on it.  Operations on an object are performed

Page 11: Amoeba Distributed Operating System

by sending a message to the object's server.Objects are created by processes and managed by

the corresponding server. There are many different object classes: 

Files

Directories

Memory segments

Processes

I/O-Devices (Hard drive...)

Terminals

...

Operations on objects are performed with Stub-procedures.When an object is created, the

server returns a Capability.  The capability is used to address and protect the object. A typically

capability is shown below. 

The Port field identifies the server. The Object field tells which object is beeing referred to,

since a server normally will manage several objects. The Rights field specifies which operations

are allowed (e.g. capability for a file may be read only). Since capabilities are managed in user

space, the Check field is needed to protect them cryptographically, to prevent users from

tampering with them.

 

5. Process management 

Page 12: Amoeba Distributed Operating System

A process is an object in Amoeba. Information about the processes in Amoeba are contained in

capabilities and in a data structure called a process descriptor, which is used for process

creation and stunned processes (and process migration). The process descriptor consists of

four components: 

 

 

The host descriptor provides the requierements for the system where the process

must run, by describing what machine it can be run

The capabilities include the capability of the process which every client needs,

and the capability of a handler, which deals signals and process exit

The segment component describes the layout of the addess space (see below)

The thread component describes the state of each of the threads (see below) in

the process and their state informations(IP, Stack,...)

Amoeba supports a simple thread model. When a process starts up, it has at least one thread.

The number of threads is dynamic. During execution, the process can create additional threads.

And existing threads can terminate. All threads are managed by the kernel. The advantage of

this design is that when a thread does a RPC, the kernel can block that thread and schedule

another one in the same process if one is ready! 

Three methods are provided for thread synchronization:

Page 13: Amoeba Distributed Operating System

Mutexes

Semaphores

Signals

A Mutex is like a binary semaphore. It can be in one of two states, locked or unlocked. Trying to

lock an unlocked mutex causes it to become locked. The calling thread continues. Trying to lock

a mutex that is already locked causes the calling thread to block until another thread unlocks the

mutex.The second way threads can synchronize is by counting Semaphores. These are slower

than mutexes, but there are times when they are needed. A semaphore can't be negative. Try

down a zero semaphore  causes the calling thread to block until another thread do a up

operation on the semaphore.

Signals are asynchronous interrupts sent from one thread to another in the same process.

Signals can be raised, caught, or ignored. Asynchronous interrupts between processes use

the stun mechanism.

6. Memory management  

Amoeba supplies a simple memory management based on segments. Each process owns at

least three segments:

1. Text/Code segment

2. Stack segement for the main thread/process

3. Data segment

Each further thread gets his own stack segment, and the process can allocated arbitrary

additional data segments.

All segments are page protected by the underlying MMU, the kernel segments, too.

7. Communication 

Page 14: Amoeba Distributed Operating System

The definitions of the Amoeba communication calls are given in the ANSI C language.

All three calls use a Msg data structure, which is a 32-byte header with several fields to

hold capabilities and other items. Note that each request or reply message can consist

of just a header or a header and an additional component.

All processes, the kernel too, communicate with a standardized RPC (Remote procedure

call) interface. There are only three functions to reach this goal:

trans(Msg *requestHeader, char *requestBuffer, int requestSize, Msg *replyHeader,

char*replyBuffer, int replySize) Client sends a request message and receives a reply; the

header contains a capability for the object upon which an operation is being requested.

trans(req_header, req_buf, req_size, rep_header,rep_buf, rep_size)

-> do a transaction to another server

get_request(Msg *requestHeader, char *requestBuffer, int requestSize)Server gets a

request from the port specified in the message header.

getreq( req_header, req_buf, req_size)

-> get a client request

put_reply(Msg *replyHeader, char *replyBuffer, int replySize) Server replies.zza

putrep( rep_header, rep_buf, rep_size)

-> send a reply to the client

The first function is used by client to send a message to a server, and get a reply from the

server on this request. The reply and request buffers are generic memory buffers (char). The

reply and request headers are simple data structures to describe the request and the capability

of the server. On the other side, teh server calls within an infinite loop the getreq function. Each

time a client sends this server (determined by a server port - see capabilities for details) a

message, the getreqfunction returns with the client data filled in the request buffer, if any. The

request header contains informations about the client request.

Because the client expects a reply, the server must send a reply (either with or without reply

data) using the reply function.

How to use Amoeba

Page 15: Amoeba Distributed Operating System

Amoeba is freeware

It can be loaded on a LAN or University computer system

Editors such as elvis, jove, ed come with the installation package

Compilers such as C, Pascal, Fortran 77, Basic, and Modula 2

Orca - used for Parallel Programming

Applications

UNIX emulation

Parallel make

Traveling salesman

Alpha-beta search

Parallel make

o Amoeba runs contains a processor pool with several processors.

o One application for these processors in a UNIX environment is a parallel

version of make.

o When make discovers that multiple compilations are needed, they are run in

parallel on different processors.

o pmake was developed based on the UNIX make but with additional code to

handle parallelism.

o many medium-sized files = considerable speedup

o one large source file and many small ones = total time can never be smaller

than the compilation time of the large one.

o A speedup of about a factor of 4 over sequential make has been

observed in practice on typical makefiles.

The Traveling Salesman

Page 16: Amoeba Distributed Operating System

o The computer is given a starting location and a list of cities to be visited. The

idea is to find the shortest path that visits each city exactly once, and then

returns to the starting place.

o Amoeba was programmed to run this application in parallel by having one

pool processor act as coordinator, and the rest as slaves.

o Example: the starting place is London, and the cities to be visited include

New York, Sydney, Nairobi, and Tokyo

o The coordinator might tell the first slave to investigate all paths starting with

London-New York, the second slave to investigate all paths starting with

London-Sydney, the third slave to investigate all paths starting with London-

Nairobi...

o All searches go on in parallel.

o When a slave is finished, it reports back to the coordinator and gets a new

assignment.

o Also, the algorithm can be applied recursively.

o The first slave could allocate a processor to investigate paths starting with

London-New York-Sydney, another processor to investigate London-New

York-Nairobi, and so forth.

o Results show that about 75 percent of the theoretical maximum speedup can

be achieved using this algorithm, the remaining 1/4 being lost to

communication and other overhead

Significance of points

Page 17: Amoeba Distributed Operating System

Amoeba is a distributed operating system which successfully allows users to execute

jobs transparently over multiple CPUs.

It was primarily developed by Andrew Tannenbaum and others at the Vrije Universiteit

Amsterdam, Netherland.

Its basic design goals are –

Distribution—Connecting together many machines

Parallelism—Allowing individual jobs to use multiple CPUs easily

Transparency—Having the collection of computers act like a single system

Performance—Achieving all of the above in an efficient manner

It is based on a microkernel architecture.

It uses objects to encapsulate data and processes and capabilities to describe the

objects.

The kernel provides just three major system calls

trans(Msg *requestHeader, char *requestBuffer, int requestSize, Msg *replyHeader,

char*replyBuffer, int replySize)

get_request(Msg *requestHeader, char *requestBuffer, int requestSize)

put_reply(Msg *replyHeader, char *replyBuffer, int replySize)

It has proven to be successful at implementing speedup on many common computer

science algorithms including UNIX emulation, parallel make, traveling salesman, and

alpha-beta search.

Summary

Page 18: Amoeba Distributed Operating System

The Amoeba distributed operating system succeeds in overcoming many of the

hurdles faces in distributed computing.

It abstracts away the use of RPCs using stubs and is scalable based on available

CPUs.

Although system updates seem to have stopped, the current version appears to

have reached a stable point in its architectural development.

The programming languages included with the distribution are common to most

programmers and should make code creation easy for Amoeba applications.

Results of application speedup and the fact that the system is freely available

make it worth evaluating at the university level.

References

Page 19: Amoeba Distributed Operating System

[1]-Based on the article: Amoeba: An Overview of a Distributed Operating

System by Eric W. Lund - March 29, 1998, Rochester Institute of Technolog and

informations about Amoeba taken from a  web site from the University of Halle.

Additional parts are taken from the Amoeba tribute site from Stephen Wagner.

Furthermore the classics: The Amoeba kernel, Andrew S. Tanenbaum, M.F.

Kaashoek.

[2] Tanenbaum, A.S, Sharp, G.J. “The Amoeba Distributed Operating System”

Online: 2006 http://www.cs.vu.nl/pub/amoeba/Intro.pdf

[3] Ramsay, M., Keigel, T., Memmer, H. “Ameoba Distributed Operating

System” Online http://csserver.evansville.edu/~mr56/CS470/Final_Draft.pdf

[4] Coulouris, G. Dollimore, J., Kindberg, T. Distributed Systems – Concepts

and Design, 1994, Online: http://www.cdk3.net/oss/Ed2/Amoeba.pdf

[5] Sharp, G.J.: ‘‘The Design of a Window System for Amoeba,’’ Report IR-

142, Dept. of Math. & Computer Science, Vrije Universiteit, Dec. 1987.

[6] The Amoeba Reference Manual Users Guide Vrije University of

Amsterdam, 1996 Online 2006:

http://www.cs.vu.nl/pub/amoeba/manuals/usr.pdf

[7] Bal, H.E., Renesse R. van, and Tanenbaum, A.S.: ‘‘Implementing

Distributed Algorithms Using Remote Procedure Calls,’’ Proc. 1987 National

Computer Conference, pp. 499-506, June 1987.

[8] Baalbergen, E.H.: ‘‘Parallel and Distributed Compilations in Loosely

Coupled systems,’’ Proc. Workshop on Large Grain Parallelism , Providence,

RI, Oct 1986.

[9] Straven H. van, Renesse R. van, and Tanenbaum, A.S, “The Performance

Of The Amoeba Distributed Operating System” Online: 2006

https://dare.ubvu.vu.nl/bitstream/1871/2589/1/11008.pdf

Page 20: Amoeba Distributed Operating System

[10] Tanenbaum, A.S, et al, “Experiences with the Amoeba Distributed

Operating System” Online 2006:

http://citeseer.ist.psu.edu/cache/papers/cs/6593/ftp:zSzzSzftp.sys.toronto.eduzS

zpubzSzamoebazSz03.pdf/tanenbaum90experiences.pdf

[11] Wikipedia – www.wikipedia.com

[12] Slide finder – www.slidefinder.com

[13] Slide share – www.slideshare.com

[14] document share – www.docshare.com