chap14(1)

Process MigrationTransferring enough process state to execute it on another machineMotivationLoad sharingCan improve performance, but must watch out for communications overheadCommunications performanceMove interacting processes to same nodeMay move process to where data isAvailabilitySurvive crashes or downtimeUtilizing special capabilitiesMay want/need capabilities only available on a particular nodeInitiationO.S. may start for load balancingProcess may request based on system

Migrating a ProcessMust destroy process on source and create it on the destinationProcess control block must be updatedWhat to migrate?Eager (all) Move entire address spacePrecopy Start copying pages while process runs on source systemModified pages copied a second timeEager (dirty) Transfer only pages in main memory and modifiedOther pages transferred on demandCopy-on-reference Copy when usedVery quick initial transferFlushing Clear pages to diskPages referenced from disk as neededTemporary or permanent transfer?Transfer thread or process?

Migrating a ProcessFilesIf exclusive, transfer with processIf shared, set up distributed accessMessages and SignalsHave a method of temporarily store outstanding messages and signalsMay keep forwarding information so future messages get to the destinationMigration ScenarioProcess sends migration requestIncludes part of image and file infoDestination forks a childPasses it that informationChild brings other necessary infodata, environment, stack infoWhen migration is finished, originating process destroyed

Process MigrationNegotiating MigrationStarter module for S decides to migrate a process P, send message to node DStarter for D sends positive replyKernel on S offers process to DD may reject it or send info to StarterStarter for D send to kernel on DD reserves resources, accepts PEvictionNode can evict a process that migrated to it at some earlier timeExample: SpriteEach process has a home nodeMonitor sees high load, initiates evictionProcess migrates back to home nodeEvicted processes are suspendedEntire address space transferred to home nodePreemptive/NonpreemptiveMove running process or at start only?

Distributed Global StateMutual Exclusion, Deadlock, etc. more complicated by no global stateTime lag between different systemsProcess/event graph (Fig 14.3)Horizontal line represents timePoint corresponds to an eventMay be internal event, send message, receive messageBoxed point represents a snapshot

Must deal with messages in transit

Clocks also may not be in sync

Distributed Global StateSnapshot State of a process, with messages sent and received since last snapshotDistributed Snapshot Collection of snapshotsInconsistent if it records receiving a message but not sending itDistributed Snapshot AlgorithmAssumes reliable, in-order messagesInitiated by sending a markerWhen P receives a marker from QRecord local state SpMark state of QP channel as emptyPropagate marker on outgoing channelsRecord incoming messages on other channels until a marker is receivedCombine snapshots for global state

Distributed Mutual ExclusionMust maintain normal mutual exclusion rules (Chapter 5 & 6)Centralized Managed by one nodeFailure if that node fails, bottleneckDistributed All nodes participateEach node has partial informationNode failure doesnt halt entire systemCant rely on system-wide clockLamports Clock AlgorithmEach process keeps an event counterWhen sending a message increment counter and include it with messageWhen message arrives, set counter to: 1 + max [counter, message timestamp]Order by counter, then process IDMay not match real timeConsistency is usually sufficient

Distributed QueueAssume N nodes, reliable ordered messages, fully connected networkAssume we are managing one resourceMessages: Request, Reply, ReleaseEach site keeps an array with the most recent message receivedInitialize q[j] = (Release,0,j)To get resource: Pi issues (Request, Ti, i)Receiving: save message in q[i]If message = (Request, Tj, j) and no local request, send (Reply, Ti, i)Ok to enter critical section when:Pis request is earliest in arrayAll other messages have later timestampWhen done, Pi issues (Release, Ti, i)Timestamps and replies guarantee proper mutual exclusion, no deadlock, etc.

Distributed Mutual ExclusionRevised Queue AlgorithmEliminates Release messages, does not require messages be delivered in orderTo get resource: Pi issues (Request, Ti, i)When Pj receives (Request, Ti, i):If in critical section, defers sending ReplyIf not waiting, sends (Reply, Tj, j)If waiting and Pis request later then Pjs request, save message in q[j] and defer ReplyIf waiting and Pis request newer then Pjs request, save message in q[j] and send ReplyOk to enter critical section when it has received (Reply, Tj, j) from all processesWhen leaving critical section, issue (Reply, Ti, i) to each pending requestToken-Passing approachBroadcast request when wanting resourceCan enter critical section if you hold a tokenIncludes last time each process held the tokenWhen leaving critical section, look for next process with request time > token held time

Distributed DeadlockPhantom Deadlock May see cycle if requests arrive before the release message that breaks cyclePrevention Linear order on resourcesDeny hold-and-wait by requiring resources to be acquired at onceWait-Die MethodBased on timestampsIf resource held by younger process, waitIf resource held by older process, die and restart with same timestampProcess ages so will eventually succeedWound-Wait MethodImmediately kill younger processes that are holding requested resources

Distributed DeadlockAvoidanceRequires global stateDecisions must be made in a critical section DetectionCentralized ControlOne site handles detectionSimple, subject to failure of central nodeHierarchical ControlSites organized as a treeEach node collects information from childrenDetects deadlocks at common ancestorDistributed ControlAll processes participate in detectionMay have considerable overhead

Distributed DetectionAssume at most one outstanding request per transactionEach object has a ID Di and variable Locked_By(Di) with who holds itEach transaction Tj has values:Held_by(Tj) Transaction holding resource wanted by Tj (who I am waiting for)Wait_for(Tj) Head of list of blocked transactions (who I am ultimately waiting for)Request_Q(Tj) All outstanding requests for objects held by Tj (who is waiting for me)When it requests an objectIf not granted, updates Held_by() to holderAdds itself to Request_Q of holding transactionIf holder not blocked, set Wait_For() to holderElse set to Wait_For(holder)Send update message to transactions in Request_QWhen update received, check to see if transaction Wait_for() is in Request_Q()If there, then we have a deadlockElse send update message to our Request_Q

Detection ExampleT1 requests R1 (request is granted)T2 requests R2 (request is granted)

T3 requests R2 (currently held by T2)



Held_By(T2) is in Request(T2), so we have deadlock

TransWait_ForHeld_ByRequest_QT1NilNilNilT2NilNilNilT3NilNilNil

TransWait_ForHeld_ByRequest_QT1NilNilNilT2NilNilT3T3T2T2Nil

TransWait_ForHeld_ByRequest_QT1NilNilT2T2T1T1T3T3T1T2Nil

TransWait_ForHeld_ByRequest_QT1T1T2T2T2T1T1T3, T1T3T1T2Nil

Messages and DeadlockLook at who each process is expecting a message fromMay be expecting messages from any of two or more processesDeadlock occurs if no messages are in transit and all potential senders are themselves blocked (Fig 14.16)Can also deadlock due to message buffers not being availableCan prevent by having a hierarchical set of buffers packet must have traveled k hops to use level k bufferProcess may be suspended if it tries to send a message and no buffers are availableCan have deadlock if two processes send significant data before receiving

chap14(1)

Documents