partha dasgupta arizona state university the milan project
DESCRIPTION
Parallel Processing with Windows NT Networks. Partha Dasgupta Arizona State University The MILAN Project New York University Arizona State University Funding Sources: DARPA/Rome Laboratory, NSF, Intel, and Microsoft. - PowerPoint PPT PresentationTRANSCRIPT
1
http://www.eas.asu.edu/~calypso
2
http://www.eas.asu.edu/~calypso
Partha DasguptaArizona State University
The MILAN ProjectNew York University Arizona State University
Funding Sources: DARPA/Rome Laboratory, NSF, Intel, and Microsoft
Parallel Processing with Windows NT Networks
Collborators: Zvi M. Kedem Donald McLaughlin Shantanu Sardesai Rahul Thombre
3
http://www.eas.asu.edu/~calypso
Chime
Malaxis+
ECLIPSE
CalypsoLinux 1.0
Joint Research ofArizona State University and New York University
4
http://www.eas.asu.edu/~calypso
The Platforms
Calypso Language independent parallel processing Shared memory and fault tolerance.
Chime CC++ based parallel processing Shared memory, fault tolerance
Malaxis DSM package for Windows NT Read/write locking, barriers
Milan A metacomputing platform Coalesces features from the above systems to a general purpose
computing platform
5
http://www.eas.asu.edu/~calypso
Unix to Windows NT
Port a system program or middleware from Unix to Windows NT.
How? Just change the system calls? Does not work.
Change programming and design styles to NT-centric: no signals in NT use structured event handling (no such thing in Unix) use threads (useful) integrate with windows messages or MFC remote execution support is weak
Learn NT-centrism, and NT lingo
6
http://www.eas.asu.edu/~calypso
NT Terminology
MSDN is not a network Developer’s library contains books Resource Kit is not about resources Huh?
SDK, DDK, checked build Service Pack OSR2
Remote access does not let you execute anything remotely Use a Share?
You mean remote mount? No, I mean map network drive
Memory can be reserved or committed or both. Synchronization primitives - never mind...
7
http://www.eas.asu.edu/~calypso
What is
Yet another parallel processing system, which runs on a distributed network of microcomputers: Shared Memory Novel execution and memory management strategy
Fault Tolerant: Machines may stop and start dynamically without affecting the
execution
Automatic Load Balancing: Manages slow and fast machines
Provides near optimal thread assignments (measured)
Execution strategy hidden from programmer: No message passing, process management, data partitioning
Low-overhead mechanisms
8
http://www.eas.asu.edu/~calypso
Key Techniques in Calypso
Eager Scheduling Manager - worker architecture Provides fault-tolerant and load-
shared executions with minimal overhead
Two-phase Idempotent Execution Strategy Distributed memory management
strategy Stops side effects due to failures Ensures idempotence of results,
in spite of duplicate executions
These techniques developed in previous joint theoretical research
worker
manager
worker
9
http://www.eas.asu.edu/~calypso
Eager Scheduling
Workers contact the manager for work after finishing previous assignment, if any When there is unfinished work, the manager has the option of
assigning an unfinished thread to a “willing” worker regardless of who is already working on that thread
An example of Round Robin Eager Scheduling: 3 machines: fast, slow and transient 12 threads of equal length (50 secs)
1 3 5 8 10 12 9
92 116
Worker interrupted
9
Worker crashed
74
time
50 100 150 200 250 300 350 400
A
B
C
10
http://www.eas.asu.edu/~calypso
Chime is a programming system and runtime environment for parallel processing
The first system to incorporate standard parallel language support on a network of workstations: Nested Parallelism, Parallel statements Language-defined scoping of variables Synchronization support Transparent shared memory
Chime supports the “shared memory” constructs of CC++ Adds fault tolerance…. Adds load balancing….
…. with low overhead
Chime
A “distributed” cactus stack
11
http://www.eas.asu.edu/~calypso
Chime Software Architecture
Application Request Protocol
Application
RuntimeSystem
Application Request Protocol
Application
RuntimeSystem
The Manager One out ofmany Workers
ApplicationThread
Controlling Thread
12
http://www.eas.asu.edu/~calypso
Chime Execution Trace
Send Parallel Task Start Task
Page Fault Request Page from Manager
Send Page Install Page
Compute
Done
Parallel Exec. Request
Done/ Send Dirty Page Diffs
Sq. Step
Suspend
Resume & Done
The Manager
One out of “many” workers
Controlling Thread
Application Thread1
2
3
4 5
7
6
8
10
Suspend
Start Application
Suspend
Start Cntrl. Thread
Suspend
Start Cntrl. Thread1’
Application Request Protocol Initialize
Initialize
9
11
12
13
13
http://www.eas.asu.edu/~calypso
Malaxis
A DSM Package Uses NT threads and memory mapping
and protection features Uses barrier synchronization,
memory XOR-ing and intelligent monitoring of page/lock requests to prevent page shuttling
Programmer support: Spawning processes on remote machines Mapping shared segments Barrier Synchronization Read and Write locks (abstract, advisory)
14
http://www.eas.asu.edu/~calypso
Milan
A metacomputing platform Creates a system image of a large computer on a set of
workstations Smart scheduling
bunching job recall pre-emption
Shared memory Fault tolerant
15
http://www.eas.asu.edu/~calypso
Using Windows NT
The needs of our implementations: User Level page fault handling Getting and setting thread contexts Getting and setting stack contents Asynchronous notification and exception handling Networking support Process/Thread control
Windows NT provides all of the above
16
http://www.eas.asu.edu/~calypso
Memory Handling
Windows NT memory handling is elegant and powerful (After you understand the terminology)
States of memory: committed reserved guarded
Protection and allocation is done by: VirtualAlloc VirtualProtect
Access violations generate exceptions Needed reprogramming Calypso - for the better
17
http://www.eas.asu.edu/~calypso
Exception Handling
All exceptions are delivered to an exception handler, defined in the current scope of execution.
Great, for programmers - nice and structured Not good for middleware solutions….
How can I execute another persons code, with my exception handlers?
I cannot change the exception handler, from within my exception handler.
In our case, we found reasonable workarounds - but don’t have general solutions to the above problems.
18
http://www.eas.asu.edu/~calypso
Threads
Good, consistent, kernel threads. Easy to use works great plethora of synchronization constructs (too many, in fact)
Threads are useful for: Threads inside middleware - wow! Handling distributed shared memory (callbacks, caching, memory
service) Process migration - a thread can set up the main process Segregating functionality (assign a thread per job)
19
http://www.eas.asu.edu/~calypso
Process and Stack Migration
Migration is used by our system for several purposes: Cactus stacks Checkpointing Pre-emptive scheduling (produces better turnaround times in
dynamic environments)
When a thread has to be migrated: Another thread suspends it and gets its context The context is a checkpoint The context is sent to the target machine A thread sets the context of a suspended thread with the new
context and resumes it. Stack has to be reset too.
IT WORKS
20
http://www.eas.asu.edu/~calypso
Other Features
Networking winsock is like sockets, no surprises
Remote execution our approach: Use a daemon process NT approach: use a starter service
Execution Monitor (GUI) External process, that controls and displays state of the distributed
computation
21
http://www.eas.asu.edu/~calypso
Performance
Program: Ray Trace, generates a nice picture Equipment:
Pentium-90, running Windows NT (Calypso tests)Pentium Pro 200, running Windows NT (Chime tests)
Tests conducted Speedup Speedup in case of mixed speed machines Speedup in case of crashing and recovering machines Micro-tests (migration, stack creation)
– Not all tests will be shown now.
22
http://www.eas.asu.edu/~calypso
Calypso Performance
1037 1042
548
362290
230
1.0 1.0
1.9
2.9
3.6
4.5
-
200
400
600
800
1,000
1,200
Machine
Tim
e
0
1
2
3
4
5
6
Sp
ee
du
p
Performance is comparable to Unix systems
23
http://www.eas.asu.edu/~calypso
Chime Performance
584
639
329
174148
240
1.0 0.9
1.8
3.4
3.9
2.4
-
100
200
300
400
500
600
700
Sequential 1-P90 2-P90 3-P90 4-P90 5-P90
Machine
Tim
e
0
1
2
3
4
5
6
Sp
ee
du
p
Chime has higher network overhead than Calypso
24
http://www.eas.asu.edu/~calypso
In Retrospect
NT has some strong points, things that are better than Unix Threads Exception Handling Memory Management Program development tools
– (very good, especially the debugger)
Documentation
A few shortcomings no signals no remote execution facility terrible terminology
25
http://www.eas.asu.edu/~calypso
Status
Operational prototype systems Calypso on Windows NT / Windows 95 released A prototype of Chime implementing most of the “parallel part” of
Compositional C++ on an unreliable network of workstations
Ongoing research Distributed scheduling and resource management (for MILAN) Quality of service Better integration with NT (MFC support, remote services, global
scheduling…)
26
http://www.eas.asu.edu/~calypso
Acknowledgements
Co-PI Zvi M. Kedem
Calypso Arash Baratloo, Mehmet Karaul
Calypso NT Donald McLaughlin and Shantanu Sardesai
Chime Shantanu Sardesai
Calypso Linux Arash Baratloo
27
http://www.eas.asu.edu/~calypso
28
http://www.eas.asu.edu/~calypso
done?
29
http://www.eas.asu.edu/~calypso
Review request for SP&E
Done?