introductioncs-4513 d-term 20081 introduction cs-4513 distributed computing systems (slides include...
Post on 19-Dec-2015
218 views
TRANSCRIPT
IntroductionCS-4513 D-term 2008 1
Introduction
CS-4513Distributed Computing Systems
(Slides include materials from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne, Distributed Systems: Principles & Paradigms, 2nd ed. By Tanenbaum and Van Steen, and
Modern Operating Systems, 2nd ed., by Tanenbaum)
IntroductionCS-4513 D-term 2008 2
Outline for Today
• Introduction to CS-4513
• What is “Distributed Computing”– An example of a distributed computation
• Remote Procedure Call
• Assignment of Project #1
IntroductionCS-4513 D-term 2008 3
CS-4513, Distributed Systems
• Continuation of CS-3013, Operating Systems– File Systems
• No coverage in A- or C-Term CS-3013 (2007-2008)
• Distributed System Topics– Remote Procedure Call
– Naming
– Security and Encryption
– Atomic Transactions
– …
IntroductionCS-4513 D-term 2008 4
Four Principal Abstractions Implementedby almost all Operating Systems
• Processes and Threads• Abstracts notion of “processor”
• Concurrency and synchronization
• Virtual Memory• Address space in which a process “thinks”
• Physical memory is cache of virtual memory
• Files• Named, persistent storage of information
• Sockets and connections• Conversations among processes/threads across a network
IntroductionCS-4513 D-term 2008 5
Four Principal Abstractions Implementedby almost all Operating Systems
• Processes and Threads• Abstracts notion of “processor”
• Concurrency and synchronization
• Virtual Memory• Address space in which a process “thinks”
• Physical memory is cache of virtual memory
• Files• Named, persistent storage of information
• Sockets and connections• Conversations among processes/threads across a network
OS course
OS course
This course
CS-4514
IntroductionCS-4513 D-term 2008 6
CS-4513, Distributed Systems
• Continuation of CS-3013, Operating Systems– File Systems
• No coverage in A- or C-Term CS-3013 (2007-2008)
• Distributed System Topics– Remote Procedure Call
– Naming
– Security and Encryption
– Atomic Transactions
– …
IntroductionCS-4513 D-term 2008 7
Textbook and Web
• Textbook:–– Distributed Systems: Principles and Paradigms,
Tanenbaum and Van Steen, Prentice-Hall, 2007
• Supplemental:– You should own or have access to one of the following from CS-3013– Operating Systems Concepts, 7th ed, by Silberschatz,
Galvin, and Gagne, John Wiley and Sons, 2005– Modern Operating Systems, 2nd edition, by Andrew S.
Tanenbaum, Prentice Hall, 2001
• Course Information:– http://www.cs.wpi.edu/~cs4513/d08/
IntroductionCS-4513 D-term 2008 8
Prerequisites
• Prerequisites:–– CS-3013, Operating Systems, or equivalent– C and C++ programming, esp. “low level”
programming– Data structures
• pointers, linked lists, malloc(), free()
– Unix/Linux user experience and access
IntroductionCS-4513 D-term 2008 9
Co-Requisite
• CS-4514, Computer Networks
or• CS-502, Operating Systems (graduate level)
or
• Tutorial by R. Skowyra• Sockets
• Connections
• OSI 7-layer model
IntroductionCS-4513 D-term 2008 10
Schedule & Logistics
• Schedule– Goddard Hall 227– 8:00 – 9:50 AM – Tuesdays & Fridays thru April 29– No class on April 15– 14 classes total
• Exams– Mid-term on ~April 1– Final on April 29
• Unannounced Quizzes– May occur at any time– May be at beginning, middle, or
end of class
• Mobile Phones, pagers, laptops, and other devices OFF during class
• Two Programming Projects– Fossil Lab– One individual, one team
• Office Hours– Adjunct Office, Fuller 239– by appointment, or– Normally ½ hour after class
• Teaching Assistant– Rick Skowyra– Isaac Chanin
• Contacts– <Professor’s last name> @ cs.wpi.edu– Adjunct office phone:
(508) 831-6470 (shared, no messages)– cs4513-staff at same domain
IntroductionCS-4513 D-term 2008 11
Grading
• Grading– Exams – 40%
– Programming Projects – 40%
– Class participation, homework, & quizzes – 20%
• Unless otherwise noted, assignments are to be completed individually, not groups
• Late Policy – 10%/day– But contact Professor for extenuating circumstances at
least one day prior to deadline or exam date
• WPI Academic Honesty policy
IntroductionCS-4513 D-term 2008 12
Miscellaneous
• Is this course the capstone for a Minor in CS?
• Anyone needing a project for BS & MS credit?
• How many students feel they need tutorial on networking
• Scheduling options
IntroductionCS-4513 D-term 2008 13
Project Work
• Two project• One individual – Remote Procedure Call
• One team – Choice of Distributed or File System topics
• Fossil Lab• Newly refurbished
• Your accounts
• Virtual machines
IntroductionCS-4513 D-term 2008 14
Cloning a Virtual Machine
• Log in using Fossil password• Navigate to P drive
• Open Clonable-SUSE-Linux-10.3
• Double-click on VMware configuration file
• Select “Clone this virtual machine”
• Root and “student” password• Fossil-B17
• Linked vs. Full clone• Linked – about 2-3 gigabytes, tied back to master
• Full – 8-9 gigabytes, can stand alone– Exceeds your quota on Fossil server
IntroductionCS-4513 D-term 2008 16
Ground Rule
• There are no “stupid” questions.
• It is a waste of your time and the class’s time to proceed when you don’t understand the basic terms.
• If you don’t understand it, someone else probably doesn’t, either.
IntroductionCS-4513 D-term 2008 17
Instructor — Hugh C. LauerAdjunct Professor
• Ph. D. Carnegie-Mellon 1972-73– Dissertation “Correctness in Operating Systems”
• Lecturer: University of Newcastle upon Tyne, UK• Approximately 30 years in industry in USA• Research topics
– Operating Systems– Proofs of Correctness– Computer Architecture– Networks and Distributed Computing– Real-time networking– 3D Volume Rendering– Surgical Simulation and Navigation– …
IntroductionCS-4513 D-term 2008 18
Systems Experience
• IBM Corporation• University of Newcastle• Systems Development Corporation• Xerox Corporation (Palo Alto)• Software Arts, Inc.• Apollo Computer• Eastman Kodak Company• Mitsubishi Electric Research Labs (MERL)• Real-Time Visualization
• Founded and spun out from MERL• Acquired by TeraRecon, Inc.
• SensAble Technologies, Inc.• Dimensions Imaging, Inc. (new start-up)
IntroductionCS-4513 D-term 2008 19
VolumePro™
• Interactive volume rendering of 3D data such as• MRI scans
• CT scans
• Seismic scans
• Two generations of ASICs, boards, software• VolumePro 500 – 1999
• VolumePro 1000 – 2001
• CTO, Chief Architect of VolumePro 1000• 7.5-million gate, high-performance ASIC
• 109 Phong-illuminated samples per second
IntroductionCS-4513 D-term 2008 21
Operating Systems I have known
• IBSYS (IBM 7090)
• OS/360 (IBM 360)
• TSS/360 (360 mod 67)
• Michigan Terminal System (MTS)
• CP/CMS & VM 370
• MULTICS (GE 645)
• Alto (Xerox PARC)
• Pilot (Xerox STAR)
• CP/M
• MACH
• Apollo DOMAIN
• Unix (System V & BSD)
• Apple Mac (v.1 – v.9)
• MS-DOS
• Windows NT, 2000, XP
• various embedded systems
• Linux
• …
IntroductionCS-4513 D-term 2008 22
Other
• Two seminal contributions to computer science
• Duality hypothesis for operating system structures (with Roger Needham)
• First realization of opaque types in type-safe programming languages (with Ed Satterthwaite)
• 21 US patents issued• Computer architecture• Software reliability• Networks• Computer graphics & volume rendering
IntroductionCS-4513 D-term 2008 23
Class Discussion(laptops closed, please)
What is Distributed Computing?
IntroductionCS-4513 D-term 2008 24
Distributed System
• Collection of computers that are connected together and (sometimes) interact
• Many independent problems at same time• Similar• Different
• Or …– One very big problem (or a small number)
• Computations that are physically separated• Client-server• Inherently dispersed computations
IntroductionCS-4513 D-term 2008 25
Distributed Computing Spectrum
• Many independent computations at same time• Similar — e.g., banking & credit card; airline reservations
• Different — e.g., university computer center; your own PC
• Or …– One very big problem (or a few)
• Computations that are physically separated• Client-server
• Inherently dispersed computations
IntroductionCS-4513 D-term 2008 26
Multiprocessing Distributed Computing(a spectrum)
• Many independent problems at same time• Similar — e.g., banking & credit card; airline reservations
• Different — e.g., university computer center; your own PC
• Or …– One very big problem (too big for one computer)
• Weather modeling, Finite element analysis; Drug discovery; Gene modeling; Weapons simulation; etc.
• Computations that are physically separated• Client-server
• Inherently dispersed computations
IntroductionCS-4513 D-term 2008 27
Multiprocessing Distributed Computing(a spectrum)
• Many independent problems at same time• Similar — e.g., banking & credit card; airline reservations
• Different — e.g., university computer center; your own PC
• Or…– One very big problem (too big for one computer)
• Weather modeling, Finite element analysis; Drug discovery; Gene modeling; Weapons simulation; etc.
• Computations that are physically separated• Client-server
• Dispersed – routing tables for internet; electric power distribution.
IntroductionCS-4513 D-term 2008 28
Observation
• Same spectrum applies to multiprocessor systems– Much more tightly coupled that traditional “distributed
systems”
• Some differences– “Multiprocessor systems”
• Usually under same management, often in same room
• Very fast communication
– “Distributed systems”• Sometimes not under same management
• Slower communication
IntroductionCS-4513 D-term 2008 29
Another Observation(attributed to R. Hamming)
• When you change the operating point of a system by an order of magnitude …
… you introduce qualitative changes in how to approach problems
IntroductionCS-4513 D-term 2008 30
Observation
• Same spectrum applies to multiprocessor systems– Much more tightly coupled that traditional “distributed
systems”
• Some differences– “Multiprocessor systems”
• Usually under same management
• Very fast communication
– “Distributed systems”• Sometimes not under same management
• Slower communication
So there is a qualitative
difference in how we
approach these two
kinds of systems
IntroductionCS-4513 D-term 2008 31
Let’s look at an example
• An inherently distributed computation– I.e., parts of the computation must occur at
physically separate locations– Under separate administrations
• Internet routing tables
IntroductionCS-4513 D-term 2008 32
The Internet
• A vast collection of independent computers– ~ 600 106
• All connected together• Any computer can send a message to any
other• Messages broken up into little packets
• Question: how do packets find their way to destinations?
IntroductionCS-4513 D-term 2008 34
Distributed routing algorithm(simplified example)
• Each node “knows” which networks are directly connected to it.
• Each node maintains table of distant networks• [network #, 1st hop, “distance”]
• Adjacent nodes periodically exchange tables• Update algorithm (for each network in table)
• If (my distance to network > neighbor’s distance to network + my distance to neighbor), then …
• … update my table entry for that network so that neighbor is first hop.
IntroductionCS-4513 D-term 2008 35
Distributed routing algorithm(result)
• All nodes in Internet maintain reasonably up-to-date routing tables
• Rapid responses to changes in network topology, congestion, failures, etc.
• Very reliable with no central management!
IntroductionCS-4513 D-term 2008 36
Characteristic
• The routing algorithm is inherently distributed
• Different parts execute in physically separated locations
• Only nearby nodes “know” whether – Neighbors are up or down– Networks are congested or not
IntroductionCS-4513 D-term 2008 37
Big networks
• Network management systems• Monitoring health of network (e.g., routing tables)
• Identifying actual or incipient problems
• Data and statistics for planning purposes