cs a320 operating systems for engineersssiewert/a320_doc/lectures/lecture-week-8...cs a320 operating...
TRANSCRIPT
October 14, 2013 Sam Siewert
CS A320 Operating Systems for Engineers
Lecture 8 – Review Through MOS Chapter 4 and Material Up to EXAM #1
History of OS and Abstraction History of Unix and Linux (Multics) OS as a Resource Manager – CPU, Processing – Processes, Threads and Kernel Tasks – Memory – Virtual Memory, Page Cache, Buffer Cache, Swap – Storage – Spinning Disk Drives, Newer NVM (Non-Volatile
Memory) – I/O Bandwidth and Devices – Power – System Calls from User Space to Kernel Space
OS Interfaces – GUI (Graphical User Interface) or CLI (Command Line Interface)
Sam Siewert 2
OS API System Calls – Library APIs are Difference – Collections of Useful Code in User
Space Archives (e.g. built with “ar”) – System Calls Cross from User Space to Kernel Space – Requires System Call Trap (Interrupt to Kernel) to Execute
Kernel Code on Behalf of Caller in Supervisor Mode – Used to Interface to all I/O Drivers – Used to Interface to Kernel API
Run-Time Environment – Linker – Static or Dynamic – Loader – Program File is put into Execution as a Process – Scheduler – Ready Queue, Dispatch, Context Switch, Load
Balance for SMP
Sam Siewert 4
Anatomy of System Call User Space (Process) to Kernel Space (Task or Handler in User Context Task), Look-up-table for Kernel Handler, Return
Sam Siewert 5
Multi-Programming Why Do We Have More Processes than CPUs? Waiting on Disk and Other I/O Where p=fraction of time waiting on I/O, n=number of processes, CPU utility = 1 – pn
Sam Siewert 6
Processing Linux/Unix Process – Container for Address Space, I/O resources (file descriptors) and 1 or more threads – Memory Protection (from other bad code) – User Space – OS Manages memory, CPU, and I/O resources for Process – CFS – Completely Fair Scheduler – Fork(), Execve()
POSIX Threads – Sequence of Machine Code Instructions (Minimum) – Stack for Local Variables and Parameters – Mapped to Kernel Tasks in Linux (NPTL – Native POSIX
Threads Library) – FIFO, RR, and OTHER (CFS – Completely Fair Scheduler)
Sam Siewert 7
Linux Uses Fair Scheduler Default Scheduler is the CFS – Completely Fair Scheduler – Each Process Gets a Timeslice a Frequency – Frequency is Based on the “Tick”, A software Counter Driven by
an Interrupt – Some Processes get Slices (of Pie) more Often than Others –
Priority
Use nice - - 19 for example to set CFS prio high Use nice -19 to set it low
– POSIX threads may Be Scheduled in a Library or by an OS Kernel
– For Us, we Use NPTL, so by the OS Kernel
POSIX has RR, OTHER, and FIFO Sam Siewert 8
A Schedule is a State Machine A Process (task) Can 1. Execute 2. Yield the CPU core 3. Wait in the Ready Queue
to Execute 4. Delay using the sleep()
call for example 5. Pend by taking and
Empty Semaphore for example
6. Suspend by causing and Exception (divide by 0)
Sam Siewert 9
Producer/Consumer - Message Queues
Producer / Consumer Bounded Buffer is the Problem! Message Queues are the Answer But, How do We Implement One?? – We need mutual exclusion? – We need counting semaphores?
What is a Message Queue? – Atomic Operations for:
1. Enqueue 2. Dequeue 3. Tests and Notification for Is-Empty, Is-Full 4. Blocks on Empty (or returns Empty Error EAGAIN) 5. Blocks on Full (or returns Full Error EAGAIN)
Sam Siewert 10
POSIX Message Queue http://math.uaa.alaska.edu/~ssiewert/a320_code/EXAMPLES/POSIX/ mq_open mq_send mq_receive Name of Message Queue must be KNOWN globally It is a Global Bounded Buffer Where One Service Produces Message and Another Consumes Can be Simplex or Duplex Can have Priority and Head of Queue Features
Sam Siewert 11
Sam Siewert 12
Blocking
Blocking Indefinitely Can Be Viewed as Failure of a Service Caused by Need for Shared Resource that is Unavailable Despite Availability of CPU Core Ideally Eliminate Potential for Blocking During Service Execution, Or Use Timeouts! If Elimination Impossible, Then We Want Bounded Blocking (Known Upper Bound on Blocking Time)
Sam Siewert 13
Resource Deadlock (Circular Wait)
A is holding X and would like Y B is holding Y and would like X How is this resolved? A and B could Block Indefinitely Each could release X or Y and try again? – Can Result in Livelock – They Release, A grabs X, B grabs Y, Deadlock, Detection,
Release, A grabs X, B grabs Y …
Circular Wait Can Evolve over Complex Sets of Tasks and Resources (Hard to Detect or Prevent) Unbounded Blocking Detection Most Often with Watch-Dog and Sanity Monitors
A(X)
B(Y) Request Y
Request X
Mutual Exclusion Critical Section Protects Global Data for Multi-threaded Read/Write Access without Potential for Data Corruption http://www.ibm.com/developerworks/power/library/pa-soc5/ Linux see this as non-issue because of CFS (but is an issue for FIFO, but perhaps prefers use of Priority Ceiling solutions) – “Against priority inheritance”, by Victor Yodaiken
Sam Siewert 14
Sam Siewert 15
Priority Inversion
Problem: Service Using Shared Resource May Suffer Unbounded Priority Inversion – Mutex Protection of a Resource May Result in Unbounded
Inversion – 3 Necessary Conditions for Unbounded Inversion
Three or More Services With Unique Priority in the System - High, Medium, Low Priority Sets of Services At Least Two Services of Different Priority Share a Resource with Mutex Protection - One or More High and One or More Low Involved One or More Services Not Involved in the Mutex Has Priority Between the Two Involved in the Mutex
– What Happens? Low Priority Service Enters Mutex and High Priority Blocks on Mutex The Medium Priority Services Not Involved in the Mutex Can Interfere with the Low Priority Service for An Indeterminate Amount of Time
– Possible Solution: Priority Inheritance or Priority Ceiling
Sam Siewert 16
Priority Inheritance
When Higher Priority Task is Blocked on Mutex and Lower Priority Task is in Mutex, Higher Prio Loans Its Prio to the Lower for Scope of Mutex Can Chain – Even Higher Prio Task Also Blocks and Again
Loans Even Higher Prio – As More Block More Temporary Prio Transfers
Occur – All Prios Must Ultimately Be Restored – What is the Limit of Chaining?
What Happens if Mutexes are Nested?
Sam Siewert 17
Priority Ceiling
Instead of Chaining, Simply Set Prio of Task in Mutex to Highest Immediately When There is an Inversion Could be highest Prio in the System – May Over-amplify – Simple to Implement
More Precisely Can Be highest Prio of Those Tasks Actually Involved in Mutex
Thread Safety Thread Safe – Re-entrant Code can Be Executed by More Than 1 Thread at the Same Time – Use Stack Only – Use Thread Indexed Global Data (Unique Copy per Thread) – Use Semaphore Protected Critical Sections – Use TaskLock Protected Critical Sections (Not Advised) – Use InterruptLock Protected Critical Sections (Not Advised)
API Libraries Should Indicate Thread Safe in Manual Pages
Sam Siewert 18
Memory Management Virtual Memory – Virtual Address Space Larger then Main Memory Address Space – Allows for Use of Swap (Pages Spilled to Disk) – Allows for Page Protection (Read Only Code Segments), Ownership by
Process – Allows for Creation of Page Cache – Requires Page Look-Up-Table (TLB – Translation Look-aside Buffer) – Allows for Management of Segments in Page Size Chunks
Compromise Between External and Internal Fragmentation for Memory Management N=segments in memory P=page size Worst Case = (N x (P-1)) Bytes Average Case = (N x P)/2 Bytes Best Case = 0, for Exact fit to Page
MMU (Memory Management Unit) – Hardware for Page Table and Page Protection
Sam Siewert 20
Paging Continued Segments are Paged – Code or Text Segment – Machine
Code – Data Segment – Global Data per
Process BSS – Uninitialized Global Data Data – Statically Initialized Global Data
– Stack Segment – Local Variables and Parameters
– Heap – Malloc
Page Replacement Policy in Page Cache – LRU, Approximate LRU – LFU – Working Set
Sam Siewert 21
Paged Segmented Executables The “Norm” today in Linux, Solaris, OS-X, etc. Demand Paging (Dynamic Allocation) Try “file myprog.exe” in Linux Understand the Integration of the Two Concepts and Theory (Internal/External Fragmentation)
Sam Siewert 24
File Systems Name Space for Collections of Bytes Disk I/O is Blocks (512 Byte today, 4K IDEMA Future) Files have Logical Block Size that Maps to Disk (e.g. Linux 4K, Windows 16K to 64K) I-Nodes – Hierarchical (Indirection to Lists of
Blocks) – More Efficient than Simple Linked List of
File Blocks – Free List (Blocks Available)
Directory Structure for Namespace Disk Partitioning Standard Entry Points – Open, Create, Read, Write, Close Buffer Cache Sam Siewert 25
Linux VFS
© Sam Siewert
VFS
User Kernel
Syscalls (Open, Create, Read, Write, Close)
Page Cache
NFS Ext4 /proc
Inode Cache
Directory Cache
Buffer Cache
Devices
SW HW
Device Drivers
Fragmentation Impact Fragmented Files Waste Space AND take Longer to Read/Write Seek and Rotate Delays (E.g. Seek Time + ½ Rotation at 7200 RPM for Each Non-contiguous set of Blocks)
Sam Siewert 27
General Disk Block I/O Performance Spinning Disk – Low Random IOPs, 200MB Sequential BW SSD – High Random IOPs, similar BW over Wider range of I/O Request Sizes
Sam Siewert 28
Free Block Tracking Issues with Fragmentation Again (Internal / External) for Disk Block Size and File System Pages Overhead of Tracking Free Blocks that Are Not Contiguous
Sam Siewert 29
Buffer Cache Immediate Read after Write (Common) – Cached Recent and Frequently Used File system Pages – Cached Write-back of Large Updates to Disk, Coalesced for More Sequential Large Write (Disk Optimal) Elevator Algorithm
Sam Siewert 30
Disk Arm Motion Random IOPs are the Problem (200 per Disk Typical) Due to Seek time + ½ Rotation on Average (milliseconds) An Eternity compared to Nanoseconds (1, 10, 100’s) for CPU and Memory running at GhZ Rates
Sam Siewert 31
Disk Access Latency Calculations Ex #1 – an HDD with 3 5cm diameter platters turns at 3000 RPM, what is the average latency for the drive?
– Latency = (1/2) x ( 60 𝑠𝑠𝑠/𝑚𝑚𝑚3000 𝑟𝑠𝑟/𝑚𝑚𝑚
) = 0.01 sec
Ex #2 – an HDD with 12.7cm diameter disk spins at 5600 RPM – the average access time is 11 ms, what is the average rotational delay?
– Latency = (1/2) x (60 𝑠𝑠𝑠/𝑚𝑚𝑚) ( 1000 𝑚𝑠/𝑠𝑠𝑠5600 𝑟𝑠𝑟/𝑚𝑚𝑚
) = 5.357 ms
Seek + Rotate Delay HDD - 200 IOPs, 200MB/sec
– Typical Max Perf – Slow Random Access – Good Sequential
Reason for Multiprogramming Reason for Buffer Cache
Sam Siewert 32
Unix / Linux I-Nodes First I-Node Handles Small Files (64 KB) Second Handles Medium (E.g. 64 MB) Third Handles Large (E.g. 64 GB)
Sam Siewert 33
Day 1: Example Problems [5 – 40%] [1] Concepts (Ch 1) – System Calls (Diagram), Kernel vs. User Space, OS as Resource Manager, Multi-User [2] Processing (Ch 2) - Fork(), Execve(), Threading, Degree of Multi-programming Required, Producer-Consumer, Deadlock/Livelock, Unbounded Priority Inversion, CFS v. FIFO Scheduling [1] Virtual Memory (Ch 3) – Compute Fragmentation, Virtual to Physical Address Translation (Page and Offset Computation), Segmentation, Replacement Policies [1] Storage and Files (Ch 4) – E.g. Disk Access Latency Calculation Based on Seek and Rotate or Bandwidth and IOPs, I-Node Traversal, Free-List (Linked) vs. Bitmaps Sam Siewert 34
Day 2: Design, Take Home and Program – 60%
#1 – Processes and/or Threading, Producer-Consumer, Syncrhonization of Threads [Coding] – Design in Class – Implement, Test, Upload at Home
#2 – Memory Management and Use by Processes (in execution) and Programs (at rest) [Analysis] – Answer Based on Theory in Class – Answer Based on Analysis, Upload at Home
#3 – File systems and Block Devices – Answer Based on Knowledge and Theory in Class – Explore, Analyze and Upload at Home
Sam Siewert 35