thread implementation and scheduling cse451 andrew whitaker

25
Thread Implementation and Scheduling CSE451 Andrew Whitaker

Upload: bartholomew-boone

Post on 13-Jan-2016

228 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Thread Implementation and Scheduling

CSE451

Andrew Whitaker

Page 2: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Scheduling: A Few Lectures Ago…

Scheduler decides when to run runnable threads

Programs should make no assumptions about the scheduler Scheduler is a “black box”

scheduler

CPU

1 2 3 4 5

{4, 2, 3, 5, 1, 3, 4, 5, 2, …}

Page 3: Thread Implementation and Scheduling CSE451 Andrew Whitaker

This Lecture: Digging into the BlackboxPreviously, we glossed over the choice of

which process or thread is chosen to be run next “some thread from the ready queue”

This decision is called scheduling scheduling is policy context switching is mechanism

We will focus on CPU scheduling But, we will touch on network, disk scheduling

Page 4: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Scheduling goals Maximize CPU utilization Maximize throughput

Req/sec Minimize latency

Average turnaround time Time from request submission to completion

Average response time Time from request submission till first result

Favor some particular class of requests Priorities

Avoid starvation

These goals can conflict!

Page 5: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Context

Goals are different across applications I/O Bound

Web server, database (maybe) CPU bound

Calculating Interactive system

The ideal scheduler works well across a wide range of scenarios

Page 6: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Preemption

Non-preemptive: Once a thread is given the processor, it has it for as long as it wants A thread can give up the processor with Thread.yield or a blocking I/O operation

Preemptive: A long-running thread can have the processor taken away Typically, this is triggered by a timer interrupt

Page 7: Thread Implementation and Scheduling CSE451 Andrew Whitaker

To Preempt or not to Preempt?

Non-preemptive schedulers are subject to starvation

Non-preemptive schedulers are easier to implement

Non-preemptive scheduling is easier to program Fewer scheduler interleavings => fewer race conditions

Non-preemptive scheduling can perform better Fewer locks => better performance

while (true) { /* do nothing */ }

Page 8: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Preemption in Linux

User-mode code can always be preempted

Linux 2.4 (and before): the kernel was non-preemptive Any thread/process running in the kernel runs

to completion Kernel is “trusted”

Linux 2.6: kernel is preemptive Unless holding a lock

Page 9: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Linux 2.6 Preemption Details

The task_struct contains a preempt_count variable Incrementing when grabbing a lock Decrementing when releasing a lock

A thread in the kernel can be preempted iff preempt_count == 0

Page 10: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Algorithm #1: FCFS/FIFO

First-come first-served / First-in first-out (FCFS/FIFO) Jobs are scheduled in the order that

they arrive Like “real-world” scheduling of

people in lines Supermarkets, bank tellers, McD’s,

Starbucks … Typically non-preemptive

no context switching at supermarket!

Page 11: Thread Implementation and Scheduling CSE451 Andrew Whitaker

FCFS example

Suppose the duration of A is 5, and the durations of B and C are each 1. What is the turnaround time for schedules 1 and 2 (assuming all jobs arrive at roughly the same time) ?

Job A B C

CB Job A

time

1

2

Schedule 1: (5+6+7)/3 = 18/3 = 6Schedule 2: (1+2+7)/3 = 10/3 = 3.3

Page 12: Thread Implementation and Scheduling CSE451 Andrew Whitaker

+ No starvation (assuming tasks terminate)

- Average response time can be lousy- Small requests wait behind big ones

- FCFS may result in poor overlap of CPU and I/O activity I/O devices are left idle during long CPU bursts

Analysis FCFS

Page 13: Thread Implementation and Scheduling CSE451 Andrew Whitaker

FCFS at Amazon: Head-of-line BlockingClients and servers at Amazon

communicate over HTTPHTTP is a request/reply protocol which

does not allow for re-orderingProblem: Short requests can get stuck

behind a long request Called “head-of-line blocking”

Job A B C

Page 14: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Algorithm #2: Shortest Job First

Choose the job with the smallest service requirement Variant: shortest processing time first

Provably optimal with respect to average waiting time

Page 15: Thread Implementation and Scheduling CSE451 Andrew Whitaker

It’s non-preemptive … but there’s a preemptive version – SRPT (Shortest

Remaining Processing Time first) – that accommodates arrivals

Starvation Short jobs continually crowd out long jobs

Possible solution: aging

How do we know processing times? In practice, we must estimate

SJF drawbacks

Page 16: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Longest Job First?

Why would a company like Amazon favor Longest Job First?

Page 17: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Algorithm #3: Priority

Assign priorities to requests Choose request with highest priority to run next

if tie, use another scheduling algorithm to break (e.g., FCFS)

To implement SJF, priority = expected length of CPU burst

Abstractly modeled (and usually implemented) as multiple “priority queues” Put a ready request on the queue associated with

its priority

Page 18: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Priority drawbacks

How are you going to assign priorities? Starvation

if there is an endless supply of high priority jobs, no low-priority job will ever run

Solution: “age” threads over time Increase priority as a function of accumulated wait time Decrease priority as a function of accumulated

processing time Many ugly heuristics have been explored in this space

Page 19: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Priority Inversion

A low-priority task acquires a resource (e.g., lock) needed by a high-priority task

Can have catastrophic effects (see Mars PathFinder)

Solution: priority inheritance Temporarily “loan” some priority to the lock-

holding thread

Page 20: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Algorithm #4: RR

Round Robin scheduling (RR) Ready queue is treated as a circular FIFO queue Each request is given a time slice, called a quantum

Request executes for duration of quantum, or until it blocks

Advantages: Simple

No need to set priorities, estimate run times, etc. No starvation

Great for timesharing

Page 21: Thread Implementation and Scheduling CSE451 Andrew Whitaker

RR Issues

What do you set the quantum to be? No value is “correct”

If small, then context switch often, incurring high overhead If large, then response time degrades

Turnaround time can be poor Compared to FCFS or SJF

All jobs treated equally If I run 100 copies of SETI@home, it degrades your

service Need priorities to fix this…

Page 22: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Combining algorithms

In practice, any real system uses some sort of hybrid approach, with elements of FCFS, SPT, RR, and Priority

Example: multi-level queue scheduling Use a set of queues, corresponding to different priorities Priority scheduling across queues Round-robin scheduling within a queue

Problems: How to assign jobs to queues? How to avoid starvation? How to keep the user happy?

Page 23: Thread Implementation and Scheduling CSE451 Andrew Whitaker

UNIX scheduling: Multi-level Feedback Queues Segregate processes according to their CPU utilization

Highest priority queue has shortest CPU quantum Processes begin in the highest priority queue Priority scheduling across queues, RR within

Process with highest priority always run first Processes with same priority scheduled RR

Processes dynamically change priority Increases over time if process blocks before end of quantum Decreases if process uses entire quantum

Goals: Reward interactive behavior over CPU hogs

Interactive jobs typically have short bursts of CPU

Page 24: Thread Implementation and Scheduling CSE451 Andrew Whitaker

Multi-processor Scheduling

Each processor is scheduled independently Two implementation choices

Single, global ready queue Per-processor run queue

Increasingly, per-processor run queues are favored (e.g., Linux 2.6) Promotes processor affinity (better cache locality) Decreases demand for (slow) main memory

Page 25: Thread Implementation and Scheduling CSE451 Andrew Whitaker

What Happens When a Queue is Empty?Tasks migrate from busy processors to

idle processors Book talks about push vs pull migration

Java 1.6 support: thread-safe double-ended queue (java.util.Deque) Use a bounded buffer per consumer If nothing in a consumer’s queue, steal work

from somebody else