intro to virtualization (part 3)
TRANSCRIPT
Wrap-up Basics of Computing( Software Threads)
As Needed to Understand OS Virtualization
What we know: Abstract view of Computing
Process 2
Process 3
Process 4
Process N
Operating System Kernel
Process 1
System Call Interface
Scheduler
DriversFile system services
Memory management
CPU management
…
This diagram is accurate, but can be refined to be more detailed in cases where a process is made up of multiple “software threads”
Every lcore
Memory
Registers
PC
Program 1
Program 2
Program 2
Kernel
Other Logic
(e.g. ALU)
What we know: Multiprogramming allows each lcore to run multiple programs concurrently
What we knew: MultiProcessing + Simultaneous MultiThreading + MultiProgramming:
Logical Core
Logical Core
Logical Core
Logical Core
Socket / CPU
Core
Core
LCORE1
LCORE2
LCORE3
LCORE4
Review: What exactly “is” a running process?
Data
CodeRegisters
PC
Other Logic
(e.g. ALU)
Execution state (PC, registers, etc.)
Page Table, which lives in Kernel’s memory, but is loaded into TLB when process runs
Data and Code blocks in Virtual Memory
Memory region
Running (and reading instructions from CODE region)
CPU translates instruction addresses to virtual memory addresses via a page table:
Where do we SAVE these things when processblocks or is context switched out?
Memory
Program 1
Program 2
Program 3
Kernel
Every lcore
Registers
PC
Other Logic
(e.g. ALU)
Execution state (PC, registers, etc.)
Page Table (which is cached in CPU as TLB)
Execution state (PC, Registers) Saved in kernel’s memory
Where do we save the Page Table and Memory Region of Process?
• Page Table “already” lives in the Kernel’s memory structures (though it’s cached into a TLB near/in CPU when process is loaded).So all we need to do is make sure any updates that were made to TLB are updated into Page Table also.
• Memory Region of a Process can stay as is, because all processes co-exist in RAM+SWAP at same time (remember: this makes multiprogramming fast).
Somewhere in Kernel’s Memory … Somewhat like this…
Kernel
P1’s Execution
State
P1’s Page Table
P2’s Page Table
P1’s Page Table
P2’s Execution
State
P3’s Execution
State
Has Program Counter (i.e., address in code where the program when “paused”) and registers (i.e., contents of intermediate variables used within CPU) it last had when it was saved/interrupted/context-switched.
Other stuff in Kernel’s memory
Multiple Instances of Same Program
• SCENARIO:We have a web server that serves HTTP requests. The process is called httpd.
• Early versions of httpd might have run as a single process, and served http requests to any browser that connected to it.
• But eventually, we wanted a web server to service > 1 connection concurrently. How do we do this?
FORK: Multiple httpd Processes• Run multiple httpd processes• 1 to listen for new connections on port 80/443• For each new connections it accepts on port 80/443, it
“forks” (creates another copy) of itselfThis creates an essentially identical httpd process
• E.g., Assume 2 browsers have connected to a website running httpd, there will be at one point 3 httpd processes running concurrently(1 for listening to new connections, 2 processes serving the 2 browsers)
• NOTE: This feature is possible with MultiProgramming (does not require software threads) …
Multiple Processes from Same Executable
Core/Processor
Memory
Registers
PC
Program 1
Program 2
HTTPD
Kernel
Other Logic
(e.g. ALU)
HTTPD
HTTPD
Works, but… can we do this more efficiently?
Software Multithreading
• In previous example, WHY have 3 processes running with the same CODE residing in memory?
• Why not have multiple execution states, where each of these states have program counters that point into the same CODE memory region?
• Analogy:Instead of creating 3 elves that get instructions to build same toy from 3 different IDENTICAL instruction sheets, why not have the 3 elves read off the same instruction sheet?
Software Threads:Different Execution Contexts within Same Process (i.e., same memory region, page table)
Core/Processor Memory
Registers
PC = 0xE4BA
Program 1
Program 2
HTTPD
Kernel
Other Logic
(e.g. ALU)
Single Thread
Single Thread
ThreeThreads
QUESTION: Assume the above running software thread gets context switched out exactly at shown state (same PC), can we then tell which thread was running??The above picture proves that ONE of the three threads of HTTPD process is running (because PC refers to HTTPD’s core area). BUT it is still unclear which thread…
ZOOM into Kernel’s Memory Structures
P1’s Execution
State
P1’s Page Table
P2’s Page Table
P1’s Page Table
P2’s Execution
State
HTTPD Thread 1 Execution
State
HTTPD Thread 2 Execution
State
HTTPD Thread 3 Execution
StatePC PC PC
PC PC
PC = 0xE39C
PC = 0xE4BA
PC = 0xE219
Zooming into the
Program Counters of
the HTTPD threads
• 3 software threads of HTTPD process have different execution states
• Regardless, the entire HTTPD process only has one common page table (because process only has 1 memory region, so there needs to only be 1 table to do address translations)
DATA REGION of PROCESS’s MEMORY• Made up of GLOBAL memory that is shared by all
software threads of process• And also made up of LOCAL memory that is used by
each software thread.This local memory is used for many things, including to keep track of “function call state”
(really a part of the execution state)• Even if a process is made up of 1 thread (the default
situation), it still has local memory for function call state and global memory, but it was not that important to mention this earlier before introduction of software threads.
HTTPD’s (1 process with 3 software threads) Memory Region:
Memory
Program 1
Program 2
HTTPD
Kernel
ZOOMLocal Data Memory for Thread 3
Local Data Memory for Thread 2
Local Data Memory for Thread 1
Global Data Memory for Entire Process
Code Memory (instructions) pointed to by PC of all threads of this process
WARM AND FUZZY DIAGRAM THAT IS OFTEN PRESENTED TO EXPLAIN THREADS…
WARM AND FUZZY DIAGRAM THAT IS OFTEN PRESENTED TO EXPLAIN THREADS…
WARM AND FUZZY DIAGRAM THAT IS OFTEN PRESENTED TO EXPLAIN THREADS…
Advantage of Software Threads (vs. multiple processes running same
executable/code)
• Software threads A.k.a. “LIGHT-WEIGHT” processes• Less wasted virtual memory (shared CODE and
GLOBAL DATA memory regions)• Share a page table too – Makes contexts switches between threads of same
process very fast (no need to load a new page table into TLB)
• Easier communication (shared GLOBAL DATA allows for some faster communication)
• Faster to create new threads vs. new process (fork)
What we know: MultiProcessing + Simultaneous MultiThreading + MultiProgramming + Software MultiThreading
Logical Core
Logical Core
Logical Core
Logical Core
Socket / CPU
Core
Core
LCORE1
LCORE2
LCORE3
LCORE4NOW we know these purple regions of time are really periods when unique software threads are executing.
OS Scheduling• NOW we know:– In modern computers, the operating system is
really scheduling:• Software threads onto logical cores• Each logical core has:
– 1 software thread running on it– While several software threads are waiting on that lcore’s
READY QUEUE to run on it
• This diagram really refers to software threads:
Pinning / Isolation• By default any software thread can land on ready queue of any logical
core• To modify this behavior, you can provide RULES that pin software
thread(s) or isolate logical core(s)
• Pinning software threads:– Reducing the set of logical cores that a software thread is allowed to be
scheduled onto– E.g., via “taskset” command
or sched_setaffinity system call• Isolation of logical cores:
– Isolating a logical core REMOVES it from general pool for which threads can be scheduled on, EXCEPT when a thread is specifically pinned to that logical core
– E.g., via “isolcpus” kernel boot parameter (check /boot/grub*/grub*.cfg)
DEMO
• Sorry nothing to do with virtualization, but we will use a VM (treating it as a host)
• ps -eL -o cpuid -o tid -o pid -o comm -o stat -o command | egrep -i 'PID|ptsm‘
• taskset –p <mask> <tid>
• ssh tpc-e3-12-019 ...