intro to virtualization (part 3)

Wrap-up Basics of Computing( Software Threads)

As Needed to Understand OS Virtualization

What we know: Abstract view of Computing

Process 2

Process 3

Process 4

Process N

Operating System Kernel

Process 1

System Call Interface

Scheduler

DriversFile system services

Memory management

CPU management

…

This diagram is accurate, but can be refined to be more detailed in cases where a process is made up of multiple “software threads”

Every lcore

Memory

Registers

PC

Program 1

Program 2

Program 2

Kernel

Other Logic

(e.g. ALU)

What we know: Multiprogramming allows each lcore to run multiple programs concurrently

What we knew: MultiProcessing + Simultaneous MultiThreading + MultiProgramming:

Logical Core

Logical Core

Logical Core

Logical Core

Socket / CPU

Core

Core

LCORE1

LCORE2

LCORE3

LCORE4

Review: What exactly “is” a running process?

Data

CodeRegisters

PC

Other Logic

(e.g. ALU)

Execution state (PC, registers, etc.)

Page Table, which lives in Kernel’s memory, but is loaded into TLB when process runs

Data and Code blocks in Virtual Memory

Memory region

Running (and reading instructions from CODE region)

CPU translates instruction addresses to virtual memory addresses via a page table:

Where do we SAVE these things when processblocks or is context switched out?

Memory

Program 1

Program 2

Program 3

Kernel

Every lcore

Registers

PC

Other Logic

(e.g. ALU)

Execution state (PC, registers, etc.)

Page Table (which is cached in CPU as TLB)

Execution state (PC, Registers) Saved in kernel’s memory

Where do we save the Page Table and Memory Region of Process?

• Page Table “already” lives in the Kernel’s memory structures (though it’s cached into a TLB near/in CPU when process is loaded).So all we need to do is make sure any updates that were made to TLB are updated into Page Table also.

• Memory Region of a Process can stay as is, because all processes co-exist in RAM+SWAP at same time (remember: this makes multiprogramming fast).

Somewhere in Kernel’s Memory … Somewhat like this…

Kernel

P1’s Execution

State

P1’s Page Table

P2’s Page Table

P1’s Page Table

P2’s Execution

State

P3’s Execution

State

Has Program Counter (i.e., address in code where the program when “paused”) and registers (i.e., contents of intermediate variables used within CPU) it last had when it was saved/interrupted/context-switched.

Other stuff in Kernel’s memory

Multiple Instances of Same Program

• SCENARIO:We have a web server that serves HTTP requests. The process is called httpd.

• Early versions of httpd might have run as a single process, and served http requests to any browser that connected to it.

• But eventually, we wanted a web server to service > 1 connection concurrently. How do we do this?

FORK: Multiple httpd Processes• Run multiple httpd processes• 1 to listen for new connections on port 80/443• For each new connections it accepts on port 80/443, it

“forks” (creates another copy) of itselfThis creates an essentially identical httpd process

• E.g., Assume 2 browsers have connected to a website running httpd, there will be at one point 3 httpd processes running concurrently(1 for listening to new connections, 2 processes serving the 2 browsers)

• NOTE: This feature is possible with MultiProgramming (does not require software threads) …

Multiple Processes from Same Executable

Core/Processor

Memory

Registers

PC

Program 1

Program 2

HTTPD

Kernel

Other Logic

(e.g. ALU)

HTTPD

HTTPD

Works, but… can we do this more efficiently?

Software Multithreading

• In previous example, WHY have 3 processes running with the same CODE residing in memory?

• Why not have multiple execution states, where each of these states have program counters that point into the same CODE memory region?

• Analogy:Instead of creating 3 elves that get instructions to build same toy from 3 different IDENTICAL instruction sheets, why not have the 3 elves read off the same instruction sheet?

Software Threads:Different Execution Contexts within Same Process (i.e., same memory region, page table)

Core/Processor Memory

Registers

PC = 0xE4BA

Program 1

Program 2

HTTPD

Kernel

Other Logic

(e.g. ALU)

Single Thread

Single Thread

ThreeThreads

QUESTION: Assume the above running software thread gets context switched out exactly at shown state (same PC), can we then tell which thread was running??The above picture proves that ONE of the three threads of HTTPD process is running (because PC refers to HTTPD’s core area). BUT it is still unclear which thread…

ZOOM into Kernel’s Memory Structures

P1’s Execution

State

P1’s Page Table

P2’s Page Table

P1’s Page Table

P2’s Execution

State

HTTPD Thread 1 Execution

State


State


StatePC PC PC

PC PC

PC = 0xE39C

PC = 0xE4BA

PC = 0xE219

Zooming into the

Program Counters of

the HTTPD threads

• 3 software threads of HTTPD process have different execution states

• Regardless, the entire HTTPD process only has one common page table (because process only has 1 memory region, so there needs to only be 1 table to do address translations)

DATA REGION of PROCESS’s MEMORY• Made up of GLOBAL memory that is shared by all

software threads of process• And also made up of LOCAL memory that is used by

each software thread.This local memory is used for many things, including to keep track of “function call state”

(really a part of the execution state)• Even if a process is made up of 1 thread (the default

situation), it still has local memory for function call state and global memory, but it was not that important to mention this earlier before introduction of software threads.

HTTPD’s (1 process with 3 software threads) Memory Region:

Memory

Program 1

Program 2

HTTPD

Kernel

ZOOMLocal Data Memory for Thread 3

Local Data Memory for Thread 2

Local Data Memory for Thread 1

Global Data Memory for Entire Process

Code Memory (instructions) pointed to by PC of all threads of this process

WARM AND FUZZY DIAGRAM THAT IS OFTEN PRESENTED TO EXPLAIN THREADS…

Advantage of Software Threads (vs. multiple processes running same

executable/code)

• Software threads A.k.a. “LIGHT-WEIGHT” processes• Less wasted virtual memory (shared CODE and

GLOBAL DATA memory regions)• Share a page table too – Makes contexts switches between threads of same

process very fast (no need to load a new page table into TLB)

• Easier communication (shared GLOBAL DATA allows for some faster communication)

• Faster to create new threads vs. new process (fork)

What we know: MultiProcessing + Simultaneous MultiThreading + MultiProgramming + Software MultiThreading

Logical Core

Logical Core

Logical Core

Logical Core

Socket / CPU

Core

Core

LCORE1

LCORE2

LCORE3

LCORE4NOW we know these purple regions of time are really periods when unique software threads are executing.

OS Scheduling• NOW we know:– In modern computers, the operating system is

really scheduling:• Software threads onto logical cores• Each logical core has:

– 1 software thread running on it– While several software threads are waiting on that lcore’s

READY QUEUE to run on it

• This diagram really refers to software threads:

Pinning / Isolation• By default any software thread can land on ready queue of any logical

core• To modify this behavior, you can provide RULES that pin software

thread(s) or isolate logical core(s)

• Pinning software threads:– Reducing the set of logical cores that a software thread is allowed to be

scheduled onto– E.g., via “taskset” command

or sched_setaffinity system call• Isolation of logical cores:

– Isolating a logical core REMOVES it from general pool for which threads can be scheduled on, EXCEPT when a thread is specifically pinned to that logical core

– E.g., via “isolcpus” kernel boot parameter (check /boot/grub*/grub*.cfg)

DEMO

• Sorry nothing to do with virtualization, but we will use a VM (treating it as a host)

• ps -eL -o cpuid -o tid -o pid -o comm -o stat -o command | egrep -i 'PID|ptsm‘

• taskset –p <mask> <tid>

• ssh tpc-e3-12-019 ...

intro to virtualization (part 3)

Engineering