prelude to multiprocessing detecting cpu and system-board capabilities with cpuid and the mp...
Post on 19-Dec-2015
220 views
TRANSCRIPT
Prelude to Multiprocessing
Detecting cpu and system-board capabilities with CPUID and the
MP Configuration Table
CPUID
• Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities
• If it’s implemented, this instruction can be executed in any of the processor modes, and at any of its four privilege levels
• But this ‘cpuid’ instruction might not be implemented (e.g., 8086, 80286, 80386)
Intel x86 EFLAGS register
0 0 0 0 0 0 0 0 0 0ID
VIP
VIF
AC
VM
RF
0NT
IOPLOF
DF
IF
TF
SF
ZF
0AF
0PF
1CF
31 16
15 0
21
Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register if the processor is capable of executing the ‘cpuid’ instruction
But what if there’s no EFLAGS?
• The early Intel processors (8086, 80286) did not implement any 32-bit registers
• The FLAGS register was only 16-bits wide
• So there was no ID-bit that software could try to ‘toggle’ (to see if ‘cpuid’ existed)
• How can software be sure that the 32-bit EFLAGS register exists within the CPU?
Detecting 32-bit processors
• There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the ‘shift-factor’
• On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286)
• Software can exploit this distinction, in order to tell if EFLAGS is implemented
Detecting EFLAGS
# Here’s a test for the presence of EFLAGS
mov $-1, %ax # a nonzero value
mov $32, %cl # shift-factor of 32
shl %cl, %ax # do logical shift
or %ax, %ax # test result in AX
jnz is32bit # EFLAGS present
jmp is16bit # EFLAGS absent
Testing for ID-bit ‘toggle’# Here’s a test for the presence of the CPUID instruction
pushfl # copy EFLAGS contentspop %eax # to accumulator registermov %eax, %edx # save a duplicate imagebtc $21, %eax # toggle the ID-bit (bit 21)push %eax # copy revised contentspopfl # back into EFLAGSpushfl # copy EFLAGS contentspop %eax # back into accumulatorxor %edx, %eax # do XOR with prior valuebt $21, %eax # did ID-bit get toggled?jc y_cpuid # yes, can execute ‘cpuid’jmp n_cpuid # else ‘cpuid’
unimplemented
How does CPUID work?
• Step 1: load value 0 into register EAX
• Step 2: execute ‘cpuid’ instruction
• Step 3: Verify ‘GenuineIntel’ character-string in registers
(EBX,EDX,ECX)
• Step 4: Find maximum CPUID input-value in the EAX register
• load 1 into EAX and execute CPUID
• Processor model and stepping information is returned in register EAX
Version and Features
ExtendedFamily ID
ExtendedModel ID
TypeFamily
IDModel
SteppingID
27 20 19 16 13 12 11 8 7 4 3 0
Some Feature Flags in EDX
HTT
PGE
APIC
PSE
DE
VME
FPU
9 3
28
HTT = HyperThreading Technology (1 = yes, 0 = no)PGE = Page Global Entries (1=yes, 0=no) APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no)PSE = Page-Size Extensions (1 = yes, 0 = no)DE = Debugging Extensions (1=yes, 0=no)VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no)FPU = Floating-Point Unit on-chil (1=yes, 0=no)
12 013
Some Feature Flags in ECX
VMX
5
VMX = Virtual Machine Extensions (1 = yes, 0 = no)
Multiprocessor Specification
• It’s an industry standard, allowing OS software to use multiple processors in a uniform way
• OS software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16-bytes called the MP Floating Pointer Structure: – Search in lowest KB of Extended Bios Data Area– Search in topmost KB of conventional 640K RAM– Search in the 128KB ROM-BIOS (0xE0000-0xFFFFF)
MP Floating Pointer Structure
• This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MP Configuration Table having entries that specify a “customized” system architecture
• The machines in our classroom employ the latter of these two options
An example record
• The MP Configuration Table will contain a record for each logical processor
CPU FlagsBP (bit 1), EN (bit 0)
Local-APICversion
Local-APICID
Entry Type0
CPU signature (stepping, model, family)
Feature Flags
reserved (=0)
reserved (=0)
BP = Bootstrap Processor (1=yes, 0=no), EN = Enabled (1=yes, 0=no)
Our ‘mpinfo.cpp’ utility
• We created a Linux utility that will display the system-information contained in the MP Configuration Table (in hex format)
• You can refer to the ‘MP Specification 1.4’ document (online) to interpret this display
• This utility needs a device-driver ‘dram.c’ to be pre-installed (in order that it be able to directly access the system’s memory)
A processor’s Local-APIC
• The purpose of each processor’s APIC is to allow the CPUs in a multiprocessor system to send messages to one another and to manage the delivery of the interrupt-requests from the various peripheral devices to one (or more) of the CPUs in a dynamically programmable way
• Each processor’s Local-APIC has a variety of registers, all ‘memory mapped’ to paragraph-aligned addresses within the 4KB page at physical-address 0xFEE00000
Local-APIC’s register-space
APIC 0xFEE00000
4GB physicaladdress-space
0x00000000
RAM
Analogies with the PIC
• Among the registers in a Local-APIC are these (which had analogues in the older 8259 PIC’s design:– IRR: Interrupt Request Register (256-bits)– ISR: In-Service Register (256-bits)– TMR: Trigger-Mode Register (256-bits)
• For each of these, its 256-bits are divided among eight 32-bit register addresses
New way to do ‘EOI’
• Instead of using a special End-Of-Interrupt command-byte, the Local-APIC contains a dedicated ‘write-only’ register (named the EOI Register) which an Interrupt Handler writes to when it is ready to signal an EOI
# issuing EOI to the Local-APICmov $0xFEE00000, %ebx # address of the cpu’s Local-APICmovl $0, %fs:0xB0(%ebx) # write any value into EOI register
# Here we assume segment-register FS holds the selector for a segment-descriptor# for a ‘writable’ 4GB-size expand-up data-segment whose base-address equals 0
Each CPU has its own timer!
• Four of the Local-APIC registers are used to implement a programmable timer
• It can privately deliver a periodic interrupt (or one-shot interrupt) just to its own CPU– 0xFEE00320: Timer Vector register– 0xFEE00380: Initial Count register– 0xFEE00390: Current Count register– 0xFEE003E0: Divider Configuration register
Timer’s Local Vector Table
InterruptID-number
MODE
MASK
BUSY
7 01216170xFEE00320
MODE: 0=one-shot 1=periodic
MASK: 0=unmasked 1=masked
BUSY: 0=not busy 1=busy
Timer’s ‘Divide-Configuration’
reserved (=0)
3 2 1 00xFEE003E0
0
Divider-Value field (bits 3, 1, and 0)000 = divide by 2001 = divide by 4010 = divide by 8011 = divide by 16100 = divide by 32101 = divide by 64110 = divide by 128111 = divide by 1
Initial and Current Counts
Initial Count Register (read/write)
0xFEE00380
Current Count Register (read-only)
0xFEE00390
When the timer is programmed for ‘periodic’ mode, the Current Count is automatically reloaded from the Initial Count register, then counts down with each CPU bus-cycle, generating an interrupt when it reaches zero
Using the timer’s interrupts
• Setup your desired Initial Count value
• Select your desired Divide Configuration
• Setup the APIC-timer’s LVT register with your desired interrupt-ID number and counting mode (‘periodic’ or ‘one-shot’), and clear the LVT register’s ‘Mask’ bit to initiate the automatic countdown operation
In-class exercise #1
• Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple logical processors in a cpu)
• Then run the ‘mpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor
• If both results hold true, then we can write our own multiprocessing software in H235!
In-class exercise #2
• Run the ‘apictick.s’ demo (on our CS 630 website) to observe the APIC’s ‘periodic’ interrupt-handler drawing ‘T’s onscreen
• It executes for ten-milliseconds (the 8254 is used here to create that timed delay)
• Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or perhaps to double it)