department of computer sciences the university of texas … · spring 2003 © 2003 yongguang zhang...
TRANSCRIPT
Linux Kernel Programming
Yongguang Zhang
Department of Computer Sciences THE UNIVERSITY OF TEXAS AT AUSTIN
CS 378 (Spring 2003)
Copyright 2003, Yongguang Zhang
Spring 2003 © 2003 Yongguang Zhang 3
Interrupts
• Hardware Interrupt (IRQ):– A signal that the hardware sends to get kernel’s
attention– A limit of interrupt lines (i386: 16)– With APIC (mostly for SMP): 256 lines
• See all interrupts: /proc/interrupts CPU0 CPU1 0: 129487264 0 IO-APIC-edge timer 1: 103346 0 IO-APIC-edge keyboard
Spring 2003 © 2003 Yongguang Zhang 4
Exceptions
• CPU execution problems (e.g., divided-by-zero, overflow, page fault, etc.)– Different from hardware interrupt (IRQ)
• Maximum 32 exceptions (in i386)– 0: divided-by-zero– 1: debug– 4: overflow– 14: page fault– ...
Spring 2003 © 2003 Yongguang Zhang 5
Interrupt Descriptor Table
• A vector for interrupt/exception handlers– Both types of handlers in one table
• IDT data structure– Parameters: include/asm-i386/hw_irq.c
#define FIRST_EXTERNAL_VECTOR 0x20#define SYSCALL_VECTOR 0x80
– Source code: arch/i386/kernel/traps.c– Functions: set_intr_gate(n,addr),
set_system_gate(n,addr), set_trap_gate(n,addr)
Spring 2003 © 2003 Yongguang Zhang 6
IDT (i386)
• 0x00-0x1f: exception handlers– Arch/i386/kernel/traps.c : trap_init()
set_trap_gate(0,÷_error);
• 0x20-0xff (except 0x80): IRQ handlers– IRQn -> IDT(n+0x20)– arch/i386/kernel/i8259.c : init_IRQ()
set_intr_gate(vector,interrupt[i]);
– interrupt[0xab] macro: IRQ0xab_interrupt
• 0x80: system call– arch/i386/kernel/traps.c : trap_init()
set_system_gate(SYSCALL_VECTOR,&system_call);
Spring 2003 © 2003 Yongguang Zhang 7
IRQ Firing
• Registers saved and all local interrupts blocked• Invoke IDT vector at n+0x20 (IRQxxx_interrupt)• IRQ0xab_interrupt: calls common_interrupt
– Both assembly code in include/asm-i386/hw_irq.h
• common_interrupt: calls C Function do_IRQ()• do_IRQ(struct pt_regs regs):
– Maintain IRQ status, locks– Call handle_IRQ_event(irq, regs, action)– Enables interrupts and check softirq
Spring 2003 © 2003 Yongguang Zhang 8
IRQ Descriptor
• Data structure to store the status and handlers for each IRQ– Data structure: irq_desc_t in include/linux/irq.h– Why IDT and IRQ Descriptor? IRQ Descriptor stores
more information (e.g., an IRQ list) for Linux use
• IRQ action– One for each IRQ handler– struct irqaction
• Array of all IRQ desciptors:extern irq_desc_t irq_desc [NR_IRQS];
Spring 2003 © 2003 Yongguang Zhang 9
IRQ Descriptor Illustration
0
n
lock
status
depth
handleraction
...
shutdown()
ack()
enable()disable()
struct hw_interrupt _type
dev_id
handler()
name
flagsmask
struct irqaction
nextdev_id
handler()
name
flagsmask
struct irqaction
next
Registered interrupthandler function
More than one irqaction only if SA_SHIRQ
irq_desc[NR_IRQS]
Spring 2003 © 2003 Yongguang Zhang 10
Interrupt Handler Routine
• Insert an IRQ descriptor to the IRQ listint setup_irq(unsigned int irq, struct irqaction *new);
• Register a interrupt handler routineint request_irq(unsigned int irq, void (*handler)(), unsigned long flags, const char *dev_name, void *dev_id);
• To free:void free_irq(unsigned int irq, void *dev_id);
Spring 2003 © 2003 Yongguang Zhang 11
Interrupt Handler Flag
• SA_INTERRUPT– Handler will execute with all local interrupts disable– If this is not given, interrupt will be enabled before
calling the handlers in handle_IRQ_event()
• SA_SHIRQ– This IRQ line can be shared with other devices– Can call request_irq() more than once
• SA_SAMPLE_RANDOM– Used by kernel random number generator
Spring 2003 © 2003 Yongguang Zhang 12
Nested Interrupts/Exceptions
• Without SA_INTERRUPT, interrupt handler can be further interrupted
• Exceptions cannot be nested – assuming no kernel bugs– Except Page Fault in a System Call (which is an
Exception afterall)– In fact, exceptions other than page fault should never
raise in kernel mode
• Exceptions can be interrupted by interrupt
Spring 2003 © 2003 Yongguang Zhang 13
Illustration 1
User mode
Kernel mode
Exception(system call) Interrupt Interrupt
Return fromsystem call
Return frominterrupt
Spring 2003 © 2003 Yongguang Zhang 14
Interrupt Context
• Under which the interrupt handler runs• Limitations
– Cannot sleep– Cannot access user space– Cannot call the scheduler– Can only allocate memory with GFP_ATOMIC
• Interrupt handlers must finish quickly– So as not to keep interrupts blocked for long (all
interrupts are blocked unless no SA_INTERRUPT)– How to perform longer tasks within a handler?
Spring 2003 © 2003 Yongguang Zhang 15
Deferred Invocation
• Linux allow “subtasks” (function calls) to be defered to a “later” time– Interrupt handler does only the most critical tasks– Registers rest of the tasks to be executed later– Interrupt handler then returns as soon as possible
• Data structure for deferred invocation: tasklet• Typical scenario:
– Interrupt handler saves device data to a device-specific buffer, schedules a tasklet, and returns
– The tasklet runs later to perform whatever other work is required to finish the interrupt handling
Spring 2003 © 2003 Yongguang Zhang 16
Tasklet
• A Really Really Lightweight Task (Thread)– A function call in the interrupt context– A kernel routine/function with an agrument, which
can be scheduled to run later (at a system-determined safe time)
• Bottom Half?– A historical mechanism for deferred invocation– “ Top half” : handler registered by request_irq()– In 2.4 kernel: implemented as a tasklet– In 2.6 kernel: no more bottom half
Spring 2003 © 2003 Yongguang Zhang 17
Tasklet Properties
• Guaranteed to run once– Once scheduled, a tasklet is guaranteed to be executed
once after that– An already scheduled but not yet executed tasklet can
be rescheduled, but will be executed only once
• Once a tasklet starts running, it can be rescheduled to run again later
• Tasklet is strictly serialized (no nesting)• Different tasklets can run simultaneously in
different CPUs
Spring 2003 © 2003 Yongguang Zhang 18
Tasklet Data Structure
• Deferred function and the argument• Data Structure: in include/linux/interrupt.h
struct tasklet_struct{ struct tasklet_struct *next; unsigned long state; atomic_t count; void (*func)(unsigned long); unsigned long data;};
• Two tasklet lists per CPU (one higher priority)
Spring 2003 © 2003 Yongguang Zhang 19
Tasklet Liststasklet_vec[NR_CPUS]
0
n
data
next
func()
statecount
data
next
func()
statecount
data
next
func()
statecount
Deferred function
tasklet_hi_vec[NR_CPUS]
0
n
data
next
func()
statecount
data
next
func()
statecount
Spring 2003 © 2003 Yongguang Zhang 20
Declaring Tasklet
• Define a function that takes one argument– void function(unsigned long data)
• DECLARE_TASKLET(name,function,data)– Declares a tasklet (of type struct tasklet_struct) with
the given name, function, and (unsigned long) data value (as the argument when function is later called).
• DECLARE_TASKLET_DISABLED(name,function,data)– Declares a tasklet but with initial state ‘‘disabled’’ -- it
can be scheduled but will not be executed until enabled at some future time
Spring 2003 © 2003 Yongguang Zhang 21
Using Tasklet
• To schedule a tasklet to run soon:void tasklet_schedule(struct tasklet_struct *t)void tasklet_hi_schedule(struct tasklet_struct *t)– Both functions: add to the beginning of curresponding
tasklet list, and raise softirq
• Disable or enable a tasklet:void tasklet_disable(struct tasklet_struct *t);void tasklet_enable(struct tasklet_struct *t);– Disabled tasklet can be scheduled but won't run -- will
remain in the tasklet list until enabled again
Spring 2003 © 2003 Yongguang Zhang 22
Softirq
• “ Software IRQ”– Mechanims for execute some functions later– Used to run the scheduled tasklets– Like an exception status: can be raised and checked– Unlike a real IRQ: won't interrupt automatically
• 4 types of softirq used– TASKLET_SOFTIRQ for tasklet handling– HI_SOFTIRQ for high-priority tasklets (and BHs)– NET_TX_SOFTIRQ for transmitting packets to NIC– NET_RX_SOFTIRQ for receiving packets from NIC
Spring 2003 © 2003 Yongguang Zhang 23
Raising and Checking Softirq
• Softirq must be raised and checked explicitly• Softirq status per CPU: softirq_pending(cpu)
– Long integer, one bit per type (so 4 bits are used)– Check this macro for any pending softirq
• To raise a softirq: cpu_raise_softirq(cpu,nr)– It calls to __cpu_raise_softirq(cpu,nr)– It then wakes up ksoftirqd
• __cpu_raise_softirq(cpu,nr)– Set the corresponding bit in softirq_pending(cpu)
Spring 2003 © 2003 Yongguang Zhang 24
A “ Later” Time
• Executed by do_softirq() (kernel/softirq.c)– At the end of do_IRQ()– In ksoftirqd_CPUn kernel thread– Some other places
• Kernel thread ksoftirqd()– One thread per CPU, run under nice=19– Essentially
while (softirq_pending(cpu)) { do_softirq(); if (current->need_resched) schedule();}
Spring 2003 © 2003 Yongguang Zhang 25
do_softirq()
• Won't run if in nested interrupt, or in another tasklet (serialized)
• Do one round to execute all the deferred functions:– Call action in each softirq_vec[], in the order of HI_,
NET_TX, NET_RX, TASKLET_.– First action: tasklet_hi_action(): execute all tasklets in
high-priority one-by-one– Last action: tasklet_action(): execute all tasklets in
normal priority one-by-one
• Check if new softirq raised– Yes, wake up ksoftirqd to run later
Spring 2003 © 2003 Yongguang Zhang 26
Tasklet and Softirq
• tasklet_hi_schedule():– Add tasklet to the beginning of the corresponding
tasklet list (tasklet_hi_vec[cpu].list)– Raise the corresponding softirq
cpu_raise_softirq(cpu, HI_SOFTIRQ)
• tasklet_hi_action():– Remove each enabled tasklet from the tasklet list and
run the defered function: t->func(t->data);– If any disabled tasklet remained in the list, raise softirq
again (not waking ksoftirqd yet, do_softirq() will do):__cpu_raise_softirq(cpu, HI_SOFTIRQ)
Spring 2003 © 2003 Yongguang Zhang 27
Illustration 2
User mode
Kernel mode
Exception(system call) Interrupt Interrupt
Returnfromsystemcall
Returnfrominterrupt
do_IRQ
do_IRQ
do_softirq
Schedule tasklet
Returnfrominterrupt
ksoftirqd
Spring 2003 © 2003 Yongguang Zhang 28
Limitation of Tasklet
• Still running in interrupt context– Cannot sleep– Cannot access user space– Cannot call the scheduler– Can only allocate memory with GFP_ATOMIC
Spring 2003 © 2003 Yongguang Zhang 29
Interrupt Bottom Half
• Backward Compatibility: The old BH structure– A list of BH functions (tasklet), one per interrupt– Using mark_bh(int nr) to schedule a BH function– In include/linux/interrupt.h
extern struct tasklet_struct bh_task_vec[];static inline void mark_bh(int nr){ tasklet_hi_schedule(bh_task_vec+nr);}
Spring 2003 © 2003 Yongguang Zhang 30
Task Queue
• Data Structure for Deferred Execution– Extension to tasklet and BH– For other type of deferred execution of kernel functions
• Data structure: include/linux/tqueue.hstruct tq_struct { struct list_head list; unsigned long sync; void (*routine)(void *); void *data;};