eliminating receive livelock in an interrupt-driven kernel j. c. mogul and k. k. ramakrishnana...
TRANSCRIPT
Eliminating Receive Livelock in an Interrupt-Driven Kernel
J. C. Mogul and K. K. Ramakrishnana
Presented by I. Kim, 01/04/13
Introduction
• Interrupt vs. Polling– Interrupts are designed for relatively slow I/O devices
• Target applications– Host-based routing, Passive network monitoring, Network file
system– High event rates and protocols without flow control mechanis
m
• Receive Livelock– Interrupt handlers eats up all system resources to handle inp
ut events– Starvation of normal kernel/user threads
Requirements
• High throughput– Maximum Loss Free Receive Rate
(MLFRR)• Low latency and jitter• Fair allocation of resources
– Packet reception, protocol processing, transmission, and other tasks
• Overall Stability
Traditional Interrupt-Driven System (4.2 BSD)
• Packet arrival -> interrupt -> device driver (buffer mgmt + DL) -> S/W interrupt
• Several queues among steps– Overload => queue overflow => drop
• Batch processing of burst traffics• Receive livelock, longer latency, and star
vation of transmission processing
Latency induced by interrupt and batch
processing
Avoiding Livelock• Limiting interrupt (adaptation bt. poll)
– Packet drop upon reception• Disable interrupt temporarily and re-enable later
– High/low watermark on CPU occupancy• Polling
– Round-robin through registered devices• No preemption while processing a packet already rece
ived– Ensure work conservation– Do nothing in receiving interrupt handler– Removing almost queues in IP processing chain
Measurements
• Methodology– IP packet router configuration– DEC 3000/300 Alpha based Digital UNIX 3.2 (OSF
/1) : rather slow– 3000/400 as a packet generator– 10,000 UDP packets with 4 bytes payload– Count netstat output (Opkts)– With and without screend (user level PF)
Receive Livelock Example
Other Scheduling Heuristics
• Quota on packet burst• Feedback from full queue• Cycle-limit on device driver
Other performance results
• End-system transport protocols– Benefits user level delivery performance
• Promiscuous network monitor
Concluding Remarks
• Related and future works– Clocked interrupts– 4.3 BSD terminal I/O– Lazy Receiver Processing– Feedback-based scheduling algorithm– Early packet drop– Extension to Multiprocessor Kernels
• This research provided improved performance in Click project