working kernel linux

452

Upload: mmnetmohanscr

Post on 15-Sep-2014

1.919 views

Category:

Documents


9 download

TRANSCRIPT

;-)

LinuxNOVELL PRESS

Ximian Desktop, Novell

www.williamspublishing.com

Novell

Linux

Linux Kernel DevelopmentSecond Edition Robert Love

Novell

Novell Press, 800 East 96th Street, Indianapolis, Indiana, 46240 USA

Linux

- 2006

32.973.26-018.2.75 13 681.3.07 "c" . .. A. "" : [email protected], http://www.williamspublishing.com 115419, , / 783; 03150, , / 152 , . 13 Linux, 2- . : . . . : ".. " 2006. 448 . : . . . . ISBN 5-8459-1085-4 (.) Linux 2.6, , . : , , , , , VFS, , . Linux. , , . , , . 32.973.26-018.2.7 . , , , Novell Press. Aufhorued translation from the English language edition published by Novell Press, Copyright 200 by Pearson Education, Inc. All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmi ted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permi sion from the publisher. All terms mentioned in this book that are known to be trademarks or service marks have been appro[ rialely capitalized. Novell Press cannot attest to the accuracy of this information. Use of a term in this boo should not be regarded as affecting the validity of any trademark or service mark. Russian language edition is published by Williams Publishing House according to the Agreement wit R&I Enterprises International, Copyright 2006 ISBN 5-S459-10854 (.) ISBN 0-72-2720-1 (.) "", 2006 2005 by Pearson Education, Inc., 200

1. Linux

15 17 21 22 23 33 45 65 95 109 131 163 177 207 233 265 293 311 331 343 355 373 389 405 415 423 429 433 437

2. Linux 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. - 14. 15. 16. 17. kobject sysfs 18. 19. 20. , . . . .

... -

15 1718 18 18 19 20

21 2222

1. Linux : Linux Linux Unix Linux Linux

2325 26 29 31 32 32

2. Linux " " l i b c GNU

3333 33 34 34 34 37 37 38 38 39 39 41 42 42 42 43 43

3. task structure 6 .

4546

Linux . ""

47 48 50 51 51 52 53 54 57 59 59 61 61 63

4. , - . ,

6567 67 68 69 70 70 71 72 74 75 78 81 83 87 88 88 89 91 91 92 92 93

5. API, POSIX syscall .

9596 97 98 99 99 99 100

101 101 104 104 106 107 108

6. , !

109109 111 111 112 114 115 116 117 119 121 124 125 126 127 128

7. (softirq) ksoftirqd

131132 133 133 136 136 139 141 141 144 146 148 149 150 154 157 157 159 160 162

8

8.

163164 164 167 169 170 172 174 176

9. - - - - - - - BLK:

177177 178 181 183 186 187 188 190 193 193 195 196 196 197 199 200 201 205

.

10. : HZ HZ j i f f i e s j iffies jiffies HZ . ,

207208 209 210 213 214 215 217 218 218 218 219 221 223 224 226

9

s c h e d u l e _ t i m e o u t ()

226 227 227 229 230 232

11. kmalloc () gfp_mask kf () vmalloc () , percpu , , , , ,

233233 235 238 238 239 240 241 245 246 248 249 252 254 256 257 257 257 258 259 260 260 261 263 264

12. Unix VFS VFS superblock inode dentry dentry

265266 266 267 269 270 270 272 274 276 278 280 280 281

10

file , , Linux

283 284 288 289 291

13. - b i o . - - - - - - noop -

293294 295 298 300 301 301 302 302 503 304 307 308 309 309 310

14. mm_struct VMA VMA find_vma () find_vma_prev () find_VMA_intersection () mmap () do_mmap ( ) : mmap () munmap () do_munmap (): munmap ()

311313 315 315 316 316 317 319 320 321 322 323 324 324 325 326 327 327 327 329

15. address_space

331 332 33211

- pdflush bdflush kupdated :

335 336 336 337 339 339 341

16. "Hello,Worldl"

343343 345 345 347 347 347 348 349 351 353 354

17. kobject sysfs k o b j e c t ktype k s e t k o b j e c t kref sysfs sysfs sysfs kobj e c t sysfs

355356 357 358 358 359 360 361 362 363 365 366 369 371

18. p r i n t k () p r i n t k () s y s l o g d klogd p r i n t k () Oops ksymoops 12

373373 374 375 375 376 377 377 378 378 380

kallsyms SysRq gdb kgdb kdb UID

380 381 38] 382 382 384 384 385 385 385 385 386 386 387 388 388

19. Linux c h a r big-endian little-endian ,

389390 391 394 394 395 396 396 397 397 398 399 401 401 401 402 403 403 404

'

20. , t y p e d e f

405405 406 406 406 407 408 408 408 410 13

, i f d e f

-

'

410 410 411 411 411 412 412 413 414

. Linux

415416 416 417 417 419 421

.

423424 426 426 427

. - ,

429429 430 430 431 431

. Unix Linux API Unix Web-

433433 434 434 435 435 435 436

437

14

(Doris) (Helen)

, Linux , , Linux. , , Linux, , . : . , , . , , -. , Linux , , . , Linux, , , . : , , " , " .. (Linus Torvalds). , , , , . ( . , , .) , . .

(Robert Love) , , . : , , , , .. , , , , . , : . , . . , . . ! , , . , , Linux, , , . (Andrew Morton) Open Source Development Labs

16

-

Linux , , , . . , , - , . ? , , . , . . . . , . , . , Linux. , . , . , . , , , ( ), . , , . , . , . . : , . . , , Linux 1 2.7. 2.6. , , , , 2.6, . , , " " . , . , . , , , .

1 Linux (Linux Kernel Development Summit), 2004 . , .

17

... , Unix-. , . , , , , . , . Linux , . . . ! ! ! , .

Linux 2.6 2.6.10. " ", . , , .

, Linux. . (API) (, API Linux ). , . , . , , , , . , , , . , , Linux, . , , . , . 7, " ", (bottom half).

18

( ), , bottom half ( ). , . , , , , , . , . , . , (API). , . , , . , . , , . , , , , , . , , , , . , Linux. . , , . , , . , , . ; .

- - h t t p : //tech9.net/rml/kernel_book/, , , , . .

19

, , ( , ), , , . , , . , (Scott Meyers) , . (Georg Nedeff), , . (Margo Catts). , , . (Adam Belay), (Martin Pool) (Chris Rivera). . , , , . (Zak Brown), . , , , . (Andrea Arcangely), (Alan ), - (Greg Kroah-Hartman), (Daniel Phillips), (David Miller), (Patrick Mochel), (Andrew Morton), (Zwane Mwaikambo), (Nick Piggin) (Linus Torvalds). ( ). . (Paul Amichi), (Keith Barbag), (Dave Eggers), (Richard Erickson), {Nat Friedman), (Dostin Hall), (Joyce Hawkins), (Miguel de Icaza), (Jimmy Krehl), (Doris Love), (Jonathan Love), (Patrick LeClair), (Linda Love), ' (Randy O'Dowd), (Salvatore Ribaudo) , (Chris Rivera), (Joey Shaw), (Jeremy VanDoren) , (Steve Weisberg) (Helen Whinsnant). , . ! , . , , .

20

.

(Robert Love) Linux . GNOME. Ximian Desktop Novell. Vista Software. , , , () (preemptive kernel), , (VM), . schedutils GNOME. Linux Journal. . , , . , .

21

, , . , , . , . . Web- . , , , , . , , . . : E-mail: WWW: : : [email protected] http://www.williamspublishing.com 115419, , / 783 03150, , / 152

:

Sams Publishing - www. nowellpress.com. ISBN ( ) .

22

1 Linux

() Unix . Unix 1969 , (Dennis Ritchie) (Ken Thompson) , , . Unix Multics , Bell Laboratories. Multics, Bell Laboratories Computer Sciences Research Center . 1969 Bell Labs , Unix. PDP-7. 1971 Unix PDP-11, 1973 , , . Unix, Bell Labs, Unix System 6, V6. Unix . , , , , 1977 Bell Labs Unix System III, 1982 AT&T System V1. Unix, , , , . (University of California at Berkeley).1

System IV? , o .

Unix Berkeley Software Distributions (BSD). Unix, 1981 , 3BSD. 4BSD: 4.0BSD, 4.1BSD, 4.2BSD 4.3BSD. Unix , (demand paging) TCP/IP. Unix 4.4BSD, 1993 , . BSD Darwin, Dragonfly BSD, FreeBSD, NetBSD OpenBSD. 1980-1990- , , Unix. AT&T , . Tru64 Digital, HP-UX Hewlett Packard, AIX IBM, DYNIX/ptx Sequent, IRIX SGI, Solaris Sun. Unix , , Unix , . Unix . -, Unix : , Unix- . -, Unix 2. , : open (), read (), w r i t e ( ) , i o c t l () c l o s e (). -, Unix - Unix . Unix f o r k ( ) . , Unix , , , , , , . Unix , , , , , , , TCP/IP. Unix , Unix . Unix , ( ) 2

, , , . , Plan9 ( Unix), .

24

1

Unix, . Unix . , , Unix .

: Linux Linux (Linus Torvalds) 1991 , Intel 80386. Unix- . DOS, Microsoft, , " ", . Minix, Unix- , . ( Minix), , Minix. , . , Unix- . , . , Unix-. 1991 . , Linux . Linux , , , . , Linux , . Linux , AMD 86-64, ARM, Compaq Alpha, CRIS, DEC VAX, H8/300, Hitachi SuperH, HP PA-RISC, IBM S/390, Incel IA-64, MIPS, Motorola 68000, PowerPC, SPARC, UltraSPARC v850. , , - . Linux . , Linux {Monta Vista Red Hal), (IBM, Novell) , . Linux Unix, Linux Unix. Linux Unix, Linux API Unix ( POSIX Single Unix Specification), Linux Unix, Unix-, , , , ,

Linux

25

Unix . Linux , ; , , . , Linux , . Linux. Linux, , 3.

, Linux GNU General Public License (GPL) 2.0. . , , 4. Linux . , , , , , (login) (shell). Linux X Windows, - (desktop environment), , , GNOME. Linux . Linux, , Linux. , , , Linux . , Linux .

- , . , , , . , , , . , (boot loader), , . , . , . , . , 3

, , h t t p : / / w w w . f s f . o r g h t t p : / / w w w . o p e n s o u r c e . o r g . , GNU GPL, . COPYING, , . http://www.fsf.org.4

26

1

. , , . (core) . , , , , , , , . . . , , , , ( , kernel-space). , ( , , user-space). , , . () , , . , , (system call) (. 1.1). , , , , , . , , . p r i n t f (). w r i t e () . . , open () , open () . , , , s t r c p y (), , . , , . , , , . , , . . , , Linux, (interrupt). - , , 5.5

, , . - *. .

Linux

27

1

2

3

. 1.1. , . .

. (interrupt handler), . , , , , . , . , , , . : , . . (interrup context), . , . . , , Linux . .

28

1

, . .

Linux Unix API, Unix . Unix . , , . Unix (memory management unit); . Unix. , , : . ( , , , , .) , 1980- . , , . . , , , . , . . . , . , , . . , , . , , , . . (Inter Process Comrrmnication, IPC) , "" IPC. . , . IPC , , , , .

Linux

29

, , , , , . Windows NT, Mach ( Mac OS X) - . Windows NT, Mac OS X , . Linux , .. , . Linux : , ( ). Linux , : , . , Linux , , . .

Linux, , Linux , Unix (, , API Unix). Linux - Unix, ! Linux, Unix. Linux . Linux , . Linux (SMP). Unix SMP, Unix . Linux . Unix, Linux , . Unix Solaris IRIX. Linux (threads): . , . Linux Unix, , , , STREAMS, "" . Linux . , Linux, Linux. - ,

30

1

. , Linux "" : , . , Unix, , . , Linux Unix.

Linux Linux -: (stable) (development). - , . . , , . , . Linux (. 1.2.). , , . - (major) , - (minor), - (, revision). , ; , , , . , , 2.6.0 . 2, 6 0. " ", 2.6. 6 ( )

2

2.6.0. 1.2.

. , . , . . . , . . ( ) , , . , 2.5 2.6.

Linux

31

- . . , . 2004 Linux 2.6 Linux 2.7. , 2.6 ; , , , . , , , , , . , , 2.6 . , .

2.6.

Linux Linux, Linux. Linux (linux-kernel mailing list). h t t p : / / v g e r . k e r n e l . o r g . , ( 300 ) ( , ) . ; , . , .

Linux: , . , . , . , . , Linux . , Linux, "" , , . , Linux, . , . !!

32

1

2 Linux

, Linux: , . , Linux, , , . , , , . . .

tar (tarball), http://www.kernel.org. , . kernel.org , , .

tar GNU zip (gzip) bzip2. bzip2 , gzip. bzip2 l i n u x - x . . z . t a r . b z 2 , , , z . . tar- GNU zip, .$ tar xvzf linux-x..z.tar.gz

bzip2, .$ tar xvjf linux-x..z.tar.bz2

l i n u x - x . y . z . / u s r / s r c / l i n u x . , . , , . , , root, root . / u s r / s r c / l i n u x .

Linux (patch) . . (incremental patch), . , . . , , . $ patch -p1 < ../patch-..z . .

, . , , . 2.1. , , . COPYING (GNU GPL v2). CREDITS , . MAINTAINERS , . , Makefile .

. , , , , g l i b c . 2.6 , 2.4.34 2

2 . 1 . arch crypto Documentation drivers fs include init ipc kernel lib mm net scripts security sound usr API VFS , Linux (initramfs)

Linux, , , . . , . , , , . CONFIG_FEATURE. , (Symmetric multiprocessing, SMP) CONFIG SMP. , SMP . , SMP . . c o n f i g , , make xconfig. , , . : (boolean) (instate). yes . , CONFIG_PREEMPT, . yes, no module. module , , (.. , ). .

Linux

35

, , . , , . , . , Linux , Novell Redhat, . . . , , , , . , . :

make config

, yes, no module { ). , , ncurses:make menuconfig

X11:make xconfig

, gtk+make gconfig

, Processor Features ( ) Network Devices ( ). , , . $ make defconfig

, , . ( , i386 ), , . , , , . .config. , , . . 36 2

, : make oldconfig , . , : make , 2.6 make dep , . , bzlmage, . , Makefile, , !

, , , , , make (1): make > "__" , . , , . make > /dev/null, .

make (1) . , . , - (, -). make (1) , . " ", . , . . $ make -jn n , .

Linux

37

. , . $ make -j4 , d i s t c c (1) c c a c h e ( l ) , .

, . . , , . , , ! , x86, grub a r c h / i 3 8 6 / b o o t / b z l m a g e /boot / e t c / g r u b / g r u b . c o n f , . , LILO, / e t c / l i l o . c o n f l i l o (8). . root. $ make modules_install System.map. . .

" " , , . . , . ( , , ), . . . GNU . , . . . 38 2

, SMP, . . , .

l i b c , ( ). , , . , , . , . , l i b / s t r i n g . . < l i n u x / s t r i n g . h > . , , , . , ,

p r i n t f ( ) . p r i n t f ( ) , p r i n t k (). p r i n t k ( ) (kernel log buffer), s y s l o g . p r i n t f ( ) : printk("Hello world! : %s : %d\n", a_string, an_integer); p r i n t f () p r i n t k () , p r i n t k () . s y s l o g , , . : printk(KERN_ERR " !\n"); p r i n t k () . p r i n t k ().

GNU " " Unix, Linux . , , Linux ANSI . , , -

Linux

39

, gcc (GNU Compiler Collection GNU, , ). ISO C991 GNU . Linux gcc, , Imel , gcc , Linux. - 99, , 99 , . , , ANSI GNU . , .

Ill

GNU (inline functions). , , , . ( ) , . ( ) , . , . , , . s t a t i c i n l i n e . ,static inline void dog(unsigned long tail_size);

, . . ( s t a t i c ) , . , . .

1

ISO C99 ISO . 99 . ISO C99 complex.

40

2

gcc . , , , . asm(). Limix . , . .

gnu , , . . l i k e l y ( ) u n l i k e l y ( ) , . , if : if (foo) { /*..*/ } , , : /* , foo ..*/ if (unllkely(ffoo)) { /*..*/ } , /* , foo ..*/ if (likely(foo)) { /*..*/ } , - . , , , . u n l i k e l y () l i k e l y () .

, . , . oops, Linux 41

. , , NULL , ! , . , , . , .

, . , , . , , . , . , : ; .

"" , . , (, , , , DOS, , ). , , , , . . 86 4 8 . , , 8 32- 16 64-. . . .

(race condition). , , . , .

42

2

Linux . , , , . . , , , . Linux . , , , . ( ) - . .

, Linux . , - , , . , , 64- , . .

, : , , , . Linux . , ; , , . . , , , , . , , , . , , , , , , . .

Linux

43

3

- Unix- 1. , , .. , - . , Unix text section ( ). (data section), ; , ; . . , (thread), , . (program counter), . , . Unix- . . , Linux . Linux . : . , , , , , . 4, " ", . , . 11, " ". , .

.

, ; . , , . , , , . , . Linux fork() (, ), . , fork (), (, pannt), (, child).

, . - . exec*() . Linux fork() clone(), . e x i t ( ) . . wait4() 2 , . , (zombie), , wait() waitpid(). (task). Linux . , , .

task structure , task list3 ( ). s t r u c t task_struct, i n c l u d e / l i n u x / s c h e d . h . .

2

w a i t 4 ( ) . Linux w a i t ( ) , w a i t p i d ( ) , w a i t 3 ( ) w a i t 4 ( ) . , .3

t a s k a r r a y ( ). Linux , , task l i s t .i

46

3

t a s k _ s t r u c t 1,7 32- . , , , . , , , , , (. 3.1).

t a s k _ s t r u c t , (slab allocator), -

(cache coloring) (. 11, " "). 2.6 t a s k _ s t r u c t . , (, , 86), , (stack pointer), . , thread_info, ( , ) ( , )4 (. 3.2.).struct task struct struct task_struct struct task_struct struct task structunsigned long state; int prio;

unsigned long policy; struct task_struct *parent; struct list_head tasks; pid_t pid;

(task list) . .1.

thread_info , , , .

4

47

current_thread_inf()

struct thread_inf

thread_inf struct task_struct 3.2.

struct thread_info 86 .struct thread_info { struct task_struct struct exec_domain unsigned long unsigned long u32 __s32 mm_segment_t struct restart_block unsigned long __u8 }; *task; *exec_domain; flags; status; cpu; preempt_count; addr_limit; restart_block; previous_esp; supervisorytack[0];

thread_info . thread_info t a s k task_struct .

, (process identification, PID). PID

, pid_t5 , int.5

(opaque type) , .

48

3

, Unix Linux 32768 ( short int). pid . , , . 32768 , . , , : , . , // sys/kernel/pid_max. t a s k _ s t r u c t . , , , t a s k _ s t r u c t . , , , current. . t a s k _ s t r u c t , , , . , , , , thread_inf . thread_info, t a s k _ s t r u c t . 86 current 13 thread_inf. current_thread_info (). .movl $-8192, %eax andl %esp, %eax

c u r r e n t task thread_info:current_thread_info()->task;

PowerPC ( RISC- IBM), c u r r e n t r2. , , 8, . , .

49

s t a t e (. 3-3). .

fork()

: schedule () concext_switch ()

TASK_ZOMBIE ( )

do exit()

TASK_RUNNING

(, )

TASK_RUNNING

(]

,

TASK_INTRRUPTIBLE TASK_UNINTERRUPTTLE ( )

. 3.3.

, . TASK_RUNNING (runnable). , , , ( , runqueue, 4. " "). TASK_INTERRUPTIBLE ( , sleeping), ..

50

3

. , TASK__RUNNING. (wake up) . TASK_UNNTERRUPTIBLE - TASK_INTERRUPTIBLE, , . , , . , TASK_UNINTERRUPTIBLE , TASK_INTERRUPTIBLE6. TASK_ZOMBIE , w a i t 4 ( ) . , . wait4 (), . TASK_STOPPED . . , - SIGSTOP, SIGTSTP, SIGTTIN SIGTTOU, , .

. set_task state(task, state); /* 'task' 'state' */ . , (memory barrier), ( SMP-). : task->state = state; s e t c u r r e n t s t a t e ( s t a t e ) s e t _ t a s k _ state(current, state).

. (executable) . . (. 5, " ") , . - "" , ps (1) , D, , SIGKILL. , , , , - .6

51

, " " . current 1. , . , . . .

Linux . i n i t , PID 1. i n i t . i n i t , , (initscripts) , . . , . , , (siblings). . t a s k _ s t r u c t t a s k _ s t r u c t , p a r e n t , , children. , (current), :struct task_struct *task = current->parent;

, , :struct task_struct *task; struct list_head *list; list_for_each (list, scurrent->children) { task = list_entry(list, struct task_struct, sibling); /* task , */ }

i n i t i n i t t a s k . , .1

, 6, " ". , . , .

52

3

struct task_struct *task for (task = current; task ! = $init_task; task = task->parent) /* task init */ , , . , , . , . , - , : list_entry(task->tasks.next, struct task_struct, tasks) . list_entry (task->tasks.prev, struct task_struct, tasks) next_task (task) ( ), p r e v _ t a s k ( t a s k ) ( ). , for_each_process (task) . t a s k : struct task_struct *task; for_each_process(task) { /* PID */ printk("%s[%d]\n", task->comm, task->pid); } , , , . ( ).

Unix . (spawn). , , . Unix , : fork () exec ( ) 8 .

8 e x e c ( ) e x e c * ( ) . e x e c v e ( ) , e x e c l p ( ) , execle(), execv() execvp().

53

f o r k ( ) , . PID ( ), PPID ( PID , PID ), , ( ), - exec () . f o r k () e x e c () , .

f o r k ( ) . . Linux fork () (copy-on-write) . (copy-on-write, COW) . . , , , . , , , (read-only). , . , , , , exec () fork (), . , f o r k ( ) , . ( 10 ), . , Unix .

f o r k () Linux f o r k () c l o n e () . , , ( ) . " Linux" . f o r k ( ) , v f o r k ( ) c l o n e d c l o n e () . c l o n e () do_fork ( ) .

54

3

do_f ork (), kernel/fork.. , , copy_pracess () . , copy_process (). dup_task_struct (), , thread_info task_struct , . . , . . . . TASK_ UNINTERRUPTIBLE, , . copy_process () copy_f lags (), flags task s t r u c t . PF_SUPERPRIV, , . PF_FORKNOEXEC, , exec (), . get_pid () , PID . , clone (), , , , (namespace). . . ( 4, " "). , . do_fork () . copy_process () , . 9.

9 , , , .

55

, exec () , , , , .

v f o r k ()

vfork () , fork (), , . , exec () . . 3BSD, fork () . , , vfork () . - Linux 10, . vfork () (, , , exec () ?), , vfork () . vfork () fork (), Linux 2.2.

vfork () clone (), .

copy_process () vfork_done t a s k _ s t r u c t NULL.

do_fvork (), , vfork_done ( ).

, , copy_process () , , vfork_done.

mm_release () ( , ), vfork_done NULL, .

do_fork() .10

Linux. , , 2.6 , .

56

3

, , , . , .

Linux . . . (concurrent programming), .

Linux . Linux . Linux , . Linux - . , . t a s k _ s t r u c t ( , , ). Linux , Microsoft Windows Sun Solaris, ( , lightweight process). " " Linux . , , , . Linux ( ) 11 . , , . , . , . , . Linux, , , , t a s k _ s t r u c t . , . , , , c l o n e () , : Clone (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND, 0 ) ; ( ) Linux . .11

-

57

, f o r k (), , , , . , , , . , f o r k () : clone (SIGCHLD, 0 ) ; v f o r k () : clone (CLONE_VFORK | CLONE_VM | SIGCHLD, 0); , c l o n e () , , . . 3.1 c l o n e () . 3 . 1 . c l o n e () PID ( (idle) ) TID ? (thread local storage, TLS) SEM_UNDO System V v f o r k (): , CLONE_PTRACE T A S K _ S T O P P E D TID TID TID

CLONE_FILES CLONE_FS CLONE_IDLETASK CLONE_NEWNS CLONE_PARENT CLONE_PTRACE CLONE_SETTID CLONE_SETTLS CLONE_SIGHAND CLONE_SYSVSEM CLONE_THREAD CLONE_VFOK

CLONE_ONTRACED CLONE_3T0P CLONE_CHILD_CLEARTID CLONE_CHILD_SETTID CLONE_PARENT_SETTID CLONE_VM

58

3

. (kernel thread) , . , ( mm NULL). , . , . Linux , , pdflush ksoftirq. . , . : int kernel_thread(int (*fn) (void * ) , void * arg, unsigned long flags) c l o n e () , f l a g s . t a s k _ s t r u c t . , fn, arg. CLONE_KERNEL, CLONE_FS, CLONE_FILES CLONE_SIGHAND, f l a g s . (, , , Linux- ). , , . .

, . , , , , , , , , "". , e x i t () ( e x i t () main ()). . , , -

59

. , , d o e x e c (), . PF_EXITING flags task s t r u c t . del_timer_sync (), . , . , (BSD process accounting), acct_process () , . __exit_mm() mm_struct, . ( , ), . exit_sem (). IPC, . __exit_files (), __exit_fs () , exit_namespace () e x i t _ s i g n a l s () , , , . , , . , e x i t c o d e t a s k s t r u c t . e x i t () , - . e x i t n o t i f (), (reparent) , - , i n i t . TASK_ZOMBIE. schedule () (. 4, " "). TASK_ZOMBIE , , . do_exit () k e r n e l / e x i t . . , ( ). (, , ), , TASK_ZOMBIE. , , , thread_inf task_struct.

60

3

, .

do_exit () , TASK_ZOMBIE . , . , . , task_struct . wait () ( ) wait4 (). , . PID . , , . , release_task (), . free_uid () . Linux , . , , . unhash_process () - pidhash . (ptrace), , (pirate) . p u t _ t a s k _ s t r u c t () , thread_inf, a , task_struct. , , , .

"" , , - , , , , . : 61

- , i n i t . d o _ e x i t () n o t i f y _ p a r e n t (), f o r g e t _ o r i g i n a l p a r e n t () (reparent), . struct task_struct *, *reaper = father; struct list_head *list; if (father->exit_signal != -1) reaper = prev_thread(reaper); else reaper = child_reaper; if (reaper == father) reaper = child_reaper; r e a p e r . , r e a p e r c h i l d _ r e a p e r , i n i t . , , , . list_for_each(list, &father->children) { = list_entry(list, struct task_struct, sibling); reparent_thread(p, reaper, child_reaper); } list_for_each (list, sfather->ptrace__children) { p = list_entry(list, struct task:_struct, ptrace_list); reparent_thread(p, reaper, child_reaper); } : child list , ptraced child list. , , ( 2.6). ptrace, , (debugging). , . . , , , , : . , , - . 62 3

I n i t wait () , , -, .

. , , . , , Linux , ( t a s k _ s t r u c t t h r e a d _ i n f ) , ( clone () fork ()), ( exec ()), , ( wait ()) ( e x i t ()). , , , , ( ). , , , - .

63

, . , . (scheduler) , , . , (, , ) , , . (multitasking) , Linux. , , , . , , . , , , - . , , . (runnable). , , , , . , . , . . , . , , . , , ( , , ..). , Linux 100 , .

(multitasking) : (cooperative) (preemptive,

) . Linux, Unix , . , , , . , , (preemption) . , , , . (timesiice) . , . . , , . , Linux , . , , . , , (yielding). : , ; , ; "" , , . , , , . Mac OS 9 . , Unix . Linux 2.5, . 0(1)- (0(1) scheduler) 1 . Linux , . , (1)-, 0(1)-, , , .

1

(1) " ". , , , . , " ", , " ".

66

4

(policy) , , . . , .

, - , - (I/0-bound), , (processor-bound). , - . , , , - ( -, - , , , ). , , , . , , -. -, , . , , , , . , . , , . . : X Windows , -. -, . , , , . : ( , low latency) (throughput). , , , , . Unix- , , -, . , -, -

67

, -. Linux ( ), .. , -, . , , , .

(priority-based). , . , , ( , round-robin), .. . , Linux, . , , , . , . Linux (dynamic priority-based), . , , . , , , -, -. Linux . , , , , . . Linux . nice, -20 19, 0. nice (ic . , ). nice ( ) niie ( ). nice , . nice -20 , nice 19 . nice Unix .

68

4

(real-time priority), . 0 99. . Linux POSIX. Unix- .

(timeslice2) , , , . , , . , . , . , , - . , -, , , , , , . , . , , , 20 . Linux , . Linux , . Linux ( 4.1). , Linux . , , . . , . , , 100 , 100 , . 20 .2 timeslice ( ) quantum () processor slice. Linux timeslice.

69

5

100

800

. 4.1.

, , , , . , , . , , , . , , . . Linux , . .

, Linux . TASK_RUNNING, . , , , , ( , ). , , .

: . -, ( , , ). , , . . , , , 100%. -

70

4

: , . , , . , , . . , , . , . , , . .

Linux. , , , Linux. Linux k e r n e l / s c h e d . c . , 2.5. , . , . (1) -. , , , . SMP-. . SMP- (SMP affinity). , , , , , . . . . (fairness). . , .

71

, 1-2, , .

.

(runqueue). kernel/sched.c 3 s t r u c t runqueue. . . . , , . , . , . . struct runqueue { spinlock_t lock; /* - */ unsigned long nr_rinning; /* , */ unsigned long nr_switches; /* */ unsigned long expired timestamp; /* */ unsigned long nr_uninterruptible; /* */ unsigned long long timestamp last tick; /* */ struct task_struct *curr; /* , */ struct task_struct *idle; /* */ struct mm_struct *prev_mm; /* mm_struct */ struct prio_array "active; /* */ struct prio_array 'expired; /* */ struct prio_array arrays[2]; /* */ struct task_3truct *migration_thread; /* */ struct list_head migration_queue; /* */ atomic_t nr_iowait; /* , - */ };

3

: kernel/sched., include/linux/sched.h? , .

72

4

, , . cpu_rq (processor) , , . this_rq () , . , task_rq(task) , . , ( 8, " "). , (, , ). , , , . , , . tapk_rq_lock () task_rq_unlock(), .struct runqueue *rq; unsigned long flags; rq = task_rq_lock(task, &flags); /* */ task_rq_unlock (rq, &flags);

this_rq_lock (), , rq__unlock (struct runqueue *rq), . , , , ( 8, " ", ). , , ./* , ... */ if (rql < rq2) ( spin_lock (s,rql->lock] ; spin_lock(Srq2->lock) ; }e s ( le spin_lock(Srq2->lock) ; si_ok&q-lc) pnlc(rl>ok}

/* ... */ / , ... */ spin_unlock(brql->lock) ; spin_unlock(&rq2->lock);

73

double_rq_lock () double_rq_unlock () . . double_rq_lock(rql, rq2); /* ...*/ double_rq_unlock(rql, rq2) ; , , . 8, " " 9, " ". : . - . , . , , , . , ( ), , . (, spinning), , , . , , , , , . , , . , . , . . , , , , . , . 8 9 .

(priority arrays): . k e r n e l / s c h e d . c s t r u c t p r i o _ a r r a y . , 0(1)-. , . (priority bitmap), , . struct prio_array ( int nr_active; /* */ unsigned long bitmap[BITMAP_SIZE]; /* */ struct list head queue[MAX_PRIO];/* */ }; 74 4

MAX_PRIO . 140. , s t r u c t l i s t _ h e a d . BITMAP_SIZE , unsigned long. . 140 32- , BITMAP_SIZE 5. , bitmap , 160 . b i t m a p , . 0. ( TASK_RUNNING), b i t m a p 1. , , 7, , 7. . , , , , . , Linux (find first set) . s c h e d _ f i n d _ f i r s t _ b i t ( ) . 4. . , s t r u c t l i s t _ h e a d . queue. . , , , . , , . . n r _ a c t i v e , .

( Linux) , .

4

86 bsfl, cntlzw.

75

, , .for ( ) ( }

. . . , (n), n . - . . . , ( - Linux). Linux . : (active) (expired). , , . , . - , , , . . , , . schedule ().struct prio_array array = rq->active; if (!array->nr_active) { rq->active = rq->expired; rq->expired = array; }

, O(1)-. , (1)- . .

76

4

schedule () schedule (). (sleep), a - . schedule () . , , . schedule () , , . .struct task_struct *prev, *next; struct list_head *queue; struct prio_array *array; int idx; prev = current; array = rq->active; idx = sched_find_first_bit(array->bitmap); queue = array->queue + idx; next = list__entry(queue->next, struct task struct, run_ist);

. . , . , . . 4.2.schedule() sched_find_first_set() 0 0

7 7 140

139 139

, 7

. 4.2. (1)- Linux 77

prev next , (next). , prev, , next, context_switch (), . . . , , , . -, . . , schedule () . .

, , , . , , - , , . , . , nice. -20 19, 0. 19 , -20 . nice s t a t i c _ p r i o t a s k _ s t r u c t . , , . , prio. . effective_prio () . ic -5 5, . , , nice, 10, , 5. , nice, 10, , , 12. , , , , nice. , , . , , - . (sleep). , -. , 78 4

, . , , -; , . Linux , , , , . sleep_avg t a s k _ s t r u c t . MAX_SLEEP_AVG, 10 . , sleep_avg , , sleep_avg MAX_SLEEP_AVG. , (timer tick) , 0. , , . , , , . , , , : , , . . , , , . , . sleep_avg. , nice, nice . , , , nice ( ). . , . . , , . task_timeslice () . . , . MAX_TIMESLICE, 200 . MIN_TIMESLICE, 10 .

79

, ( nice, ), 100 , . 4.1. 4.1. nice , +19 0 100 -20 5 (MIN_TIMESUCE) (DEF_TIMESLICE) 800 (MAX_TIMESLICE)

: , , . , : , . , O(1). , , , , "" . . , , , . s c h e d u l e r _ t i c k (), ( 10, " "), . struct task_struct *task = current; struct runqueue *rq = this_rq(); if (!--task->time_slice) { if (!TASK_INTERACTIVE(task) || EXPIRED_STARVING(rq)) enqueue_task(task, rq->expired); else enqueue_task(task, rq->active); } , . , . SK_INTERACTIVE (). nice , " ". nice ( ), . nice, 19, -

80

4

. , nice, -20, , . nice, , .. , , , . , EXPIRED_STARVING ( ) , , , (startving), . , , , . , . , .

( , , sleeping, blocked) , . , , "" , , , , . , . , - . , ( 9, " "). -, r e a d () , . . : , (wail queue), s c h e d u l e d . (wake up) : , . , : TASK_INTERRUPTIBLE TASK_UNINTERRUPTIBLE. , TASK_UNINTERRUPTIBLE , TASK_INTERRUPTIBLE . , , , . (wait queue). ,

81

. wait_queue_head_t. DECLARE_WAIT_QUEUE_HEAD () i n i t _ w a i t q u e u e _ h e a d (). . , , , , . , (race). , . : , . . . /* q ( ), */ DECLARE_WAITQUEUE(wait, current) ; add_wait_queue(q, &wait); set_current_State(TASK_INTERRUPTIBLE); /* TASK_UNINTERRUPTIBLE */ /* condition , */ while (!condition) schedule() ; set_current_state(TASK_RUNNING); remove_wait queue(q, &wait); , , . DECLARE_WAITQUEUE ( ) . add w a i t _ q u e u e () . , , , . , - , wake_up () , . TASK_INTERRUPTIBLE TASK_ UNINTERRUPTIBLE. , . , . , s c h e d u l e (). , . ,

82

4

. , schedule () . , TASK_RUNNING remove_wait_queue().

, , . , . , schedule () ; , -ERESTARTSYS; . (wake up) wake_up (), , , . try_to_wake_up () , TASK_RUNNING, activate_task () need_resched , , , . , , wakeup () , . , , VFS wake_up () , , . , . , , , , : , , , , (. 4.3).

, Linux . , . , , . - ? , , , , ? , , . .

83

add_wait_que-je() , TASK_INTERRUPTIBLE schedule(). scheduled deactivate_task(), .

, TASK_RUNNING

, , , try_to_wake_up() TASK_RUNNING, activate_task() schedule() . remove_wait_quaue () . . 4,3. (sleeping) (wake up)

, , . , , , . k e r n e l / s c h e d . l o a d _ b a l a n c e (). . s c h e d u l e (), . 1 , , 200 . l o a d _ b a l a n c e () , , . , , . , load b a l a n c e ( ) s c h e d u l e ( ) , , . , . , , . 4.4.

84

4

1 2 4 5 6 load_balancer()

1 2 3 4 5 6

20 1, 20

15 2, 15

. 4.4.

load_balance () , , , . load_balance () find_busiest_queue () . . , 25% , , f ind_busiest_queue () NULL load_balance (). . load_balance () , . , , , (.. , not "cache hot"). , , . load_balance () , ( ), , . , , - . , , p u l l _ t a s k () . , . , , load_balance ().

85

load_balance (), , . static int load_balance(int this_cpu, runqueue_t *this_rq, struct sched_doraain *sd, enum idle_type idle) { struct sched_group *group; runqueue_t *busiest; unsigned long imbalance; int nr_moved; spin_lock(&this_rq->lock); group = find_busiest_group(sd, this_cpu, &imbalance, idle); if (!group) goto out_balanced; busiest = find_busiest_queue(group) ; if (!busiest) goto out_balanced; nr_moved = 0; if (busiest->nr_running > 1) { double_lock_balance(this_rq, busiest); nr_moved = move_tasks(this_rq, this_cpu, busiest, imbalance, sd, idle); spin_unlock(&busiest->lock); } spin_unlock(&this rq->lock); if (!nr_moved) { sd->nr_balance_failed++; if (unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2)) { int wake = 0; spin_lock(abusiest->lock); if (!busiest->active_balance) { busiest->active_balance = 1; busiest->push_cpu = this_cpu; wake = 1; } spin_unlock(&busiest->lock); if (wake) wake_up_process(busiest->migration_thread); sd->nr_balance_failed = sd->cache_nice_tries; ) } else sd->nr_balance_failed = 0; sd->balance_interval = sd->min_interval; return nr_moved;

86

.

4

out_balanced: spin_unlock (&this_rq->lock) ; if (sd->balance_interval < sd->max_interval) sd->balance_interval *= 2; return 0; }

, . context_switch(), kernel/sched.. schedule (), . . switch_mm (), include/asm/ mmu_context.h . s w i t c h _ t o () , i n c l u d e /asm/ system.h, . . , schedule (). , , . need_resched , , schedule () (. 4.2). schediiler_tick (), , try_to_wake_up (), , , . , , schedule () . , , . 4.2. n e e d _ r e s c h e d set_tsk_need_resched (task) need_resched() need_resched need_resched . t r u e , , f a l s e ,

clear_tsk_need_resched (task) need_resched

87

, need_resched . , , . , , (- current , ). , 2.2. 2.2 2.4 t a s k _ s t r u c t i n t . 2.6 t h r e a d info. , .

(user preemption) , , need_resched , , . , "" . , , . , need_resched. , , . , entry.S ( , , ). , . . .

Linux, Unix, (, preemptible). . , , - , . , ( ) . 2.6, Linux : , , , . ? , , -

88

4

. , , . (SMP-safe), , . , , p r e e m p t _ c o u n t t h r e a d _ i n f . , . . , , n e e d _ r e s c h e d p r e e m p t _ c o u n t . n e e d _ r e s c h e d preempt__count , , . . p r e e m p t _ c o u n t , , . . , , preempt_count . , , , n e e d _ r e s c h e d . , . , 9. , s c h e d u l e () . , , , . , s c h e d u l e (), , . . . . , , s c h e d u l e ( ) . , , , .. ( s c h e d u l e ()).

Linux (real-lime): SCHED_FIFO SCHED_RR. SCHED_OTHER , .. . SCHED_FIFO " " (first-in first-out, FIFO) . SCHED_FIFO SCHED_OTHER. 89

SCHED_FIFO , , . , SCHED_FIFO, (roundrobin). , SCHED_FIFO, , , . SCHED_RR SCHED_FIFO, , , . , SCHED_RR SCHED_FIFO , .. (round-robin) . SCHED_RR, . , . SCHED_FIFO, , SCHED_RR, . . . , , , . Linux (soft real-time). , , . (hard real-time) . Linux . Linux , , . Linux , , Linux . 2.6 . 1 MAX_RT_PRIO 1, MAX_RT_PRIO 100, 1 99. nice SCHED_OTHER, MAX_RT_PRIO (MAX_RT_PRIO+40). , nice -20 +19 100 139.

90

4

Linux . , , , (yield) . , (man pages), ( , ). . 4.3 . , , 5, " ". 4.3.

nice () schedsetscheduler sched_getscheduler sched_setparam () sched_getparam ()

() ()

nice

sched_get_priority_max () Eched_get_priority_min () sched_rr_get_interval () sched_setaffinity() sched_getaffinity sched_yield () ()

, sched_setscheduler () sched_getcheduler () . , , , . p o l i c y r t _ p r i o r i t y t a s k _ s t r u c t . sched_setparam () sched_getparam () . r t _ p r i o r i t y , sched_param. sched_get_priority_max ()

91

sched_get_priority__min () . (MAX_USER_RT_PRIO-1), - 1. nice () . root , .. nice . nice () s e t _ u s e r _ n i c e (), s t a t i c _ p r i a prio task_struct.

Linux (processor affinity). , : " ". cpus_allowed t a s k _ s t r u c t . . 1, . sched_setaffinity () . sched_getaffinity () cpus_allowed.

. -, . , . -, , (migration threads) . , , cpus_allowed .

Linux sched_yield () , . ( , ) . , , , , . , . ( ).

4

. Linux s c h e d _ y i e l d () . . , , s c h e d _ y i e l d (). , , y i e l d () , , TASK_RUNNING, s c h e d _ y i e l d ( ) . s c h e d _ y i e l d ().

, ( , ) . , , . , , , , , . , Linux , , . , , ( ) , , , , . . - . NUMA ( ) , NUMA- . (scheduler domain) , ; 2.6 . , Linux. , , .

93

5

, , , . . , , ( ). , , , , , , . , . . -, . , , , , , . -, . , . , , - , . , , 3, " ". , . Linux , ; . , /, .

, Linux 1 , . Linux.

API, POSIX , (Application Programing Interface, API). , , , , . API , . , , . , , API . Unix- POSIX. POSIX I2, , Unix. Linux POSIX. POSIX API . Unix- API, POSIX, . , POSIX , , Unix, . , , OS Unix, Windows NT, , POSIX. Linux, Unix-, . Unix-, . , , , , , .1

x86 250 ( ). , .2

IEEE, eye-trple-E ( , Institute of Electrical and Electronics Engineers) , , POSIX. : h t t p : / / w w n . i e e e . o r g .

96

5

p r i n t f ()

->

p r i r t f ( )

->

w r i t e ( )

w r i t e ( ) >

>

. 5.1. , p r i n t f ( )

API POSIX. , : , , API. , : , , . - , . Unix " , ". , , . , , .

syscall ( syscall Linux) . (inputs), 3 , , . long 4 , . , , , . ( ) . Unix e r r n o . y p e r r o r (). , , . , g e t p i d () , , PID . .

3

"". (.. - ), , , , getpid (), .4

long 64- .

97

asmlinkage long sys_getpid(void) { return current->tgid; ) , . , , , . , , , ( )5. , . -, asmlinkage . , . . -, , g e t p i d () , s y s _ g e t p i d (). Linux: b a r () sys_bar ( ) .

Linux (syscall number). . , . . , . , . Linux " " ("not implemented") s y s _ n i _ s y s c a l l (), , , , -ENOSYS, , . " " , . . , s y s _ c a l l _ t a b l e . e n t r y . S . s y s c a l l .

, , g e t p i d () tgid, (thread group ID)? , TGID PID. TGID . getpid () PID.

5

98

5

Linux , . . . , .

. , , . , " ". - , , , . , , : (exception) . (system call handler). 8 i n t $0x80. 128, . s y s t e m _ c a l l (). e n t r y . S 6 . , sysenter. , i n t . . , , , , , .

, , . .

x86. , .

6

99

86 , . . . system_call() NR_syscalls. NR_syscalls, -ENOSYS. :call *sys_call_table(,%eax,4)

32 (4 ), 4 (. 5.2).

read()

read()

system_call()

sys_read()

read ()

sys_read()

. 5.2.

, . - . : . 86 ebx, ecx, edx, e s i , edi . , , , . . 86 .

100

5

Linux . Linux . . . , , Linux. , .. . . ( , , , ) Linux . , , i o c t l ( ) . , ? , . , , . . ? . , , . , . ? . 19, "", . , . Unix: " , ". , , . Unix . , !

, , . , , . , - , . , , , PID . , .

101

, . , , , , ! , , , . , , . . , . . , . . . , . . , ! . copy_to_user (). : ; ; , , . copy_from_user (), c o p y _ t o _ u s e r (). , , , , . , . . -EFAULT. , copy_from_user () c o p y _ t o _ u s e r () . s i l l y _ c o p y () . . , . . /* * silly copy , * len , * src, , * dst, * . ! */ asmlinkage long sys_silly_copy(unsigned long *src, unsigned long *dst, unsigned long len) }

102

5

unsigned long buf; /* , , */ if (len != sizeof(buf)) return -EINVAL; /* src, , buf */ if (copy_from_user (&buf, src, len)) return -EFAULT; /* buf dst, */ if (copy_to_user (dst, &buf, len) ) return -EFAULT; /* */ return len; } , , copy_from_user () c o p y _ t o _ u s e r ( ) , . , , , , , . , - (page fault handler) . . Linux s u s e r () , root. , root. " " (capabilities). . c a p a b l e () , , , , . , c a p a b l e (CAP_SYS_NICE) , nice . , , root, . , , . asmlinkage long sys_am_i_popular (void) { /* , CAP_SYS_NICE */ if (!capable(CAP_SYS_NICE)) return -EPERM; /* , */ return 0; }

" " , , . 103

3, " ", . current , , . (, schedule ()), . . , . 6, " ", 7. , , , , , . , , . , . , , 8, " ", 9, " ". system_call (), , .

, . . , ( ). , . , . include/linux/unistd.h. ( 8). - kernel/.7

, , , .8

. , , . . .

104

5

, f (). sys_f () . e n t r y . S . ENTRY(sys_call_table) .long sys_restart_syscall .long sys_exit .long sys_fork .long sys_read .long sys_write .long sys_open /* ... .long sys_timer_delete .long sys_clock_settime .long sys_clock_gettime .long sys_clock_getres .long sys_clock_nanosleep /* 0 */

5

*/

/* 280 */

: .long sys_foo , 283, . , , ( ). . , , . include/asm/unistd.h, . /* * This file contains the system */ #define __NR_restart_syscall #define __NR_exit 1 #define __NR_fork 2 #define __NR_read 3 #define __HR_write 4 #define __NR_open 5 ... #define __NR_mq_unlink #define __NR_mq_timedsend #define __NR_mq_timedreceive #define __NR_mq_notify #define __NR_mq_getsetattr call numbers. 0

278 279 280 281 282

105

. #define __NR_foo 283 f (). , k e r n e l / s y s . . . , , s c h e d . . /* * sys_foo - . * * */ asmlinkage long sys_foo(void) { return THREAD_SIZE; } ! . foo ().

. ( , ). , , g l i b c ! , Linux - . i n t 50x80. s y s c a l l n ( ) , . , , , , , . , open (), . long open(const char "filename, int flags, int model . #define NR_open 5 _syscall3(long, NR_open, const char *, filename, int, flags, int, mode) open ( ) . 2 + 2*n . . . -

106

5

, . NR_open, , . , . , , . , open (). , , , . #define NR_foo 283 __syscallO(long, foo) int main () { long stack_size; stack_size = foo () ; printf (" %ld\n", stack_size); return 0; }

, , . , , . "" "" . "". . Linux . "". , . , " ". , . . " ".

107

. r e a d () w r i t e () , i o c t l () . , , . . sysfs. , . Linux , . , (deprecated) (.. , ). , Linux . 2.3 2.5. .

, (API). , Linux, : , , . , , . ! , . , , , . "" "" .

108

5

6

, , . . , , , , . , , . (polling). . , , , , . , . (interrupt).

. , ( , ) , , . , . , . , . , , .

, . . , . , . , . , , , , . , . . , , (interrupt request lines, IRQ lines). . , PC IRQ, 0, , a IRQ, 1, . . , PCI, , . , PCI, . , , . , , "! ! !. (exceptions) . , . , . (, ) , (, - , page fault). , , , . , (, ), (, ). . , 86 . , . , , , .

110

6

, , (interrupt handler) -

(interrupt service routine). , , . , , , . - , . Linux , . , , . , , , , (interrupt context), . , , , . , . , , , , . , , , . . , . , , , . , .

, , , , , . , , . (top half) , , . , , ( ) (bottom half). , , . .

111

, 7, " ". . , , . , . : ", ! !. . , , , . , . . , .

, . , ( ), . ./* request_irq: */ int request_irq(unsigned int irq, irqreturn_t (*handler)(int, void *, struct pt_regs *), unsigned long irqflags, const char * devname, void *dev_id);

, irq, . , , , , , , , . (probing) . , handler, , . , . -. i r q r e t u r n _ t . . , i r q f l a g s , .

112

6

SA_INTERRUPT. , . , Linux . , , , , . : . , . ( ) , , . , , . SA_SAMPLE_RANDOM. , , , . . , , , . , (, , ) (, , ). , . .. , " ". SA_SHIRQ. , (shared). , , . . . , devname, ASCII-, , . , "keyboard". /proc/irq /proc/interrupts, . , d e v i d , , . ( ), dev_id (cookie), . , . , NULL, , (cookie) ( ISA, , , ).

113

. ( ), , , , . r e q u e s t _ i r q () . , . -EBUSY, , ( , SA_SHIRQ). , r e q u e s t _ i r q () (sleep) , , , , . , request_irq() , . , , r e q u e s t _ i r q ( ) - . . /proc/irq. proc_mkdir () procfs. proc_create () p r o c f s , kmalloc () . 11, " ", kmalloc () . ! .if (request_irq(irqn, my_interrupt, SA_SHIRQ, "my_device", dev)){ printk(KERN_ERR "my_device: cannot register IRQ %d\n", irqn); return -EIO; }

irqn , my_interrupt , , "my_device", dev dev_id. , , . , . . , , .

void free_irq(unsigned int irq, void *dev_id)

114

6

, . , , dev_id. , . , dev_id. , , , f r e e _ i r q ( ) . , devoid NULL, , . free_irq() . 6 . 1 . request_irq f r e e _ i r q () () . ,

.static irqreturn_t intr_handler (int irq, void *dev_id, struct pt_regs *regs)

, , request_irq (). , irq, , . , . , 2.0, dev_id, i r q , , , ( ). , dev_id, , , request_irq () . , , , , . , () (device structure) , , , , dev_id . , regs, , , . , .. _ 115

. , . , , . i r q r e t u r n _ t . : IRQ_NONE IRQ_HANDLED. , , , , . , , , . , IRQ_RETVAL (x). , IRQ_HANDLED, , IRQ_NONE. , () . , , IRQ_NONE, . , , i r q r e t u r n _ t , int. , , . 2.6 void. typedef i r q r e t u r n _ t void 2.4 . s t a t i c , . , . , , . , . , , . Linux . , , . , , . , . .

(shared) , . , , . SA_SHIRQ flags request_irq ().

116

6

dev_id . , . , , , , . dev_id NULL! , , . , . , , . , , . , , , . request_irq () SH_SHIRQ, , SH_SHIRQ. , 2.6, , "" SA_INTERRUPT. , , . , , . , . , (status register) , . .

, RTC (real-time clock, ), d r i v e r s / c h a r / r t c . . RTC , (PC). , , (alarm) (periodic timer). ( ) - (I/O range). . : , . RTC r t c i n i t () 117

. . . if (request_irq(RTC_IRQ, rtc_interrupt, SA_INTERRUPT, "rtc", NULL) { printk(KERN_ERR "rtc: cannot register IRQ %d\n", rtc_irq); return -EIO; } , RTC_IRQ, . , IRQ 8. , r t c i n t e r r u p t , SA_INTERRUPT. , " r t c " . - , dev_id NULL. , . /* * . * SA_INTERRUPT, * set_rtc_mmss () * ( rtc * * ). , * - rtc_lock, * * . ( set_rtc_mmss() * ./arch/XXXX/kernel/time.c) */ static irqreturn_t rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs) /* * , * . * () * , * rtc_irq_data */ spin_lock (&rtc_lock); rtc_irq_data += 0x100; rtc_irq_data &= ~Oxff; rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & OxFO); if (rtc_status & RTC_TIMER_ON) rnod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);

118

6

spin_unlock(&rtc_lock); /* * /* spin_lock(&rtc_task_lock); if (rtc_callback) rtc_callback->func(rtc_callback->private_data); spin_unlock(&rtc_task_lock); wake_up_interruptible(&rtc_wait); kill_fasync(&rtc_async_queue, SIGIO, POLL_IN); return IRQ_HANDLED; }

, RTC. , -: , rtc_irq_data SMP-, rtc_callback. 9, " ". r t c _ i r q data RTC m o d t i m e r (). 10, " ". , -, (callback), . RTC , , RTC. IRQ_HANDLED, , . , RTC , IRQ_HANDLED.

, . , , , , . current . , , ..

119

, . c u r r e n t ( , ). , (sleep) , ? . , , , . , . . - (busy loop) . . , (, !). . , . . , . 1. , 8 32- 16 64- . , , . , , . 2.6 , 4 32- . , , . , , . . , , , . , . .

- . , (idle task). 120 6

1

, , Linux . , , . . 6.1 , .

handle_IRQ_event() ? \ ret_from_int()

do_IRQ()

. 6.1.

. ( ), . . ( , ), , , , , . . , . , . IRQ . ( ). do_IRQ (). , , , .

121

do_IRQ() . unsigned int do_IRQ(struct pt_regs regs) , p t r e g s , . , do_IRQ () . 86 . int irq = regs.orig_eax & 0xff; , do_IRQ() . PC, mask_and_ack_8295A (), do_IRQ (). do_IRQ () , , . , handle_IRQ_event (), . 86 handle_IRQ_event () . int handle_IRQ_event (unsigned int irq, struct pt_regs *regs, struct irqaction *action) { int status = 1; if (!(action->flags & SA_INTERRUPT)) local_irq_enable (); do { status != action->flags; action->chandler (irq, action->dev_id, regs); action = action->next; } while (action); if (status & SA_SAMPLE_RANDOM) add_interrupt_randomness (irq); local_irq_disable(); return status; } , , SA_INTERRUPT . , SA_INTERRUPT , . . , . . a d d _ i n t e r r u p t _ r a n d o m n e s s (), SA_SAMPLE_RANDOM. , . , " ", .

122

6

( do_IRQ () , ). do_IRQ () , ret_from_intr(). r e t _ f r o m _ i n t r ( ) , , . , ( 4, " ", n e e d _ r e s c h e d ) . (.. ), s c h e d u l e () . (.. ), s c h e d u l e () , p r e e m p t _ c o u n t ( ), s c h e d u l e () , , . 86, , , a r c h / i 3 8 6 / k e r n e l / e n t r y . S , a r c h / i 3 8 6 / k e r n e l / i r q . . .

/proc/interrupts procfs , /. procfs , . // i n t e r r u p t s , , , . CPU0 0: 3602371 1: 3048 2: 0 4: 2689466 5: 0 12: 85077 15: 24571 NMI: 0 LOC: 3602236 ERR: 0 XT-PIC XT-PIC XT-PIC XT-PIC XT-PIC XT-PIC XT-PIC timer i8042 cascade uhci-hed, ethO EMU10K1 uhei-hcd aic7xxx

. 0-2, 4, 5, 12 15. , , . . , .

123

, 3 . 6 0 2 . 3 7 1 2 , (EMU10K1) ( , , ). , . XT-PIC PC (PC programmable interrupt controller). I/ APIC IO-APIC-level IO-APIC-edge. , , . dev_name request_irq () , . , 4 , , . , , procfs, fs/proc. , /proc/interrupts, show_interrupts () .

Linux . . . . 6.2 . , , , . , . , . , . Linux , , . . , . 8 9 . .

2

10, " ", , ( HZ) ?

124

6

( ) . local_irq_disable(); /* .. */ local_irq_enable(); (, , ). 86 l o c a l _ i r q _ d i s a b l e () c l i , 11_ i r q _ e n a b l e () s t i . , 86, s t i c l i , (set) (clear) (allow interrupt flag). , . l o c a l _ i r q _ d i s a b l e () , . l o c a l _ i r q _ e n a b l e () , ( l o c a l _ i r q _ d i s a b l e ()) . , , . , , , . , . , , . , , . . unsigned long flags; local_irq_save(flags); /* . . */ local_irq_restore (flags) ; /* ..*/ , , f l a g s . , . , (SPARC), , f l a g s ( , ). .

125

, . cli () , . , - , , . c l i ( ) , s t i ( ) ; "86-" ( ). 2.5, , , - ( 9, " "). , , , , . , , , c l i ( ) , . c l i () , ( ) . , , c l i ( ) , , , c l i ( ) , .. s t i (). c l i () . -, . , , c l i (). -, . .

, . . . , . Linux .void void void void disable_irq(unsigned int irq); disable_irq_nosync(unsigned int irq); enable_irq(unsigned int irq); synchronize_irq(unsigned int irq);

. . , d i s a b l e _ i r q () , , , . , 126 6

, , . d i s a b l e _ i r q _ n o s y n c () . s y n c h r o n i z e _ i r q () , , , , . , .. d i s a b l e _ i r q () d i s a b l e _ i r q _ n o s y n c () e n a b l e _ i r q ( ) . e n a b l e _ i r q () . , d i s a b l e _ i r q () , , e n a b l e _ i r q () . (sleep). ! , (, , , ). , . , . 3. PCI , . d i s a b l e _ i r q () , .

(, , ). i r q _ d i s a b l e d (), , , . . , . in_interrupt() in_irq() . , . , . i n _ i r q ( ) , . , ISA, , . - ISA- . PCI , PCI . .3

127

, , .. , . , -, , . i n _ i n t e r r u p t () , . 6.2. local_irq_disable() local_irq_enable() local_irq_save(unsigned long flags) , irq) , , , , , ,

local_irq_restore(unsigned disable_irq(unsigned int

long flags) irq)

disable_irq_nosync(unsigned enable_irq(unsigned irqs_disabled() int

int

irq)

in_interrupt()

in_irq()

, ! , , . , . , . , , , . , , , , -

128

6

, , . , , . . 6.2 . ( , ), . . . . .

129

7

, , . , . , , . . ( ). . ( SA_INTERRUPT) . . , . , . , . , , , . , , , . . (top half, )

, . (bottom half).

, . , , (.. ) . . , , . . , . . , , , , . , , , , . , . , , , . . , , . , . , . , ( ) , . . . , : " , ". , , .

132

7

, . , , . , , SA_INTERRUPT, ( ). , , . , , , , . . ? , . , , . Linux, . , . ( ) , . , .

, , . , . , . Linux . , . , " ". , Linux 2.6. , , . , , , .

133

Linux , " " ("bottom half"). , . "" , "bottom half ( ). , . 32 . 32- . , .. , . , ; , . (task queue) . . , . , , , , . . , , . , "" , . 2.3 1 (softirq) (tasklet).

, 2. - 32 , , . , , , 3.

1

softirq , " ", , ( ) " ". (. .)2 , . 2.5 - . 3

task (). (softirq).

134

7

, . , . . , , , . , . , . , , . , , (software interrupt, softirq). , . , , , "" . 2.5 , . , (work queue). , . , 2.6 : , . , . . , , . , , - . , . , , . 10, " ".

, . . " " ("bottom half") , , . Linux . , , .

135

"soflirq", . "Bottom Half" Linux. "", , " " ("bottom half") . 2.5. : (softirq), . softirq, . . 7.1 . 7 . 1 . 2.5 2.5 2.3 2.3 2.3

, .

(softirq) softirq. . . softirq, softirq . , , k e r n e l / s o f t i r q . .

. , . softirq_action, < l i n u x / i n t e r r u p t . h > ./* * , */ struct softirq_action {

136

7

void (*action)(struct softirq_action * ) ; /* , */ void *data; /* */ };

32 kernel/sof t i r q . .static struct softirq_action softirq_vec[32];

. , 32 softirq. , . softirq . 32 4.

softirq , action, .void softirq_handler(struct softirg_action *)

, action softirq_action . , rny_softirq softirq_vec, - .my_softirq->action(my_softirq)

, , , data. . data data. softirq. , , softirq, . ( ) .

, . (rise softirq). . . .4

, softirq, .

137

. ksoftirqd. , , , , . , , do_softirq (). - . , do_softirq() . do_sof t i r q ().u32 pending = softirq_pending(cpu); if (pending) { struct softirq_action *h = softirq_vec; softirq_pending(cpu) = 0; do { if (pending & 1) h->action (h); h++; pending = 1; } while (pending); }

. . pending , softirq_pending (). 32- . n, . , 5. h softirq_vec. , pending, , h->action(h). h , softirq_vec. , pending, . . , ..5 , . , ( ), .

138

7

h , . . , . , . , , , h softirq_vec, pending 32 32 .

. SCSI softirq. , . , , . , . , . , , , softirq . (enum) . , , . . (enum). , . , . , HI_SOFTIRQ , a TASKLET_SOFTIRQ . , , - TASKLET_SOFTIRQ. . 7.2 . 7.2. HI_SOFTIRQ 0

TIMER_SOFTIRQ NET_TX_S0FTIRQ NET_RX_SOFTIRQ SCSI_SOFTIRQTASKLET_SOFTIRQ

1 2 3 45

SCSI

139

o p e n _ s o f t i r q ( ) , : , - d a t a . , , . open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL); open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL); (sleep). . . , , , . , , , , , ( ). , . . , , . , , ( , ), - , . . , . , .

o p e n _ s o f t i r q () , . , , d o _ s o f t i r q ( ) , r a i s e _ s o f t i r q ( ) . , . raise_softirq(NET_TX_SOFTIRQ); NET_TX_SOFTIRQ. n e t _ t x _ a c t i o n () . , , .

140

7

, raise_sof t i r q _ i r q o f f (), ./* * ! */ raise_softirq_irqoff(NET_TX_SOFTIRQ);

. , , . d o _ s o f t i r q ( ) . , . " " " ".

, . , (task). . . , , : . , . , . .

, (softirq). , : HI_SOFTIRQ TASKLET_SOFTIRQ. , HI_SOFTIRQ TASKLET_SOFTIRQ.

t a s k l e t _ s t r u c t . . .struct tasklet_struct { struct tasklet_struct *next; unsigned long state; } 141 /* */ /* */

atomic_t count; /* */ void (*func) (unsigned long); /* - */ unsigned long data; /* - */ ); f u n c - ( a c t i o n , ), d a t a . s t a t e : , TASKLET_ STATE_SCHED TASLET_STATE_RUN. TASKLET_STATE_SCHED , , TASLET_STATE_RUN . TASLET_STATE RUN , , (, , , , ). c o u n t . , ; , , .

(scheduled) ( ) 6 , : t a s k l e t _ v e c ( ) t a s k l e t _ h i _ v e c ( ). t a s k l e t _ s t r u c t . t a s k l e t _ s t r u c t . t a s k l e t _ s c h e d u l e () t a s k l e t _ h i _ s c h e d u l e ( ) , t a s k l e t _ s t r u c t . ( , TASKLET_SOFTIRQ, HI_SOFTIRQ). . t a s k l e t _ h i _ s c h e d u l e (), . , s t a t e TASKLET_STATE_ SCHED. , . . , . , , t a s k l e t _ v e c t a s k l e t _ h i _ v e c , .6

. (softirq) (rise), (lasklet) (schedule)? ? , .

142

7

TASKLET_SOFTIRQ I_ SOFTIRQ, d o _ s o f t i r q ( ) . . d o _ s o f t i r q ( ) , . , , , d o _ s o f t i r q () , . TASKLET_SOFTIRQ HI_SOFTIRQ , d o _ s o f t i r q () . , t a s k l e t _ a c t i o n () t a s k l e t _ h i _ a c t i o n () - , . , t a s k l e t _ v e c t a s k l e t _ hi_vec . . ( , , ). . , , TASLET_STATE_RUN. , (, ). , TASLET_STATE_RUN, . c o u n t , , . ( c o u n t ), , . , , ( ) c o u n t . . , TASLET_STATE_RUN s t a t e . , , .

143

, . , TASKLET_SOFTIRQ HI_SOFTIRQ. , . , , , . , ( ). .

, . , , .

. , , ( ) : . ( , ) , < l i n u x / i n t e r r u p t s ,h>; DECLARE_TASKLET(name, func, data) DECLARE_TASKLET_DISABLED(name, func, data); s t r u c t _ t a s k l e t _ s t r u c t (name). , func, d a t a . ( c o u n t ) . , c o u n t , , , . count, , , , . . DECLARE_TASKLET(my_tasklet, my_tasklet_handler, dev); . struct tasklet_struct rny_tasklet = { NULL, 0, ATOMIC_INIT(0), tasklet_handler, d e v ) ; m y _ t a s k l e t , . t a s k l e t _ h a n d l e r . d e v - . , s t r u c t t a s k l e t _ s t r u c t * t , . t a s k l e t _ i n i t ( t , t a s k l e t _ h a n d l e r , dev); /* , */ 144 7

- - . void tasklet_handler(unsigned long data) , (). , , . , (, ), . , , . , (. 8, " " 9, " ").

, t a s k l e t _ s c h e d u l e (), t a s k l e t _ s t r u c t . tasklet_schedule(&my_tasklet) ; /* , my_tasklet */ , . , , , , . , , , . , , . t a s k l e t _ d i s a b l e ( ) . , , . t a s k l e t _ d i s a b l e _ n o s y n c ( ) , , , . , , . t a s k l e t _ e n a b l e () . , , DECLARE_TASKLET_DISAB