adopting and commenting the old kernel source code for ... · 298 • adopting and commenting the...

18

Click here to load reader

Upload: doanthuy

Post on 13-Jul-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

Adopting and Commenting the Old Kernel Source Codefor Education

Jiong ZhaoUniversity of TongJi, Shanghai

[email protected]

Trent JarviUniversity of Denver

[email protected]

Abstract

Dissecting older kernels including their prob-lems can be educational and an entertaining re-view of where we have been. In this session,we examine the older Linux kernel version 0.11and discuss some of our findings. The pri-mary reason for selecting this historical kernelis that we have found that the current kernel’svast quantity of source code is far too complexfor hands-on learning purposes. Since the 0.11kernel has only 14,000 lines of code, we caneasily describe it in detail and perform somemeaningful experiments with a runnable sys-tem effienctly. We then examine several as-pects of the kernel including the memory man-agement, stack usage and other aspects of theLinux kernel. Next we explain several aspectsof using Bochs emulator to perform experi-ments with the Linux 0.11 kernel. Finally, wepresent and describe the structure of the Linuxkernel source including thelib/ directory.

1 Introduction

As Linus once said, if one wants to understandthe details of a software project, one should“RTFSC—Read The F**king Source Code.”The kernel is a complete system, the parts re-late to each other to fulfill the functions of a

operating system. There are many hidden de-tails in the system. If one ignores these details,like a blind men trying to size up the elephantby taking a part for the whole, its hard to under-stand the entire system and is difficult to under-stand the design and implementations of an ac-tual system. Although one may obtain some ofthe operating theory through reading classicalbooks like the “The design of Unix operatingsystem,” [4] the composition and internal rela-tionships in an operating system are not easy tocomprehend. Andrew Tanenbaum, the authorof MINIX[1], once said in his book, “teachingonly theory leaves the student with a lopsidedview of what an operating system is really like.”and “Subjects that really are important, such asI/O and file systems, are generally neglectedbecause there is little theory about them.” Asa result, one may not know the tricks involvedin implementing a real operating system. Onlyafter reading the entire source code of a oper-ating system, may one get a feeling of suddenenlightened about the kernel.

In 1991 Linus made a similar statements[5] af-ter distributing kernel version 0.03, “TheGNUkernel (Hurd) will be free, but is currently notready, and will be too big to understand andlearn.” Likewise, the current Linux kernel istoo large to easily understand. Due to the smallamount of code (only 14,000 lines) as shownin Figure 1, the usability and the consistency

• 297 •

Page 2: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

298 • Adopting and Commenting the Old Kernel Source Code for Education

with the current kernel, it is feasible to chooseLinux 0.11 kernel for students to learn and per-form experiments. The features of the 0.11 ker-nel are so limited, it doesn’t even contain jobcontrol or virtual memory swapping. It can,however, still be run as a complete operatingsystem. As with an introductory book on oper-ating systems, we need not deal with the morecomplicated components such asVFS, ext3,networking and more comprehensive memorymanagement systems in a modern kernel. Asstudents understand of the main concepts con-cerning how an operating system is generallyimplemented, they can learn to understand theadvanced parts for themselves. Thus, both theteaching and learning become more efficientand consume considerably less time. The lowerbarrier to entry for learning can even stimulatemany young people to take part in and involvein the Linux activities.

Comparison of total line counts of Linux kernels

V1.2.13

V1.0

V1.1.52

V0.01

V0.11V0.12

V0.95 V0.96a

V0.97 V0.98

V0.99

V2.0.38

V2.2.20

V2.4.17 V2.6.0

Y axis unit: 1000 Lines

Figure 1: Lines of code in various kernel ver-sions

From teaching experience and student feed-back, we found the most difficult part of study-ing the 0.11 kernel is the memory management.Therefore, in the following sections we mainlydeal with how the 0.11 kernel manages and usesmemory in the protected mode of the IntelIA-32 processor along with the different kinds ofstacks used during the kernel initialization ofeach task.

2 Linux Kernel Architecture

The Linux kernel is composed of five mod-ules: task scheduling, memory management,file system, interprocess communication (IPC)and network. The task scheduling module is re-sponsible for controlling the usage of the pro-cessor for all tasks in the system. The strat-egy used for scheduling is to provide reason-able and fair usage between all tasks in the sys-tem while at the same time insuring the pro-cessing of hardware operations. The memorymanagement module is used to insure that alltasks can share the main memory on the ma-chine and provide the support for virtual mem-ory mechanisms. The file system module isused to support the driving of and storage inperipheral devices. Virtual file system moduleshide the various differences in details of thehardware devices by providing a universal fileinterface for peripheral storage equipment andproviding support for multiple formats of filesystems. TheIPC module is used to providethe means for exchanging messages betweenprocesses. The network interface module pro-vides access to multiple communication stan-dards and supports various types of networkhardware.

The relationship between these modules is il-lustrated in Figure 2. The lines between themindicates the dependences of each other. Thedashed lines and dashed line box indicate thepart not implemented in Linux 0.1x.

The figure shows the scheduling module rela-tionship with all the other modules in the kernelsince they all depend on the schedules providedto suspend and restart their tasks. Generally, amodule may hang when waiting for hardwareoperations, and continue running after the hard-ware operation finishes. The other three mod-ules have like relationships with the schedulemodule for similar reasons.

Page 3: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 299

Figure 2: The relationship between Linux ker-nel modules

The remaining modules have implicit depen-dences with each other. The scheduling subsys-tem needs memory management to adjust thephysical memory space used by each task. TheIPC subsystem requires the memory manage-ment module to support shared memory com-munication mechanisms. Virtual file systemscan also use the network interface to supportthe network file system (NFS). The memorymanagement subsystem may also use the filesystem to support the swapping of memory datablocks.

From the monolithic model structure, we canillustrate the main kernel modules in Figure 3based on the structure of the Linux 0.11 kernelsource code.

3 Memory Usage

In this section, we first describe the usage ofphysical memory in Linux 0.1x kernel. Thenwe explain the memorysegmentation, paging,multitasking and theprotection mechanisms.Finally, we summarize the relationship betweenvirtual, linear, and physical address for the codeand data in the kernel and for each task.

Figure 3: Kernel structure framework

3.1 Physical Memory

In order to use the physical memory of the ma-chine efficiently with Linux 0.1x kernel, thememory is divided into several areas as shownin Figure 4.

Figure 4: The regions of physical memory

As shown in Figure 4, the kernel code anddata occupies the first portion of the physi-cal memory. This is followed by the cacheused for block devices such as hard disks andfloppy drives eliminating the memory spaceused by the adapters andROM BIOS. When a

Page 4: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

300 • Adopting and Commenting the Old Kernel Source Code for Education

task needs data from a block device, it will befirst read into the cache area from the block de-vice. When a task needs to output the data to ablock device, the data is put into the cache areafirst and then is written into the block deviceby the hardware driver in due time. The lastpart of the physical memory is the main areaused dynamically from programs. When kernelcode needs a free memory page, it also needsto make a request from the memory manage-ment subsystem. For a system configured withvirtual RAM disksin physical memory, spacemust be reserved in memory.

Physical memory is normally managed by theprocessor’s memory management mechanismsto provide an efficient means for using the sys-tem resources. The Intel 80X86 CPU providestwo memory management mechanisms: Seg-mentation and paging. The paging mechanismis optional and its use is determined by thesystem programmer. The Linux operating sys-tem uses both memory segmentation and pag-ing mechanism approaches for flexibility andefficiency of memory usage.

3.2 Memory address space

To perform address mapping in the Linux ker-nel, we must first explain the three differentaddress concepts used invirtual or logical ad-dress space, the CPUlinear address space, andthe actualphysicaladdress space. Thevirtualaddresses used in virtual address space are ad-dresses composed of thesegment selectorandoffset in the segment generated by program.Since the two part address can not be used toaccess physical memory directly, this addressis referred to as a virtual address and must useat least one of the address translation mecha-nisms provided by CPU to map into the phys-ical memory space. The virtual address spaceis composed of theglobal address spacead-dressed by the descriptors in global descriptor

table (GDT) and thelocal address spacead-dressed by the local descriptor table (LDT). Theindex part of asegment selectorhas thirteen bitsand one bit for the table index. The Intel 80X86processor can then provide a total of 16384 se-lectors so it can addresses a maximum of 64Tof virtual address space[2]. The logical addressis the offset portion of a virtual address. Some-times this is also referred to as virtual address.

Linear address is the middle portion of addresstranslation from virtual to physical addresses.This address space is addressable by the pro-cessor. A program can use alogical addressor offset in a segment and the base address ofthe segment to get a linear address. Ifpagingis enabled, the linear address can be translatedto produced a physical address. If thepagingisdisabled, then the linear address is actually thesame as physical address. The linear addressspace provided by Intel 80386 is 4 GB.

Physical addressis the address on the proces-sor’s external address bus, and is the final resultof address translation.

The other concept that we examine isvirtualmemory. Virtual memory allows the computerto appear to have more memory than it actu-ally has. This permits programmers to write aprogram larger than the physical memory thatthe system has and allows large projects to beimplemented on a computer with limited re-sources.

3.3 Segmentation and paging mechanisms

In a segmented memory system, the logical ad-dress of a program is automatically mapped ortranslated into the middle 4 GB linear addressspace. Each memory reference refers to thememory in a segment. When programs refer-ence a memory address, a linear address is pro-duced by adding the segment base address with

Page 5: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 301

Figure 5: The translation between virtual orlogical, linear and physical address

the logical address visible to the programmer.If pagingis not enabled, at this time, the linearaddress is sent to the external address bus of theprocessor to access the corresponding physicaladdress directly.

If pagingis enabled on the processor, the linearaddress will be translated by thepagingmech-anism to get the final physical correspondingphysical address. Similar to the segmentation,paging allow us to relocate each memory ref-erence. The basic theory of paging is that theprocessor divides the whole linear space into

pages of 4 KB. When programs request mem-ory, the processor allocates memory in pagesfor the program.

Since Linux 0.1x kernel uses only onepage di-rectory, the mapping function from linear tophysical space is same for the kernel and pro-cesses. To prevent tasks from interfering witheach other and the kernel, they have to occupydifferent ranges in the linear address space. TheLinux 0.1x kernel allocates 64MB of linearspace for each task in the system, the systemcan therefor hold at most 64 simultaneous tasks(64MB * 64 = 4G) before occupying the entireLinear address space as illustrated in Figure 6.

Figure 6: The usage of linear address space inthe Linux 0.1x kernel

3.4 The relationship between virtual, lin-ear and physical address

We have briefly described the memory segmen-tation and paging mechanisms. Now we willexamine the relationship between the kerneland tasks in virtual, linear and physical addressspace. Since the creation oftasks 0and1 arespecial, we’ll explain them separately.

3.4.1 The address range of kernel

For the code and data in the Linux 0.1x ker-nel, the initialization inhead.s has alreadyset the limit for the kernel and data segments to

Page 6: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

302 • Adopting and Commenting the Old Kernel Source Code for Education

be 16MB in size. These two segments over-lap at the same linear address space startingfrom address 0. Thepage directoryandpagetable for kernel space are mapped to 0-16MBin physical memory (the same address range inboth spaces). This is all of the memory thatthe system contains. Since one page table canmanage or map 4MB, the kernel code and dataoccupies four entries in thepage directory. Inother words, there are four secondary page ta-bles with 4MB each. As a result, the addressin the kernel segment is the same in the physi-cal memory. The relationship of these three ad-dress spaces in the kernel is depicted in Figure7.

Figure 7: The relationship of the three addressspaces in a 0.1x kernel

As seen in Figure 7, the Linux 0.1x kernelcan manage at most 16MB of physical mem-ory in 4096 page frames. As explained ear-lier, we know that: (1) the address range ofkernel code and data segments are the same asin the physical memory space. This configura-tion can greatly reduce the initialization oper-ations the kernel must perform. (2)GDT and

Interrupt Descriptor Table (IDT) are in the ker-nel data segment, thus they are located in thesame address in both address spaces. In theexecution of code insetup.s in real mode,we have setup both temporaryGDT andIDT atonce. These are required before entering pro-tected mode. Since they are located by physicaladdress0x90200 and this will be overlappedand used for block device cache, we have torecreateGDT and IDT in head.s after en-tering protected mode. The segment selectorsneed to be reloaded too. Since the locations ofthe two tables do not change after entering pro-tected mode, we do not need to move or recre-ate them again. (3) All tasks excepttask 0needadditional physical memory pages in differentlinear address space locations. They need thememory management module to dynamicallysetup their own mapping entries in thepage di-rectoryandpage table. Although the code andstatic data oftask 1are located in kernel space,we need to obtain new pages to prevent interfer-ence withtask 0. As a result,task 1also needsits own page entries.

While the default manageable physical mem-ory is 16MB, a system need not contain 16MBmemory. A machine with only 4MB or even2MB could run Linux 0.1x smoothly. For a ma-chine with only 4MB, the linear address range4MB to 16MB will be mapped to nonexistentphysical space by the kernel. This does not dis-rupt or crash the kernel. Since the kernel knowsthe exact physical memory size from the initial-ization stage, no pages will be mapped into thisnonexistent physical space. In addition, sincethe kernel has limited the maximum physicalmemory to be 16MB at boot time (inmain()corresponding tostartkernel() ), memoryover 16MB will be left unused. By addingsome page entries for the kernel and chang-ing some of the kernel source, we certainly canmake Linux 0.1x support more physical mem-ory.

Page 7: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 303

3.4.2 The address space relationship fortask 0

Task 0is artificially created or configured andrun by using a special method. The limits of itscode and data segments are set to the 640KB in-cluded in the kernel address space. Nowtask 0can use the kernel page entries directly withoutthe need for creating new entries for itself. Asa result, its segments are overlapped in linearaddress space too. The three space relationshipis shown in Figure 8.

Figure 8: The relationship of three addressspaces for task 0

As task 0 is totally contained in the kernelspace, there is no need to allocate pages fromthe main memory area for it. The kernel stackand the user stack fortask 0are included thekernel space.Task 0still has read and writerights in the stacks since the page entries usedby the kernel space have been initialized to bereadable and writable with user privileges. Inother words, the flags in page entries are set asU/S=1, R/W=1.

3.4.3 The address space relationship fortask 1

Similar totask 0, task 1is also a special case inwhich the code and data segment are includedin kernel module. The main difference is thatwhen forkingtask 1, one free page is allocatedfrom the main memory area to duplicate andstoretask 0’s page table entries fortask 1. Asa result,task 1has its ownpage tableentries inthepage directoryand is located at range from64MB to 128MB (actually 64MB to 64MB +640KB) in linear address space. One additionalpage is allocated for task 1 to store itstaskstructure (PCB)and is used as its kernel modestack. The task’sTask State Segment (TSS)isalso contained in task’s structure as illustratedin Figure 9.

Figure 9: The relationship of the three addressspaces in task 1

Task 1and task 0will share their user stackuser_stack[] (refer tokernel/sched.c , lines 67-72). Thus, the stack space should be‘ ‘clean” beforetask 1uses it to ensure that there

Page 8: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

304 • Adopting and Commenting the Old Kernel Source Code for Education

is no unnecessary data on the stack. When fork-ing task 1, the user stack is shared betweentask0 andtask 1. However whentask 1starts run-ning, the stack operating intask 1would causethe processor to produce a page fault becausethe page entries have been modified to be readonly. The memory management module willtherefor need allocate a free page fortask 1’sstack.

3.4.4 The address space relationship forother tasks

For task 2and higher, the parent istask 1or theinit process. As described earlier, Linux 0.1xcan have 64 tasks running synchronously in thesystem. Now we will detail the address spaceusage for these additional tasks.

Beginning withtask 2, if we designatenr asthe task number, the starting location fortasknr will be at nr * 64MB in linear addressspace.Task 2, for example, begins at address 2* 64MB = 128MB in the linear address space,and the limits of code and data segments are setto 64MB. As a result, the address range occu-pied by task 2is from 128MB to 192MB, andhas 64MB/4MB = 16 entries in the page direc-tory. The code and data segments both mapto the same range in the linear address space.Thus they also overlap with the same addressrange as illustrated in Figure 10.

After task 2has forked, it will call the func-tion execve() to run a shell program such asbash. Just after the creation oftask 2and be-fore callexecve() , task 2is similar totask 1in the three address space relationship for codeand data segments except the address range oc-cupied in linear address space has the rangefrom 128MB to 192MB. Whentask 2’scodecalls execve() to load and run a shell pro-gram, the page entries are copied fromtask 1and corresponding memory pages are freed and

Figure 10: The relationship of the three addressspaces in tasks beginning with task 2

new page entries are set for the shell program.Figure 10 shows this address space relation-ship. The code and data segment fortask 1arereplaced with that of the shell program, and onephysical memory page is allocated for the codeof the shell program. Notice that although thekernel has allocated 64MB linear space fortask2, the operation of allocating actual physicalmemory pages for code and data segments ofthe shell program is delayed until the programis running. This delayed allocation is called de-mand paging.

Beginning with kernel version 0.99.x, the usageof memory address space changed. Each taskcan use the entire 4G linear space by changingthe page directory for each tasks as illustratedin Figure 11. There are even more changes arein current kernels.

Page 9: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 305

Figure 11: The relationship of the three addressspace for tasks in newer kernels

4 Stack Usage

This section describes several different meth-ods used during the processing of kernel boot-ing and during normal task stack operations.Linux 0.1x kernel uses four different kinds ofstacks: the temporary stack used for systembooting and initialization under real addressmode; The kernel initialization stack used afterthe kernel enters protected mode, and the userstack fortask 0after moving into task 0; Thekernel stack of each task used when running inthe kernel and the user stacks for each task ex-cept fortasks 0 and 1.

There are two main reasons for using four dif-ferent stacks (two used only temporarily forbooting) in Linux. First, when entering pro-tected from real mode, the addressing methodused by the processor has changed. Thus thekernel needs to rearrange the stack area. In ad-dition, to solve the protection problems broughtby the new privilege level on processor, weneed to use different stacks for kernel code at

level 0 and for user code at level 3 respectively.When a task runs in the kernel, it uses the ker-nel mode stack pointed by the values inss0andesp0 fields of itsTSSand stores the task’suser stack pointer in this stack. When the con-trol returns to the user code or to level 3, theuser stack pointer will be popped out, and thetask continues to use the user stack.

4.1 Initialization period

When theROM BIOScode boots and loadsthe bootsect into memory at physical address0x7C00 , no stack is used until it is movedto the location0x9000:0 . The stack is thenset at0x9000:0xff00 . (refer to line 61–62 in boot/bootsect.s ). After control istransferred tosetup.s , the stack remains un-changed.

When control is transferred tohead.s , theprocessor runs in protected mode. At this time,the stack is setup at the location ofuser_stack[] in the kernel code segment (line 31in head.s ). The kernel reserves one 4 KBpage for the stack defined at line 67–72 insched.c as illustrated in Figure 12.

This stack area is still used after the controltransfers intoinit/main.c until the execu-tion of move_to_user_mode() to hand thecontrol over totask 0. The above stack is thenused as a user stack fortask 0.

4.2 Task stacks

For the processor privilege levels 0 and 3 usedin Linux, each task has two stacks: kernel modestack and user mode stack used to run kernelcode and user code respectively. Other than theprivilege levels, the main difference is that thesize of kernel mode stack is smaller than thatof the user mode stack. The former is located

Page 10: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

306 • Adopting and Commenting the Old Kernel Source Code for Education

Figure 12: The stack used for kernel code afterentering protected mode

at the bottom in a page coexisting with task’sstructure, and no more than 4KB in size. Thelater can grow down to nearly 64MB in userspace.

As described, each task has its own 64MB log-ical or linear address space except fortask 0and1. When a task was created, the bottom ofits user stack is located close to the end of the64MB space. The top portion of the user spacecontains additional environmental parametersand command line parameters in a backwardsorientation, and then the user stack as illus-trated in Figure 13.

Task code at privilege level 3 uses this stack allof the time. Its corresponding physical memorypage is mapped by paging management codein the kernel. Since Linux utilizes thecopy-on-write[3] method, both the parent and childprocess share the same user stack memory untilone of them perform a write operation on the

Figure 13: User stack in task’s logical space

stack. Then the memory manager will allocateand duplicate the stack page for the task.

Similar to the user stack, each task has its ownkernel mode stackused when operating in thekernel code. This stack is located in the mem-ory to pointed by the values inss0 , esp0fields in task’sTSS. ss0 is the stack segmentselector like thedata selectorin the kernel.esp0 indicates the stack bottom. Whenevercontrol transfers to the kernel code from usercode, the kernel mode stack for the task alwaysstarts fromss0:esp0 , giving the kernel codean empty stack space. The bottom of a task’skernel stack is located at the end of a mem-ory page where the task’s data structure begins.This arrangement is setup by making the privi-lege level 0 stack pointer inTSSpoint to the endof the page occupied by the task’s data struc-ture when forking a new task. Refer to line 93in kernel/fork.c as below:

p->tss.esp0 = PAGE_SIZE+(long)p;p->tss.ss0 = 0x10;

p is the pointer of the new task structure,tssis the structure of the task status segment. Thekernel request a free page to store the task struc-ture pointed byp. The tss structure is a field inthe task structure. The value oftss.ss0 isset to the selector of kernel data segment andthe tss.esp0 is set to point to the end of thepage as illustrated in Figure 14.

As a matter of fact,tss.esp0 points to thebyte outside of the page as depicted in the fig-

Page 11: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 307

Figure 14: The kernel mode stack of a task

ure. This is because the Intel processor de-creases the pointer before storing a value on thestack.

4.3 The stacks used by task 0 and task 1

Both task 0or idle task andtask 1or init taskhave some special properties. Althoughtask 0and task 1have the same code and data seg-ment and 640KB limits, they are mapped intodifferent ranges in linear address space. Thecode and data segments oftask 0begins at ad-dress 0, andtask 1 begins at address 64MBin the linear space. They are both mappedinto the same physical address range from 0to 640KB in kernel space. After calling thefunction move_to_user_mode() , the ker-nel mode stacks oftask 0andtask 1are locatedat the end of the page used for storing theirtask structures. The user stack oftask 0is thesame stack originally used after entering pro-tected mode; the space foruser_stack[]array defined insched.c program. Sincetask1 copiestask 0’s user stack when forking, theyshare the same stack space in physical memory.When task 1starts running, however, a pagefault exception will occur whentask 1writesto its user stack because the page entries fortask 1have been initialized as read-only. Atthis moment, the kernel will allocate a free page

in main memory area for the stack oftask 1inthe exception handler, and map it to the loca-tion of task 1’s user stack in the linear space.From now on,task 1has its own separate userstack page. As a result, the user stack fortask0 should be “clean” beforetask 1uses the userstack to ensure that the page of stack duplica-tion does not contain useless data fortask 1.

The kernel mode stack fortask 0is initializedin its static data structure. Then its user stack isset up by manipulating the contents of the stackoriginally used after entering protected modeand emulating the interrupt return operation us-ing IRET instruction as illustrated in Figure 15.

031

Figure 15: Stack contents while returning fromprivilege level 0 to 3

As we know, changing the privilege level willchange the stack and the old stack pointerswill be stored onto the new stack. To emu-late this case, we first push thetask 0’s stackpointer onto the stack, then the pointer of thenext instruction intask 0. Finally we run theIRET instruction. This causes the privilegelevel change and control to be transferred totask 0. In the Figure 15, the oldSSfield storesthe data selector ofLDT for task 0(0x17) andthe oldESPfield value is not changed since thestack will be used as the user stack fortask 0.The oldCSfield stores the code selector (0x0f)for task 0. The oldEIP points to the next in-struction to be executed. After the manipula-tion, aIRET instruction switches the privileges

Page 12: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

308 • Adopting and Commenting the Old Kernel Source Code for Education

from level 0 to level 3. The kernel begins run-ning in task 0.

4.4 Switch between kernel mode stack anduser mode stack for tasks

In the Linux 0.1x kernel, all interrupts and ex-ceptions handlers are in mode 0 so they belongto the operating system. If an interrupt or ex-ception occurs while the system is running inuser mode, then the interrupt or exception willcause a privilege level change from level 3 tolevel 0. The stack is then switched from theuser mode stack to the kernel mode stack of thetask. The processor will obtain the kernel stackpointersss0 and esp0 from the task’sTSSand store the current user stack pointers intothe task’s kernel stack. After that, the processorpushes the contents of the currentEFLAGSreg-ister and the next instruction pointers onto thestack. Finally, it runs the interrupt or exceptionhandler.

The kernelsystem callis trapped by using asoftware interrupt. Thus anINT 0x80 willcause control to be transferred to the kernelcode. Now the kernel code uses the currenttask’s kernel mode stack. Since the privilegelevel has been changed from level 3 to level 0,the user stack pointer is pushed onto the kernelmode stack, as illustrated in Figure 16.

If a task is running in the kernel code, thenan interrupt or exception never causes a stackswitch operation. Since we are already in thekernel, an interrupt or exception will nevercause a privilege level change. We are using thekernel mode stack of the current task. As a re-sult, the processor simply pushes theEFLAGSand the return pointer onto the stack and startsrunning the interrupt or exception handler.

Figure 16: Switching between the kernel stackand user stack for a task

5 Kernel Source Tree

Linux 0.11 kernel is simplistic so the sourcetree can be listed and described clearly. Sincethe 0.11 kernel source tree only has 14 directo-ries and 102 source files it is easy to find spe-cific files in comparison to searching the muchlarger current kernel trees. The mainlinux/directory contains only one Makefile for build-ing. From the contents of the Makefile we cansee how the kernel image file is built as illus-trated in Figure 17.

Figure 17: Kernel layout and building

There are three assembly files in theboot/directory: bootsect.s , setup.s , andhead.s . These three files had correspondingfiles in the more recent kernel source trees un-til 2.6.x kernel. Thefs/ directory containssource files for implementing aMINIX version

Page 13: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 309

1.0 file system. This file system is a clone of thetraditional UN*X file system and is suitable forsomeone learning to understand how to imple-ment a usable file system. Figure 18 depicts therelationship of each files in thefs/ directory.

Figure 18: File relationships in fs/ directory

The fs/ files can be divided into four types.The first is the block cache manager filebuffer.c . The second is the files concern-ing with low level data operation files suchinode.c . The third is files used to processdata related to char, block devices and regularfiles. The fourth is files used to execute pro-grams or files that are interfaces to user pro-grams.

The kernel/ directory contains three kindsof files as depicted in Figure 19.

The first type is files which deal with hardwareinterrupts and processor exceptions. The sec-ond type is files manipulating system calls from

Figure 19: Files in the kernel/ directory

user programs. The third category is files im-plementing general functions such as schedul-ing and printing messages from the kernel.

Block device drivers for hard disks, floppydisks and ram disks reside in a subdirectoryblk_drv/ in the kernel/ , thus the Linux0.11 kernel supports only three classical blockdevices. Because Linux evolved from a ter-minal emulation program, the serial terminaldriver is also included in this early kernel inaddition to the necessary console character de-vice. Thus, the 0.11 kernel contains at least twotypes of char device drivers as illustrated in Fig-ure 20.

The remaining directories in the kernel sourcetree include, init, mm, tools, andmath . The include/ contains the head filesused by the other kernel source files.init/contains only the kernel startup filemain.c ,in which, all kernel modules are initialized andthe operating system is prepared for use. Themm/ directory contains two memory manage-ment files. They are used to allocate and freepages for the kernel and user programs. Asmentioned, the mm in 0.11 kernel uses demandpaging technology. Themath/ directory onlycontains math source code stubs as 387 emula-

Page 14: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

310 • Adopting and Commenting the Old Kernel Source Code for Education

Figure 20: Character devices in Linux 0.11 ker-nel

tion did not appear until the 0.12 kernel.

6 Experiments with the 0.1x kernel

To facilitate understanding of the Linux 0.11kernel implementation, we have rebuilt arunnable Linux 0.11 system, and designed sev-eral experiments to watch the kernel internalactivities using theBochs PC emulator. Bochsis excellent for debugging operating systems.TheBochssoftware package contains an inter-nal debugging tool, which we can use to ob-serve the dynamic data structures in the kerneland examine the contents of each register on theprocessor.

It is an interesting exercise to install the Linux0.11 system from scratch. It is a good learningexperience to build a root file system image fileunder Bochs.

Modifying and compiling the kernel sourcecode are certainly the most important experi-ments for learning about operating systems. Tofacilitate the process, we provide two environ-ments in which, one can easily compile the ker-nel. One is the originalGNU gcc environment

under Linux 0.11 system in Bochs. The otheris for more recent Linux systems such asRedHat 9 or Fedora. In the former environment,the 0.11 kernel source code needs no modifi-cations to successfully compile. For the laterenvironment one needs to modify a few linesof code to correct syntax errors. For peoplefamiliar with MASM andVC environment un-der windows, we even provide modified 0.11kernel source code that can compile. Offer-ing source code compatible with multiple envi-ronments and providing forums for discussionhelps popularize linux and the linux communitywith new people interested in learning aboutoperating systems:-)

7 Summary

From observing people taking operating systemcourses with the old Linux kernel, we foundthat almost all the students were highly inter-ested in the course. Some of them even startedprogramming their own operating systems.

The 0.11 kernel contains only the basic featuresthat an operating system must have. As a result,there are many important features not imple-mented in 0.11 kernel. We now plan to adopteither the 0.12 or 0.98 kernel for teaching pur-poses to include job control, virtualFS, virtualconsole and even network functions. Due totime limitations in the course, several simpli-fications and careful selection of material willbe needed.

References

[1] Albert S. Woodhull AndrewS. Tanenbaum.OPERATING SYSTEMS:Design and Implementation.Prentice-Hall, Inc., 1997.

Page 15: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

2005 Linux Symposium • 311

[2] Patrick P. Gelsinger John H. Crawford.Programming the 80386. SYBEX Inc.,1987.

[3] Robert Love.Linux Kernel Development.Sams Inc., 2004.

[4] M.J.Bach.The Design of Unix OperatingSystem. Prentice-Hall, Inc., 1986.

[5] Linus Torvalds. LINUX – a free unix-386kernel. October 1991.

Page 16: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

312 • Adopting and Commenting the Old Kernel Source Code for Education

Page 17: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

Proceedings of theLinux Symposium

Volume Two

July 20nd–23th, 2005Ottawa, Ontario

Canada

Page 18: Adopting and Commenting the Old Kernel Source Code for ... · 298 • Adopting and Commenting the Old Kernel Source Code for Education with the current kernel, it is feasible to choose

Conference Organizers

Andrew J. Hutton,Steamballoon, Inc.C. Craig Ross,Linux SymposiumStephanie Donovan,Linux Symposium

Review Committee

Gerrit Huizenga,IBMMatthew Wilcox,HPDirk Hohndel,IntelVal Henson,Sun MicrosystemsJamal Hadi Salimi,ZnyxMatt Domsch,DellAndrew Hutton,Steamballoon, Inc.

Proceedings Formatting Team

John W. Lockhart,Red Hat, Inc.

Authors retain copyright to all submitted papers, but have granted unlimited redistribution rightsto all as a condition of submission.