linux tutorial 8

Upload: ssanagav

Post on 11-Oct-2015

13 views

Category:

Documents


0 download

DESCRIPTION

Gary Nutt

TRANSCRIPT

  • 5/21/2018 Linux Tutorial 8

    1/10

    SynchronizationMechanisms

    eig

    ht

    1

    Exercise Goal: You w ill add a new synchronization m echanism to the kernel.

    This w ill require you to create a design for a new m echanism and then to add

    its im plem entation to the kernel.

    I n t r o d u c t i o n

    Every m odern kernel provides one or m ore synchronization prim itives to enable

    application program m ers to coordinate the behavior of m ultithreaded softw are.

    Classic U N IX system s did not provide m echanism s for ne-grained m ultithread

    operation; the prim ary m eans for synchronizing processes w as pipes (and les).

    System V U N IX provided sem aphores as part of the shared m em ory im plem en-

    tation; these prim itives are defined in PO SIX and hence are supported by

    Linux.

    Blocki ng a Task

    As a task(a process executing in kernel space) runs, it m ight have to w ait for

    som e condition to becom e true before it can continue. For exam ple, it m ight

  • 5/21/2018 Linux Tutorial 8

    2/10

    w ait for an I/O operation to com plete (an interrupt to occur). W hen a task de-

    cides to block itself, it changes its ow n state, by changing the current->state

    eld in its descriptor from TASK_RUNNING to eitherTASK_INTERRUPTIBLE

    orTASK_UNINTERRUPTIBLE. Then it enqueues itself on a list of blocked

    processes and invokes the scheduler to run a different task. A task in the

    TASK_INTERRUPTIBLE state can be reactivated by a signal or an interrupt,

    w hereas a task in the TASK_UNINTERRUPTIBLE state can be reaw akened

    only if an interrupt occurs and changes som ething in the m achine state so that

    the task can run again. For exam ple, w hen a task starts an I/O operation, the

    device driver code places the task in the TASK_UNINTERRUPTIBLE state. The

    task can be aw akened only by the interrupt that indicates that the I/O operation

    has com pleted.

    As m entioned, w hen a task blocks itself, the kernel code links the process into

    a list of tasks, w aiting for the given condition to becom e true. Thus you can

    think of each condition on w hich a task could block as defining a list of tasks.

    W henever the corresponding condition becom es true, then one or m ore (de-

    pending on the sem antics of the condition) of the blocked tasks in the list be-

    com e unblocked. For exam ple, the classic D ijkstra sem aphore (see Chapter 8 in

    N utt [2000]) is a form of condition that can block a task. A binary sem aphore is

    an unsigned integer variable that has an initial value 1. For a sem aphore, S, its

    value can be changed only by one of tw o functions,P(S) orV(S). Think of a

    sem aphore as being like a C + + class or an abstract data type. That is, it has pri-

    vate data to represent the unsigned integer value and tw o public functions on

    its interface. The binary sem aphore is the sim plest form of sem aphore; it can

    take on only values of0 or1. By contrast, general sem aphores can take on ar-

    bitrary non-negative integer values. For a binary sem aphore, w hen P(S) is called

    and S has the value 1, then S is decrem entedchanged to 0and the calling

    process continues to execute. If S has the value 0, then the calling process

    is blocked until S is increm ented to 1. The V(S) operation increm ents the

    sem aphore.

    The paradigm for using a sem aphore is for a process to callP(S) before it be-

    gins to execute a critical section, code that could cause a race condition of the

    type described in Exercise 6. W hen the process com pletes executing the criticalsection code, it calls V(S) to increm ent the sem aphore. N ote that if one process

    is in a critical sectionS w ill have been set to 0then another process that at-

    tem pts to enter its critical section by calling P(S) w ill be blocked until the first

    process executes V(S). The P() and V() functions are them selves critical

    sections, m eaning that w hen they are being executed, they cannot be

    interrupted.

    2 Synchronization Mechanisms

  • 5/21/2018 Linux Tutorial 8

    3/10

    You can im plem ent a sem aphore by creating a kernel abstract data type that

    has an unsigned integer variable for the value of the sem aphore and the tw o

    publicly accessible functions,P() and V(). You also can keep track of the tasks

    that are blocked on the sem aphore by putting each blocked task in a list. If the

    value of the sem aphore is1, then the list is em pty. If one or m ore tasks are

    w aiting for S, then each of these tasks is in the abstract data types internal task

    list. W hen V(S) is called, exactly one task is rem oved from the list, thereby leav-

    ing the others still blocked. If no tasks are w aiting w hen V(S) is called, then the

    value of the sem aphore is changed to 1 w ith no further action. M ore details on

    how this w orks are provided after a discussion of kernel w ait queues.

    Wait Queues

    The situation w hereby a task/process m ust w ait for som e condition to becom e

    true occurs in m any places in the kernel. This is another w ay of saying that the

    kernel has m any critical sections in it. Therefore Linux provides a single, funda-

    m ental abstract data structure that can be used to m aintain lists of tasks w aiting

    on a condition, called w ait queues. A wait queueis a circular list of elem ents in

    w hich each elem ent contains a pointer to a struct task_struct process de-

    scriptor. These list im plem entations are not privatethe kernel is im plem ented

    as a single m onolithic block of code rather than as a collection of m odules.

    H ow ever, the intent is that tasks be inserted into a w ait queue only by using

    __add_wait_queue() and rem oved by using __remove_wait_queue().

    Follow ing are the relevant definitions for a w ait queue, taken from

    include/linux/wait.h for the data structure and m acro and from include/

    linux/sched.h for the functions.

    #define WAIT_QUEUE_HEAD(x) ((struct wait_queue *)((x)-1))

    struct wait_queue {

    struct task_struct * task;

    struct wait_queue * next;

    };

    extern inline void __add_wait_queue(

    struct wait_queue ** p, struct wait_queue * wait)

    {

    wait->next = *p ? : WAIT_QUEUE_HEAD(p);

    *p = wait;

    }

    Introduction 3

    8

  • 5/21/2018 Linux Tutorial 8

    4/10

    extern inline void __remove_wait_queue(

    struct wait_queue ** p, struct wait_queue * wait)

    {

    struct wait_queue * next = wait->next;

    struct wait_queue * head = next;

    struct wait_queue * tmp;

    while ((tmp = head->next) != wait) {

    head = tmp;

    }

    head->next = next;

    }

    This code is very com pact. It does no error checking and uses a m inim um

    am ount of instructions to m aintain the list. The __add_wait_queue() code

    sim ply places an elem ent of type struct wait_queue * at the head of the

    w ait queue (the argum ent of type struct wait_queue **). To enqueue a task,

    T, you create a w ait queue instance, newInstance, set the task field,

    newInstance->task to reference the task, and then call __add_wait_

    queue(targetWaitQueue, newInstance). The

    __remove_wait_queue()function w orks sim ilarly to rem ove an elem ent from the w ait queue. Com m ents

    in the source code w arn kernel program m ers to be very careful about m anipu-

    lating the lists w ithout using these add and rem ove functions.

    Using Wait Queues

    The code in kernel/sched.c provides another set of functions that use w ait

    queues for com m on form s of task m anagem ent. The functions sleep_on()and interruptible_sleep_on() place the calling task on the specified w ait

    queue. The rst function sets current->state to TASK_UNINTERRUPTIBLE,

    and the second sets it to TASK_INTERRUPTIBLE.

    #define SLEEP_ON_VAR \

    unsigned long flags; \

    struct wait_queue wait;

    #define SLEEP_ON_HEAD \

    wait.task = current; \

    write_lock_irqsave(&waitqueue_lock,flags); \

    __add_wait_queue(p, &wait); \

    4 Synchronization Mechanisms

  • 5/21/2018 Linux Tutorial 8

    5/10

    write_unlock(&waitqueue_lock);

    #define SLEEP_ON_TAIL \

    write_lock_irq(&waitqueue_lock); \

    __remove_wait_queue(p, &wait); \

    write_unlock_irqrestore(&waitqueue_lock,flags);

    void interruptible_sleep_on(struct wait_queue **p)

    {

    SLEEP_ON_VAR

    current->state = TASK_INTERRUPTIBLE;

    SLEEP_ON_HEAD

    schedule();

    SLEEP_ON_TAIL

    }

    void sleep_on(struct wait_queue **p)

    {SLEEP_ON_VAR

    current->state = TASK_UNINTERRUPTIBLE;

    SLEEP_ON_HEAD

    schedule();

    SLEEP_ON_TAIL}

    These functions are identical, except for the value that they assign to

    current->state. N otice that they also do not contain m any statem ents (your

    solution to this exercise w ill not contain m any lines of code either). The

    SLEEP_ON_HEAD m acro adds the task to the w ait queue. Then they call the

    scheduler, w ith the calling task being blocked (not eligible to be scheduled).

    W hen som e other task aw akens the sleeping task w ith wake_up() orwake_up_interruptible() (discussed next), the new ly aw akened task re-

    m oves itself from the w ait queue by using the SLEEP_ON_TAIL m acro. The

    context in w hich the sleep_on() function m ay be called could allow interrup-

    tion. Therefore the m acro uses a lock for the critical section related to m anipu-

    lating a w ait queue.

    Introduction 5

    8

  • 5/21/2018 Linux Tutorial 8

    6/10

    H ere are the corresponding functions to w ake up the task, both based on

    __wake_up().

    #define wake_up(x) __wake_up((x),TASK_UNINTERRUPTIBLE | \

    TASK_INTERRUPTIBLE)

    #define wake_up_interruptible(x) __wake_up((x),TASK_INTERRUPT-IBLE)

    void __wake_up(struct wait_queue **q, unsigned int mode)

    {

    struct task_struct *p;

    struct wait_queue *head, *next;

    if (!q)

    goto out;

    /*

    * this is safe to be done before the check because it

    * means no deference, just pointer operations.

    */head = WAIT_QUEUE_HEAD(q);

    read_lock(&waitqueue_lock);

    next = *q;

    if (!next)

    goto out_unlock;

    while (next != head) {

    p = next->task;

    next = next->next;

    if (p->state & mode) {

    /*

    * We can drop the read-lock early if this

    * is the only/last process.

    */if (next == head) {

    read_unlock(&waitqueue_lock);

    wake_up_process(p);

    goto out;

    }

    wake_up_process(p);

    6 Synchronization Mechanisms

  • 5/21/2018 Linux Tutorial 8

    7/10

    }

    }

    out_unlock:

    read_unlock(&waitqueue_lock);

    out:

    return;

    }

    In Exercise 6, you studied the System V shared m em ory im plem entation. In

    your solution to the exercise problem , you had to change all array references of

    the form shm_seg[] to a list reference. In the functions findkey() in

    ipc/shm.c, you undoubtedly observed the follow ing code segm ent.

    for (id = 0; id u.shm_perm.key)

    return id;}

    And in newseg() in ipc/shm.c, you saw the follow ing com plem entary code

    segm ent.

    found:shp = (struct shmid_kernel *) kmalloc ();

    if (!shp) {

    shm_segs[id] = (struct shmid_kernel *) IPC_UNUSED;

    wake_up (&shm_lock);

    return -ENOMEM;

    }

    N ow you can understand w hat is going on in this code. A race condition can

    occur w hen processes attem pt to find a new key and is detectable by the

    IPC_NOID value in the shm_segs[id] variable. If the code detects the race,

    then the loser of the race puts itself to sleep on the shm_lock w ait queue, to

    be aw akened in the newseg() function by the wake_up(&shm_lock) func-

    tion call.

    Introduction 7

    8

  • 5/21/2018 Linux Tutorial 8

    8/10

    Pr o b l e m St a t e m e n t

    Part A

    D esign and im plem ent a new kernel synchronization prim itive that w ill allow

    m ultiple processes to block on an eventuntil som e other process signals the

    event. W hen a process signals the event, all processes that are blocked on the

    event are unblocked. If no processes are blocked on an event w hen it is sig-

    naled, then the signal has no effect. Im plem ent the follow ing new system calls.

    int evntopen (int); // Creates an event, returning ID

    // arg=0 => new event; otherwise, uses

    // an existing event; error accordinglyint evntclose(int); // Destroy an event

    int evntwait(int eventNum); // Blocks thread until the event is

    // signaled

    void evntsig(int eventNum); // Unblocks all waiting threads

    // If no processes are blocked,

    // ignored

    Part B

    W rite an application program to test your new kernel functions. Your test pro-

    gram should show that the kernel functions w ork for the usual cases, for exam -

    ple, w ith one process w aiting, and also for boundary conditions such as

    no tasks w aiting w hen evntsig() is called,

    m ultiple tasks w aiting w hen evntsig() is called,

    m ultiple events open at one tim e, and

    processes w aiting w hen evntclose() is called.

    A t t a c k i n g t h e P r o b l e m

    You are to design a com pletely new facility to add to the kernel. W hen you add

    the facility, you w ill probably need to create a new im plem entation file (and

    change Makefile so that it com piles your new le). You also likely w ill need to

    change a few parts of the existing kernel, such as the initialization code. To do

    this, you should w rite a new evnt_init() function that can be called w hen the

    system is booted:

    int evnt_init(int); // Initializes events__not a sys call

    8 Synchronization Mechanisms

  • 5/21/2018 Linux Tutorial 8

    9/10

    You also need to change the system initialization code to call this new

    evnt_init() function. (The initialization routine has an underbar betw een evnt

    and init because it is nam ed by using the internal kernel function nam ing con-

    vention rather than the system call nam ing convention.) You can use this func-

    tion to set up any internal structures that you require in your solution.

    Begin by thinking carefully about the data structures that you w ill need to solve

    the problem . Your system w ill need to support several different events, so you

    need a set of descriptor data structures, one for each event. If you solved

    Exercise 6, then recall that each shared m em ory segm ent had its ow n descrip-

    tor; you w ill need the analog of a shared m em ory descriptor for an event. You

    can decide if you w ant a list or an array of such descriptors. Then your

    evnt_init() function can initialize that data structure.

    If you use an array, you nevertheless should dynam ically allocate the space for

    each entry (again, see the shared m em ory segm ent descriptors). This w ill re-

    quire that you use kmalloc() and kfree(), tw o functions that allocate kernel

    m em ory dynam ically.

    You m ight w ant to w ork at the level of w ait queues and the low -level m anipu-

    lation routines. Yourevntwait() code then m ight have a skeleton such as the

    follow ing.

    int evntwait(int eventNum)

    {

    ->state = TASK_INTERRUPTIBLE;

    add_wait_queue();

    schedule();

    }

    See the discussion in this exercises introduction.

    Sim ilarly, yourevntsig() function m ight have a skeleton such as that show n for

    the wake_up() im plem entation.

    H ow ever, if you decide to use the interruptible_sleep_on()/sleep_on() andwake_up()/wake_up_interruptible() internal kernel functions, then your

    evntwait() function m ight have a skeleton such as this.

    int evntwait(int eventNum)

    {

    Attacking the Problem 9

    8

  • 5/21/2018 Linux Tutorial 8

    10/10

    interruptible_sleep_on();

    }

    Sim ilarly, yourevntsig() function m ight have a skeleton such as this.

    int evntsig(int eventNum)

    {

    wake_up_interruptible();

    }

    10 Synchronization Mechanisms