dr a sahu dept of comp sc & engg. iit guwahati. file system, block devices block device...

Post on 18-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Block Device Driver

Dr A SahuDept of Comp Sc & Engg.

IIT Guwahati

Outline• File System, Block Devices• Block Device Registration• Initialization of Sbull• Block Device Operation • Request processing

File System & Block Devices

• Block Devices (Disk)– Sector, inode

• File systems (Operations)– Read/write, open,close, lseek, type

Block Devices Registration

• Block Devices (Disk)– Sector, inode

• File systems (Operations)– Read/write, open,close, lseek, type

File System & Block Devices

• Block Devices (Disk)– Sector, inode

• File systems (Operations)– Read/write, open,close, lseek, type

What is the VFS ?

• Component in the kernel that handles file-systems, directory and file access.

• Abstracts common tasks of many file-systems.• Presents the user with a unified interface, via

the file-related system calls (open, stat, chmod etc.).

• Filesystem-specific operations:- vector them to the filesystem in charge of the file.

Mounting a device

• $ mount -t iso9660 -o ro /dev/cdrom /mnt/cdrom

• Steps involved:– Find the file system.(file_systems list)– Find the VFS inode of the directory that is to be the

new file system's mount point. – Allocate a VFS superblock and call the file system

specific read_super function.

What will we learn ?

– The details of how a block device works• ll_rw_block() : trigger I/O transfer• __make_request (): make request -> request

queue • task queue : plug/unplug use the mechanism • request service routine :

–How to write a block device driver• writing a module– Init / exit

• implement the necessary operations– block_device_operations– request_fn_proc

Common Block Device Operations

• In fs/block_dev.cstruct file_operations def_blk_fops = {

open: blkdev_open,release: blkdev_close,llseek: block_llseek,read: generic_file_read,write: generic_file_write,mmap: generic_file_mmap,fsync: block_fsync,ioctl: blkdev_ioctl,

};

Block Device Specific Operations• Additional operations for block device only• In include/linux/fs.h :

struct block_device_operations {int (*open) (struct inode *, struct file *);int (*release) (struct inode *, struct file *);int (*ioctl) (struct inode *, struct file *, unsigned,unsigned long);int (*check_media_change) (kdev_t);int (*revalidate) (kdev_t);

};• In include/linux/blkdev.h :

typedef void (request_fn_proc) (request_queue_t *q);

EXT2generic_file_read

readpageext2_readpage

ext2_aops(address spaceoperation table)

Generic block device layer

block_read_full_page

submit_bh

cache search

cache search

file

Page page

bh bh

ext2_get_block

do_generic_file_readgeneric_readahead

get logical block number

bh bh

ll_rw_block

bread

Bh= block header

EXT2generic_file_write

prepare_writeext2_prepare_write

ext2_aops(address spaceoperation table)

Generic block device layer

block_prepare_write

ll_rw_block

submit_bh

cache search

cache search

file

bh bh

ext2_get_block

read request

bread

get logical block number

EXT2generic_file_write

commit_writegeneric_commit_write

ext2_aops(address spaceoperation table)

Generic block device layer

__block_commit_write

file

Page Page

balance_dirty

__mark_dirty

bdflush

wakeup_bdflush

submit_bh

write_some_buffers

bhbhdirty dirty

Generic Block Device Layer• Provides common functionality for all block devices

in Linux– Uniform interface (to file system) e.g. bread( ) block_prepare_write( ) block_read_full_page( ) ll_rw_block( )– buffer management and disk caching– Block I/O requests scheduling

• Generates and queues actual I/O requests in a request queue (per device)– Individual device driver services this queue (likely interrupt

driven)

Request Queue

• Data structure: in include/linux/blkdev.h• Queue header: type request_queue_t

typedef structure request_queue request_queue_t– queue_head: double linked list of pending requests– request_fn: pointer to request service routine

• Queue element: struct request– cmd: read or write– Number of request sectors, segments– bh, bhtail: a list of buffer header– Memory area (for I/O transfer)

Request Queue

Invoking the Lower Layer

• Generic block device layer– Generates and queues I/O request– If the request queue is initially empty, schedule a plug_tq

tasklet into tq_disk task queue

• Asynchronous run of task queue tq_disk– Run in a few places (e.g., in kswapd)– Take a request from the queue and call the request_fn

function:• q->request_fn(q);

Request Service Routine

• To service all I/O requests in the queue• Typical interrupt-driven procedure– Service the first request in the queue– Set up hardware so it raises interrupt when it is done– Return

• Interrupt handler tasklet– Remove the just-finished request from the queue– Re-enter the request service routine (to service the next)

Request submission

• ll_rw_block()• submit_bh()• generic_make_request()• __make_request()– generic_plug_device()– elevator algorithm– __get_request_wait()

ll_rw_block()• void ll_rw_block(int rw, int nr, struct buffer_head * bhs[])

– rw: read/write– nr: number of buffer_head structures in the array– bhs: array of buffer_head structures

• Top-level function to submit the I/O request• Checks whether the requested operation is permitted by the device

– Performing a write operation on a read-only device• Checks the buffer size is a multiple of the sector size of the device• Locks the buffer and verifies whether the operation is required

– If the dirty bit is not set on the buffer, write operation is not necessary– Read operation on a buffer with uptodate buffer is redundant

submit_bh()• void submit_bh(int rw, struct buffer_head *bh)• submit a buffer_head to the block device later for I/O• Sets the BH_Req and the BH_Launder flags on the

buffer• Sets the real device and sector values– count = bh->b_size >> 9;– bh->b_rdev = bh->b_dev;– bh->b_rsector = bh->b_blocknr * count;

generic_make_request()• void generic_make_request(int rw, struct

buffer_head *bh)• Hand a buffer head to it’s device driver for I/O• Checks the requested sector is within the range

(blk_size[major][minor]• Get the request queue of the device, calls

make_request_fn to put the buffer in the request queue (in most case,this handler is __make_request)

__make_request()• static int __make_request(request_queue_t * q, int

rw, struct buffer_head * bh)• Inserts the buffer in the request queue• Plug device by calling plug_device_fn handler of the

request queue– In most case, this is generic_plug_device()– Submits the plug_tq to the disk task queue tq_disk

• Enlarger an existing request – elevator algorithm

__make_request() (cont.)• __get_request_wait()

– If a new request has to be created and there are no free request objects, it waits on the request queue till it gets a free request object

– static struct request *__get_request_wait(request_queue_t *q, int rw){ register struct request *rq; DECLARE_WAITQUEUE(wait, current);

generic_unplug_device(q); add_wait_queue(&q->wait_for_request, &wait); do { set_current_state(TASK_UNINTERRUPTIBLE); if (q->rq[rw].count < batch_requests) schedule(); spin_lock_irq(&io_request_lock); rq = get_request(q,rw); spin_unlock_irq(&io_request_lock); } while (rq == NULL); remove_wait_queue(&q->wait_for_request, &wait); current->state = TASK_RUNNING; return rq;}

Request processing

• __generic_unplug_device()– Called by __get_request_wait() or

generic_unplug_device()– Marks the queue as unplugged– Calls the request_fn handler of the request queue

submit_bh

generic_make_request

loop_make_request

VFS

__make_request

I/O task queue

request queue

request queue

request queue

I/O request

I/O request

I/O request I/O request

I/O request

I/O request

block device driver

IDE SCSI

Generic block device layer

generic_unplug_device

I/O task queue

request queue

request queue

request queue

I/O request

I/O request

I/O request I/O request

I/O request

I/O request

block device driver

IDE

run_task_queue(tq_disk)

SCSI

generic block layer

Block Device in 2.4 Kernel• Not completely rid of major/minor number yet– Still keep queues and device driver related parameters in

arrays indexed by major numbers• In include/linux/blkdev.h– struct blk_dev_struct {

request_queue_t request_q;queue_proc *queue;void *data;

}struct blk_dev_struct blk_dev[MAX_BLKDEV];#define BLK_DEFAULT_QUEUE(_MAJOR) &blk_dev[_MAJOR].request_q

Block Device in 2.4 Kernel (2)• Matrices for device paramaters– Type: int * xxx[MAX_BLKDEV];– Indexed by major then minor number– blk_size, blksize_size, hardsec_size, max_readahead,

max_sectors, max_segments• Read ahead parameters– int read_ahead[] (include/linux/fs.h)– Indexed by major number

• You will need to set these parameters for any device in your init and open functions

include/linux/blk.h• Assign major number for each device driver• Define macros for each device (by major number)– MAJOR_NR: major number– DEVICE_NAME: name of your device– DEVICE_INTR: device interrupt handler– DEVICE_REQUEST: request service routine– DEVICE_NR(): how to calculate the minor number

• You may have to add a set for each new device driver (major number) you introduce

Skeleton Block Device

• Device operation structure:– static struct block_device_operations xxx_fops =

{open: xxx_open,release: xxx_release,ioctl: xxx_ioctl,

check_media_change, xxx_check_change, revalidate, xxx_revalidate,

owner: THIS_MODULE,};

Skeleton Block Device• Xxx_open()– MOD_INC_USE_COUNT;

• Xxx_release()– MOD_DEC_USE_COUNT;

• Xxx_ioctl(struct inode *inode, struct file *filp, unsigned int cmd, unsigned long arg)– switch(cmd)

• BLKGETSIZE• HDIO_GETGEO• …

– default: return blk_ioctl(inode->i_rdev, cmd, arg);

Skeleton “ Init” Function#define MAJOR_NR XXX_MAJORstatic int __init xxx_init(void){

/* probe the hardware, request irq, … */devfs_dir = devfs_mk_dir(NULL, “xxx_dir”, NULL);

/* old way: register_blkdev(MAJOR_NR, “xxx”, &xxx_bdops); */devfs_handle = devfs_register_blk(devfs_dir, “xxx", ......);blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), xxx_request);read_ahead[MAJOR_NR] = 8; /* 8 sector (4K) read ahead *//* you may also setup those:

blk_size[MAJOR_NR] blksize_size[MAJOR_NR] hardsect_size[MAJOR_NR]*/

/* rest of the initial setup */printk( … ); return 0;

}

Skeleton “ Exit” Functionstatic void __exit xxx_exit (void){

/* clean up */blk_cleanup_queue(BLK_DEFAULT_QUEUE(MAJOR_NR));

/* old way: unregister_blkdev(MAJOR_NR, “xxx”); */devfs_unregister(devfs_handle);devfs_unregister(devfs_dir);

/* clean up */}

module_init(xxx_init);module_exit(xxx_exit);

Skeleton “ Request” Operationstatic void xxx_request(request_queue_t *q){

while (1) {INIT_REQUEST;//a macro,quit while loop when request queue is emptyswitch (CURRENT->cmd) { case READ: /* do read request, i.e: memcpy(q->buffer, mem_block, size); */

case WRITE: /* do write request, i.e: memcpy(mem_block, q->buffer, size); */

default: /* impossible */ return 0;}end_request(status);//when finishing a request, remove it

}}

To Write a Block Device Driver (summarize)

• Write all the device operation functions– xxx_open(), xxx_release(), xxx_ioctl()...

• Write a request service routine– xxx_request()

• Write interrupt handler and related tasklets• Write module “ init” and “ exit” functions to– Register and unregister the device driver– Set up and clear up the request queue and parameters– Set up and clear up the interrupt line and handler

ThanksRef: Chap 16, LDD 3e Rubini- Corbet

Wishing u happy diwali

top related