multiple device driver and flash ftl sarah diesburg cop 5641
TRANSCRIPT
Multiple Device Driver and Flash FTL
Sarah DiesburgCOP 5641
Introduction
Kernel uses logical remapping layers over storage to hide complexity and add functionality Two examples
Multiple device drivers Flash Translation Layer (FTL)
The md driver
Provides virtual devices Created from one or more
independent underlying devices The basic mechanism to support
RAIDs Full-disk encryption (software) LVM Secure deletion (TrueErase)
The md driver File systems
mounted on top of device mapper virtual device
Virtual device can Abstract multiple
devices Perform encryption Other things
User/KernelApplication
s
DM
File System
Simple Device Mappers Linear
Maps a linear range of a device Delay
delays reads and/or writes and maps them to different devices
Zero provides a block-device that always returns zero'd
data on reads and silently drops writes similar behavior to /dev/zero, but as a block-device
instead of a character-device. Flakey
Used for testing only, simulates intermittent, catastrophic device failure
http://lxr.linux.no/#linux+v3.2/Documentation/device-mapper
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Logical start sector
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Command to get number of sectors of a device (like /dev/sda1)
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Type of device mapper device we want. Linear is a one-to-one logical to physical sector mapping.
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Linear parameters: base device (like /dev/sda1)
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Linear parameters: starting offset within the device
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
Pipe the command to dmsetup, acts like “table_file” parameter
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
dmsetup command manages logical devices that use the device mapper driver. See ‘man dmsetup’ for more information.
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
We wish to “create” a new logical device mapper device.
Loading a device mapper#!/bin/sh
# Create an identity mapping for a device
echo "0 `blockdev --getsize $1` linear $1 0" \
| dmsetup create identity
We name the new device “identity”.
Loading a device mapper
Can then mount file system directly on top of virtual device
#!/bin/bash
mount /dev/mapper/identity /mnt
Unloading a device mapper
#!/bin/bash
umount /mnt
dmsetup remove identity
Unloading a device mapper
#!/bin/bash
umount /mnt
dmsetup remove identity
First unmount the file system
Unloading a device mapper
#!/bin/bash
umount /mnt
dmsetup remove identity
Then use dmsetup to remove the device called identity
dm-linear.c
Documentation http://lxr.linux.no/#linux+v3.2/
Documentation/device-mapper/linear.txt
Code http://lxr.linux.no/#linux+v3.2/
drivers/md/dm-linear.c
dm-linear.cstatic struct target_type linear_target = {
.name = "linear",
.version = {1, 1, 0},
.module = THIS_MODULE,
.ctr = linear_ctr,
.dtr = linear_dtr,
.map = linear_map,
.status = linear_status,
.ioctl = linear_ioctl,
.merge = linear_merge,
.iterate_devices = linear_iterate_devices,};
linear_mapstatic int linear_map(struct dm_target *ti, struct bio *bio,
union map_info *map_context)
{
struct linear_c *lc = (struct linear_c *) ti->private;
bio->bi_bdev = lc->dev->bdev;
bio->bi_sector = lc->start + (bio->bi_sector - ti->begin);
return DM_MAPIO_REMAPPED;
}
(**Note – this is a simpler function from an earlier kernel version. Version 3.2 does the same, but with a few more helper functions)
Memory Technology Device
Different than a character or block device Exports a special character device with
extra ioctls and operations to access flash storage
For raw flash devices (not USB sticks) Embedded chips
http://www.linux-mtd.infradead.org/
NAND Flash Characteristics
Flash has different constraints than hard drives or character devices
Exports read, write, and erase operations
NAND Flash Characteristics
Can only write to a freshly-erased location If you want to write again to same
physical location, you must first erase the area
Reads and writes are to smaller flash pages
Erasures are performed in flash blocks Holds many flash pages
NAND Flash Characteristics
Each storage location can be erased only 10K-1M times
Writing is slower than reading Erasures can be 10x slower than writing
Each NAND page has a small, non-addressable out-of-bounds area to hold state and mapping information Accessed by ioctls
NAND Flash Characteristics
We need a way to not wear out the flash and have good performance with a minimum of writes and erases
Flash Translation Layer
The solution is to stack a flash translation layer (FTL) on top of the raw flash device Exports a block device Takes care of the flash operations of
reads, writes, and erases Evenly wears writes to all flash locations
Marks old pages as invalid until they can be erased later
Data Path
Virtual file system (VFS)
File system
Multi-device drivers
Ext3
Disk driver Disk driver MTD driver MTD driver
JFFS2
FTL
Apps
Flash Translation Layer
Rotates the usage of pages
OS
Logical Address
Physical Address0 0
1 1
Write random
bitsto 1
dataFlash
0 1 2 3 4 5 6
data
Flash Translation Layer
Overwrites go to new page
Logical Address
Physical Address0 0
1 2
Write random
bitsto 1
dataFlash
0 1 2 3 4 5 6
random
data
OS
FTL Example
INFTL – Inverse Nand Flash Translation Layer Open-source FTL in linux kernel for
DiskOnChip flash Somewhat out-dated
INFTL
Broken into two files inftlmount.c – load/unload functions inftlcore.c – flash and wear-leveling
operations http://lxr.linux.no/linux+*/
drivers/mtd/inftlmount.c http://lxr.linux.no/linux+*/
drivers/mtd/inftlcore.c
INFTL
Stack-based algorithm to provide the illusion of updates
Each stack (or chain) corresponds to a virtual address with sequentially-addressed pages
INFTL “Chaining”
INFTL “Chaining”
Chains can grow to any length Once there are no more freshly-
erased erase blocks, some old ones must be garbage-collected
Chain is “folded” so that all valid data is copied into top erase block
Lower erase blocks in chain are erased and put back into the pool
inftlcore.cstatic struct mtd_blktrans_ops inftl_tr = {
.name = "inftl",
.major = INFTL_MAJOR,
.part_bits = INFTL_PARTN_BITS,
.blksize = 512,
.getgeo = inftl_getgeo,
.readsect = inftl_readblock,
.writesect = inftl_writeblock,
.add_mtd = inftl_add_mtd,
.remove_dev = inftl_remove_dev,
.owner = THIS_MODULE,
};
inftl_writeblockstatic int inftl_writeblock(struct mtd_blktrans_dev *mbd, unsigned long
block, char *buffer){
struct INFTLrecord *inftl = (void *)mbd;unsigned int writeEUN;unsigned long blockofs = (block * SECTORSIZE) & (inftl->EraseSize - 1);size_t retlen;struct inftl_oob oob;char *p, *pend;
inftl_writeblock/* Is block all zero? */
pend = buffer + SECTORSIZE;for (p = buffer; p < pend && !*p; p++);
if (p < pend) {
writeEUN = INFTL_findwriteunit(inftl, block); if (writeEUN == BLOCK_NIL) {
printk(KERN_WARNING "inftl_writeblock():cannot find" "block to write to\n"); /* * If we _still_ haven't got a block to use, we're screwed.
*/return 1;
}
memset(&oob, 0xff, sizeof(struct inftl_oob)); oob.b.Status = oob.b.Status1 = SECTOR_USED; inftl_write(inftl->mbd.mtd, (writeEUN * inftl->EraseSize) + blockofs, SECTORSIZE, &retlen, (char *)buffer, (char *)&oob);
inftl_writeblockmemset(&oob, 0xff, sizeof(struct inftl_oob));
oob.b.Status = oob.b.Status1 = SECTOR_USED; inftl_write(inftl->mbd.mtd, (writeEUN * inftl->EraseSize) + blockofs, SECTORSIZE, &retlen, (char *)buffer, (char *)&oob);
} else { INFTL_deleteblock(inftl, block); } return 0;}