youjip won - oslab.kaist.ac.kr

48
Youjip Won

Upload: others

Post on 18-Mar-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Youjip Won - oslab.kaist.ac.kr

Youjip Won

Page 2: Youjip Won - oslab.kaist.ac.kr

2Youjip Won

inode

data structure to represent the attribute of file

File type: general file, directory, device file, or 0 (unused)

creation time, modified time, size

access authority

location of blocks

one inode per file.

There are on-disk inode and in-memory inode

functions

iget(),iput(), ialloc(), iupdate()

Page 3: Youjip Won - oslab.kaist.ac.kr

3Youjip Won

On-disk inode

struct dinode

inode structure stored in disk

size: 64 bytes

dinode is successively stored in inode area in filesystem

i-number: dinode index of inode area

inode area

25 blocks

8 inodes per block (512/64)

total number of nodes = 200

25 blocks30 blocks

Page 4: Youjip Won - oslab.kaist.ac.kr

4Youjip Won

Page 5: Youjip Won - oslab.kaist.ac.kr

5Youjip Won

struct dinode

type: file type(directory,file,special

file,unused)

0: unused

major: number of major device

minor: number if minor device

nlink: the number of directory

entries referring to the inode

if nlink is zero, deletes it.

size: file size(byte)

addrs: data block address

struct dinode {

short type;

short major;

short minor;

short nlink;

uint size;

uint addrs[NDIRECT+1];

};

Page 6: Youjip Won - oslab.kaist.ac.kr

6Youjip Won

in-memory inode

struct inode

The inode data structure cached in memory.

includes contents of the on-disk inode.

Other contents included

Reference count, lock, and so on...

Page 7: Youjip Won - oslab.kaist.ac.kr

7Youjip Won

struct inode

dev: device number

inum: inode index

ref: number of the processes that

currently open the file

If ref field is 0, there is no process

referencing that process, so inode

structure is removed from icache

This field is increased by iget, and is

decreased by iput.

Lock

For the exclusive access for

synchronizing the on-disk inode and in-

memory inode

struct inode {

uint dev;

uint inum;

int ref;

struct sleeplock lock;

int valid;

short type;

short major;

short minor;

short nlink;

uint size;

uint addrs[NDIRECT+1];

};

Page 8: Youjip Won - oslab.kaist.ac.kr

8Youjip Won

icache

Array of in-memory inodes (NINODES = 50)

make the inode access quicker.

the write-through policy: When on-disk attributes are updated, icache immediately

updates the modified inode to the disk by calling iupdate().

Spinlock for icache

For the exclusive access for the in-memory portion of the inode

ensure that at most one copy of the inode at icache

...inode[0] inode[1] inode[2] inode[3] inode[49]

struct {

struct spinlock lock;

struct inode inode[NINODE];

} icache;

spinlock

Page 9: Youjip Won - oslab.kaist.ac.kr

9Youjip Won

Page 10: Youjip Won - oslab.kaist.ac.kr

10Youjip Won

Page 11: Youjip Won - oslab.kaist.ac.kr

11Youjip Won

inode API’s

ialloc : allocate a new inode.

iget : find the inode entry from inode cache.

itrunc : reduce the size of inode to 0 and release all allocated data block

iput : decrease the reference count after completing the reference to the inode

ilock : lock the in-memory node.

for copying the on-disk inode data to in-memory inode data

iunlock : release the sleeplock acquired at ilock.

Page 12: Youjip Won - oslab.kaist.ac.kr

12Youjip Won

ialloc

scan the inode structures on the disk for a free one.

If it finds one,

claims it by writing the “new type” to the disk,

loads it to the inode cache,

and returns the pointer (iget).

be sure to lock (ilock) the “struct inode” in iget.

Page 13: Youjip Won - oslab.kaist.ac.kr

13Youjip Won

allocating a new inode

inode array in the diskblock

inode

disk

memory

buffer cache

1.bread()

2. mark it used.

3.log_write: mark it used.

icache

4.iget

Page 14: Youjip Won - oslab.kaist.ac.kr

14Youjip Won

struct inode* ialloc(uint dev, short type){

int inum;

struct buf *bp;

struct dinode *dip;

for(inum = 1; inum < sb.ninodes; inum++){

bp = bread(dev, IBLOCK(inum, sb));

dip = (struct dinode*)bp->data + inum%IPB;

if(dip->type == 0){

memset(dip, 0, sizeof(*dip));

dip->type = type;

log_write(bp); // mark it as ‘allocated' on the disk

brelse(bp);

return iget(dev, inum);

}

}

...

}

Code: inode * ialloc()

bread() is called multiple times for the

same block. can we improve it?

Read inode blocks of inode area one by one

Page 15: Youjip Won - oslab.kaist.ac.kr

15Youjip Won

typical inode operation

Thus a typical sequence is:

ip = iget(dev, inum)

ilock(ip)

... examine and modify ip->xxx ...

iunlock(ip)

iput(ip)

Page 16: Youjip Won - oslab.kaist.ac.kr

16Youjip Won

Obtaining the inode pointer: iget (dev, inum)

Page 17: Youjip Won - oslab.kaist.ac.kr

17Youjip Won

inode * iget(dev,inum)

returns the pointer to the in-core copy of the struct inode with dev and inum.

increases the reference count.

locks the icache when it starts and unlock the icache when it finishes.

reference to inode stays valid until the matching call to iput() is made and

reference count becomes 0.

If the requested inode is not in icache, then creates one with valid field being 0.

It does not read the inode from the disk.

It separates the process of reserving a slot in icache from the process of reading the

associated inode from the disk.

Page 18: Youjip Won - oslab.kaist.ac.kr

18Youjip Won

inode * iget(dev,inum)

static struct inode* iget(uint dev, uint inum)

{

struct inode *ip, *empty;

acquire(&icache.lock);

...

empty = 0;

for(ip = &icache.inode[0]; ip < &icache.inode[NINODE]; ip++){

if(ip->ref > 0 && ip->dev == dev && ip->inum == inum){

ip->ref++;

release(&icache.lock);

return ip;

}

if(empty == 0 && ip->ref == 0) //remember the empty slot

empty = ip;

}

...

}

Check whether inode is cached already in in-memory inode cache

If it is in in-memory cache, ref count is increased by one

Page 19: Youjip Won - oslab.kaist.ac.kr

19Youjip Won

inode * iget(dev,inum) (cont.)

static struct inode* iget(uint dev, uint inum)

{

struct inode *ip, *empty;

...

empty = 0;

for(ip = &icache.inode[0]; ip < &icache.inode[NINODE]; ip++){

...

if(empty == 0 && ip->ref == 0)

empty = ip;

}

ip = empty;

ip->dev = dev;

ip->inum = inum;

ip->ref = 1;

ip->valid = 0;

release(&icache.lock);

return ip;

}

If it is not in icache, inode is allocated.

Page 20: Youjip Won - oslab.kaist.ac.kr

20Youjip Won

ilock and iunlock

xv6 allows only one process can read or write a file content or file metadat

a at a time.

How can we guarantee the exclusive access?

ilock() and iunlock()

lock the inode in the icache.

it’s sleep lock!

Page 21: Youjip Won - oslab.kaist.ac.kr

21Youjip Won

ilock (inode *ip)

1. sleep-lock the inode cache entry pointed by ip.

2. bring-in the inode inum from disk to icache if necessary.

inode array in the diskblock

inode

disk

memory

icache

buffer cache

1. bread()

ip

2.copy

Page 22: Youjip Won - oslab.kaist.ac.kr

22Youjip Won

ilock

void ilock(struct inode *ip){

struct buf *bp;

struct dinode *dip;

if(ip == 0 || ip->ref < 1) panic("ilock");

acquiresleep(&ip->lock);

if(ip->valid == 0){

bp = bread(ip->dev, IBLOCK(ip->inum, sb));

dip = (struct dinode*)bp->data + ip->inum%IPB;

ip->type = dip->type;

ip->major = dip->major;

ip->minor = dip->minor;

ip->nlink = dip->nlink;

ip->size = dip->size;

memmove(ip->addrs, dip->addrs, sizeof(ip->addrs));

brelse(bp);

ip->valid = 1;

if(ip->type == 0)

panic("ilock: no type");

}

}

If inode is invalid, it initializes fields

Page 23: Youjip Won - oslab.kaist.ac.kr

23Youjip Won

iunlock

void

iunlock(struct inode *ip)

{

if(ip == 0 || !holdingsleep(&ip->lock) || ip->ref < 1)

panic("iunlock");

releasesleep(&ip->lock);

}

If inode is invalid or lock is not acquired or there is no process that is not

reference this inode, panic occurs.

release the lock of inode

Page 24: Youjip Won - oslab.kaist.ac.kr

24Youjip Won

typical usage

Thus a typical sequence is:

ip = iget(dev, inum)

ilock(ip)

... examine and modify ip->xxx ...

iunlock(ip)

iput(ip)

Page 25: Youjip Won - oslab.kaist.ac.kr

25Youjip Won

void iput(inode * ip)

inode array in the diskblock

inode

disk

memory

buffer cache

icache

ip

Write the inode in the inode cache to disk.

done by

iupdate

Page 26: Youjip Won - oslab.kaist.ac.kr

26Youjip Won

void iput (inode * ip)

decreases ref, the reference count for the in-memory inode.

If the reference counter is 0, the slot in the icache can be recycled.

If nlink is 0 (no link) and reference counter is 0,

frees the inode from the disk.

frees all blocks associated with the inode ( itrunc).

set the in-core inode type to UNUSED ( 0 ) and logs the updated inode to the disk.

Page 27: Youjip Won - oslab.kaist.ac.kr

27Youjip Won

iput

void iput(struct inode *ip)

{

acquiresleep(&ip->lock);

if(ip->valid && ip->nlink == 0){

acquire(&icache.lock);

int r = ip->ref;

release(&icache.lock);

if(r == 1){

itrunc(ip); // Free all data blocks of file by

using itruc()

ip->type = 0; // Modify type to 0 (0 means unused inode)

iupdate(ip); // Apply modified data

ip->valid = 0;

}

}

releasesleep(&ip->lock);

acquire(&icache.lock);

ip->ref--;

release(&icache.lock);

}

Decrease reference count of inode by one

Page 28: Youjip Won - oslab.kaist.ac.kr

28Youjip Won

iput

void iput(struct inode *ip)

{

acquiresleep(&ip->lock);

if(ip->valid && ip->nlink == 0){

acquire(&icache.lock);

int r = ip->ref;

release(&icache.lock);

if(r == 1){

itrunc(ip); // Free all data blocks of file by

using itruc()

ip->type = 0; // Modify type to 0 (0 means unused inode)

iupdate(ip); // Apply modified data

ip->valid = 0;

}

}

releasesleep(&ip->lock);

acquire(&icache.lock);

ip->ref--;

release(&icache.lock);

}

If nlink is 0, the inode is released

Decrease reference count of inode by one

Page 29: Youjip Won - oslab.kaist.ac.kr

29Youjip Won

iput

void iput(struct inode *ip)

{

acquiresleep(&ip->lock);

if(ip->valid && ip->nlink == 0){

acquire(&icache.lock);

int r = ip->ref;

release(&icache.lock);

if(r == 1){

itrunc(ip); // Free all data blocks of file by

using itruc()

ip->type = 0; // Modify type to 0 (0 means unused inode)

iupdate(ip); // Apply modified data

ip->valid = 0;

}

}

releasesleep(&ip->lock);

acquire(&icache.lock);

ip->ref--;

release(&icache.lock);

}

If nlink is 0, the inode is released

Decrease reference count of inode by one

Page 30: Youjip Won - oslab.kaist.ac.kr

30Youjip Won

iput

void iput(struct inode *ip)

{

acquiresleep(&ip->lock);

if(ip->valid && ip->nlink == 0){

acquire(&icache.lock);

int r = ip->ref;

release(&icache.lock);

if(r == 1){

itrunc(ip); // Free all data blocks of file by

using itruc()

ip->type = 0; // Modify type to 0 (0 means unused inode)

iupdate(ip); // Apply modified data

ip->valid = 0;

}

}

releasesleep(&ip->lock);

acquire(&icache.lock);

ip->ref--;

release(&icache.lock);

}

If nlink is 0, the inode is released

Decrease reference count of inode by one

Page 31: Youjip Won - oslab.kaist.ac.kr

31Youjip Won

itrunc (inode * ip)

1. free the data blocks pointed by the direct pointers.

2. free the data blocks pointed by the indirect pointers.

3. free the indirect block.

4. set the file size to 0.

5. safely store the updated inode.

Page 32: Youjip Won - oslab.kaist.ac.kr

32Youjip Won

X

X

free

free

Page 33: Youjip Won - oslab.kaist.ac.kr

33Youjip Won

XX

X

X

X

free

free

free

free

Page 34: Youjip Won - oslab.kaist.ac.kr

34Youjip Won

X

X

X

X

X

free

free

free

free

free

Page 35: Youjip Won - oslab.kaist.ac.kr

35Youjip Won

itrunc

static void itrunc(struct inode *ip)

{

int i, j;

struct buf *bp;

uint *a;

for(i = 0; i < NDIRECT; i++){

if(ip->addrs[i]){

bfree(ip->dev, ip->addrs[i]);

ip->addrs[i] = 0;

}

}

...

Page 36: Youjip Won - oslab.kaist.ac.kr

36Youjip Won

itrunc

...

if(ip->addrs[NDIRECT]){

bp = bread(ip->dev, ip->addrs[NDIRECT]);

a = (uint*)bp->data;

for(j = 0; j < NINDIRECT; j++){

if(a[j])

bfree(ip->dev, a[j]);

}

brelse(bp);

bfree(ip->dev, ip->addrs[NDIRECT]);

ip->addrs[NDIRECT] = 0;

}

ip->size = 0;

iupdate(ip);

}

Page 37: Youjip Won - oslab.kaist.ac.kr

37Youjip Won

iupdate(inode * ip)

update the on-disk inode ;

inode array in the diskblock

inode

disk

memory

buffer cache

icache

ip

1. bread

2. copy

3. log_write

Page 38: Youjip Won - oslab.kaist.ac.kr

38Youjip Won

void iupdate (inode * ip)

Copy a modified in-memory inode to disk.

Must be called after every change to an ip->xxx field that lives on disk, since i-

node cache is write-through.

Caller must hold ip->lock.

Page 39: Youjip Won - oslab.kaist.ac.kr

39Youjip Won

void iupdate (inode * ip)

void iupdate(struct inode *ip)

{

struct buf *bp;

struct dinode *dip;

bp = bread(ip->dev, IBLOCK(ip->inum, sb));

dip = (struct dinode*)bp->data + ip->inum%IPB;

dip->type = ip->type;

dip->major = ip->major;

dip->minor = ip->minor;

dip->nlink = ip->nlink;

dip->size = ip->size;

memmove(dip->addrs, ip->addrs, sizeof(ip->addrs));

log_write(bp);

brelse(bp);

}

Page 40: Youjip Won - oslab.kaist.ac.kr

40Youjip Won

Page 41: Youjip Won - oslab.kaist.ac.kr

41Youjip Won

iput and crash

Page 42: Youjip Won - oslab.kaist.ac.kr

42Youjip Won

Putting everything together: filewrite()

int filewrite(struct file *f, char *addr, int n) {

int r;

while(i < n){

begin_op();

ilock(f->ip);

if ((r = writei(f->ip, addr + i, f->off, n1)) > 0)

f->off += r;

iunlock(f->ip);

end_op();

}

}

Page 43: Youjip Won - oslab.kaist.ac.kr

43Youjip Won

readi/writei/stati

Now, let’s look at the the real system call is implemented using iget/iput/me

mmove.

readi: read the file

writei: write the file

stati: get the inode

Page 44: Youjip Won - oslab.kaist.ac.kr

44Youjip Won

readi(inode * ip, char*dst, uint off, uint u)

read n byte to dst from off position of ip.

if the inode is device, read directly to the user buffer.

If the inode represents file, read first into buffer cache and then copy to the

user buffer.

Page 45: Youjip Won - oslab.kaist.ac.kr

45Youjip Won

int

readi(struct inode *ip, char *dst, uint off, uint n)

{

uint tot, m;

struct buf *bp;

if(ip->type == T_DEV){

if(ip->major < 0 || ip->major >= NDEV || !devsw[ip->major].read)

return -1;

return devsw[ip->major].read(ip, dst, n);

}

if(off > ip->size || off + n < off)

return -1;

if(off + n > ip->size)

n = ip->size - off;

for(tot=0; tot<n; tot+=m, off+=m, dst+=m){

bp = bread(ip->dev, bmap(ip, off/BSIZE));

m = min(n - tot, BSIZE - off%BSIZE);

memmove(dst, bp->data + off%BSIZE, m);

brelse(bp);

}

return n;

}

raw device read vs. buffered read

Page 46: Youjip Won - oslab.kaist.ac.kr

46Youjip Won

writei(inode *ip, char *dst, uint off, uint u)

write n byte of dst to off position of ip.

if the inode is device, write directly to the user buffer.

If the inode represents file, write first to buffer cache and then call log_write().

Page 47: Youjip Won - oslab.kaist.ac.kr

47Youjip Won

int

writei(struct inode *ip, char *src, uint off, uint n)

{

uint tot, m;

struct buf *bp;

if(ip->type == T_DEV){

if(ip->major < 0 || ip->major >= NDEV || !devsw[ip->major].write)

return -1;

return devsw[ip->major].write(ip, src, n);

}

if(off > ip->size || off + n < off)

return -1;

if(off + n > MAXFILE*BSIZE)

return -1;

for(tot=0; tot<n; tot+=m, off+=m, src+=m){

bp = bread(ip->dev, bmap(ip, off/BSIZE));

m = min(n - tot, BSIZE - off%BSIZE);

memmove(bp->data + off%BSIZE, src, m);

log_write(bp);

brelse(bp);

}

if(n > 0 && off > ip->size){

ip->size = off;

iupdate(ip);

}

return n;

}

Page 48: Youjip Won - oslab.kaist.ac.kr

48Youjip Won

summary

inode structure: dinode, inode

iget, iput

ilock/iunlock

Updating the on-disk fields of the inode entry is write through: via iupdate().

protection

icache.lock: spin lock to protect the changes in the in-memory field of the inode

inode.lock: sleep lock, to synchronize the changes in the in-memory and on-dis

k inodes.