frash: exploiting storage class memory in hybrid …...frash: exploiting storage class memory in...

24
FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage Jaemin Jung, Youjip Won Hanyang University, Seoul, Korea and Eunki Kim , Hyungjong Shin , Byeonggil Jeon Samsung Electronics, Suwon, Korea In this work, we develop novel hybrid file system, FRASH, for storage-class memory and NAND Flash. Despite promising physical characteristics of Storage-Class memory, its scale is an order of magnitude smaller than the current storage device scale. This fact makes it less than desirable for use as independent storage device. We carefully analyze in-memory and on-disk file system objects in log-structured file system and exploit memory and storage aspects of the Storage-Class memory to overcome the drawbacks of the current log-structured file system. FRASH provides a hybrid view on the Storage-Class memory. It harbors in-memory data structure as well as on- disk structure. It provides non-volatility to key data structures which have been maintained in- memory in a legacy log-structured file system. This approach greatly improves the mount latency and effectively resolves the robustness issue. By maintaining on disk structure in Storage-Class memory, FRASH provides byte-addressability to the file system object and metadata for page and subsequently greatly improves the I/O performance compared to the legacy log-structured approach. While storage-class memory offers byte granularity, it is still far slower than its DRAM counter part. We develop a Copy-On-Mount technique to overcome the access latency difference between main memory and Storage-Class Memory. Our file system was able to reduce the mount time by 92% and file system I/O performance is increased by 16%. Categories and Subject Descriptors: D.4.2 [Operating System]: Storage Management; D.4.3 [Operating System]: File Systems Management General Terms: Storage-Class Memory, File System Additional Key Words and Phrases: Flash Storage, Log-structured File System 1. INTRODUCTION 1.1 Motivation Storage-Class memory is a next generation memory device which can preserve data without electricity and which can be accessed in byte-granularity. There exist sev- eral semiconductor technologies for Storage-Class memory devices. These include PRAM(Phase Change RAM), FRAM(Ferro-electric RAM), MRAM(Magnetic RAM), RRAM(Resistive RAM), and Solid Electrolyte [Freitas et al. 2008]. All these tech- nologies are in the inception stage. It is currently too early to determine which of these semiconductor devices will be the most marketable. Once realized to proper Author’s address: Jaemin Jung, Youjip Won, Dept. of Electrical and Computer Engineering, Hanyang University, Seoul, Korea Eunki Kim, Hyungjong Shin, Byeonggil Jeon, Samsung Electronics, Suwon, Korea Corresponding Author This work was performed while the authors were graduate students at Hanyang University. Submitted to ACM Transactions on Storage

Upload: others

Post on 12-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory inHybrid File System for Hierarchical Storage

Jaemin Jung, Youjip Won†

Hanyang University, Seoul, Korea

and

Eunki Kim‡, Hyungjong Shin‡, Byeonggil Jeon‡

Samsung Electronics, Suwon, Korea

In this work, we develop novel hybrid file system, FRASH, for storage-class memory and NANDFlash. Despite promising physical characteristics of Storage-Class memory, its scale is an orderof magnitude smaller than the current storage device scale. This fact makes it less than desirablefor use as independent storage device. We carefully analyze in-memory and on-disk file systemobjects in log-structured file system and exploit memory and storage aspects of the Storage-Classmemory to overcome the drawbacks of the current log-structured file system. FRASH providesa hybrid view on the Storage-Class memory. It harbors in-memory data structure as well as on-disk structure. It provides non-volatility to key data structures which have been maintained in-memory in a legacy log-structured file system. This approach greatly improves the mount latencyand effectively resolves the robustness issue. By maintaining on disk structure in Storage-Classmemory, FRASH provides byte-addressability to the file system object and metadata for pageand subsequently greatly improves the I/O performance compared to the legacy log-structuredapproach. While storage-class memory offers byte granularity, it is still far slower than its DRAMcounter part. We develop a Copy-On-Mount technique to overcome the access latency differencebetween main memory and Storage-Class Memory. Our file system was able to reduce the mounttime by 92% and file system I/O performance is increased by 16%.

Categories and Subject Descriptors: D.4.2 [Operating System]: Storage Management; D.4.3[Operating System]: File Systems Management

General Terms: Storage-Class Memory, File System

Additional Key Words and Phrases: Flash Storage, Log-structured File System

1. INTRODUCTION

1.1 Motivation

Storage-Class memory is a next generation memory device which can preserve datawithout electricity and which can be accessed in byte-granularity. There exist sev-eral semiconductor technologies for Storage-Class memory devices. These includePRAM(Phase Change RAM), FRAM(Ferro-electric RAM), MRAM(Magnetic RAM),RRAM(Resistive RAM), and Solid Electrolyte [Freitas et al. 2008]. All these tech-nologies are in the inception stage. It is currently too early to determine which ofthese semiconductor devices will be the most marketable. Once realized to proper

Author’s address: Jaemin Jung, Youjip Won, Dept. of Electrical and Computer Engineering,Hanyang University, Seoul, KoreaEunki Kim, Hyungjong Shin, Byeonggil Jeon, Samsung Electronics, Suwon, Korea†Corresponding Author‡This work was performed while the authors were graduate students at Hanyang University.

Submitted to ACM Transactions on Storage

Page 2: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

2 · Jaemin Jung et al.

4

64

256

1000

2000

128

256

1000

4000

1

10

100

1000

10000

2004 2006 2008 2010 2012 2014

De

ns

ity

[M

bit

]

Years

NVRAM Technology Trend

MRAM

FRAM

Fig. 1. NVRAM Technology Trend: FRAM [Nikkei ] and MRAM [NEDO ]

scale, Storage-Class memory is going to resolve most of the technical issues whichcurrently confound storage system administrators, e.g. reliability, heat, and powerconsumption, and speed [Schlack 2004]. However, these devices still leave muchto be desired as independent storage devices due the scale(Fig. 1). Size of thelargest FRAM and MRAM are 64 Mbit [Kang et al. 2006], and 4Mbit [Freescale ],respectively.

Parallel to the advancement of Storage-Class memory, Flash based storage isnow positioned as one of the key constituents in computer systems. The usage offlash based storage ranges from storage for mobile embedded devices, e.g. MP3players and portable multimedia players, to storage for enterprise servers. Flashbased storage is carefully envisioned as a possible replacement for the legacy HardDisk Based Storage system. While Flash based storage devices effectively addressa number of technical issues, Flash still has two fundamental drawbacks. It isnot possible to overwrite the existing data and it has a limited number of erasecycles. The log-structured filesystem technique [Rosenblum and Ousterhout 1992]and FTL(Flash Translation Layer) [Intel ] have been proposed to address theseissues. The problem with log structured file system is memory requirement andlong mount latency. Since FTL is usually implemented in hardware, it consumesmore power than log-structured filesystem approach. Also, FTL does not showgood performance under small random write workload [Kim and Ahn 2008]. Thedrawbacks of log-structured filesystem becomes more significant when the Flashdevice becomes large.

In this work, we exploit the physical characteristics of Storage-Class memoryand use it to effectively address the drawbacks of the log-structured file system.We develop a storage system which consists of Storage-Class memory and Flashstorage and develop a hybrid file system, FRASH. Storage-Class memory is byte-addressable, non-volatile and very fast. It can be integrated in the system via astandard DRAM interface or via high speed I/O interface, e.g. PCI. Storage-Classmemory can be accessed through the memory address space or through file systemname space. These characteristics pose an important technical challenge which hasSubmitted to ACM Transactions on Storage

Page 3: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3

not been addressed before. Three key technical issues require elaborate treatmentin developing the hybrid file system. First, we need to determine the appropriatehierarchy for each of the file system components. Second, when the storage systemconsists of multiple hierarchy, file system objects for each hierarchy need to betailored to effectively incorporate the physical characteristics of the device. Weneed to develop appropriate data structure for file system objects which reside atthe Storage-Class memory layer. Third, we need to determine whether we useStorage-Class Memory as storage or memory.

Our work distinguishes itself from existing works and makes significant contribu-tion in a number of aspects. First, different from existing hybrid file system for byte-addressable NVRAM, FRASH imposes hybrid view on byte-addressable NVRAM.FRASH uses byte-addressable NVRAM as storage and as memory device. As stor-age, we carefully analyze access characteristics of individual fields of metadata.Based upon the characteristics, we categorize them into two sets which need to bemaintained in byte-addressable NVRAM and NAND flash, respectively. FRASHfile system is designed to maintain metadata in byte-addressable NVRAM effec-tively exploiting its access characteristics . As memory, byte-addressable NVRAMalso harbors in-core data structures which are dynamically constructed, e.g. objectand PAT. Via enabling persistency to in-core data structures, FRASH relieves theoverhead of creating and initializing in-core data structures at file system mountphase. This approach enables us to make the file system faster, and robust againstunexpected failure. Second, We address the speed difference issue between DRAMand byte-addressable NVRAM. Despite its promising physical characteristics, byte-addressable NVRAM is far slower than DRAM. As currently it stands, it is infeasi-ble for byte-addressable NVRAM to replace the roll of DRAM. None of the existingworks properly addressed this issue. In this work, we propose copy-on-mount tech-nique to address this issue. Third, few works implemented physical hierarchicalstorage and hybrid file system and performed comprehensive analysis on variousapproaches of using byte-addressable NVRAM in hierarchical storage. In this work,we physically built two other file systems which utilize byte-addressable NVRAMeither as memory device or as storage device. We performed comprehensive anal-ysis on three different ways of exploiting byte-addressable NVRAM in hierarchicalstorage. We update the manuscript as follows.

The notion of hierarchical storage in maintaining data is not a new conceptand has been around for more than a couple of decades. There are numerouspreceding works to form storage with multiple hierarchies. The hierarchical storagecan consist of disk and tape drive [Wilkes et al. 1996; Lau and Lui 1997], fastdisk and slow disk [Deshpande and Bunt ], NAND flash and hard disk [Kgil et al.2008], byte-addressable NVRAM and HDD [Miller et al. 2001; Wang et al. 2006],byte-addressable NVRAM and NAND flash [Kim et al. 2007; Doh et al. 2007; Parket al. 2008]. All these works aim at maximizing the performance(access latency andI/O bandwidth) and reliability while minimizing TCO(Total Cost of Ownership)via exploiting access characteristics on the underlying files.

Significant fraction of file system I/O operation is about file system metadata,e.g. superblock, inode, directory structure, various bitmaps and etc. These objectsare much smaller than a block, e.g. superblock is about 300Byte, inode is 128Byte.

Submitted to ACM Transactions on Storage

Page 4: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

4 · Jaemin Jung et al.

Recent advancement of memory device which is non-volatile and byte-addressablemakes it possible to maintain storage hierarchy in smaller granularity than block.A number of works propose to exploit byte-addressability and non-volatility of newsemi-conductor device in hierarchical storage [Miller et al. 2001; Kim et al. 2007;Doh et al. 2007; Park et al. 2008]. These file systems improve performance viamaintaining small objects, e.g. file system metadata, file inode, attributes, bitmapin byte-addressable NVRAM layer. Since byte-addressable NVRAM is much fasterthan existing block device, e.g. NAND flash and HDD, maintaining frequentlyaccessed objects and small files in byte-addressable NVRAM can improve the per-formance significantly. The objective of this work is to develop hybrid file system forhierarchical storage which consists of byte-addressable NVRAM and NAND Flashdevice. Previously, none of the existing works properly exploit the storage aspectand memory aspect of the byte-addressable NVRAM simultaneously in their hybridfile system design. These works proposed to either migrate the on-disk structuresonto byte-addressable NVRAM or to maintain some of the in-core structures atbyte-addressable NVRAM. We impose a hybrid view on byte-addressable NVRAMand file system is designed to properly exploit its physical characteristics. Noneof the existing works properly incorporate the bandwidth and latency differencebetween DRAM and byte-addressable NVRAM in maintaining in-core filesystemobjects. Despite many proposals to directly maintain metadata in byte-addressableNVRAM [Doh et al. 2007; Park et al. 2008], we find this approach practically infea-sible because of the speed of byte-addressable NVRAM. Byte-addressable NVRAMis far slower than DRAM and from the performance point of view, it is much betterto maintain metadata objects in DRAM. Most of the existing works on hierarchicalstorage with byte-addressable NVRAM focus on using byte-addressable NVRAMto harbor on-disk data structures, e.g., inode, metadata, superblocks and etc. Forfile system to use these objects properly, it still requires transforming the object tomemory friendly format. This procedure requires significant amount of time espe-cially when file system needs to scan multiple objects from the storage device andto create summary information in memory. Log-structured file system [Rosenblumand Ousterhout 1992; Manning 2001; jff ] is typical example.

Via maintaining in-memory structures in byte-addressable NVRAM, we are ableto provide persistency to in-memory structures. We can reduce the overhead of sav-ing(restoring) the in-memory data structures to(from) the disk. Also, file systembecomes much more robust against unexpected system failure and recovery over-head becomes smaller. Via maintaining file metadata and page metadata to byte-addressable NVRAM, file access becomes much faster and can reduce the number ofthe expensive ’write’ operation in flash device. Second, we develop the technique toovercome the access latency issues. While byte-addressable NVRAM delivers richbandwidth and small access latency, it is still far slower than DRAM. In case ofPRAM, read and write is 2–3 times slower and x10 slower than DRAM, respectively.We develop Copy-On-Mount technique to fill the performance gap between DRAMand byte-addressable NVRAM. Third, all algorithms and data structures devel-oped in this study is examined via comprehensive physical experiment. We buildhierarchical storage with 64Mb FRAM(largest one currently available) and NANDflash and develop hybrid file system FRASH on linux 2.4. For comprehensivenessSubmitted to ACM Transactions on Storage

Page 5: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 5

of the test, we developed two other file systems which use FRAM to maintain onlyin-memory objects and to maintain only on-disk objects, respectively.

1.2 Related Works

Reducing the file system mount latency has been an issue for more than a decade.Consumer electronics area is one of the typical places where file system mountlatency is critical. Growing number of consumer electronics products are equippedwith micro-processor and storage device, e.g. cell phone, digital camera, MP3player, set-top box, IP TV’s and etc. Significant fraction of these devices adoptsNAND flash based device and uses log-structured file system to manage it. As thesize of the flash device increases, overhead of mounting flash filesystem partition ismore significant and so is the overhead of file system recovery. There have been anumber of works to reduce the file system mount latency in NAND flash device.[Yim et al. 2005] and [Bityuckiy ] used file system snapshot to expedite file systemmount procedure. These file systems dedicate a certain region of flash device forfile system snapshot and stores file system snapshot in regular fashion. In thistechnique, it takes more time to unmount the file system. [Park et al. 2006] divideflash memory into two region: location information area and data area. At mountphase, they construct main memory structures from the location information area.Even though the location information area reduces area to scan, the mount timeis still proportional to flash memory size. [Wu et al. 2006] proposed a methodfor efficient initialization and crash recovery for flash-memory file system. It scanscheck region at mount phase which is located at fixed part in flash memory. Mostof the NAND flash file system use ’page’ as its basic unit and maintain metadatafor each page. To reduce the overhead of maintaining metadata for individualpages, MNFS [Kim et al. 2009] uses ’block’ as basic building block. Since MNFSrequires one access to spare area for each block at mount phase, mount time isreduced. MiNVFS [Doh et al. 2007] also improved file system mount speed withbyte-addressable NVRAM.

A number of works proposed hybrid file system by byte-addressable NVRAM andHDD’s [Miller et al. 2001; Wang et al. 2006]. Miller et al. proposed to use byte-addressable NVRAM file system [Miller et al. 2001]. In [Miller et al. 2001], byte-addressable NVRAM is used as a storage for file system metadata, write buffer anda storage for front parts of files. In Conquest file system [Wang et al. 2006], byte-addressable NVRAM layer harbors metadata, small files, executable files. Conquestproposed to use existing memory management algorithm, e.g. slab allocator andbuddy algorithm, for byte-addressable NVRAM. In performance experiment, Con-quest used battery backed DRAM to emulate byte-addressable NVRAM. In reality,byte-addressable NVRAM is two to ten times slower than legacy DRAM. It is notclear how Conquest will behave under realistic setting. Another set of works pro-posed hybrid file systems for byte-addressable NVRAM and NAND flash. Thesefile systems focus on addressing NAND-flash file system specific issues using byte-addressable NVRAM [Kim et al. 2007; Doh et al. 2007; Park et al. 2008]. Theyinclude mount latency, recovery overhead against unexpected system failure, theoverhead of accessing page metadata for NAND flash device. Kim et al. [Kimet al. 2007] stores file system metadata and spare area of NAND flash memoryin FRAM. They do not exploit he memory aspect of byte-addressable NVRAM.

Submitted to ACM Transactions on Storage

Page 6: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

6 · Jaemin Jung et al.

MiNVFS [Doh et al. 2007] and PFFS [Park et al. 2008] store file system meta-data in byte-addressable NVRAM and file data in NAND flash memory. Theyaccess byte-addressable NVRAM directly during file system operation. This di-rect accesses to byte-addressable NVRAM makes mount latency independent tofile system size. These file systems exhibit significant improvement in mount la-tency. However, it will be practically infeasible to maintain objects directly onbyte-addressable NVRAM due to its slow speed. Jung et al. proposed to imposeblock device abstraction on NVRAM [Jung et al. 2009]. They suggested that writeaccess to NVRAM could be reliable by the simple block device abstraction withatomicity support.

Our work distinguishes itself from existing works and makes significant contri-bution in a number of aspects. First, different from existing hybrid file systemfor byte-addressable NVRAM, FRASH imposes hybrid view on byte-addressableNVRAM. FRASH uses byte-addressable NVRAM as storage and as memory de-vice. As storage, byte addressable NVRAM harbors various metadata for file andfile system. As memory, byte-addressable NVRAM harbors in-core data structureswhich are dynamically constructed at file system mount phase. Via enabling persis-tency to in-core data structures, FRASH relieves the overhead of creating and ini-tializing in-core data structures at file system mount phase. This approach enablesus to make the file system faster, and robust against unexpected failure. Exist-ing works do not address the latency characteristics of byte-addressable NVRAM’sand assume that these devices are as fast as DRAM. Aligned with this, these worksproposed to maintain various objects which are used to be in main memory at byte-addressable NVRAM. However, in practice, byte-addressable NVRAM is far slowerthan DRAM(Table I). From filesystem’s point of view, it is practically infeasible tosimply migrate and to maintain the in-core objects to byte-addressable NVRAM.In our work, we carefully incorporate the latency characteristics of byte-addressableNVRAM and proposed a file system technique called Copy-On-Mount to overcomethe latency difference between byte-addressable NVRAM and DRAM. In this work,we physically built two other file systems which utilize byte-addressable NVRAMeither as memory device or as storage device. We performed comprehensive anal-ysis on three different ways of exploiting byte-addressable NVRAM in hierarchicalstorage.

The rest of this paper is organized as follows. Section 2 introduces the Flashand byte-addressable NVRAM device technologies. Section 3 deals with the log-structured file system technique for Flash storage. Section 4 explains the technicalissues in Operating System to adopt Storage-Class Memory. Section 5 explains thedesign of FRASH file system. Section 6 explains the details of the hardware systemdevelopment for FRASH. Section 7 discusses the results of performance experiment.Section 8 concludes the paper.

2. NVRAM(NON-VOLATILE RAM) TECHNOLOGY

2.1 Flash Memory

Flash device is a type of EEPROM which can retain data without power. There aretwo types of Flash storage: NAND Flash and NOR Flash. The unit cell structureof NOR flash and NAND flash are the same(Fig. 2(a) and Fig. 2(b)). The unit cellSubmitted to ACM Transactions on Storage

Page 7: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 7

Item DRAM FRAM PRAM MRAM NOR NAND

ByteAddressable

YES YES YES YES Readonly NO

Non-volatile NO YES YES YES YES YESRead 10ns 70ns 68ns 35ns 85ns 15usWrite 10ns 70ns 180ns 35ns 6.5us 200usErase none none none none 700ms 2msPower

consumption High Low High Low High High

Capacity High Low High Low High VeryHigh

Endurance 1015 1015 > 107 1015 100K 100KPrototype Size 64Mbit 512Mbit 4MBit

Table I. Comparison of Non-volatile RAM Characteristics

B /L

W /L

Source(a) NAND

B /L

W /L

Source(b) NOR

B /L

W /L

P /L(c) FRAM

B /L

W /L

V SS

(d) PRAM

Fig. 2. Cell Schematics of NVRAM’s

is composed of only one transistor having a floating gate. When the transistor isturned on or off, the data status of the cell is defined as 1 or 0, respectively. Cellarray of NOR flash consists of parallel connection of several unit cells. It providesfull address and data buses, allowing random access to any memory location. NORflash can perform byte addressable operation and has faster read/write speed thanNAND flash. However, because of the byte addressable cell array structure, NORflash has slower erase speed and lower capacity than NAND flash.

A cell-string of NAND flash memory generally consists of a serial connection ofseveral unit cells to reduce cell area. The page, which is generally composed of512-byte data and 16-byte spare cells(or 2048-byte data and 64 byte spare cells),is organized with a number of unit cells in a row. It is a unit for the read/writeoperation. The block, which is composed of 32 pages (or 64 pages for 2048 bytepage), is the base unit for the erase operation. Erase operation requires high voltageand longer latency. Erase operation sets all the cells of the block to data 1. Theunit cell is changed from 1 to 0 when the write data is 0, but there is no changewhen the write data is 1. NAND flash has faster erase and write times, and requiresa smaller chip area per cell, thus allowing greater storage density and lower costsper bit than NOR flash. The I/O interface of NAND flash does not provide a

Submitted to ACM Transactions on Storage

Page 8: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

8 · Jaemin Jung et al.

random-access external address bus and therefore Read and Write operation is alsoperformed in a page unit. From an Operating System’s point of view, NAND flashlooks similar to other secondary storage devices and is thus very suitable for use inmass-storage devices.

The major drawback of a Flash device is the limitation of the number of eraseoperations (known as endurance which is typically 100K cycles). This number oferase operations is a fundamental property of the floating gate. It is important thatall NAND flash cells go through a similar number of erase cycles to maximize lifetime of the individual cell. Therefore, NAND devices require bad block managementA number of blocks on the flash chip are set aside for storing mapping tables to dealwith bad blocks. The error-correcting and detecting checksum will typically correctan error where one bit per 256 bytes (2,048 bits) is incorrect. When this happens,the block is marked bad in a logical block allocation table, and its undamagedcontents are copied to a new block and the logical block allocation table is alteredaccordingly.

2.2 Storage-Class Memory

There are a number of emerging technologies for byte-addressable NVRAM. Theseinclude FRAM(Ferro-electric RAM), PCRAM(Phase-change RAM), MRAM(Magneto-resistive RAM), SE(Solid Electrolyte) and RRAM(Resistive RAM) [Freitas et al.2008].

FRAM (Ferro-electric RAM) [Kang et al. 2006] has ideal characteristics suchas low power consumption, fast read/write speed, random access, radiation hard-ness, and non-volatility. Among MRAM, PRAM, and FRAM, FRAM is the mostmatured technology and a small density device is already commercially available.

The unit cell of FRAM consists of one transistor and one ferro-electric capaci-tor(FACP)(Fig. 2(c)); known as 1T1C, which has the same schematic as DRAM.Since the charge of FACP retains its original polarity without power, FRAM canmaintain its stored data in the absence of power. Unlike DRAM, FRAM does notneed refresh operation and subsequently consumes less power. A write operationcan be performed by forcing a pulse to the FCAP through P/L or B/L for data”0” or data ”1”, respectively. Since the voltage of P/L and B/L for write opera-tion is same as Vcc, FRAM does not need additional high voltage as does NANDflash memory. This property enables FRAM to perform write operation in a muchfaster and simples way. FRAM design can be very versatile. It can be designedcompatible to an SRAM interface as well as a DRAM interface. Asynchronous,synchronous, or DDR FRAM can be designed.

PRAM [Raoux et al. 2008] consists of one transistor and one variable resis-tor(Fig. 2(d)). The variable resistor is integrated by GST(GeSbTe, Germanium-Antimony-Tellurium) material and acts as a storage element. The resistance ofGST material varies with respect to its crystallization status; it can be convertedto crystalline (low resistance) or to amorphous (high resistance) structure by forcingcurrent though B/L to Vss. This mechanism is adapted to PRAM for write method.Due to this conversion overhead, the write operation of PRAM spends more timeand current than read operation. This is the essential drawback of PRAM device.The read operation can be performed by sensing the current difference throughB/L to Vss. Even though the write is much slower than the read operation, PRAMSubmitted to ACM Transactions on Storage

Page 9: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 9

does not require an erase operation. It is being expected that its storage densitywill soon be able to compete with that of NOR flash. PRAM is being consideredas a future replacement for NOR flash memory. Contrary to PRAM, FRAM hasgood access characteristics. It is much faster than PRAM and read speed and writespeed is almost identical.

Table I summarizes the characteristics of Storage-Class Memory technologies.The current state of the art Storage-Class memory technology still leaves much tobe desired for storage in a generic computing environment. This is mainly due toscale of Storage-Class memory devices which is much smaller 1% of existing SolidState Disk.

3. LOG-STRUCTURED FILE SYSTEM FOR FLASH STORAGE

Flash Device

Object

ParentPhysical Address Translation Info

Object ObjectObject �

File Metadata page File data page Empty page

Physical Address Translation InformationM ainM emory

Fig. 3. On-disk data structure and in-memory data structure in log-structured filesystem for NAND Flash

Log structured file system [Rosenblum and Ousterhout 1992] maintains the filesystem partition as an append-only log. The key idea is to collect the small writeoperations into single large unit, e.g. page, and appends it to an existing log.The objective of this approach is to minimize the disk overhead(particularly seek)for small writes. In Flash storage, erase takes approximately ten times longerthan the write operation(Table I). A number of Flash file systems exploit the log-structured approach [jff ; Manning 2001] to address this issue. Fig. 3 illustratesthe organization of file system data structures in a log-structured file system forFlash storage. In log-structured file system, the file system maintains in-memorydata structures to keep track of the valid locations for each file system block. Thereare two data structures for this purpose. The first one is directory structure forall files in a file system partition. The second one is location of data blocks forindividual files. A leaf node of a directory tree corresponds to a file. The filestructure maintains a tree like data structure for pages belonging to itself. Theleaf node of this tree contains a physical location of the respective page. Fig. 5illustrates the relationship among the directory, file and data blocks.

Fig. 4 illustrates details of the spare cells for individual pages in one of the logstructured file systems for NAND Flash [jff ]. In this case, spare cells(or spare

Submitted to ACM Transactions on Storage

Page 10: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

10 · Jaemin Jung et al.

File Metadata

Data

Data

File Metadata

PM

PM

PM

PM

Flash device

Block Status Information

Data Status Information

Page ECC

Page Information Tuple

Page Metadata (PM )

file_number

file_page_number

file_byte_count

version

Page Information Tuple

ECC

Fig. 4. Page metadata structure for Flash page

area) contains the metadata for the respective page. We use the term spare areaand page metadata interchangeably. Metadata field carries the information aboutthe respective physical page(Block status, Data status, ECC of the content of ablock) and information related to the content(file id, page id, byte count, version,and ECC). File id is set to 0 for an invalid page. If the page id is 0, then therespective page contains file metadata, e.g. inode for Unix file system. Pagesbelonging to the same file have the same file id. Byte count denotes the number ofbytes used in a page. The serial number is used to identify the valid page when twoor more pages becomes alive due to a certain exception, e.g. power failure, duringupdating a page. When a new page is appended, the new page is written beforethe old chunk is deleted.

Object(/)

Object(file1)

PATI(file1)

Object(dir1)

Object(file2)

PATI(file2)

file1

/

file2

children

sibling

PATI

PATI

childrendir1

Directory Structure

FM (/)

FM (file1)

Data (file1)

Data (file1)

FM (dir1)

FM (file2)

Data (file2)

Flash Device(FM: File Metadata)

Fig. 5. Mapping from file system name space to physical location

In the mount phase, the file system scans all page metadata and extracts thepages with page id 0(Fig. 6). A page with id 0 contains metadata for the file. Withthis file metadata, file system builds an in-memory structure for the file object.In scanning the file system partition, the file system also examines the file id ofan individual page metadata and identifies the pages belonging to each file. Eachfile object forms a tree of its pages. A file is represented by the file object datastructure of the tree of its pages. Fig. 6 illustrates the data structure for a file tree.

There are two drawbacks of the log-structured file system: mount latency andmemory requirement. A log-structured file system needs to scan an entire filesystem partition to build the in-memory data structure for a file system snapshot.Submitted to ACM Transactions on Storage

Page 11: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 11

SCAN

PATI in Main Memory

Object

Physical Page address

Data PM

FM PM

Data PM

Data PM

Data PM

Data PM

Data PM

... ...

... ...

... ...

FM PM

Data PM

Fig. 6. Mounting the file system in Log Structured File System

A log-structured file system needs to maintain the file system snapshot to mapthe logical location of a block to the physical location. It also maintains the datastructure for metadata for individual pages in Flash storage. The total size of theper page metadata corresponds to 3.2% of the file system size. For a storage scaleFlash device, the memory requirements can be prohibitively large.

4. ISSUES IN EXPLOITING STORAGE-CLASS MEMORY IN FILE SYSTEM DE-SIGN

The current Operating System paradigm draws a clear line between memory andstorage and handles them in very different ways. The memory and storage systemare accessed via the address space and via file system name space, respectively.Memory and storage are very different world from Operating System’s point ofview in various aspects: latency, scale, I/O unit size and etc. Operating Systemsuse load-store and read()/write() interface for memory and storage devices,respectively. The methods for locating an object and protecting the object againstillegal access are totally different in memory and storage device. Advancement ofStorage-Class memory now calls for redesign of various Operating System technique,e.g. filesystem, read/write, protection and etc. to effectively exploit its physicalcharacteristics.

Storage-Class Memory can be viewed as memory, storage, or both. When Storage-Class memory is used as storage, it stores the information in a persistent manner.The main purpose of this approach is to reduce the access time and to improvethe I/O performance. When Storage-Class memory is used as memory, it storesthe information which can be derived from the storage and which is dynamicallycreated. The main purpose of maintaining versatile information in Storage-Classmemory is to reduce the time for constructing it which consists of crash recov-ery, file system mount, etc. The FRASH file system employs a hybrid approachto storage-class memory. Storage-class memory in FRASH file system has bothmemory characteristics and storage characteristics.

Submitted to ACM Transactions on Storage

Page 12: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

12 · Jaemin Jung et al.

5. FRASH FILE SYSTEM

The objective of this work is to develop a hybrid file system which can complementthe drawbacks of the existing file system for Flash storage by exploiting the physicalcharacteristics of Storage-Class Memory.

5.1 Maintaining In-memory data structure in Storage-Class Memory

In FRASH, we exploit the non-volatility and byte-addressability of Storage-ClassMemory. We carefully identified the objects which are maintained in the mainmemory and place these data structures in the Storage-Class Memory layer. Thekey data structures are Device Structure, Block Information Table, Page Bit Map,File Object and File Tree. Device Structure is similar to superblock in legacy filesystem. It contains the overall statistics and meta information on the file systempartition: page size, block size, number of files, number of free pages, the number ofallocated pages, etc. File system needs to maintain the basic information for eachblock and Block Information Table is responsible for maintaining this information.Page Bit Map is used to specify whether each page is in-use or not. File object datastructure is similar to inode in the legacy file system and contains file metadata.File metadata can be for file, directory, symbolic link and hard link. File Tree isa data structure to represent the page belonging to a file. Each file has one filetree associated with it. It has B+ tree like data structure and the leaf node of thetree contains the pointer to the respective page in a file. The structure of this treedynamically changes with the changes in file size.

In maintaining the in-memory data structure at the storage-class memory layer,we partition the storage-class memory region into two parts: fixed size region andvariable size region. Size of Device structure, Block Information Table and Page BitMap are determined by the file system partition size and does not change. Space forfile objects and file trees dynamically change as they are created and deleted. Wedevelop a space manager for storage-class memory. It is responsible for dynamicallyallocating and deallocating the Storage-Class memory to the file object and filetree. Instead of using existing memory allocation interface kmalloc(), we developa new management module scm alloc(). To expedite the process of allocationand deallocation, FRASH initializes linked lists of free file objects and file trees inthe Storage-Class Memory layer. scm alloc() is responsible for maintaining theselists. Fig. 7 schematically illustrates in-memory data structure in the Storage-ClassMemory.

Maintaining in-memory data structure in storage-class memory has significantadvantages. Mount operation becomes an order of magnitude faster. It is no longernecessary to scan file system partition to build in-memory data structure. Also,the file system becomes more robust against system crash and can recover morequickly.

5.2 Maintaining On-Disk Structure in Storage-Class Memory

FRASH file system exploits Storage-Class Memory in terms of memory and stor-age. The objective of maintaining in-memory data structure in the Storage-ClassMemory layer is to overcome the volatility of DRAM and to relieve the burdenof constructing this data structure during the mount phase. This is to exploitSubmitted to ACM Transactions on Storage

Page 13: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 13

Flash Device NVRAM

File M etadata

Data

Data

File M etadata�

Page M etadata

Page M etadata

Page M etadata

Page M etadata�

Device Info

Page B itm ap Array

Block Info

O bject Info

PAT Info

Storage System Part

Main Memory Part

Fig. 7. FRASH: Exploiting Storage Aspect and Memory aspect of Storage-Class Memory

the memory aspect of the Storage-Class Memory device. In the storage aspect ofstorage-class memory, we maintain a fraction of on-disk structure in Storage-ClassMemory layer. Storage-Class Memory is faster than Flash. According to our ex-periment, effective read and write speed is 10 times faster in FRAM than in NANDFlash(Table II). However, Storage-Class memory is order of magnitude smallerthan legacy storage device, e.g. SSD and HDD and therefore special care needs tobe taken in storing objects in the storage-class memory layer. We can increase thesize of the storage-class memory layer by using multiple chips. However, it is stillsmaller than the modern storage device.

FRASH maintains page metadata in Storage-Class Memory. This data structurecontains the information on individual pages. The file system for hard disk putsgreat emphasis on clustering the metadata and the respective data, e.g. blockgroup and cylindrical group [McKusick et al. 1984]. This is to minimize the seekoverhead involved in accessing Filesystem. Maintaining page metadata in Storage-Class Memory layers brings significant improvement in I/O performance. Detailsof the analysis will provided in section 7.

In a FRASH file system, Storage-Class Memory layer is organized as in Fig. 7.It is partitioned into two parts: in-memory and on-disk. The in-memory regioncontains the data structure which used to be maintained dynamically in the mainmemory. The on-disk region contains the page metadata for individual pages inFlash storage.

5.3 Copy-On-Mount

Storage-Class Memory is faster than legacy storage devices, e.g. Flash and HardDisk, but it is still slower than DRAM(Table I). Access latency of FRAM andDRAM is 110 nsec and 15 nsec, respectively. Reading and writing in-memory datastructure from and to storage-class memory is much slower than reading and writingit from legacy DRAM.

Submitted to ACM Transactions on Storage

Page 14: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

14 · Jaemin Jung et al.

Flash Device NVRAM

F ile M etada taD ata

D ata

F ile M etada ta�

P age M etada taP age M etada ta

P age M etada ta

P age M etada ta�

D evice In foP age B itm ap A rray

B lock In fo

O b ject In fo

PAT In fo�

Main Memory

O bject

P hysica l P age address

Copymount-time

Copyunmount-time

Fig. 8. Copy-On-Mount in FRASH

A number of data structures in the Storage-Class Memory layer, e.g. file objectand file tree, need to be accessed to perform I/O operations. As a result, I/Operformance actually becomes worse as a result of maintaining in-memory structurein Storage-Class Memory. We develop a Copy-On-Mount technique to address thisissue. In-memory data structures in Storage-Class Memory are copied into mainmemory during mount phase and regularly synchronized to Storage-Class memory.In case of system crash, FRASH reads the on-disk structure region of storage-classmemory, scans NAND Flash storage and reconstructs the in-memory data structureregion in the storage-class memory.

There is an important technical concern in maintaining in-memory structure instorage-class memory. Page metadata already resides in storage-class memory andin-memory data structures can actually be derived from page metadata. Maintain-ing in-memory data structure in non-volatile region can be thought as redundant.In fact, earlier version of FRASH maintains only page metadata in Storage-ClassMemory [Kim et al. 2007]. This approach still significantly reduces the mount la-tency since file system scans a much smaller region(Storage-Class Memory) whichis much faster than NAND Flash. However, in this approach, the file system needsto parse the page metadata and to construct in-memory data structures. Main-taining in-memory data structures in Storage-Class Memory removes the need forscanning, analyzing and rebuilding the data structure. FRASH memory-copies theimage from Storage-Class Memory to the DRAM region. It improves the mountlatency by 60 % in comparison to scanning the metadata from storage-class memory.

6. HARDWARE DEVELOPMENT

6.1 Design

We develop a prototype file system on embedded board. We use 64 MByte SDRAM,64 Mbit FRAM chip and 128 MByte NAND Flash card for the main memory,Storage-Class Memory layer and Flash storage layer, respectively. 64 MBit FRAMchip is the largest scale under current state of art technology1. This storage system

1as of May 2008

Submitted to ACM Transactions on Storage

Page 15: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 15

is built into a SMDK2440 embedded system [Meritech ], which has an ARM 920Tmicroprocessor. Fig. 9 illustrates our hardware setup. FRAM has same accesslatency as SRAM: 110ns asynchronous read/write cycle time, 4Mb x 16 I/O, and1.8V operating power. Since the package type of FRAM is 69FBGA (Fine PitchBall Grid Array), we develop a daughter board to attach FRAM to the memoryextension pin of an SMDK2440 board. The SMDK 2440 board supports 8 banksfrom bank0 to bank7. These banks are directly managed by an Operating SystemKernel. We choose bank1 (0x0800 0000) for FRAM. FRASH is developed on Linux2.4.20. To manage the NAND Flash storage, we use existing log-structured filesystem, YAFFS [Manning 2001].

SMDK2440 Board

FRAM Artwork

8MB FRAM

Memory Extension Pin

S3C2440 CPU

SMDK2440 Board

FRAM Artwork

8MB FRAM

Memory Extension Pin

S3C2440 CPU

Fig. 9. FRASH Hardware

6.2 ECC issue in Storage-Class Memory

Storage-Class memory can play a role as storage or as memory. If Storage-Classmemory is used as memory, i.e. the data is preserved in a storage device, corrup-tion of memory data can be cured by rebooting the system and by reading therespective value from the storage. On the other hand, if Storage-Class memory isused as storage, data corruption can result in permanent loss of data. Storage-Classmemory technology aims at achieving an error rate comparable to DRAM since itis basically a memory device. For standard DDR2 memory, the error rate is 100soft errors during 10 billion device hours. 16 memory chips corresponds to one softerror for every 30 years [Yegulalp 2007]. This is longer than the life-time of mostcomputer systems.

There are two issues for ECC in Storage-Class memory which require elaboration.The first one is whether Storage-Class memory requires hardware ECC or not. Thisissue arises from the memory aspect of the Storage-Class memory and is largelygoverned by the criticality of the system where Storage-Class memory is used. If itis used in mission critical system or servers, ECC should be adopted. Otherwise, itcan be overkill to use hardware ECC in Storage-Class memory. The second issueis whether Storage-Class memory requires software ECC or not. This issue arises

Submitted to ACM Transactions on Storage

Page 16: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

16 · Jaemin Jung et al.

due to the storage aspect of the Storage-Class memory. Flash and HDD providemechanism to protect the stored data from latent error. Even though Storage-Classmemory delivers a soft error rate of memory class device, it may still be necessaryto set aside a certain amount of space in Storage-Class memory to maintain ECC.

Both hardware and software ECC are not free. Hardware ECC requires extrahardware circuitry and will increase cost. Software ECC entails additional com-puting overhead and will aggravate the access latency. According to Jeon [Jeon2008], mount latency decreases to 66% when the Operating System excludes theECC checking operation log-structured file system for NAND Flash. The overalldecision on this matter should be made based upon the usage and criticality of thetarget system. One thing for sure is that storage class memory delivers memoryclass soft error rate and it is much more reliable than legacy Flash storage. Webelieve that in Storage-Class memory, we do not have to provide the same level ofprotection as in Flash storage. In this study, we maintain page metadata at theStorage-Class memory layer and exclude ECC for page metadata.

6.3 Voltage Change and Storage-Class Memory

Storage-Class memory should be protected against voltage level transition caused byshutdown of the system. Due to the capacitor in the electric circuit, the voltage levelgradually(in the order of msec) decreases when the device is shut down. The voltagelevel stays within the operating range temporarily until it goes below thresholdvalue. On the other hand, when system is shut down, the memory controller setsthe memory input voltage to 0, and this takes effect immediately(in the orderof pico seconds). Usually, memory controller enables CEB(Chip Enable Signal)and WEB(Write Enable Signal) by dropping the voltage to 0. It implies thatwhen system is shut down, there exist a period where voltage stays at operatingregion and memory controller generates signals to write something(Fig. 10). Anunexpected value can be written to memory cell. This does not cause any problemsfor DRAM or Flash storage. DRAM is volatile and the contents of DRAM are resetwhen the system shuts down. Flash storage(NOR and NAND) requires several buscycles of sustained command signal to write data, but the capacitor in the systemdoes not maintain the voltage at operating level for several bus cycles. In Storage-Class memory, it can cause a problem. Particularly in FRAM(or MRAM), write isperformed in a single cycle and the content at address 0 in FRAM is destroyed atthe system shutdown phase and the effect persists.

When a system adopts Storage-Class memory, an electric circuit needs to bedesigned so that it does not unexpectedly destroy the data in storage class memorydue to voltage transition. In this work, our board is not designed to handle this sowe use a reset pin to protect the data at address 0 of FRAM.

7. PERFORMANCE EXPERIMENT

7.1 Experiment Setup

FRASH file system has reached its current form after several phases of refinement.In this section, we present the results we obtained through the course of this study.We compare four different file systems. The first one is YAFFS, a legacy log-structured file system for NAND Flash storage [Manning 2001]. The second oneSubmitted to ACM Transactions on Storage

Page 17: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 17

Command and Signal

Power

Reset Signal

Fig. 10. Voltage Level of Input signals to FRAM

SC

AN

F ile M etadata

em pty

File M etadata

File M etadata

em pty

PM

PM

PM

PM�

Flash Device NVRAM

Page Metadata part

Index Pointer

File Metadata part

Main Memory

O bject

Physical Page address

Data

File M etadata

File M etadata

Data�

PM

PM

PM

PM

(PM : Page M etadata)

Fig. 11. Storage Class Memory As Storage in Hybrid File System

is the hybrid file system which uses Storage-Class Memory only as a storage layerwhich harbors a fraction of NAND Flash contents in the Storage-Class Memorylayer [Kim et al. 2007]. Let us call this file system SAS(Storage-Class Memory AsStorage). In the SAS file system, the Storage-Class Memory layer maintains pagemetadata and the file metadata. Recall that when page id in page metadata is 0,the respective content in the page is file metadata. It uses the same format for pagemetadata and file metadata as it does in Flash storage. The SAS File system needsto scan the Storage-Class Memory region to build in-memory structure(Fig. 11).The third file system uses Storage-Class Memory as memory [Shin 2008] and wecall this SAM(Storage-Class Memory As Memory) file system. In the SAM filesystem, the Storage-Class Memory layer maintains in-memory objects(device in-formation, Page Information Table, Bit map, file objects and file trees). In SAMfile system, Operating System directly manages Storage-Class Memory. The fourth

Submitted to ACM Transactions on Storage

Page 18: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

18 · Jaemin Jung et al.

one is the FRASH file system. We examine the performance of the four file systemsin terms of Mount Latency, Metadata I/O, and Data I/O. We use two widely usedbenchmark suites in our experiment: LMBENCH [McVoy and Staelin 1996] andIOZONE [http://www.iozone.org ].

7.2 Mount Latency

0

200

400

600

800

1000

1200

10 20 30 40 50 60 70 80 90 100

(mse

c)

Partition Size(Mbyte)

YAFFSSASSAM

FRASH

(a) Under varying file system partition size

0

1000

2000

3000

4000

5000

0 2000 4000 6000 8000

(mse

c)

Number of files

YAFFSSASSAM

FRASH

(b) Under varying number of files

Fig. 12. Mount Latency

We compare the mount latency of the four file systems under varying file systemsizes and under varying number of files. Fig. 12(a) illustrates the performanceresults under varying file system partition size. In YAFFS, file system mountlatency linearly increases with the size of the file system partition. This is becausethe Operating System needs to scan the entire file system partition to build thedirectory structure of the file system objects and file trees. File system mountlatency does not vary much subject to file system partition size and the numberof files in the file system partition. Among these three, SAS approach yields thelongest mount latency. However, this difference is not significant since the mountlatency between SAS and FRASH file systems is less than 20 msec. Given thatmount latency only matters from the user’s point of view, it is unlikely that a humanbeing can perceive a difference of 20 msec. If we carefully look at the mount latencygraph of FRASH and SAS, the mount latency of FRASH and SAS increases withfile system partition size. Here is the reason. SAS scans the Storage-Class Memoryregion and constructs in-memory data structures for file system from scanned pagemetadata and file objects. Copy-On-Mount in FRASH requires scanning of theStorage-Class memory region. Therefore, mount latency is subject to the file systempartition size in both of these file systems. However, since FRASH does not have toinitialize the objects in main memory, FRASH has slightly shorter mount latencythan SAS. SAM(Storage-Class Memory As Memory) yields the shortest mountlatency of all four file systems. In SAM, there is no scanning of the Storage-ClassMemory region. In mount phase, SAM only initializes various pointers pointingto the appropriate objects in Storage-Class Memory. Therefore, mount latency inSAM is not only the smallest but also remains constant.

We examine the mount latency of each file system by varying the number of filesin the file system partition. Partition size is 100 MByte. We vary the number ofSubmitted to ACM Transactions on Storage

Page 19: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 19

files in the file system partition from 0 to 9,000 in increment of 1000. Fig. 12(b)illustrates the mount latency under a varying number of files. In this experiment, weexamine the overhead of initializing the directory structure of the file system and filetrees. YAFFS scans the entire file system and constructs in-memory structure forthe file system directory and file tree. The overhead of building this data structureis proportional to the number of file objects in the file system partition as well asthe file system partition size. In SAS and FRASH, the file system mount latencyincreases proportionally to the number of files in the system. Mount latency inFRASH is slightly smaller than the mount latency of SAS. SAM has the smallestmount latency which remains constant regardless of the number of files. This isbecause SAM does not scan the Storage-Class Memory region or the storage. Mountlatency of FRASH was 80% - 92% less than the mount latency of YAFFS.

The design goal of FRASH is to improve the mount latency as well as overallfile system performance. Existing works [Doh et al. 2007; Park et al. 2008] showgreater improvement in mount latency via directly using file system metadata inNVRAM region without caching it to DRAM. According to our experiment, how-ever, this approach is not practically feasible since file I/O becomes significantlyslower when we maintain file system metadata in byte-addressable NVRAM with-out caching. We carefully believe that considering overall file I/O performance andmount latency, FRASH exhibits superior performance to preceding works.

7.3 Metadata Operation

0

100

200

300

400

500

600

0KByte 1KByte 4KByte 10KByte

files

/sec

File Size

Meta Data Update:File Creation

YAFFS SASSAM

FRASH

(a) File Creation

0

100

200

300

400

500

600

0KByte 1KByte 4KByte 10KByte

files

/sec

File Size

Meta Data Update:File Deletion

YAFFS SASSAM

FRASH

(b) File Deletion

Fig. 13. Metadata Operation(LMBENCH)

We examine how effectively each file system manipulates file system metadata.Metadata in our context denotes directory entry, file metadata and various bitmaps.For this purpose, we measure the performance of file creation operations(creation/sec)and the number of file deletions(deletion/sec). We use LMBENCH to create 1000files. We use four different file sizes of 0KByte, 1KByte, 4KByte, and 10KByte increating 1000 files, respectively. Fig. 13(a) and Fig. 13(b) illustrate the experimen-tal results.

Creating a file involves allocating new file objects, creating directory entries, andupdating the page bitmap. In YAFFS, all these operations are initially performedin-memory and regularly synchronized to Flash storage. When creating a file with

Submitted to ACM Transactions on Storage

Page 20: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

20 · Jaemin Jung et al.

some content, we need to allocate appropriate buffer pages for content and to writethe content to buffer page. The updated buffer pages are regularly flushed to Flashstorage. Let us examine the performance of creating empty files(0KByte). In SAS,metadata operation performance decreases by 3% compared to YAFFS. In SAS, wedo not completely remove the page metadata and file system objects from Flashstorage. Page metadata and file system objects in main memory are synchronizedto both Storage-Class Memory layer and Flash storage layer. The synchronizationoverhead to Storage-Class Memory layer degrades metadata update performancein SAS. Metadata operation performance in SAM is much worse than in YAFFS.The performance decreases by 30%. In SAM, all updates on metadata are directlyperformed in Storage-Class Memory.

FRASH yields the best metadata operation performance of all four file systems.There are two main reasons for this. First, FRASH copies the metadata in Storage-Class Memory to the main memory when the file system is mounted. All subsequentmetadata operations are performed in the same manner as in YAFFS. Second, pagemetadata resides in the Storage-Class Memory layer in FRASH and in Flash storagein YAFFS, respectively. Synchronizing in-memory data structures to the Storage-Class Memory(FRASH file system) layer is much faster than synchronizing in-memory data structures to Flash Storage. In all four file systems, data pages residein Flash storage. Creating a larger file means that larger fraction of file creationoverhead is consumed by updating the file pages in Flash storage. Therefore, as thesize of file increases, the performance gap between YAFFS and FRASH becomesless significant.

Let us examine the performance of file deletion operation(Fig. 13(b)). FRASHyields 11% - 16.5% improvement on file deletion speed compared to YAFFS. Delet-ing a file is faster than creating a file. File creation requires allocation of memoryobjects and possibly searching the bitmap to find the proper page for creating data.Meanwhile, deleting a file does not require allocation or search for free object spots.Deleting a file involves freeing the file object, file tree and pages used by the file. Aswas the case in file creation, the YAFFS slightly outperforms SAS. SAM exhibitsthe worst performance.

The results of this experiment show that state of art storage class memory deviceshave two-hundred times access speed than NAND Flash(Table I), but they are stillmuch slower than state of art DRAM with 15 nsec access latency. Manipulating datadirectly on Storage-Class Memory takes more time than manipulating it in the themain memory. Given the trend of technology advancement, we are quite pessimisticthat Storage-Class memory is going to be faster than DRAM in the foreseeablefuture nor does it deliver better $/byte. While Storage-Class Memory deliversbyte-addressability and non-volatility which has long been the major drawbacks ofboth Flash and DRAM, it is not feasible that Storage-Class Memory positions itselfas full substitute for either of them. Rather, we believe that both Storage-ClassMemory and legacy main memory technology(DRAM, SRAM, and etc) should existin a way so that each can overcome the drawbacks of the other in a single system.

7.4 Sequential I/O

We measure the performance of sequential I/O with two benchmark programs:LMBENCH and IOZONE benchmark suite. Fig. 14(a) illustrates the performanceSubmitted to ACM Transactions on Storage

Page 21: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 21

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Write Read

MB

yte/

sec

I/O performance:LMBENCH

YAFFS SASSAM

FRASH

(a) LMBENCH

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Write Read

MB

yte/

sec

I/O performance:IOZONE

YAFFS SASSAM

FRASH

(b) IOZONE

Fig. 14. Sequential I/O

results. For sequential Read and Write, FRASH outperforms YAFFS by 26% and3%, respectively. Fig. 14(b) shows the results of the IOZONE benchmark. For writeoperation, FRASH shows 16% and 23% improvement in read and write operationagainst YAFFS, respectively.

Among four file systems tested, SAM exhibits the worst performance in bothread and write operation. File system I/O accompanies access to page metadataand file object. Access latency to these objects significantly affects the overall I/Operformance. YAFFS, SAS and FRASH maintains these objects in main memoryand SAM maintains these objects in Storage-Class Memory. Since FRAM is muchslower than DRAM, performance degrades significantly in SAM. YAFFS and SASexhibit similar performance(Fig. 14). In both file systems, file objects, directorystructures and page bitmaps are maintained in DRAM and are regularly synchro-nized to Flash storage. SAS file system performs significantly better than YAFFSin mount latency, but in reading and writing actual data blocks, both of thesefile systems yield similar performance. It is interesting to observe that FRASHoutperforms SAS and YAFFS. It is found that there exist significant number ofpage metadata only accesses. The number of page metadata accesses can be muchlarger than the number of page accesses. Typical reason is to find the valid pagefor a given logical block. These accesses refer to the page metadata in the storage.Due to hardware architecture of the Flash storage, reading page metadata, whichis 3.5% of the page size, requires almost the same latency as reading an entirepage(page+page metadata). Therefore, access latency to page metadata is an im-portant factor for I/O performance. We physically measure the time to access pagemetadata for each file system(Table II). In NAND Flash(YAFFS), read and writeof page metadata takes 25 µsec and 95 µsec, respectively. In FRAM, both readand write take 2.4 µsec. The read and write operation is ten times and thirty timesfaster in FRAM than in NAND Flash, respectively. Due to this reason, FRASHyields better read/write performance than YAFFS.

7.5 Random I/O

We examine the performance of Random I/O with IOZONE benchmark. Fig. 15(a)and Fig. 15(b) illustrate the results. We examine the performance under varyingI/O unit size. X and Y axis denotes the I/O unit size and the respective I/O

Submitted to ACM Transactions on Storage

Page 22: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

22 · Jaemin Jung et al.

Operation time/access(Flash) time/access(FRAM)

Read 25 µsec 2.3µsec

Write 95 µsec 2.3 µsec

Table II. Page metadata access latency in YAFFS and FRASH

0

5

10

15

20

25

30

35

8 16 32 64 128 256 512 1024

Th

rou

gh

pu

t(M

By

te/s

ec)

I/O size(kByte)

Random Read

YAFFSSAS

SAMFRASH

(a) Random Read(IOZONE)

0

5

10

15

20

25

30

35

8 16 32 64 128 256 512 1024

Th

rou

gh

pu

t(M

By

te/s

ec)

I/O size(kByte)

Write

YAFFSSAS

SAMFRASH

(b) Random Write(IOZONE)

Fig. 15. Random I/O(IOZONE)

performance. The performance difference among the four file systems are similarto sequential I/O study results. Let us compare the performance of sequentialI/O and random I/O. In read, random operation is slightly lower than sequentialoperation. In write, this gap becomes more significant. While sequential writethroughput(FRASH) is between 800 to 850 Kbytes/sec depending upon I/O unitsize, random write throughput is below 800 Kbytes/sec. In other designs, sequen-tial write outperforms random write, also. When in-place update is not allowed,random write operation causes more page invalidation and subsequently more eraseoperations. Therefore, random write operation exhibits lower throughput than se-quential write.

8. CONCLUDING REMARKS

In this work, we develop a hybrid file system, FRASH for Storage-Class Memory andNAND Flash. Once realized into proper scale, Storage-Class Memory clearly willresolve significant issues in current storage and memory system. Despite all thesepromising characteristics, for a next few years, the scale of storage-class memorydevice will be order of magnitude smaller, e.g. 1/1000, than the current storagedevice. We argue that a Storage-Class memory should be exploited as new hy-brid layer between main memory and storage rather than positions itself as a fullsubstitute of memory or storage. Via this approach, Storage-Class Memory cancomplement the physical characteristics of the two: volatility of main memory andblock access granularity of storage. The key ingredient in this file system design ishow to use Storage-Class memory in system hierarchy. It can be mapped onto themain memory address space. In this case, it is possible to provide non-volatilityto data stored in the respective address range. On the other hand, Storage-Classmemory can be used as part of the block device. In this case, I/O speed willSubmitted to ACM Transactions on Storage

Page 23: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 23

become faster and it is possible that I/O bound workload becomes CPU boundworkload. The data structures and objects to be maintained in Storage class mem-ory should be selected very carefully since Storage-Class Memory is still too smallto accommodate all file system objects.

In this work, we exploit both the memory aspect and storage aspect of theStorage-Class memory. FRASH provides a hybrid view on the Storage-Class mem-ory. It harbors in-memory data structure as well as on-disk structure for the filesystem. By maintaining on disk structure in Storage-Class memory, FRASH pro-vides byte-addressability to the on-disk file system object and metadata for page.The contribution of the FRASH file system is three folds: (i) Mount latency whichhas been regarded as a major drawback of the log-structured file system is decreasedby order of magnitude; (ii) I/O performance improves significantly via migratingon-disk structure to Storage-Class Memory layer, and (iii) By maintaining directorysnapshot and file tree in the Storage-Class Memory, system becomes more robustagainst unexpected failure. In summary, we successfully developed state of art hy-brid file system and showed that storage-class memory can effectively be exploitedto resolve the various technical issues in existing file system.

ACKNOWLEDGMENTS

This research was supported by Korea Science and Engineering Foundation (KOSEF)through a National Research Lab. program at Hanyang University (R0A-2009-0083128). We like to thank Samsung electronics for their FRAM sample endow-ment.

REFERENCES

Bityuckiy, A. JFFS3 design issues.

Deshpande, M. and Bunt, R. Dynamic file management techniques. In Proceedings of SeventhAnnual International Phoenix Conference on Computers and Communications. Scottsdale, AZ,USA.

Doh, I., Choi, J., Lee, D., and Noh, S. 2007. Exploiting non-volatile RAM to enhance flashfile system performance. In Proceedings of the 7th ACM & IEEE international conference onembedded software. Salzburg, Austria, 164–173.

Freescale. ”freescale semiconductor”. http://www.freescale.com.

Freitas, R., Wilcke, W., and Kurdi, B. 2008. Storage class memory, technology and use. InTutorial of 6th USENIX Conference on File and Storage Technologies. San Jose, CA, USA.

http://www.iozone.org. IOZONE.

Intel. Intel corporation, understanding the flash translation layer (ftl) specification.http://www.intel.com/design/flcomp/applnots/29781602.pdf.

Jeon, B. 2008. Boosting up the mount latency of nand flash file system using byte addressablenvram. M.S. thesis, Hanyang University, Seoul, Korea.

Jung, J., Choi, J., Won, Y., and Kang, S. 2009. Shadow block: Imposing block device abstrac-tion on storage class memory. In Proceedings of the Fourth International Workshop on Supportfor Portable Storage (IWSSPS09). Grenoble, France, 67–72.

Kang, Y., Joo, H., Park, J., Kang, S., Kim, J.-H., Oh, S., Kim, H., Kang, J., Jung, J., Choi,D., Lee, E., Lee, S., Jeong, H., and Kim, K. 2006. World smallest 0.34/spl mu/m cob cell 1t1c64mb fram with new sensing architecture and highly reliable mocvd pzt intgration technology.In Symposium on VLSI Technology, 2006. Digest of Technical Papers. 124–125.

Kgil, T., Roberts, D., and Mudge, T. 2008. Improving NAND flash based disk caches. InProceedings of 35th International Symposium on Computer Architecture (ISCA’08). 327–338.

Submitted to ACM Transactions on Storage

Page 24: FRASH: Exploiting Storage Class Memory in Hybrid …...FRASH: Exploiting Storage Class Memory in Hybrid File System for Hierarchical Storage · 3 not been addressed before. Three key

24 · Jaemin Jung et al.

Kim, E., Shin, H., Jeon, B., Han, S., Jung, J., and Won, Y. 2007. FRASH: Hierarchical FileSystem for FRAM and Flash. Lecture note in computer science 4705, 1, 238–251.

Kim, H. and Ahn, S. 2008. BPLRU: A Buffer Management Scheme for Improving Random Writesin Flash Storage. In Proceedings of 6th conference on USENIX Conference on File and StorageTechnologies (FAST’08). San Jose, CA, USA.

Kim, H., Won, Y., and Kang, S. 2009. Embedded NAND Flash File System for Mobile Multi-media Devices. IEEE Transactions on Consumer Electronics 55, 2, 546.

Lau, S. and Lui, J. 1997. Designing a hierarchical multimedia storage server. The ComputerJournal 40, 9, 529–540.

Manning, C. 2001. YAFFS (yet another Flash FileSystem).http://www.alephl.co.uk/armlinux/projects/yaffs/index.html.

McKusick, M., Joy, W., Leffler, S., and Fabry, R. 1984. A fast file system for UNIX. ACMTransactions on Computer Systems (TOCS) 2, 3, 181–197.

McVoy, L. and Staelin, C. 1996. lmbench: Portable tools for performance analysis. In Pro-ceedings of the 1996 annual conference on USENIX Annual Technical Conference. UsenixAssociation, San Diego, California, 23.

Meritech. Meritech, smdk2440 board. http://www.meritech.co.kr/eng/.

Miller, E. L., Brandt, S. A., and Long, D. D. 2001. Hermes: High-performance reliable mram-enabled storage. In Proceedings of the 8th IEEE Workshop on Hot Topics in Operating Systems(HotOS-VIII). 83–87.

NEDO. Nedo japan. http://www.nedo.go.jp/english/.

Nikkei. Nikkei electronics. http://www.nikkeibp.com/.

Park, S., Lee, T., and Chung, K. 2006. A Flash File System to Support Fast Mounting forNAND Flash Memory Based Embedded Systems. Lecture Notes in Computer Science 4017,415–424.

Park, Y., Lim, S., Lee, C., and Park, K. 2008. PFFS: a scalable flash memory file system forthe hybrid architecture of phase-change RAM and NAND flash. In Proceedings of the 2008ACM symposium on Applied computing. Fortaleza, Ceara, Brazil, 1498–1503.

Raoux, S., Burr, G. W., Breitwisch, M. J., Rettner, C. T., Chen, Y. C., Shelby, R. M.,Salinga, M., Krebs, D., Chen, S. H., Lung, H. L., and Lam”, C. H. 2008. Phase-changerandom access memory-a scalable technology. IBM Journal of Research and Development 52, 4,465–479.

Rosenblum, M. and Ousterhout, J. K. 1992. The design and implementation of a log-structuredfile system. ACM Transactions on Computer Systems (TOCS) 10, 1, 26–52.

Schlack, M. 2004. The future of storage: Ibm’s view. searchstorage.com: Storage TechnologyNews. http://searchstorage.com.

Shin, H. 2008. Merging memory address space and block device using byte-addressable nv-ram.M.S. thesis, Hanyang University, Seoul, Korea.

Wang, A.-I. A., Kuenning, G., Reiher, P., and Popek, G. 2006. The conquest file system:Better performance through a disk/persistent-ram hybrid design. ACM Transactions on Storage(TOS) 2, 3, 309–348.

Wilkes, J., Golding, R., Staelin, C., and Sullivan, T. 1996. The HP AutoRAID hierarchicalstorage system. ACM Transactions on Computer Systems (TOCS) 14, 1, 108–136.

Wu, C., Kuo, T., and Chang, L. 2006. The Design of efficient initialization and crash recovery forlog-based file systems over flash memory. ACM Transactions on Storage (TOS) 2, 4, 449–467.

Yegulalp, S. 2007. Ecc memory: A must for servers, not for desktop pcs.http://searchwincomputing.techtarget.com.

Yim, K., Kim, J., and Koh, K. 2005. A fast start-up technique for flash memory based computingsystems. In Proceedings of the 2005 ACM symposium on Applied computing. Santa Fe, NewMexico, 843–849.

Submitted to ACM Transactions on Storage