design and performance evaluation of networked storage architectures
DESCRIPTION
Design and Performance Evaluation of Networked Storage Architectures. Xubin He ([email protected]) July 25,2002 Dept. of Electrical and Computer Engineering University of Rhode Island. Outline. Introduction STICS: SCSI-To-IP Cache for Storage Area Networks - PowerPoint PPT PresentationTRANSCRIPT
Design and Performance Evaluation of Networked Storage Architectures
Xubin He([email protected])
July 25,2002Dept. of Electrical and Computer
EngineeringUniversity of Rhode Island
July 25, 2002 High Performance Computing Lab(HPCL),URI
Outline
Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions
July 25, 2002 High Performance Computing Lab(HPCL),URI
Background Data storage plays an essential role in
today’s fast-growing data-intensive network services.
Online data storage doubles every 9 months Storage is approaching more than 50% of
IT spending.The storage cost will be up to 75% of the total IT cost in year 2003.
A Server-to-Storage Bottleneck
Source: Brocade
July 25, 2002 High Performance Computing Lab(HPCL),URI
How to deploy data over the network efficiently and reliably? Disparities between SCSI & IP SCSI remote handshaking over IP Processor-disk gap growing High speed network Large client memories Cheap Disk & RAM, expensive NVRAM RAID5 is reliable, but low performance E-commerce over the Internet, distributed web
servers
Motivations
STICS
DRALIC
vcRAID
July 25, 2002 High Performance Computing Lab(HPCL),URI
Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions
July 25, 2002 High Performance Computing Lab(HPCL),URI
Introducing a New Device:STICS
Whenever there is a disparity, cache helps
Features of STICS: Smooth out disparities between SCSI and IP Localize SCSI protocol and filter out unnecessary
traffic reducing bandwidth requirement Nonvolatile data caching Improve performance, reliability, manageability
and scalability over current iSCSI systems.
System Overview
System overview. A STICS connects to the host via SCSI interface and connects to other STICS’ or NAS via Internet.
SCSI
TCP/IP
SCSI
STICS 1TCP/IP
NAS
SCSI
STICS 2TCP/IPInterne
t
STICS 3 STICS N
Host 1
Host 2 or
Storage
Host M or
Storage
SCSI
Disks or SAN
STICS Architecture SCSI Interface
Processor
RAM
Log Disk
Storage device
N
etw
ork
In
terfa
ce
July 25, 2002 High Performance Computing Lab(HPCL),URI
Internal Cache Structure
log Disk
Meta DataMemory Cache Data
Cache
July 25, 2002 High Performance Computing Lab(HPCL),URI
Basic Operations Write
Write requests from the host via SCSI Write requests from another STICS via NIC
Read Read requests from the host via SCSI Read requests from another STICS via NIC
Destage RAM —> log disk Log disk —> storage device
Prefetch Storage device —> RAM
July 25, 2002 High Performance Computing Lab(HPCL),URI
Web-based Network Management
Web browser-based Manager
HTTP
HTTP
Servlet
Management App.
TCP/IP
TCP/IP
Local Manage App.
July 25, 2002 High Performance Computing Lab(HPCL),URI
Implementation Platform A STICS block is a PC running Linux OS: Linux with kernel 2.4.2 Compiler: gcc Interfaces:
STICSSCSI IP
July 25, 2002 High Performance Computing Lab(HPCL),URI
Performance Evaluations Methodology
iSCSI implementation on Linux by Intel (iSCSI) Initial STICS Implementation on Linux
Two modes: Immediate report (STICS-Imm) Report after complete (STICS)
Workloads Postmark of Network Appliances: throughput
Two configurations Small: 1000/50k/436MB Large: 20k/100k/740MB
EMC Trace :response time More than 230,000 I/O requests Data set size: >900MB
Target(Squid)
SC
SI
NIC
Disks
Host(Trout)
NIC
Switch
iSCSI commands and data
iSCSI configuration. The host Trout establishes connection to target, and the target Squid responds and connects. Then the Squid exports hard drive and Trout sees the disks as local.
Cod
Target(Squid)
SC
SI
ST
ICS
2
Disks
Host(Trout)
ST
ICS
1 Switch
Block Data
STICS configuration. The STICS cache data from both SCSI and network.
Cod
Experimental Settings
Throughput (20k initial f iles and 100k transactions)
0
100
200
300
400
512 1024 2048 4096
Block size (bytes)T
ransactio
ns/s
ec
STICS-Imm STICS iSCSI
PostMark Results: Throughput
Ave. Improvement STICS-imm STICS
Small set 226% 64%
Large set 318% 97%
Throughput (1000 initial f iles and 50k transactions)
0
200
400
600
512 1024 2048 4096
Block size (bytes)
Tra
nsactio
ns/s
ec
STICS-Imm STICS iSCSI
Where does the benefit come from?
<64 65-127 128-255
255-511
511-1023
>1024
iSCSI 7 1,937,724 91 60 27 1,415,912
STICS
4 431,216 16 30 7 607,827
Total Packets Small Packets
(%)
Bytes Transferred
Bytes per
packet
iSCSI 3,353,821 57.8% 1,914,566,504
571
STICS 1039,100 41.5% 980,963,821 944
# Of packets with different sizes (bytes)
Network traffic analysis
July 25, 2002 High Performance Computing Lab(HPCL),URI
EMC Trace Results: Response Time
a) STICS with immediate report(2.7
ms)
b) STICS with report after complete (5.71
ms).
c) iSCSI (16.73 ms).
Histograms of I/O response times for trace EMC-tel.
July 25, 2002 High Performance Computing Lab(HPCL),URI
Summary
A novel cache storage device that adds a new dimension to networked storages
Significantly improving performance of iSCSI
A cost-effective solution for building efficient SAN over IP
Allow easy manageability, maintainability, and scalability
July 25, 2002 High Performance Computing Lab(HPCL),URI
Introduction STICS: SCSI-To-IP Cache for Storage Area Networks
DRALIC: Distributed RAID and Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions
July 25, 2002 High Performance Computing Lab(HPCL),URI
Web Servers Overhead caused by FS is high Enterprise web server is expensive
A Fujitsu Server: More than $5 million PCs are cheap: $1000
Disks: $160/120GB (IBM Deskstar@CompUSA)
DRAM:$100/256MB(@Crucial.com)
July 25, 2002 High Performance Computing Lab(HPCL),URI
My Solution
Combine or bridge the disk controller and network controller of existing PCs interconnected by a high-speed switch.
Share memory and storage among peers
Fast LAN (Switch)
File System File System
File System File System
RAM
RAM
RAM
RAM
NIC SCSI
Disk Driver Netw ork
Driver Bridge
NIC SCSI
Disk Driver Netw ork
Driver Bridge
NIC SCSI
Disk Driver
Netw ork Driver
Bridge
NIC SCSI
Disk Driver Netw ork
Driver Bridge
RAPID
DSM
DBMS DBMS
DBMS DBMS
July 25, 2002 High Performance Computing Lab(HPCL),URI
Performance analysis
dskdsk
netnet
RAID
memnet
netrm
memlm
RAIDrmlmrmrmlmlmlmDRALIC
OHBWN
BOHN
BWN
BNT
BW
BOH
BW
BT
BW
BT
THHTHHTHT
11
111B: data block size (8KB)N: number of nodes Hlm: Local memory hit ratioHrm: Remote memory hit ratioTlm: Local memory access timeTrm: Remote memory access timeTraid: access time from the distributed RAIDTdralic: Average response time of DRALIC system
Preliminary Performance Analysis
DRALIC: Nodes infl uence
0 1 2 3 4 5 6 7
Nodes
Ac
ce
ss
tim
e(m
s)
Hlm=0.5 Hlm=0.8
Hlm=0.5 6.2903 3.2655 1.491 0.5175 0.3869 Hlm=0.8 2.5525 1.3425 0.6328 0.2434 0.1911
1 2 4 16 32
Average I/O response time vs. number of nodes
July 25, 2002 High Performance Computing Lab(HPCL),URI
Simulation Results DRALICSim: a simulator based on socket
communication. Benchmark:
PostMark: measures performance in terms of transaction rates provided by Network Appliance Inc.
Configurations: 1000 initial files and 50000 transactions (small), 20000/50000(medium) and 20000/100000(large)
4 Nodes running Windows NT
July 25, 2002 High Performance Computing Lab(HPCL),URI
Simulation Results
Throughput
050
100150200250300
Small Medium Large
Test Suite
Tran
sacti
ons/S
ec
Base 2Nodes 3Nodes 4Nodes
July 25, 2002 High Performance Computing Lab(HPCL),URI
Summary Combination of HBAs and NICs will
reduce the overhead. Share memory and storage among
peers Make use of existing resources Our simulator has the performance
gain up to 4.2 with 4 nodes
July 25, 2002 High Performance Computing Lab(HPCL),URI
Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache
vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures Conclusions
July 25, 2002 High Performance Computing Lab(HPCL),URI
VC-RAID Hiding the small write penalty of RAID5 by
buffering small writes and destaging data back to RAID with parity computation when disk activity is low.
A combination of a small portion of the system RAM and a log disk to form a hierarchical cache.
This hierarchical cache appearing to the host as a large nonvolatile RAM.
July 25, 2002 High Performance Computing Lab(HPCL),URI
Buffer Cache
Main Memory
Cache Disk
OS kernel
Architecture
RAID5
July 25, 2002 High Performance Computing Lab(HPCL),URI
RAM buff er
Con
trolle
r
Cache Disks
RAID5
Con
trolle
r Cache Disks
RAID5
RAM buff er
Con
trolle
r
Cache Disks
RAID5
RAM buff er
Con
trolle
r
Cache Disk
RAID5
RAM buffers
(a) (b)
(d) (c)
Approaches
July 25, 2002 High Performance Computing Lab(HPCL),URI
Performance Results Test environment: Gateway G6-400,
64MB RAM, 4M RAM buffer, 200 MB Cache disk, 4 SCSI disks form a disk array.
Benchmarks Postmark by Network Appliance Untar/copy/remove
Compared to built-in RAID0 and RAID5
July 25, 2002 High Performance Computing Lab(HPCL),URI
Throughput
Series RAID 0 VC-RAID RAID 5
Small(1k+50k)
1111 941 561
Medium(20k+50k)
68 63 30
Large(20k+100k)
31 28 16
untar
0
200
400
600
800
1000
vc-r aid r aid5 r aid0
Synchronous Asynchronous
Response time (second)
Remove
0
100
200
300
400
500
600
700
vc-raid raid5 raid0
Synchr onous Asynchr onous
Copy
0
200
400
600
800
1000
vc-r aid r aid5 r aid0
Synchr onous Asynchr onous
July 25, 2002 High Performance Computing Lab(HPCL),URI
Summary Reliable:
based on RAID5 Hard drive is more reliable than RAM
Cost effective: hard drives are much cheaper than
RAM Software, don’t need extra hardware
Fast: increasing the cache size
July 25, 2002 High Performance Computing Lab(HPCL),URI
Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID
Performance Eval. on Distributed Web Server Architectures Conclusions
July 25, 2002 High Performance Computing Lab(HPCL),URI
Observations E-Commerce has grown explosively Static web pages that are stored as files are
no longer the dominant web accesses. about 70% of them start CGI, ASP, or
Servlet calls to generate dynamic pages. Web server behaviors and the interaction
between web server and database servers
WS1
WS2
DBS1
Proxy (WS Selector)
WS1
WS2
DBS1
DBS2 Proxy
(WS Selector)
WS3
WS1
WS2
DBS2
DBS1
Proxy (WS Selector)
WS
DBS
WS
DBS1
(a)
(e)
(d)
(c)
(b)
July 25, 2002 High Performance Computing Lab(HPCL),URI
Benchmark and workloads Workloads
Static pages Light CGI: 20% / 80%. Heavy CGI: 90% / 10%. Heavy servlet: 90% / 10%. Heavy database access: 90% /10%. Mixed workload: 7% / 8% / 30% /55%
WebBench 3.5 (6010 static pages, 300 cgi, 300 simple servlets, 400 DB servlets using JDBC, 2 databases with 15 and 18 tables)
Throughput(Static Pages)
020406080
100120140160180
1 2 3 4 8 12 16 20 28 32 36 40 48 60 80 100
Clients
Re
q/S
ec
1ws 2ws 3ws
Throughput (Light CGI)
0
50
100
150
200
250
1 2 3 4 8 12 16 20 28 32 36 40 48 60 80 100
Clients
Req
/Sec
1ws 2ws 3ws
Throughput(Heavy CGI)
0
50
100
150
200
250
1 2 3 4 8 12 16 20 28 32 36 40 48 60 80 100
Clients
Req
/Sec
1ws 2ws 3ws
Throughput(Heavy Servlet)
0
100
200
300
400
500
600
700
800
1 2 3 4 8 12 16 20 28 32 36 40 48 60 80 100
Figure 5(a) Clients
Req/S
ec
1ws 2ws 3ws
Throughput (Heavy Database access)
0
5
10
15
20
25
30
35
1 2 3 4 8 12 16 20 28 32 36 40 48 60 80 100
Figure 7(a) Clients
Req/S
ec
1wsdbs 1ws1dbs 2ws1dbs 2ws2dbs 3s2dbs
Throughput (Mixed workload)
010203040506070
1 2 3 4 8 12 16 20 28 32 36 40 48 60 80 100
Clients
Req
/Sec
1wsdbs 1ws1dbs 2ws1dbs 2ws2dbs 3ws2dbs
CPU time distribution(PC Server3)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 5 10 15 20 25 30 35 40 45 50 55
cgi runs on apache clients
cpu
time%
idle user kernel
cpu time distribution (PC Server3)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 5 10 15 20 25 30 35 40 45 50 55
servlet runs on java web server clients
cpu t
ime%
ilde user kernel
July 25, 2002 High Performance Computing Lab(HPCL),URI
Introduction STICS: SCSI-To-IP Cache for Storage Area Networks DRALIC: Distributed RAID & Location Independence Cache vcRAID: Large Virtual NVRAM Cache for Software RAID Performance Eval. on Distributed Web Server Architectures
Conclusions
July 25, 2002 High Performance Computing Lab(HPCL),URI
Summary STICS couples reliable and high speed
data caching with low overhead conversion between SCSI and IP.
DRALIC boosts the web server performance by combining disk controller and NIC to reduce FS overhead.
vcRAID presents a reliable and inexpensive solution for data storage.
We carried out an extensive performance study on distributed web server architectures under realistic workloads.
July 25, 2002 High Performance Computing Lab(HPCL),URI
Patents (with Dr. Yang)
STICS: SCSI-To-IP Cache Storage, File pending, Serial Number 60/312,471, August 2001
DRALIC: Distributed RAid and Location Independence Cache, Filed pending, May 2001
July 25, 2002 High Performance Computing Lab(HPCL),URI
Publications (Journal)1. Xubin He, Qing Yang, and Ming Zhang, “STICS:
SCSI-To-IP Cache for Storage Area Networks,” Submitted to IEEE Transactions on Parallel and Distributed Systems.
2. Xubin He, Qing Yang, “Performance Evaluation of Distributed Web Server Architectures under E-Commerce Workloads,” Submitted to Journal of Parallel and Distributed Computing.
3. Xubin He, Qing Yang, “On Design and Implementation of a Large Virtual NVRAM Cache for Software RAID,” Special Issue of Journal on Parallel I/O for Cluster Computing, 2002.
July 25, 2002 High Performance Computing Lab(HPCL),URI
Publications (Conference)1. Xubin He, Qing Yang, and Ming Zhang, “ A Caching Strategy to Improve iSCSI
Performance,” To appear in IEEE Annual Conference on Local Computer Networks, Nov. 6-8, 2002.
2. Xubin He, Qing Yang, and Ming Zhang, “Introducing SCSI-To-IP Cache for Storage Area Networks,” ICPP’2002, Vancouver, Canada, August 2002.
3. Xubin He, Ming Zhang, Qing Yang, “DRALIC: A Peer-to-Peer Storage Architecture”, Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'2001), 2001.
4. Xubin He, Qing Yang, “Characterizing the Home Pages”, Proc. of the 2nd International Conference on Internet Computing (IC’2001), 2001.
5. Xubin He, Qing Yang, “VC-RAID: A Large Virtual NVRAM Cache for Software Do-it-yourself RAID”, Proc. of the International Symposium on Information Systems and Engineering (ISE'2001), 2001.
6. Xubin He, Qing Yang, “Performance Evaluation of Distributed Web Server Architectures under E-Commerce Workloads”, Proc. of the 1 st International Conference on Internet Computing (IC’2000), 2000.
Thank You!
Dr. Qing Yang @ELEDr. Jien-Chung Lo @ELEDr. Joan Peckham @CSDr. Peter Swaszek @ELEDr. Lisa DiPippo @CS
And more…
Special thanks to my daughter, Rachel!