storage area network usage a unix sysadmin’s view of how a san works
DESCRIPTION
Storage Area Network Usage A UNIX SysAdmin’s View of How A SAN Works. Disk Storage. Embedded Internal Disks within the System Chassis Directly Attached External Chassis of Disks connected to a Server via a Cable Directly Attached Shared - PowerPoint PPT PresentationTRANSCRIPT
Storage Area Network Usage
A UNIX SysAdmin’s View of How A SAN Works
Disk Storage
Embedded Internal Disks within the System Chassis
Directly Attached External Chassis of Disks connected to a Server via a Cable
Directly Attached Shared External Chassis connected to more than one Server via a Cable
Networked Storage NAS SAN others
Disk Storage – 2000-2004
Type Bus Speed
Distance Cable Pins
ATA 100 MB/s 18 inches 40
SCSI 320 MB/s 12 m 68 or 80
FC 400MB/s 10K m 4
SATA-II 300MB/s 6 m 22
SAS 300MB/s 10 m 22
Deficiencies of Direct Connect Storage
Single System Bears Entire Cost of Storage Small Server in an EMC Shop Large Server cannot easily share its unused storage
Managability Fragmented and Isolated
Scalability Limited What happens when you run out of peripheral bus slots?
Availability “SCSI Bus Reset” Failover is a complicated add-on, if available at all
DASD
Direct Access Storage Device They still call it this in an IBM Mainframe Shop
Basic Limits of Disk Storage Recognized Latency
Rotation Speed of the disk Seek Time
Radial Movement of the Read/Write Heads Buffer Sizes
Stop sending me data, I can’t write fast enough!
SCSI
SCSI – Small Computer System Interface From Shugart’s 1979 SASI implementation
SASI: Shugart Associates System Interface
Both Hardware and I/O Protocol Standards Both have evolved over time Hardware is source of most limitations I/O Protocol has long-term potential
SCSI - Pro
Device Independence Mix and match device types on the bus Disk, Tape, Scanners, etc…
Overlapping I/O Capability Multiple read & write commands can be
outstanding simultaneously
Ubiquitous
SCSI - Con
Distance vs. Speed Double the Signaling Rate
Speed: 40, 80, 160, 320 MBps Halve the Cable Length Limits
Device Count: 16 Maximum Low voltage Differential Ultra3 SCSI can support
only 16 devices on a 12 meter cable at 160 MBps
Server Access to Data Resources Hardware changes are disruptive
SCSI – Overcoming the Con
New Hardware & Signaling Platforms
SCSI-3 Introduces Serial SCSI Support Fibre Channel Serial Storage Architecture (SSA)
Primarily an IBM implementation FireWire (IEEE 1394 – Apple fixes SCSI)
Attractive in consumer market
Retains SCSI I/O Protocol
Scaling SCSI Devices
Increase Controller Count within Server Increasing Burden To CPU
Device Overhead Bus Controllers can be saturated
You can run out of slots Many Queues, Many Devices
Queuing Theory 101 (check-out line) - undesirable
Scaling SCSI Devices
Use Dedicated External Device Controller Hides Individual Devices
Provide One Large Virtual Resource Offloads Device Overhead One Queue, Many Devices - good Cost and Benefit
Still borne by one system
RAID
Redundant Array of Inexpensive Disks
Combine multiple disks into a single virtual device
How this is implemented determines different strengths Storage Capacity Speed
Fast Read or Fast Write Resilience in the face of device failure
RAID Functions
Striping Write consecutive logical byte/blocks on consecutive physical disks
Mirroring Write the same block on two or more physical disks
Parity Calculation Given N disks, N-1 consecutive blocks are data blocks, Nth block is
for parity When any of the N-1 data blocks are altered, N-2 XOR calculations
are performed on these N-1 blocks The Data Block(s) and Parity Block are written Destroy one of these N blocks, and that block can be reconstructed
using N-2 XOR calculations on the remaining N-1 blocks Destroy two or more blocks – reconstruction is not possible
RAID Function – Pro & Con
Striping Pro: Increases Spindle Count for Increased Thruput Con: Does not provide redundancy
Mirroring Pro: Provides Redundancy without Parity Calculation Con: Requires at least 100% disk resource overhead
Parity Calculation Pro: Cuts Disk Resource Overhead to 1/N Con: Parity calculation is expensive
N-2 calculations are requiredIf all N-1 data blocks are not in cache, they must
be read
RAID Types
RAID 0 Stripe with No Parity
RAID 1 Mirror two or more disks
RAID 0+1 Stripe on Inside, Mirror on Outside
RAID 1+0 Mirrors on Inside, Stripe on Outside
RAID 3 Synchronous, Subdivided Block Access; Dedicated Parity Drive
RAID 4 Independent, Whole Block Access; Dedicated Parity Drive
RAID 5 Like RAID 4, but Parity striped across multiple drives
RAID 0RAID 1
RAID 3RAID 5
RAID 1+0 RAID 0+1
Breaking the Direct Connection
Now you have high performance RAID The storage bottleneck has been reduced You’ve invested $$$ to do it How do you extend this advantage to N
servers without spending N x $$$?
How about using existing networks?
How to Provide Data Over IP
NFS (or CIFS) over a TCP/IP Network This is Network Attached Storage (NAS) Overcomes some distance problems Full Filesystem Semantics are Lacking
…such as file locking Speed and Latency are problems Security and Integrity are problems as well
IP encapsulation of I/O Protocols Not yet established in the marketplace Current speed & security issues
NAS and SAN
NAS – Network Attached Storage File-oriented access Multiple Clients, Shared Access to Data
SAN – Storage Area Network Block-oriented access Single Server, Exclusive Access to Data
NAS: Network Attached Storage
File Objects and Filesystems OS Dependent OS Access & Authentication
Possible Multiple Writers Require locking protocols
Network Protocol: i.e., IP
“Front-end” Network
SAN: Storage Area Network
Block Oriented Access To Data
Device-like Object is presented
Unique Writer
I/O Protocol: SCSI, HIPPI, IPI
“Back-end” Network
A Storage Area Network
Storage StorageWorks MA8000 (24), EVA (2) HDS is 2nd Approved Storage Vendor
9980 Enterprise Storage Array – EMC class storage
Switches Brocade 12000 (8), 3800 (20), & 2800 (34)
3900’s are being deployed – 32 port
UNIX Servers on the SAN Solaris (56), IRIX (5), HP-UX (5), Tru64 (1)
Storage Volume Connected to UNIX Servers 13000 GB as of May, 2003
Windows Servers Windows 2000 (74), NT 4.0 (16)
SAN Implementations
FibreChannel FC Signalling Carrying SCSI Commands & Data Non-Ethernet Network Infrastructure
iSCSI SCSI Encapsulated By IP Ethernet Infrastructure
FCIP – FibreChannel over IP FibreChannel Encapsulated by IP Extending FibreChannel over WAN Distances Future Bridge between Ethernet & FibreChannel iFCP - another gateway implementation
NAS & SAN in the Data Center
FCIP In The Data Center
FibreChannel
How SCSI Limitations are Addressed Speed Distance Device Count Access
FibreChannel – Speed
266 Mbps – ten years ago1063 Mbps – common in 19982125 Mbps – available today4 Gbps – near future products Backward compatible to 1 & 2 Gbps
10 Gbps – 2005? Not backward Compatible with 1/2/4Gbps But 10 Gig Ethernet will compete Remember FDDI & ATM
Why I/O Protocols are Coming to IP
IP Networking is ubiquitous
Gigabit ethernet is here 10Gbps ethernet is just becoming available
Don’t have to invest in a second network Just upgrade the one you have
IP & Ethernet software is well understood Existing talent pool for vendors to leverage
Developers, not end-user Network Engineers
FibreChannel – Distance
1063 Mbps 175m (62.5 um – multi-mode) 500m (50.0 um – multi-mode) 10 km (9 um – single-mode)
2125 Mbps 500m (50.0 um – multi-mode) 2 km (9 um – single-mode)
FibreChannel – A Network
Layer 1 – Physical (Media: fiber, copper) Fibre: 62.5, 50.0, & 9.0 um Copper: Cat6, Twinax, Coax, other
Layer 2 – Data Link (Network Interface & MAC) WWPN: World Wide Port Name WWNN: World Wide Node Name
In a single port node, usually WWPN = WWNN 64-bit device address Comparable to 48-bit Ethernet device addresses
Layer 3 – Network (IP & SCSI) 24-bit fabric address Comparable to an IP address
FibreChannel Terminology: Port Types
N_Port Node port – Computer, Disk, or Storage Node
F_Port Fabric port – Found only on a Switch
E_Port Expansion Port – Switch to Switch port
NL_Port Node port with Arbitrated Loop Capabilities
FL_Port Fabric port with Arbitrated Loop Capabilities
G_Port Generic Switch Port: Can act as any of F_Port, E_Port, or FL_Port
FibreChannel - Topology
Point-to-Point
Arbitrated Loop
Fabric
FibreChannel – Point-to-point
Direct Connection of Server and Storage Node
Two N_Ports and One Link
FibreChannel - Arbitrated Loop
Up to 126 Devices in a Loop via NL_Ports
Token-access, Polled Environment (like FDDI)
Wait For Access Increases with Device Count
FibreChannel - Fabric
Arbitrary Topology
Requires At Least One Switch
Up to 15 million ports can be concurrently logged in with the 24-bit address ID.
Dedicated Circuits between Servers & Storage via Switches
Interoperability Issues Increase With Scale
FibreChannel – Device Count
126 devices in Arbitrated Loop
15 Million in a fabric (24-bit addresses) Bit 0-7: Port or Arbitrated Loop addr Bit 8-15: Area, identifies FL_Port Bit 16-23: Domain, address of switch
239 of 256 address available
256 x 256 x 239 = 15,663,104
FibreChannel Definitions
WWPN
Zone & Zoning
LUN
LUN Masking
FibreChannel - WWPN
World-Wide Port Number
A unique 64-bit hardware address for each FibreChannel Device
Analogous to a 48-bit ethernet hardware address
WWNN - World-Wide Node Number
FibreChannel – Zone & Zoning
Switch-Based Access Control
Analogous to an Ethernet Broadcast Domain
Soft Zone Zoning based on WWPN of Nodes Connected Preferred
Hard Zone Zoning Based on Port Number on Switch
to which the Nodes are Connected
FibreChannel - LUN
Logical Unit
Storage Node Allocates Storage and Assigns a LUN
Appears to the server as a unique device (disk)
FibreChannel – LUN Masking
Storage Node Based Access Control List (ACL)
LUNs and Visible Server Connections (WWPN) are allowed to see each other thru the ACL.
LUNs are Masked from Servers not in the ACL
LUN Security
Host Software
HBA-based firmware or driver configuration
Zoning
LUN Masking
LUN Security
Host-based & HBA Both these methods rely on correct security
implemented at the edges Most difficult to manage due to large numbers and
types of servers Storage Managers may not be Server Managers Don’t trust the consumer to manage resources
Trusting the fox to guard the hen house
LUN Security
Zoning An access control list Establishes a conduit
A circuit will be constructed thru this Allows only selected Servers see a Storage Node Lessons learned
Implement in parallel with LUN Masking Segregate OS types into different Zones Always Promptly Remove Entries For Retired Servers
LUN Security
LUN Masking The Storage Node’s Access Control List
Sees the Server’s WWPN Masks all LUNs not allocated to that server Allows the Server to see only its assigned
LUNs Implement in parallel with Fabric Zoning
LUN - Persistent Binding
Persistent Binding of LUNs to Server Device IDs
Permanently assign a System SCSI ID to a LUN.
Ensures the Device ID Remains Consistent Across Reconfiguration Reboots
Different HBAs use different binding methods & syntax
Tape Drive Device Changes have been a repeated source of NetBackup Media Server Failure
SAN Performance
Storage Configuration
Fabric Configuration
Server Configuration
SAN - Storage Configuration
More Spindles are Better
Faster Disks are Better
RAID 1+0 vs. RAID 5 “RAID 5 performs poorly compared to RAID 0+1 when both are
implemented with software RAID”Allan Packer, Sun Microsystems, 2002
Where does RAID 5 underperform RAID 1+0? Random Write
Limit Partition Numbers Within RAIDsets
SAN - Fabric Configuration
Common Switch for Server & Storage Multiple “hops” reduce performance Increases Reliability
Large Port-count switches 32 ports or more 16 port switches create larger fabrics
simply to carry its own overhead
SAN - Server ConfigurationChoose The Highest Performance HBA Available PCI: 64-bit is better than 32-bit PCI: 66 MHz is better than 33 MHz
Place in the Highest Performance Slot Choose the widest, fastest slot in the system Choose an Underutilized Controller
Size LUNs by RAIDset disk size BAD: LUN sizes smaller than underlying disk size
SAN Resilience
At Least Two Fabrics
Dual Path Server Connections Each Server N_Port is Connected to a Different Fabric Circuit Failover upon Switch Failure
Automatic Traffic Rerouting
Hot-Plugable Disks & Power Supplies
SAN Resilience – Dual Path
Multiple FibreChannel Ports within Server
Active/Passive Links
Most GPRD SAN disruptions have affected single-attached servers
SAN – Good Housekeeping
Stay Current With OS Drivers & HBA Firmware
Before You Buy a Server’s HBA Is it supported by the switch & storage vendors?
Coordinate Firmware Upgrades Storage & Other Server Admin Teams Using SAN
Monitor Disk I/O Statistics Be Proactive; Identify and Eliminate I/O Problems
SAN Backups – Why We Should
Why We Should Offload Front-end IP Network Most Servers are still connected to 100baseT IP 1 or 2 Gbps FC Links Increase Thruput Shrink Backup Times
Why We Don’t Cost
NetBackup Media Server License: starts at $5K list
Backup Futures
Incremental Backups No longer stored on tape Use “near-line” cheap disk arrays
Several vendors are under current evaluation
Still over IP 1 Gbps ethernet is commonly available on new
servers 10 Gbps ethernet needed in core
Questions