overview of lustre ece, u of mn changjin hong (prof. tewfik’s group) [email protected] monday,...
Post on 22-Dec-2015
214 views
TRANSCRIPT
Outline
• Reference• Lustre Cluster• Lustre System Components• Distributed Lock Manager• Object Based Storage• Conclusion (security issues)
Reference
• Lustre: A SAN File System for Linux– http://www/lustre.org/docs/lustre/luswhi
te.pdf
• Several presentation materials from Dr. Peter J. Braam
A Lustre Cluster
10,000’s
10’s of nodes
1,000’s
Key Design Issue : Scalability
• I/O throughput– How to avoid bottlenecks
• Metadata scalability– How can 10,000’s of nodes work on files in
same folder
• Cluster Recovery– If sth fails, how can transparent recovery
happen
• Management– Adding, removing, replacing, systems; data
migration & backup
System Components
Interaction between systems
OST
MDS
Client
CMD protocol(directory) metadata handling,
inodes updates,concurrency
Pre-allocation file creation, recovery purpose, file status,
OS protocolFile I/O, allocation of blocks, striping,
security enforcement
Client File System
• A directory tree, subdivision into filesets for cluster ▷wide Unix file sharing semantics
• CMD protocol– Transaction-based– Authenticated access– Write-behind caching for MD updates
with strict data/metadata coherency
Metadata Service (MDS)
• All access to the file is governed by MDS which will directly or indirectly authorize access.
• To control namespace and manage inodes• Load balanced cluster service for the
scalability (a well balanced API, a stackable framework for logical MDS, replicated MDS)
• Journaled batched metadata updates
Object Storage Targets (OST)
• Keep file data objects• File I/O service ▷Access to the objects• The block allocation for data obj.,
leading distributed and scalability• OST s/w modules
– OBD server, Lock server– Obj. storage driver, OBD filter– Portal API
VAXCluster DLM adapted
Distributed Lock Manager
• For generic and rich lock service• Lock resources: resource database
– Organize resources in trees
• High performance– node that acquires resource manages
tree
Big Picture
Resource Tree and namespace
<namespace>Name1Name2Name3Name4
:
Obj.2
Obj.1
Obj.3
Obj.4
Resource manager
RR
R R
distributed resource directory/hash function (LDWV)/lock directory
Apps.
Mechanism in resource dB
• Hash binary string % N ▷ get h• Lookup system in lock directory
weightvector [h] ▷ find system K.• Systems
– may occupy 0, 1 or more slots in LDWV– Number of slots is lock directory weight
Lustre DLM features
• Low concurrency– Want write-back caching
• High concurrency– Want load balancing in cluster– Subdivide directories etc with hashes– Want server of request to limit lock
revocations-> ops. on the MD cluster in a client server RPC model
• Deadlock detection
Object Based Storage
Object Based Storage
• Object Based Storage Device– More intelligent than block device
• Speak storage at “inode level”– create, unlink, read, write, getattr, setattr…– Iterators, security, almost arbitrary processing
Components of OB Storage
• Storage Object Device Drivers– Class drivers : attach driver to interface
• Targets, clients : remote access• Direct drivers : to manage physical storage• Logical drivers: for intelligence & storage
management
• Object storage application (OSA)– (cluster) file systems– Advanced storage : parallel I/O, snapshots– Specialized apps. : caches, db’s, filesrv
System Interface
• Modules– Load the kernel modules to get drivers
of a certain type– Name devices to be of a certain type– Build stacks of devices with assigned
types
Layering of Object Drivers
Interaction of Obj. Storages/w modules
Benefits-clustering/SM
• Suitable for use in a SAN file system• Shared at the level of an individual block• Obj namespace : divided into obj group. Thi
s is very advantageous to be able to create obj w/ given obj id’s. Good for snapshot!
• Hot file migration
Conclusion
• Object Based StorageTo process the disk operations on the higher
concept of individual files and the file inode level, rather than the low-level h/w disk block level.
• Security Issues– Auxiliary service in cluster
• LDAP, PKI, Kerberos
– Purpose• CFS/ MDS/ OST
– Authenticate to each other– Set up session keys
Etc.
• GSS-API for authentication and Integrity Checks
• Remote DMA– Layer for NEVER bypass security
processing– Request processing for checking
authentication by a higher level layer in the networking stack