ibm spectrum scale fundamentals workshop for americas part 3 information lifecycle management
TRANSCRIPT
Spectrum Scale 4.1 System Administration
Spectrum Scale
Information Lifecycle Management (ILM) Tools
© Copyright IBM Corporation 2015
Unit objectives
After completing this unit, you should be able to:
• Understand the value Information Life Cycle Management
• Understand how ILM is managed in Spectrum Scale
• Understand Storage Pools & Policy Engine Queries
– File placement/movement policies
– File Analysis Queries
– File management policies
– Working with Filesets
© Copyright IBM Corporation 2015
What is ILM?
• Based on business rules you define the policy
• Policies Apply to ILM (disk/analysis), HSM(tape movement)
• Its built into Spectrum Scale (no additional Licensing)
– Transparently managing file data using a set of rules.
– Allows for dry run testing (to test your policy before you apply it)
– Not an Easy Tier but it Is a Set it and forget policy engine
© Copyright IBM Corporation 2015
IBM Spectrum Scale can help you achieve ILM efficiencies through powerful policy driven,
automated, tiered storage management. The Spectrum Scale ILM toolkit helps you manage
sets of files and pools of storage and enables you to automate the management of that data.
ILM = Information Lifecycle Management
An ability to manage Information efficiently thru the lifecycle of its value
The Policy engine is competitively superior primarily by using the metadata
engine to walk the metadata rather than the file system to build and analyze
work lists.
How does it help client ILM?
© Copyright IBM Corporation 2015
ILM = Information Lifecycle Management
An ability to manage Information efficiently thru the lifecycle of its value
This is an incredibly valuable tool for client file system management.
Types of policies:
File placement policies are used to automatically place newly created files in a
specific file system pool.
Useful for tiering for efficiency or performance.
File management policies are used to manage files during their lifecycle by
moving them to another file system pool, moving them to nearline storage,
copying them to archival storage, changing their replication status, or deleting
them.
Analysis discovery policies can be used without the need to move or manage
data, and simply used for understanding something about the data that you
have.
*These tools are a huge competitive advantage used by all clients
How are ILM policies applied
© Copyright IBM Corporation 2015
Policies are managed by Admin defined Rules
Characteristics of a policy are as follows:
• A policy can contain any number of rules.
• A policy file is limited to a size of 1 MB.
A policy rule is an SQL-like statement that tells Spectrum Scale what to do with the data for a
file in a
specific storage pool, if the file meets specific criteria. A rule can apply to any file being
created or only to files being created within a specific file set or group of file sets.
Rules specify conditions that, when true, cause the rule to be applied. Conditions that cause
Spectrum Scale to apply a rule are as follows:
Date and time when the rule is evaluated, that is, the current date and time
Date and time when the file was last accessed
Date and time when the file was last modified
File set name
File name or extension
File size
User ID and group ID
Creating a policyCreate a text file for your policy with the following guidelines:
– A policy must contain at least one rule.
– The last placement rule of a policy rule list must be as though no other placement rules
apply to a file; the file will be assigned to a default pool.
Installing a policy
Issue the mmchpolicy command
Changing a policy
Edit the text file containing the policy and issue the mmchpolicy command.
Listing policies
The mmlspolicy command displays policy information for a given file system.
Validating policies
The mmchpolicy -I test command validates but does not install a policy file.
How ILM policies are evaluated by Spectrum Scale
© Copyright IBM Corporation 2015
Spectrum Scale evaluates policy rules in order, from first to last, as they appear in the policy.
The first rule that matches determines what is to be done with that file. For example, when a
client creates a file, Spectrum Scale scans the list of rules in the active file-placement policy to
determine which rule applies to the file. When a rule applies to the file, Spectrum Scale stops
processing the rules and assigns the file to the appropriate storage pool. If no rule applies, an
EINVAL error code is returned to the application.
Several rule types exist:
Placement policies, evaluated at file creation, for example:
– Rule xxlfiles set pool gold for file set xxlfileset rule otherfiles set pool silver
Migration policies, evaluated periodically, for example:
– Rule cleangold migrate from pool gold threshold (90,70) to pool silver
– Rule cleansilver when day_of_week()=monday migrate from pool silver to pool pewter
where access_age > 30 days
Deletion policies, evaluated periodically, for example:
– Rule purgepewter when day_of_month() = 1 delete from pool pewter where
access_age > 365 days
ILM tools
• Storage pools
– A collection of disks or arrays
with similar properties that are
managed together as a group.
• File placement policies
– Determines where the file data is
placed on creation.
• File management policies
– Migrates or deletes file based on
business rules.
• Filesets
– Logical subtrees within a file
system that act as metadata
containers for files.
© Copyright IBM Corporation 2015
Storage
pool
Storage
pool
Storage
pool
Placement
Policies
Management
Policies
Filesets
What is a storage pool?
• Two types of Storage pools
– Internal
– External
• Internal: A collection of disks or arrays with similar properties
that are managed together as a group.
– Used to:
• group storage devices and create classes of storage within a file system
• Match the cost of storage to the value of the data
• Improved performance
• Improved reliability.
• External
– An interface to an external application.
© Copyright IBM Corporation 2015
Internal storage pool properties
• Every file system has at least a “System” storage pool
– Maximum of 8 storage pools
• The pools are created by mmcrfs, mmadddisk or mmchfs –V
• Only one pool, called the System Pool, stores metadata
– The policy file
– May be created a metadaDataOnly
• All other pools are dataOnly and store user data.
• When a pool is full, the user gets E_NOSPC.
• A file system without a valid policy file can only create files in the
system pool.
– An invalid policy file is deleted by mmfsck.
• A storage pool is an extra attribute on the definition of each disk
– Each disk belongs to exactly 1 storage pool.
© Copyright IBM Corporation 2015
Storage pool properties
• Only the system pool may contain metadataOnly or
dataAndMetadata or descOnly disks
– mmdeldisk is not allowed to delete the system pool.
•mmchdisk and mmrpldisk are not allowed to change the
disk’s storage pool
– Changing the pool would require all existing data to be migrated from the disk – just like mmdeldisk.
•mmlsdisk shows the storage pool for each disk.
© Copyright IBM Corporation 2015
Defining storage pool properties
• The storage pool is attribute of each disk and defined when a
disk is added to the file system
– Disk Stanza pool attribute
%nsd:
nsd=NsdName
usage={dataOnly | metadataOnly | dataAndMetadata | descOnly}
failureGroup=FailureGroup
pool=StoragePool
servers=ServerList
device=DiskName
© Copyright IBM Corporation 2015
Storage pool: mmlsdisk
mmlsdisk gpfs1 -L
disk driver sector failure holds holdsstorage
name type size group metadata data status availability disk id pool remarks
------------ -------- ------ ------- -------- ----- ------------- ------------ ------- ------------ ---------
nsdb1_1 nsd 512 5 yes yes ready up 1 system desc
nsdb1_2 nsd 512 2 yes yes ready up 2 system desc
nsdb1_3 nsd 512 1 yes yes ready up 3 system desc
nsdb1_4 nsd 512 2 yes yes ready up 4 system
nsdb2_1 nsd 512 1 yes yes ready up 5 system
nsdb2_2 nsd 512 2 yes yes ready up 6 system
nsdb2_3 nsd 512 1 yes yes ready up 7 system
nsdb2_4 nsd 512 2 yes yes ready up 8 system
nsdb3_2 nsd 512 1 no yes ready up 9 pool3
Number of quorum disks: 3
Read quorum value: 2
Write quorum value: 2
© Copyright IBM Corporation 2015
Storage pool: mmdf# mmdf sunalpha
disk disk size failure holds holds free KB free KB
name in KB group metadata data in full blocks in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system
nsdb2_3 140095488 1 yes yes 139563776 (100%) 288 ( 0%)
nsdb1_3 140095488 1 yes yes 139389696 ( 99%) 2568 ( 0%)
nsdb2_1 140095488 1 yes yes 139512576 (100%) 1448 ( 0%)
nsdb1_4 140095488 2 yes yes 139489536 (100%) 2644 ( 0%)
nsdb2_2 140095488 2 yes yes 139747328 (100%) 1044 ( 0%)
nsdb1_2 140095488 2 yes yes 139490944 (100%) 2632 ( 0%)
nsdb2_4 140095488 2 yes yes 139835904 (100%) 884 ( 0%)
nsdb1_1 140095488 5 yes yes 139537536 (100%) 1388 ( 0%)
------------- -------------------- -------------------
(pool total) 1120763904 1116567296 (100%) 12896 ( 0%)
Disks in storage pool: pool3
nsdb3_2 140095488 1 no yes 140093312 (100%) 124 ( 0%)
------------- -------------------- -------------------
(pool total) 140095488 140093312 (100%) 124 ( 0%)
============= ==================== ===================
(data) 1260859392 1256660608 (100%) 13020 ( 0%)
(metadata) 1120763904 1116567296 (100%) 12896 ( 0%)
============= ==================== ===================
(total) 1260859392 1256660608 (100%) 13020 ( 0%)
Inode Information
-----------------
Number of used inodes: 4524
Number of free inodes: 542730
Number of allocated inodes: 547254
Maximum number of inodes: 547254
© Copyright IBM Corporation 2015
Storage pool: mmlsattr
mmlsattr -L /alpha/junk1.p3
file name: /alpha/junk1.p3
metadata replication: 2 max 2
data replication: 2 max 2
flags: exposed,illreplicated,unbalanced
storage pool name: pool3
fileset name: root
snapshot name:
© Copyright IBM Corporation 2015
Storage pool creation/deletion
• To create a new storage pool
– Define disks with the new pool
mmadddisk or mmcrfs
– Install a new policy file.
• To delete an existing storage pool
– Install a policy file that does not include the pool
– Change the storage pool attribute for all files assigned to the pool
– Migrate the data to a new pool (or delete the files)
– Delete the disks (which deletes storage pool).
© Copyright IBM Corporation 2015
External storage pools
• A script based interface for external applications.
• Can be used for custom applications
• Benefits to external applications
– Speed of Spectrum Scale policy engine
– Scalability of namespace
– High availability of Spectrum Scale
Supported external applications
– IBM Tivoli Storage Manager HSM (TSM/HSM)
– LTFS
– High Performance Storage System (HPSS)
© Copyright IBM Corporation 2015
What are filesets?
• A fileset is a sub-tree of a file system namespace that provides
a means of partitioning the file system to allow administrative
operations
– In many ways behaves like an independent file system
– Used to define quotas on data blocks and inodes.
• A fileset has a root directory
– All files belonging to the fileset are only accessible via this root
directory
– No hard links between filesets are allowed
– Renames are not allowed to cross fileset boundaries.
• Max of 10,000 total filesets.
© Copyright IBM Corporation 2015
Dependent and independent filesets
• Dependent fileset
– Shares inode space
– 10,000 dependent filesets per file system.
• Independent fileset
– Distinct inode space
– 1,000 independent filesets per file system.
© Copyright IBM Corporation 2015
Why filesets?
• Filesets play an important role in ILM
– Not directly tied to storage pools, although policies may tie storage
pools to filesets.
• Added administrative control
– Per-fileset quotas add an additional dimension to the existing user and
group quotas.
• Fileset quotas
– Implements tree-based quota requirement
– Per-fileset quotas add an additional dimension to the existing user and
group quotas.
© Copyright IBM Corporation 2015
Fileset properties
• Root fileset is always there.
• At creation a fileset is ‘unlinked’
– It is not visible in the directory space
– The sysadmin can then link the fileset to an arbitrary point within a file
system
– mmlinkfileset is analogous to the file system mount operation.
• Once linked, fileset can be populated via normal means, that is,
by copying and creating files.
• Hard links are not allowed to cross fileset boundaries.
© Copyright IBM Corporation 2015
Fileset aging and demise
• Once linked into the file system,
– Fileset root directory looks like a normal directory, except that rmdir
can’t remove it.
– A fileset root dir can be moved around with mv (within the confines of
the parent fileset). It can be unlinked and relinked under a different
fileset.
• If you no longer need a fileset it can be unlinked, at which point
it is unreachable, but all files are still there. An unlinked fileset
can then be deleted.
© Copyright IBM Corporation 2015
Fileset commands
mmcrfileset
mmlinkfileset
mmunlinkfileset
mmdelfileset
mmlsfileset
mmchfileset
mmlsattr
© Copyright IBM Corporation 2015
Special Spectrum Scale file attributes
• These are Spectrum Scale specific attributes of a file
– Does not effect POSIX compliant file operations
– Additional information for use by Spectrum Scale and accessible by
other applications if needed (TSM for example).
• Attributes are stored in a sparse file
– The i’th block of the xattr file contains attributes for file with inode
number I.
• Current use of extended attributes includes:
– Storage Pool
– Fileset
– DMAPI
– direct-IO
– Custom extended attributes.
© Copyright IBM Corporation 2015
Policy-based management
• Two types of policies
– File placement
– File management.
• File placement policies
• Determine the initial storage pool for each file’s data
– The data will be striped across all disks in the selected pool
• Also determines the file’s replication factor.
• File management policies
• Determines when a file’s data should be migrated
• Determines where the data should go.
© Copyright IBM Corporation 2015
Policy rules
• Similar syntax to SQL 92 standard.
• You can have 1MB of rule text.
• Rule order matters
– Rules are evaluated top to bottom
– Once a rule matches processing ends for that file.
• You can use built-in functions. Examples:
– Date – Current_Timestamp, DayOfWeek, DAY(), HOUR()
– String – LOWER(), UPPER(),LENGTH()
– Numeric – INT(), MOD()
© Copyright IBM Corporation 2015
Rule syntax: Placement policy
• Syntax
RULE ['RuleName']
SET POOL 'PoolName'
[LIMIT (OccupancyPercentage)]
[REPLICATE (DataReplication)]
[FOR FILESET (FilesetName[,FilesetName]...)]
[WHERE SqlExpression]
• Can be set on attributes you know about a file when it is
created
– Name, Location, User.
© Copyright IBM Corporation 2015
File Management policy processing
• Batch process.
• Very efficient metadata
scans.
• When a batch is executed
there are 3 steps:
– Directory Scan
– Rule Evaluation
– File Operations.
• Can operate in parallel over
multiple machines.
© Copyright IBM Corporation 2011
Scan Files
1
Apply Rules2
Perform File Operations
3
Rule syntax: Migration policy
• SyntaxRULE [‘rule_name’] [ WHEN time-boolean-expression]
MIGRATE [ FROM POOL ’pool_name_from’
[THRESHOLD(high-occupancy-percentage[,low-occupancy-percentage])]]
[ WEIGHT(weight_expression)]
TO POOL ’pool_name’
[ LIMIT(occupancy-percentage) ]
[ REPLICATE(data-replication) ]
[ FOR FILESET( ‘fileset_name1’, ‘fileset_name2’, ... )]
[ WHERE SQL_expression]
• Operates on existing files
– Allows more attributes in rules
• File size, last accessed time
• Can perform the following operations:
– Migration
– Deletes
– Change of replication status
– Reporting
© Copyright IBM Corporation 2015
Policy set example
Placement Rules
RULE mpg0 SET POOL “scsi” WHERE UPPER(NAME) LIKE “%.MPG”
RULE dbfiles SET POOL “premium” FOR FILESET “db-fileset”
RULE devfiles SET POOL “normal” WHERE GID = 1100
Migration Rules
RULE mpg30 WHEN (DayOfWeek()=1) MIGRATE FROM POOL “scsi”
TO POOL “sata”
WHERE UPPER(NAME) LIKE "%.mpg" and ACCESS_AGE >30 DAYS
RULE mpg90 WHEN (DayOfWeek()=7) MIGRATE FROM POOL “sata”
TO POOL “tape”
WHERE LOWER(NAME) LIKE "%.mpg" and MODIFICATION_AGE > 90 DAYS
Deletion Rule
RULE mpg999 WHEN (MonthOfYear()=12 and DayOfWeek()=1)
DELETE FROM POOL “tape” WHERE UPPER(NAME) LIKE "%.MPG"
and CREATION_AGE > 999 DAYS
Exclude Rule
RULE xclude1 EXCLUDE WHERE GID=1
© Copyright IBM Corporation 2015
Policy language example using macros
define(east_adjustment,
CASE
WHEN XATTR_FLOAT('user.e',1,-1,'DECIMAL') < 0
THEN 180+(180+XATTR_FLOAT('user.e',1,-1,'DECIMAL’))
ELSE XATTR_FLOAT('user.e',1,-1,'DECIMAL')
END )
define(west_adjustment,
CASE
WHEN XATTR_FLOAT('user.w',1,-1,'DECIMAL') < 0
THEN 180+(180+XATTR_FLOAT('user.w',1,-1,'DECIMAL’))
ELSE XATTR_FLOAT('user.w',1,-1,'DECIMAL')
END )
define(north_adjustment, 90+XATTR_FLOAT('user.n',1,-1,'DECIMAL'))
define(south_adjustment, 90+XATTR_FLOAT('user.s',1,-1,'DECIMAL'))
RULE 'listall' list 'geo_files'
SHOW( varchar(kb_allocated)|| ' ' || fileset_name )
WHERE KB_ALLOCATED > 0
AND FILESET_NAME='master_t1'
AND south_adjustment <= 130.993664
AND north_adjustment >= 126.994021
AND east_adjustment >= 250.964755
AND west_adjustment <= 257.946178
AND DAYS(XATTR('user.t')) >= (DAYS(CURRENT_TIMESTAMP)-90)
© Copyright IBM Corporation 2015
Macros
manipulate data
Policy calls
macros
Query custom file
extended attributes
Policy commands: Placement policies
•mmchpolicy device fileName [-I {yes|no}]
– Sets the placement policy
– Policy file is read into memory and passed to sg mgr
– Rules are validated
– Stored in an internal file and recorded in the sg desc
– Rules are broadcast in a message to all nodes
•mmlspolicy device [-L]
– Display the current policy
© Copyright IBM Corporation 2015
Display installed policy using mmlspolicy
•#mmlspolicy gpfs1
• Policy file for file system '/dev/gpfs1':
Installed by root@c35f1n01 on Wed May 30 12:27:01 2013.
• First line from original file policyRule was:
rule 'p3' set pool 'pool3' where LOWER(NAME) like '%.p3'
•#mmlspolicy gpfs1 -L
rule 'p3' set pool 'pool3' where LOWER(NAME) like '%.p3'
rule 'default' SET POOL 'system' /* when all else fails
*/
© Copyright IBM Corporation 2015
Invoking File Management policies
• Command is mmapplypolicy
• Drives Migration/Deletion Policy: What and when
– Invocation manually or via cron
– Runs on node on which invocation was made
– Multi-threaded
– File system must be mounted and in home cluster
– Usage
mmapplypolicy {Device|Directory} [-A IscanBuckets] [-a IscanThreads]
[-B MaxFiles] [-D yyyy-mm-dd[@hh:mm[:ss]]] [-e] [-f FileListPrefix]
[-g GlobalWorkDirectory] [-I {yes|defer|test|prepare}]
[-i InputFileList] [-L n] [-M name=value...] [-m ThreadLevel]
[-N {all | mount | Node[,Node...] | NodeFile | NodeClass}]
[-n DirThreadLevel] [-P PolicyFile] [-q] [-r FileListPathname...]
[-S SnapshotName] [-s LocalWorkDirectory]
– Some Parameters-I Allows you to test
-g Shared Directory for temporary data
-m Number of threads to do processing
© Copyright IBM Corporation 2015
DMAPI
• Data Management API
• Enable DMAPI at the file system level
-z {yes|no}
• Requires DMAPI listener to mount file system
• Use for:
– Auto retrieval for offline data
– Custom applications
© Copyright IBM Corporation 2015
Review
• Spectrum Scale ILM tools implement business rules
• Filesets allow you to organize data
• Storage pools provide grouping of storage
• File Placement Policies assign data to pools on file creation
• File management policies automate
migration/deletion/replication/reporting
© Copyright IBM Corporation 2015
Exercise 3
Pools and Policies
Exercise
© Copyright IBM Corporation 2015
Unit summary
Having completed this unit, you should be able to:
• Information Life Cycle Management
• Storage pools
• File placement policies
• File management policies
• Filesets
© Copyright IBM Corporation 2015