nabil talhaoui([email protected]) joint epikh/eumedgrid-support event in rabat morocco, 03.06.2011

41
www.epikh.eu The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Nabil Talhaoui([email protected]) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011 SE Installation and configuration (Disk Pool Manager)

Upload: martha

Post on 08-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

SE Installation and configuration (Disk Pool Manager). Nabil Talhaoui([email protected]) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011. Overview of grid Data Managment DPM Overview DPM Installation Troubleshooting. Grid Overview. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

www.epikh.eu

The EPIKH Project(Exchange Programme to advance e-Infrastructure Know-How)

Nabil Talhaoui([email protected])

Joint EPIKH/EUMEDGRID-Support Event in Rabat

Morocco, 03.06.2011

SE Installation and configuration(Disk Pool Manager)

Page 2: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Overview of grid Data Managment• DPM Overview• DPM Installation• Troubleshooting

Location, Meeting title, dd.mm.yyyy 2

Page 3: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

3

Grid Overview

We know that HPC (High Performance Computing) could be resume in two main challenges:

• CPU power;• Storage system.

GRID has found a kind of solution to these items and in this talk we’ll analyze the GRID design of Storage System.

Page 4: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

OVERVIEW

• Assumptions:– Users and programs produce and require data– the lowest granularity of the data is on the file level (we deal with files rather

than data objects or tables)– Data = files

• Files:– Mostly, write once, read many– Located in Storage Elements (SEs)– Several replicas of one file in different sites– Accessible by Grid users and applications from “anywhere”

• Also…– WMS can send (small amounts of) data to/from jobs: Input and Output

Sandbox– Files may be copied from/to local filesystems (WNs, UIs) to the Grid (SEs)

4

Page 5: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

5

Overview

• Data Management System is the subsystem of the gLite Middleware which takes care about data manipulation for both all other GRID services and user application.

• DMS provides all operation that users can perform on the data.

– Creating files/directories– Renaming files/directories– Deleting files/directories– Moving files/directories– Listing directories– Creating symbolic links– Etc …..

Page 6: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

6

DMS – Objectives

• DMS provides two main capabilities:– File Management– Metadata Management

• File is the simplest way to organize data

• Metadata are “attributes” that describe other data

File Managementstorage (save, copy, read, list, …)placement (replica, transfer, ….)security (access control, ….);

Metadata Managementcataloguingsecure database accessdatabase schema virtualization

Page 7: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

7

Data Management Services

Data Management System is composed by three main modules:

Storage Element, Catalog and File Transfer Service.

• Storage Element – common interface to storage– Storage Resource Manager Castor, dCache, DPM, storm….– POSIX-I/O gLite-I/O, rfio, dcap, xrootd– Access protocols gsiftp, https, rfio, …

• Catalogs – keep track where data is stored– File Catalog– Replica Catalog– File Authorization Service– Metadata Catalog

• File Transfer – scheduled reliable file transfer– Data Scheduler (only designs exist so far)– File Transfer Service gLite FTS and glite-url-copy;

(manages physical transfer) Globus RFT, Stork– File Placement Service gLite FPS

(FTS and catalog interaction in a transactional way)

Page 8: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

OVERVIEW

• Storage Element is the service which saves/loads files to/from local storages. These local storages can be both, a disk or large storage systems.

• Functions:– File storage.– Storage resources administration interface.– Storage space administration.

• gLite 3.2 data access protocols:– File Transfer: GSIFTP (GridFTP)– File I/O (Remote File access): gsidcap insecure RFIO secured RFIO (gsirfio)

8

Page 9: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

SE Types• Classic SE:

– GridFTP server– Insecure RFIO daemon (rfiod) – only LAN limited file access– Single disk or disk array– No quota management– not supported anymore

• Mass Storage Systems (Castor)– Files migrated between front-end disk and back-end tape storage hierarchies– GridFTP server– Insecure RFIO (Castor)– Provide a SRM interface with all the benefits

• Disk pool managers (dCache and gLite DPM)– manage distributed storage servers in a centralized way– Physical disks or arrays are combined into a common (virtual) file system– Disks can be dynamically added to the pool– GridFTP server– Secure remote access protocols (gsidcap for dCache, gsirfio for DPM)– SRM interface

• Storm– ● Solution best suited to cope with large storage (> or >> 100 TB)

– ● Makes full advantage of parallel filesystem (GPFS, Lustre)

– ● SRM v2.2 interface9

Page 10: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

10

Overview

Storage Resource Managers (SRMs) are middleware components whose function is to provide dynamic space allocation and file management on shared distributed storage systems.

This effort supports the mission in providing the technology needed to manage the rapidly growing distributed data volumes, as a result of faster and larger computational facilities.

Page 11: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

You as a user need to know all the systems!!!

SRM (Storage Resource Manager )

dCache

Castor

gLite DPM

SR

M

I talk to them on your behalf

I will even allocate space for your files

And I will use transfer protocols to send your files there

Storm

Page 12: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• The Disk Pool Manager (DPM) is a lightweight solution for disk storage management, which offers the SRM interfaces.

• It may act as a replacement for the obsolete classical SE with the following advantages :–  SRM interface (both v1.1 and v2.2) –  Better scalability : DPM is allow to manage 100+ TB distributing the load over

several servers– High performances–  Light-weight management

• The DPM head node has to have one filesystem in this pool, and then an arbitrary number of disk servers can be added by YAIM.

• The DPM disk servers can have multiple filesystems in the pool.

• The DPM head node also hosts the DPM and DPNS databases, as well as the SRM web service interfaces.

Disk Pool Manager Overview

Page 13: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Disk Pool Manager Overview

Page 14: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

/vo

DPM architecture

/dpm

/domain

/home

DPM

head node file

DPM

disk servers

• DPM Name Server

– Namespace

– Authorization

– Physical files location

• DPM Server

– Requests queuing and processing

– Space management

• SRM Servers (v1.1, v2.1, v2.2)

• Disk Servers

– Physical files

• Direct data transfer from/to

disk server (no bottleneck)

CLI, C API,

SRM-enabled client, etc. data transfer

Page 15: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Usually the DPM head node hosts:– SRM server (srmv1 and/or srmv2) : receives the SRM requests

and pass them to the DPM server;– DPM server : keeps track of all the requests;– DPM name server (DPNS) : handles the namespace for all the

files under the DPM control;– DPM RFIO server : handles the transfers for the RFIO protocol;– DPM Gridftp server : handles the transfer for the Gridftp protocol.

DPM architecture

Page 16: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Installing DPM

Page 17: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Start from a fresh install of SLC 5.X (In this tutorial use X86_64)• Installation will install all dependencies, including

– other necessary gLite modules– external dependencies

Installing pre-requisites /1

Page 18: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

•We need a dedicated partition for the storage area

• Check the partition • # df –h• Filesystem Size Used Avail Use% Mounted on• /dev/sda1 9.7G 820M 8.4G 9% /• /dev/sda2 19G 33M 18G 1% /storage• none 125M 0 125M 0% /dev/shm

Installing pre-requisites /2

Page 19: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Adding Disk

• Edit VM settings before start the VM

• Add a second Disk (scsi) (20GB is enough for tutorial)

• Start the Virtual Machine

• Login

• fdisk -l (to check disk exists)

• fdisk /dev/sdb (and create a primary partition – new partition (n p 1 enter enter))– print and write( p w)

• mkfs /dev/sdb1

• mkdir /storage

• mount /dev/sdb1 /storage

• (edit /etc/fstab to properly mount disk at boot !!!)– /dev/sdb1 /storage ext3 defaults 1 2

Location, Meeting title, dd.mm.yyyy 19

FOR THIS TUTORIA L ADD A DISK FOR VIRTUAL MACHINE

Page 20: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Repository settings

• cd /etc/yum.repos.d/

• Specify the mrepo host:• export MREPO=http://repo.magrid.ma/yumrepo/glite32

• Configure the repository as follows:• REPOS="dag lcg-CA glite-SE_dpm_mysql glite-SE_dpm_disk "

• Get repositories with:

• for name in $REPOS;• do wget $MREPO/$name.repo -O /etc/yum.repos.d/$name.repo; done

Page 21: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Syncronization among all gLite nodes is mandatory.

• So install ntp

– #yum install ntp

– You can check ntpd’s status

Installing pre-requisites /3

Page 22: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Check the FQDN (fully qualified domain name) hostname – Ensure that the hostnames of your machines are

correctly set. Run the command:

#hostname –f

– if your hostname is incorrect : edit the file /etc/sysconfig/network and set the

HOSTNAME variable, then restart network service

Installing pre-requisites /4

Page 23: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Installation

• # yum clean all • # yum update

• Install the Cas: • # yum install lcg-CA

• Install the MySQL:

• # yum install mysql-server• # yum install mysql-devel

• Install the metapackage• – yum install <metapackage>:

• # yum install glite-SE_dpm_mysql • # yum install glite-SE_dpm_disk

Page 24: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Copy host certificate located in /root/ as pcXXcert.pem and pcXXkey.pem to /etc/grid-security (hostcert.pem and hostkey.pem).

• Change files permission– #chmod 644 /etc/grid-security/hostcert.pem– #chmod 400 /etc/grid-security/hostkey.pem

Installation-Host certificate

Page 25: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

DPM Configuration

• Create a copy of site-info.def template to your reference directory for the installation (e.g. /root/sitedir):cp /opt/glite/yaim/examples/siteinfo/site-info.def /root/sitedir/mysite-info.def

• Copy the directory ‘services’ in the same locationcp –r /opt/glite/yaim/examples/siteinfo/services

/root/sitedir/.# ls /root/sitedir/

my-site-info.def services

#ls /root/sitedir/services

glite-se_dpm_disk glite-se_dpm_mysq

• Edit the site-info.def file• A good syntax test for your site configuration file is to

try to source it manually running the command:– #source site-info.def #(after you end editing)

Page 26: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• MY_DOMAIN = mydomainname mydomainname #your domain name (check it)

• MYSQL_PASSWORD=passwd_root # the root Mysql password

• VOS=“eumed" #The VO we want …

• ALL_VOMS_VOS=“eumed“

site.def

Page 27: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• WMS_HOST=wms-01.eumedgrid.eu• LB_HOST="wms-01.eumedgrid.eu:9000"

• LFC_HOST=lfc.ulakbim.gov.tr• BDII_HOST=wms-01.eumedgrid.eu

VOS=“eumed “ add here the VOs you want to supportVO_EUMED_SW_DIR=$VO_SW_DIR/eumed

VO_EUMED_DEFAULT_SE=$SE_HOST

VO_EUMED_STORAGE_DIR=$CLASSIC_STORAGE_DIR/eumed

VO_EUMED_VOMS_SERVERS="'vomss://voms2.cnaf.infn.it:8443/voms/eumed?/eumed' 'vomss://voms-02.pd.infn.it:8443/voms/eumed?/eumed'"

VO_EUMED_VOMSES="'eumed voms2.cnaf.infn.it 15016 /C=IT/O=INFN/OU=Host/L=CNAF/CN=voms2.cnaf.infn.it eumed' 'eumed voms-02.pd.infn.it 15016 /C=IT/O=INFN/OU=Host/L=Padova/CN=voms-02.pd.infn.it eumed'"

VO_EUMED_VOMS_CA_DN="'/C=IT/O=INFN/CN=INFN CA' '/C=IT/O=INFN/CN=INFN CA'"

VO_EUMED_WMS_HOSTS="prod-wms-01.pd.infn.it wms.ulakbim.gov.tr wms-01.eumedgrid.eu"

Location, Meeting title, dd.mm.yyyy 27

Support for eumed VO

Page 28: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Support for eumedVO

Add eumed poolaccount in /opt/glite/yaim/examples/users.conf according the following format:UID:LOGIN:GID:GROUP:VO:FLAG:

example:

– 3101:eumed001:2418:eumed:eumed::– 3102:eumed002:2418:eumed:eumed::– 3103:eumed003:2418:eumed:eumed::– 3104:eumed004:2418:eumed:eumed::– 3105:eumed005:2418:eumed:eumed::..

Add the following lines to /opt/gite/yaim/examples/groups.conf

– "/eumed/ROLE=SoftwareManager":::sgm:– "/eumed"::::

28

Page 29: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• In the files glite-se_dpm_disk glite-se_dpm_mysql) • Set the variables:

• DPM_HOST= <your host>.$MY_DOMAIN #FQDN of DPM head node • DPM_DB_USER=dpmmgr #The user for our database• DPM_DB_PASSWORD=mysql_pass MYSQL password • DPMFSIZE=200M # The space to be reserved by

default for a file stored in the DPM • DPMPOOL=Permanent #**The name and type of the pool

including file system(ex: Permanent) • DPM_FILESYSTEMS="$DPM_HOST:/storage" # The filesystems parts of the• DPM_DB_HOST=$DPM_HOST• DPM_INFO_PASS=the-dpminfo-db-user-pwd

• SE_GRIDFTP_LOGFILE=/var/log/dpm-gsiftp/dpm-gsiftp.log

site.def (DPM)

Page 30: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• The following ports have to be open:

– DPM server: port 5015/tcp must be open locally at your site at least (can be incoming access as well),

– DPNS server: port 5010/tcp must be open locally at your site at least (can be incoming access as well),

– SRM servers: ports 8443/tcp (SRMv1) and 8444/tcp (SRMv2) must be opened to the outside world (incoming access),

– RFIO server: port 5001/tcp must be open to the outside world (incoming access), in the case your site wants to allow direct RFIO access from outside,

– Gridftp server: control port 2811/tcp and data ports 40000-45000/tcp (or any range specified by GLOBUS_TCP_PORT_RANGE) must be opened to the outside world (incoming access).

– FOR THIS TUTORIAL JUST STOP IPTABLES– #service iptables stop

Firewall configuration

Page 31: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• /opt/glite/yaim/bin/yaim -c -s <your-site-info.def> -n glite-SE_dpm_mysql

• /opt/glite/yaim/bin/yaim -c -s <your-site-info.def> -n glite-SE_dpm_disk

If you want install the disks on another machine you can run /opt/glite/bin/yaim –c –s site-info.def glite_SE_dpm_diskon the other machine

Then run (on dpm_mysql machine)

dpm-addfs --poolname Permanent --server diskserverhostname –fs /storage2

Middleware Configuration

Page 32: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• After configuration remember to manually run the script /etc/cron.monthly/create-default-dirs-DPM.sh as suggested by yaim log. This script create and set the correct permissions on VO storage directories; it will be run monthly via cron.

Just skip it for this training !!!

Page 33: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

DPM Server Testing

Page 34: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Testing DPM

•A simple test for checking if the DPM server is correctly exporting the filesystem is:

– /opt/lcg/bin/dpm-qryconf

Page 35: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Post configuration

• Login into the UI(ui01.magrid.ma):Set the variables:

DPM_HOST : “export DPM_HOST=pcXX.magrid.ma” DPNS_HOST : “export DPNS_HOST=pcXX.magrid.ma”

Execute following commands :- dpm-qryconf dpns-ls / dpns-mkdir dpns-rm

# globus-url-copy file:/tmp/myfile

gsiftp://yourdpmhost/dpm/magrid/home/eumed/testfile

#uberftp yourdmphost.domain (chek if this connection works!) Then try to really copy a file using globus

Page 36: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• Other command to build NameSpace – dpns-mkdir– dpns-chmod– dpns-chown– dpns-setacl

And commands to add pools and filesystems– dpm-addfs– dpm-addpool

Page 37: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• The critical point of DPM is the database (mysql)• In a production site take the appropriate cautions to

backup the database .• If you miss your database you will miss all your data!!!• Consider to take a full backup of the machine

or use of mysql replica

(http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html)

Mysql

Page 38: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

• And take a look at mysql db• #mysql –p –u dpmmgr• Enter password:*****• mysql> show databases;• +----------+• | Database |• +----------+• | cns_db |• | dpm_db |• | mysql |• | test |• +----------+

+------------------+| Tables_in_dpm_db |+------------------+| dpm_copy_filereq || dpm_fs || dpm_get_filereq || dpm_pending_req || dpm_pool || dpm_put_filereq || dpm_req || dpm_space_reserv || dpm_unique_id || schema_version |+------------------+ mysql>connect dpm_db;

mysql>show tables;

mysql>connect cns_db;

mysql>show tables;

+--------------------+| Tables_in_cns_db |+--------------------+| Cns_class_metadata || Cns_file_metadata || Cns_file_replica || Cns_groupinfo || Cns_symlinks || Cns_unique_gid || Cns_unique_id || Cns_unique_uid || Cns_user_metadata || Cns_userinfo || schema_version |+--------------------+

mysql DB

Page 39: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Log-files

• If you have some problem try to analyze your log-files

/var/log/dpns/log /var/log/dpm/log /var/log/dpm-gsiftp/dpm-gsiftp.log /var/log/srmv1/log /var/log/srmv2/log /var/log/srmv2.2/log /var/log/rfio/log

SE_GRIDFTP_LOGFILE=//var/log/globus-gridftp.log

(Files can be in different location depending on the version of packages installed)

Page 40: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011

Reference

http://www.gridpp.ac.uk/wiki/Disk_Pool_Manager

https://twiki.cern.ch/twiki/bin/view/LCG/DpmGeneralDescription

http://igrelease.forge.cnaf.infn.it/doku.php?id=doc:guides:install-3_2

Page 41: Nabil Talhaoui(talhaoui@cnrst.ma) Joint EPIKH/EUMEDGRID-Support Event in Rabat Morocco, 03.06.2011