nov 5-6, 2005 see-grid banjaluka training session introduction to linux clusters and grids design...
TRANSCRIPT
Nov 5-6, 2005
Acad
em
ic a
nd E
ducat ional Gr id Init iat ive o
f Serbia
A E G I S
SEE-GRID Banjaluka Training Session
Introduction to Linux Clusters Introduction to Linux Clusters and Gridsand Grids
Design and Basic Services of Design and Basic Services of LCG Grid MiddlewareLCG Grid Middleware
SEE-GRID Infrastructure SEE-GRID Infrastructure OverviewOverviewAntun BalažSCL, Institute of Physics
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Linux ClustersLinux Clusters
Commodity hardware become available in the last 10 years
Local network 100-1000 Mbps easily deployed
Linux mature and widely available Software available and even
standardized - MPI
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Science and technology Science and technology are team sportsare team sports
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Unifying concept: GridUnifying concept: Grid
Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations.
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
What types of problems is What types of problems is the Grid intended to the Grid intended to address?address?
Too hard to keep track of authentication data (ID/password) across institutions
Too hard to monitor system and application status across institutions
Too many ways to submit jobs
Too many ways to store & access files/data
Too many ways to keep track of data
Too easy to leave “dangling” resources lying around (robustness)
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
RequirementsRequirements Security Monitoring/Discovery Computing/Processing Power Moving and Managing Data Managing Systems System Packaging/Distribution
What end users need?What end users need?Secure, reliable, on-demand access to data, software, people, and other resources (ideally all via a Web Browser!)
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Set of basic Grid Set of basic Grid servicesservices
Job submission/management File transfer (individual, queued) Database access Data management (replication,
metadata) Monitoring/Indexing system
information
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Multi-institution Multi-institution issuesissues
Trust Mismatch
Mechanism Mismatch
CertificationAuthority
CertificationAuthority
Domain A
Server X Server Y
PolicyAuthority
PolicyAuthority
Task
Domain B
Sub-Domain A1 Sub-Domain B1
No Cross-
Domain Trust
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Why Grid security is hardWhy Grid security is hard
Resources being used may be valuable & the problems being solved sensitive- Both users and resources need to be careful
Dynamic formation and management of virtual organizations- Large, dynamic, unpredictable…
VO Resources and users are often located in distinct administrative domains- Can’t assume cross-organizational trust agreements- Different mechanisms & credentials
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Why Grid security is hard Why Grid security is hard 22 Interactions are not just client/server,
but service-to-service on behalf of the user- Requires delegation of rights by user to service- Services may be dynamically instantiated
Standardization of interfaces to allow for discovery, negotiation and use
Implementation must be broadly available & applicable- Standard, well-tested, well-understood protocols; integrated with wide variety of tools
Policy from sites, VO, users need to be combined- Varying formats
Want to hide as much as possible from applications!
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Grid solution: use of VOsGrid solution: use of VOs
Certification
Domain A
Server X Server Y
PolicyAuthority
PolicyAuthority
TaskDomain B
Sub-Domain A1
GSI
CertificationAuthority
Sub-Domain B1
Authority
FederationService
VirtualOrganization
Domain
No Cross-
Domain Trust
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Effective policy governing Effective policy governing access within a access within a collaborationcollaboration
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Use delegation to establish Use delegation to establish dynamic distributed systemdynamic distributed system
ComputingCenter
VO
Rights
ComputingCenter
Service
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
GSI implementationGSI implementation
ComputeCenter
SSL/WS-Securitywith ProxyCertificates
VO
RightsVO
Users
Services (runningon user’s behalf)
Rights’’
Rights’
Access
Local Policyon VO identityor attributeauthority
CAS or VOMSissuing SAMLor X.509 ACs
Authz Callout
KCA
MyProxy
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
““Logging on” to the GridLogging on” to the Grid
To run programs, authenticate to Grid:voms-proxy-init –voms VONAMEEnter PEM pass phrase: ***************
Creates a temporary, local, short-lived proxy credential for use by our computations
Delegation = remote creation of a (second level) proxy credential, which allows remote process to authenticate on behalf of the user
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
MiddlewareMiddleware LCG: Large Hadron Collider Computing Grid LCG infrastructure running LCG-2 is “EGEE-0” In parallel producing new web-service-oriented
middleware (“gLite”), which will replace LCG-2 as production facility this year
Globus 2 based Web services based
EGEE-2EGEE-1LCG-2LCG-1
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
User view of the GridUser view of the Grid
User Interface
Grid services
User Interface
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
What really happensWhat really happensReplicaReplicaCatalogueCatalogue
Logging &Logging &Book-keepingBook-keeping
ResourceResourceBrokerBroker
StorageStorageElementElement
ComputingComputingElementElement
Information Information ServiceService
Job Status
DataSets info
Auth.&Auth.
Job
Su
bm
it Even
t
Job
Q
uery
Job S
tatu
s
Input“sandbox”
Input “sand
box”
+Broker Info
Outp
ut “san
dbox”
Output“sandbox”
Pu
blis
h
SE & CE info
User User interfaceinterface
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Workload Management Workload Management System (WMS)System (WMS)
Distributed scheduling multiple UI’s where you can submit your job multiple RB’s from where the job can be sent
to a CE multiple CE’s where the job can be put in a
queuing system
Distributed resource management multiple information systems that monitor the
state of the grid Information from SE, CE, sites
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Authentication and Authentication and AuthorizationAuthorization Authentication
User obtains certificate from CA Connects to UI by ssh Downloads certificate Invokes Proxy server Single logon – to UI - then Secure Socket
Layer with proxy identifies user to other nodes
Authorization - currently User joins Virtual Organisation VO negotiates access to Grid nodes and
resources (CE, SE) Authorization tested by CE, SE: gridmapfile
maps user to local account
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
User Interface (UI)User Interface (UI) UI is the user’s interface to the Grid -
Command-line interface to Proxy server Job operations
To submit a job Monitor its status Retrieve output
Data operations Upload file to SE Create replica Discover replicas
Other grid services To run a job user creates a JDL (Job
Description Language) file
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Computing Element Computing Element (CE)(CE)
Homogeneous set of worker nodes
Grid gate node
Local resource management system:Condor / PBS / LSF master
Gatekeeper
Job request
Info system
Logging
gridmapfile
I.S.
Logging
A CE is a grid batch queuewith a “grid gate” front-end:
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Storage Element (SE)Storage Element (SE) Storage elements hold files: write once, read many Replica files can be held on different SE:
“close” to CE; share load on SE Replica Catalogue - what replicas exist for a file? Replica Location Service - where are they?
Local Info
EventLogging
gridmapfile
GridFTP
Disk arrays or tapesDisk arrays or tapes
Info system
Logging
Gatekeeper
File transfer Requests
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Resource BrokerResource Broker Run the Workload Management System
To accept job submissions Dispatch jobs to appropriate Compute Element (CE) Allow users
To get information about their status To retrieve their output
A configuration file on each UI node determines which RB node(s) will be used
When a user submits a job, JDL options are to: Specify CE Allow RB to choose CE (using optional tags to define
requirements) Specify SE (then RB finds “nearest” appropriate CE,
after interrogating Replica Location Service)
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Logging and Logging and BookkeepingBookkeeping
Who did what and when? What’s happening to my job? Usually runs on RB node
Information SystemInformation System Receives periodic (~5 min) updates from CE, SE Used by RB node to determine resources to be
used by a job Currently BDII is used
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
What have we learn so What have we learn so far?far?
Grid structure is complicated but hidden from end-users, enabling all the comfort they need
Users just need to join the VO and obtain certificates: we already have the SEE-GRID VO!
Use of Grid is then just as easy as the use of a computer cluster
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
SEE-GRID OverviewSEE-GRID Overview
SEE-GRID is EU FP6 project, involving 11 partners from 11 European countries:Greece, Switzerland, Bulgaria, Romania, Turkey, Hungary, Albania, Bosnia and Herzegovina, FYR of Macedonia, Serbia and Montenegro, Croatia
Each partner collaborates with one or more 3rd parties
Project started in May 2004, lasts 2 years, SEE-GRID-2 on its way
http://www.see-grid.org/
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
SEE-GRID Objectives (1)SEE-GRID Objectives (1)
Human network in the area of grid computing eScience and eInfrastructures
Integrate incubating and existing National Grid infrastructures in all SEE-GRID countries
Ease the digital divide and bring SEE Grid communities closer to the rest of the continent
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
SEE-GRID Objectives (2)SEE-GRID Objectives (2)
Establish a dialogue at the level of policy developments for research and education networking and provide input to the agenda of national governments and funding bodies
Promote awareness in the region regarding Grid developments through dissemination conferences, training material and demonstrations for hands-on experience
Migrate and test Grid middleware components and APIs developed by pan-European and national Grid efforts in the regional infrastructure
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
SEE-GRID Objectives (3)SEE-GRID Objectives (3)
Deploy (adapt if necessary) and test Grid applications developed by EGEE
Demonstrate an additional Grid application of regional interest
Integrate available pilot Resource Centres of Albania, Bosnia-Herzegovina, Croatia, FYR of Macedonia, Serbia-Montenegro and Turkey into the EGEE-compatible infrastructure
Expand the operations and support centre of the EGEE SE Europe Federation to cater for the operations in the above countries
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
SEE-GRID Infrastructure SEE-GRID Infrastructure Overview (1)Overview (1)
At least one SEE-GRID site per country, (currently 15+1!), each deploying CE, SE, MON, UI, and a number of WNs
SEE-GRID regional services: SEE-GRID CA (Greece) RB and BDII (Turkey + Serbia and
Montenegro) VOMS (Croatia) R-GMA (Bulgaria) SFTs and GridICE (FYR of Macedonia) P-GRADE portal (Hungary) MYProxy (Greece + Serbia and Montegro) LFC (Serbia and Montenegro)
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
SEE-GRID Infrastructure SEE-GRID Infrastructure Overview (2)Overview (2)
SEE-GRID applications: SE4SEE (Turkey) VIVE (Serbia and Montenegro)
Technical Forum (Hungary) SEE-GRID Web site and WIKI
(Greece) Infrastructure mailing list:
[email protected] Strong human network
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Hands-on PlanHands-on Plan Hands-on I: UI Installation and Configuration Hands-on II: Certificates, Proxies, Test Jobs
TOMORROW: Hands-on III: Composing the site-info.def file Hands-on IV: UI/CE Installation and
Configuration Hands-on V: SE/MON Installation and
Configuration Hands-on VI: WNs Installation and
Configuration Hands-on VII: Testing and SEE-GRID Tuning
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Hands-on II: Hands-on II: Certificates, Proxies, Certificates, Proxies, Test JobsTest Jobs
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Grid CertificatesGrid Certificates
Each user must have a valid X.509 certificate issued by a recognized Certification Authority (CA)
Before doing any Grid operation, user must log in to User Interface (UI) machine and create a proxy certificate.
A proxy certificate is a delegated user credential that authenticates the user in every secure interaction, and has a limited lifetime: in fact, it prevents having to use one's own certificate, which could compromise its safety
voms-proxy-init –voms VONAME Voms-proxy-info; voms-proxy-destroy
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Job Submission (1)Job Submission (1)
User have to create a file describing the submitted job in Job Description Language (JDL)
User submits jobs to Resource Broker (RB)
JDL for simple test job:
[antun@ce antun]$ cat test.jdlExecutable = "/bin/hostname";
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.out","std.err"};
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Job Submission (2)Job Submission (2)
edg-job-list-match test.jdl edg-job-submit test.jdl edg-job-status JobID edg-job-cancel JobID edg-job-get-output JobID edg-job-get-logging-info JobID Bypassing RB:
globus-job-run CE command
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Using myproxy serverUsing myproxy server
Myproxy server is used for Very long jobs (that normal proxy
may be expired) Getting proxy on other machines
than UI (typical for portals) myproxy-init –s MYPROXYSERVER myproxy-get-delegation myproxy-info myproxy-destroy
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
In a nutshellIn a nutshell
voms-proxy-init –voms VONAME edg-job-submit job.jdl edg-job-status JobID edg-job-get-output JobID
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Monitoring, SEE-GRID Monitoring, SEE-GRID SFTs and GridICE (1)SFTs and GridICE (1)
Qstat, showq, pbsnodes on CE Ldapsearch of GIISes:
ldapsearch -x -h <CE_or_SE> -p 2135 -b mds-vo-name=local,o=grid
ldapsearch -x -h <CE> -p 2135 -b mds-vo-name=<site-giis-name>,o=grid
ldapsearch -x -h <BDII> -p 2170 -b o=grid
Useful entries: GlueCEUniqueID, GlueSEUniqueID, GlueSEName, GlueCESEBindSEUniqueID
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Monitoring, SEE-GRID Monitoring, SEE-GRID SFTs and GridICE (2)SFTs and GridICE (2)
For some grid components there are custom checking tools, e.g. rgma-client-check
ps on all nodes – do not forget about excellent ps!
Submitting test jobs SEE-GRID GStat
http://goc.grid.sinica.edu.tw/gstat/seegrid/
SEE-GRID Banjaluka Training Session
A E G I S
Nov 5-6, 2005
Monitoring, SEE-GRID Monitoring, SEE-GRID SFTs and GridICE (3)SFTs and GridICE (3)
SEE-GRID SFTshttp://grid-se.marnet.net.mk/sft/lastreport.cgi
SEE-GRID GridICEhttp://grid-se.ii.edu.mk/gridice/site/site.php
Real Time Monitor http://gridportal.hep.ph.ic.ac.uk/rtm/