introduction to arsc systems and services introduction to arsc systems and services derek bastille...
Post on 21-Dec-2015
226 views
TRANSCRIPT
Introduction to ARSC Systems Introduction to ARSC Systems and Servicesand Services
Derek [email protected]
907-450-8643User Consultant/Group Lead
OutlineOutline
• About ARSCAbout ARSC• ARSC Compute SystemsARSC Compute Systems• ARSC Storage SystemsARSC Storage Systems• Available SoftwareAvailable Software• Account InformationAccount Information• Utilization / AllocationsUtilization / Allocations• QuestionsQuestions
About ARSCAbout ARSC
• We are not a government site.We are not a government site.– Owned and operated by the University of Alaska Owned and operated by the University of Alaska
Fairbanks.Fairbanks.– Focus on Arctic and other Polar regionsFocus on Arctic and other Polar regions
• Involved in the International Polar YearInvolved in the International Polar Year
– Host a mix of HPCMP and non-DoD usersHost a mix of HPCMP and non-DoD users– On-site staff and faculty perform original research On-site staff and faculty perform original research
in fields like Oceanography, Space Weather in fields like Oceanography, Space Weather Physics, Vulcanology and Large Text Retrieval Physics, Vulcanology and Large Text Retrieval
About ARSCAbout ARSC• Part of the HPCMP as an Allocated Distributed Part of the HPCMP as an Allocated Distributed
CenterCenter– Participated in various Technology Insertion initiativesParticipated in various Technology Insertion initiatives
• New Cray system acquired as part of TI-08)New Cray system acquired as part of TI-08)
– Allocate 70% of available cycles to HPCMP projects Allocate 70% of available cycles to HPCMP projects and usersand users
– Locally allocate remaining 30% to non-DoD Locally allocate remaining 30% to non-DoD Universities and other government agenciesUniversities and other government agencies
– Connectivity is primarily via DREN OC12Connectivity is primarily via DREN OC12– Host Service Academy cadets during the summer Host Service Academy cadets during the summer
along with other academic internsalong with other academic interns
About ARSCAbout ARSC
• An Open Research CenterAn Open Research Center– All ARSC systems are ‘open research’All ARSC systems are ‘open research’– Only unclassified and non-sensitive dataOnly unclassified and non-sensitive data– Can host US citizens or foreign nationals Can host US citizens or foreign nationals
who are NAC-lesswho are NAC-less– Also host undergraduate and graduate Also host undergraduate and graduate
courses through the University of Alaskacourses through the University of Alaska• Happy to work with other Universities for Happy to work with other Universities for
student and class system usagestudent and class system usage
About ARSCAbout ARSC
• Wide variety of ProjectsWide variety of Projects– Computational Technology AreasComputational Technology Areas
CTA CTA NameOTH OtherCSM Computational Structural MechanicsCFD Computational Fluid MechanicsCCM Computational Chemistry and Materials ScienceCEA Computational Electromagnetics and SimulationCWO Climate/Weather/Ocean Modeling and SimulationSIP Signal/Image ProcessingFMS Forces Modeling and Simulation/C4IEQM Environmental Quality Modeling and SimulationCEN Computational Electronics and NanoelectronicsIMT Integrated Modeling and Test EnvironmentsSAP Space/Astrophysics
CTA Comparison for CY2007
OTH
CSM
CFD
CCM
CEA
CWO
SIP
FMS
EQM
CEN
IMT
SAP
About ARSCAbout ARSC
# of Projects
# of Jobs
CPU Hrs
ARSC SystemsARSC Systems
Iceberg Iceberg [AK6][AK6]
IBM Power4 (800 core)
5 TFlops peak92 p655+ nodes
736 1.5 Ghz CPUs
2 p690 nodes64 1.7 Ghz CPUs
25 TB DiskWill be retired on
18 July, 2008
ARSC SystemsARSC Systems
Midnight [AK8]Midnight [AK8]– SuSE Linux 9.3 EnterpriseSuSE Linux 9.3 Enterprise– All nodes have 4GB per coreAll nodes have 4GB per core– 358 X2200 Sun Fire nodes 358 X2200 Sun Fire nodes
• 2 dual core 2.6 Ghz 2 dual core 2.6 Ghz Opterons/nodeOpterons/node
– 55 X4600 Sun Fire nodes55 X4600 Sun Fire nodes• 8 dual core 2.6 Ghz 8 dual core 2.6 Ghz
Opterons/nodeOpterons/node– Voltaire Infiniband switchVoltaire Infiniband switch– PBSProPBSPro
– 68Tb Lustre Filesystem68Tb Lustre Filesystem
ARSC SystemsARSC Systems
PingoPingo– Cray XT5Cray XT5– 3,456 2.6Ghz Opteron 3,456 2.6Ghz Opteron
CoresCores• 4Gb per core4Gb per core• 13.5 TB total memory13.5 TB total memory
– 432 Nodes432 Nodes– 31.8 TFlops peak 31.8 TFlops peak – SeaStar interconnectSeaStar interconnect– 150 TB storage150 TB storage– Working towards FY2009 Working towards FY2009
availability (October 2008)availability (October 2008)
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
ARSC Systems - ARSC Systems - StorageStorage Seawolf / NanookSeawolf / Nanook
– SunFire 6800SunFire 6800– 8 900 Mhz CPUs 8 900 Mhz CPUs
• 16 GB total memory16 GB total memory
– 20 TB local (seawolf)20 TB local (seawolf)– 10 TB local (nanook) 10 TB local (nanook) – Fibre Channel to STK siloFibre Channel to STK silo– $ARCHIVE NFS mounted$ARCHIVE NFS mounted
Storage Tek SiloStorage Tek Silo
– SL8500SL8500– > 3 PB theoretical capacity> 3 PB theoretical capacity– STK T10000 & T9940 drivesSTK T10000 & T9940 drives
ARSC Systems - ARSC Systems - Data AnalysisData Analysis
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Discovery LabDiscovery Lab– MD Flying FlexMD Flying Flex– Multichannel audio and Multichannel audio and
video video – Located on UAF Located on UAF
campuscampus– Other Linux / OSX Other Linux / OSX
workstations available workstations available for post-production, data for post-production, data analysis, animation and analysis, animation and renderingrenderingAccess Grid NodesAccess Grid Nodes
– Collaborations with Collaborations with many UAF departmentsmany UAF departments
ARSC Systems - ARSC Systems - SoftwareSoftware
• All the usual suspectsAll the usual suspects– Matlab, ABAQUS, NCAR, Fluent, etcMatlab, ABAQUS, NCAR, Fluent, etc– GNU tools, various libraries, etc.GNU tools, various libraries, etc.– Several HPCMP Consolidated Software Several HPCMP Consolidated Software
Initiative packages and toolsInitiative packages and tools
• Several compilers on MidnightSeveral compilers on Midnight– Pathscale, GNU, Sun StudioPathscale, GNU, Sun Studio
www.arsc.edu/support/resources/software.phtmlwww.arsc.edu/support/resources/software.phtml
Access PoliciesAccess Policies
• Similar access policies to other HPCMP Similar access policies to other HPCMP centerscenters– All logins to ARSC systems are via kerberized All logins to ARSC systems are via kerberized
clientsclients• ssh, scp, kftp, krloginssh, scp, kftp, krlogin• ARSC issues SecurID cards for the ARSC.EDU ARSC issues SecurID cards for the ARSC.EDU
Kerberos realm Kerberos realm • Starting to implement PKI infrastructureStarting to implement PKI infrastructure
– PKI is still a moving target for HPCMP at this pointPKI is still a moving target for HPCMP at this point
– All ARSC systems undergo regular HPCMP CSA All ARSC systems undergo regular HPCMP CSA checks and the DITSCAP/DIACAP processchecks and the DITSCAP/DIACAP process
Access PoliciesAccess Policies
• ‘‘Open Center Access’Open Center Access’– Only HPCMP center to be Open Access for all Only HPCMP center to be Open Access for all
systemssystems– National Agency Checks not requiredNational Agency Checks not required– Nominal restrictions on Foreign NationalsNominal restrictions on Foreign Nationals
• Must apply from within USMust apply from within US• Need to not be in the TDOTSNeed to not be in the TDOTS• Must provide valid passport & entry statusMust provide valid passport & entry status
– Information Assurance Awareness training is Information Assurance Awareness training is requiredrequired
Access PoliciesAccess Policies
– Security PoliciesSecurity Policies
www.arsc.edu/support/policy/secpolicy.htmlwww.arsc.edu/support/policy/secpolicy.html• Dot file permissions and some contents Dot file permissions and some contents
routinely checked by scriptsroutinely checked by scripts• Kerberos passphrases expire every 180 daysKerberos passphrases expire every 180 days• Accounts placed in an ‘inactive’ status after 180 Accounts placed in an ‘inactive’ status after 180
days of not logging indays of not logging in• Please ask us if you have any questionsPlease ask us if you have any questions
Application Process - Application Process - DoDDoD
• HPCMP users need to use pIE and HPCMP users need to use pIE and work with their S/AAAwork with their S/AAA
• ARSC has a cross-realm trust with ARSC has a cross-realm trust with other MSRCs, so principals such as other MSRCs, so principals such as HPCMP.HPC.MIL can be usedHPCMP.HPC.MIL can be used
We are assuming that most UH researchers We are assuming that most UH researchers will be applying as Non-DoD accountswill be applying as Non-DoD accounts
Application Process - Application Process - Non-DoDNon-DoD
– Non-DoD users and projects are handled internally Non-DoD users and projects are handled internally by ARSCby ARSC
– www.arsc.edu/support/accounts/acquire.htmlwww.arsc.edu/support/accounts/acquire.html• Application forms and proceduresApplication forms and procedures• ARSC will issue and send the SecurID cardsARSC will issue and send the SecurID cards
– Allocations based on federal FY (1 Oct - 30 Sep)Allocations based on federal FY (1 Oct - 30 Sep)– Granting of resources is dependent on how much Granting of resources is dependent on how much
of the 30% allocation remainsof the 30% allocation remains• Preference given to UA researchers and affiliates and/or Preference given to UA researchers and affiliates and/or
Arctic related scienceArctic related science
Application Process - Application Process - Non-DoDNon-DoD
– You may apply for a project if you are a qualified You may apply for a project if you are a qualified faculty member or researcherfaculty member or researcher
• Students Students can notcan not be a Primary Investigator be a Primary Investigator– Faculty sponsor is required, but the sponsor does not need to be an Faculty sponsor is required, but the sponsor does not need to be an
actual ‘user’ of the systemsactual ‘user’ of the systems– Students are then added to the project as a userStudents are then added to the project as a user
• PIs are requested to provide a short annual report PIs are requested to provide a short annual report outlining project progress and any published resultsoutlining project progress and any published results
• Allocations of time are granted to projectsAllocations of time are granted to projects– Start-up accounts have a nominal allocationStart-up accounts have a nominal allocation– Production projects have allocations based on need and availabilityProduction projects have allocations based on need and availability
Application Process - Application Process - Non-DoDNon-DoD
– Users apply for access as part of the ProjectUsers apply for access as part of the Project– PIs will need to email approval before we add any PIs will need to email approval before we add any
user to a projectuser to a project– ARSC will mail a SecurID card (US Express mail) ARSC will mail a SecurID card (US Express mail)
once the account has been createdonce the account has been created– A few things are needed to activate the accountA few things are needed to activate the account
• Signed Account Agreement and SecurID receiptSigned Account Agreement and SecurID receipt• IAA training completion certificateIAA training completion certificate• Citizenship/ID verificationCitizenship/ID verification
– See: See: www.arsc.edu/support/accounts/acquire.html#proof_citizenshipwww.arsc.edu/support/accounts/acquire.html#proof_citizenship
ARSC Systems - ARSC Systems - UtilizationUtilization
– Job usage compiled/uploaded to local database Job usage compiled/uploaded to local database dailydaily
– Allocation changes posted twice dailyAllocation changes posted twice daily– PIs will be automatically notified when their project PIs will be automatically notified when their project
exceeds 90% of its allocation and when it runs out exceeds 90% of its allocation and when it runs out of allocationof allocation
– Users can check usage by invoking Users can check usage by invoking show_usage• show_usage -s for all allocated systemsfor all allocated systems• More detailed reports available upon requestMore detailed reports available upon request
ARSC Systems - ARSC Systems - UtilizationUtilizationTo: <the PI>To: <the PI>From: ARSC Accounts <[email protected]>From: ARSC Accounts <[email protected]>Subject: ARSC: midnight Project Utilization and Allocation SummarySubject: ARSC: midnight Project Utilization and Allocation Summary
Consolidated CPU Utilization ReportConsolidated CPU Utilization Report================================================================================
FY: 2008FY: 2008 ARSC System: midnightARSC System: midnight ARSC Group ID: <GROUP>ARSC Group ID: <GROUP>Primary Investigator: <PI Name>Primary Investigator: <PI Name>
Cumulative usage summary for October 1, 2007 through 15 Mar 2008.Cumulative usage summary for October 1, 2007 through 15 Mar 2008.
Foreground Background Total Foreground Background Total ---------- ---------- ---------- ---------- ---------- ---------- Allocation 150000.00Allocation 150000.00 Hours Used 126432.97 2.59 126435.56Hours Used 126432.97 2.59 126435.56 ================================== ================================== Remaining 23567.03 ( 15.71%)Remaining 23567.03 ( 15.71%)
NOTE:NOTE: In order to monitor the usage of your project on a regular basis, you can invokeIn order to monitor the usage of your project on a regular basis, you can invokethe show_usage command on any allocated ARSC system.the show_usage command on any allocated ARSC system.
If you have any questions about your allocation and/or usage, please contact us.If you have any questions about your allocation and/or usage, please contact us.
Regards,Regards, ARSC HPC AccountsARSC HPC Accounts [email] [email protected][email] [email protected] [voice] 907-450-8602[voice] 907-450-8602 [fax] 907-450-8601[fax] 907-450-8601
ARSC Systems - ARSC Systems - QueuesQueues– Invoke Invoke news queues on iceberg to see on iceberg to see
current queuescurrent queues• Load Leveler used for schedulingLoad Leveler used for scheduling
http://www.arsc.edu/support/howtos/usingloadleveler.htmlhttp://www.arsc.edu/support/howtos/usingloadleveler.html
Name MaxJobCPU MaxProcCPU Free Max Description d+hh:mm:ss d+hh:mm:ss Slots Slots --------------- -------------- -------------- ----- ----- --------------------- data 00:35:00 00:35:00 14 14 12 hours, 500mb, network nodes debug 1+08:05:00 1+08:05:00 16 32 01 hours, 4 nodes, debug p690 21+08:05:00 21+08:05:00 64 64 08 hours, 240gb, 64 cpu single 56+00:05:00 56+00:05:00 113 704 168 hours, 12gb, 8 cpu bkg 85+08:05:00 85+08:05:00 100 704 08 hours, 12gb, 256 cpu standard 170+16:05:00 170+16:05:00 113 704 16 hours, 12gb, 256 cpu challenge 768+00:05:00 768+00:05:00 113 704 48 hours, 12gb, 384 cpu special unlimited unlimited 113 736 48 hours, no limits cobaltadm unlimited unlimited 3 4 cobalt license checking --------------------------------------------------------------------------------
ARSC Systems - ARSC Systems - QueuesQueues– Invoke Invoke news queues on Midnight to see on Midnight to see
current queuescurrent queues• PBS Pro used for schedulingPBS Pro used for scheduling
Queue Min Max Max Procs Procs Walltime Notes --------------- ----- ----- --------- ------------ standard 1 16 84:00:00 See (A) 17 256 16:00:00 257 512 12:00:00 challenge 1 16 96:00:00 See (B) 17 256 96:00:00 See (C) 257 512 12:00:00
background 1 512 12:00:00 debug 1 32 00:30:00 See (D)
ARSC Systems - ARSC Systems - HelpHelp– Each system has a Each system has a Getting Started…Getting Started…
www.arsc.edu/support/howtos/usingsun.htmlwww.arsc.edu/support/howtos/usingsun.html
www.arsc.edu/support/howtos/usingp6x.htmlwww.arsc.edu/support/howtos/usingp6x.html
– The HPC News email letter has many great The HPC News email letter has many great tips and suggestionstips and suggestions
www.arsc.edu/support/news/HPCnews.shtmlwww.arsc.edu/support/news/HPCnews.shtml
– Help Desk Consultants are quite talented Help Desk Consultants are quite talented and able to help with a variety of issuesand able to help with a variety of issues
Contact InformationContact Information
ARSC Help DeskARSC Help DeskMon - FriMon - Fri
08:00 - 17:00 AK08:00 - 17:00 AK
907-450-8602907-450-8602
[email protected]@arsc.edu
www.arsc.edu/support/support.htmlwww.arsc.edu/support/support.html