jak jsem prezentaci upravil -...
TRANSCRIPT
PRAKTICKÝ ÚVOD DO SUPERPOČÍTAČE ANSELM
Infrastruktura, přístup a podpora uživatelů
David Hrbáč
2013-09-27
• Intro
• What is the supercomputer
• Infrastructure
• Access to cluster
• Support
• Log-in http://www.it4i.cz/files/hrbac.pdf http://www.it4i.cz/files/hrbac.pptx
Why Anselm
• 6000 name suggestions
• The very first coal mine at the region
• The very first to have a steam engine
• Anselm of Canterbury
Early days
Future - Hal
What is a supercomputer
• Bunch of computers • Having a lot of CPU power • Having a lot of RAM • Local storage • Shared storage • High-speed interconnected • Message Process Interface
Supercomputer
Supercomputer ?!?
Supercomputer ?!?
Anselm HW
• 209 compute nodes
• 3344 cores
• 15TB RAM
• 300TB /home
• 135TB /scratch
• Bull Extreme Computing Linux (RHEL clone)
Type of Nodes
• 180 compute nodes
• 23 GPU accelerated nodes
• 4 MIC accelerated nodes
• 2 fat nodes
General Node
• 180 nodes • 2880 cores in total • two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz
processors per node • 64 GB of physical memory per node • one 500GB SATA 2,5” 7,2 krpm HDD per node • bullx B510 blade servers • cn[1-180]
GPU Accelerated Nodes • 23 nodes • 368 cores in total • two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz
processors per node • 96 GB of physical memory per node • one 500GB SATA 2,5” 7,2 krpm HDD per node • GPU accelerator 1x NVIDIA Tesla Kepler K20 per node • bullx B515 blade servers • cn[181-203]
MIC Accelerated Nodes Intel Many Integrated Core Architecture
• 4 nodes • 64 cores in total • two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz
processors per node • 96 GB of physical memory per node • one 500GB SATA 2,5” 7,2 krpm HDD per node • MIC accelerator 1x Intel Phi 5110P per node • bullx B515 blade servers • cn[204-207]
Fat Node • 2 nodes • 32 cores in total • 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz
processors per node • 512 GB of physical memory per node • two 300GB SAS 3,5”15krpm HDD (RAID1) per node • two 100GB SLC SSD per node • bullx R423-E3 servers • cn[208-209]
Storage
• 300TB /home • 135TB /scratch • Infiniband 40 Gb/s
– Native 3600 MB/s – Over TCP 1700MB/s
• Ethernet – 114MB/s • LustreFS
Lustre File System
• Clustered
• OSS – object storage server
• MDS – meta-data server
• Limits in petabytes
• Parallel - striped
Stripes
• Stripe count – Parallel access
– Mind the script processes
– Stripe per gigabyte
• lfs setstripe|getstripe
Quotas
• /home – 250GB
• /scratch – no quota
• lfs quota –u hrb33 /home
Access to Anselm
• Internal Access Call - 4x a year
– 3rd round
• Open Access Call – 2x a year
– 2nd round
Proposals
• Proposals undergoing evaluation
– Scientific
– Technical
– Economical
• Primary Investigator
– List of co-operators
Login Credentials
• Personal certificate
• Signed request
• Credentials encrypted – Login
– Password
– Ssh keys
– Password to the key
Credentials lifetime
• Active project(s) or affiliation to IT4Innovations
• Deleted 1 year after the last project
• Announcement
– 3 months before the removal
– 1 month before the removal
– 1 week before the removal
Support
• Bug tracking and trouble ticketing system
• Documentation
• IT4I internal command line tools
• IT4I web applications
• IT4I android application
• End-user courses
Documentation
• https://support.it4i.cz/docs/anselm-cluster-documentation/
• Still evolving
• Changes almost every day
IT4I internal command line tools
• It4free • Rspbs • Licenses allocation
• Internal in-house scripts
– Automation to handle the credentials – Cluster automation – PBS accounting
IT4I web applications
• Internal information system
– Project management
– Project accounting
– User management
• Cluster monitoring
IT4I android application
• Internal tool • Considering the release to end-users
• Features
– News – Graphs
• Feature requests – Accounting – Support – Nodes allocation – Jobs status
Log-in to Anselm Finally!
• Ssh protocol
• Via anselm.it4i.cz
– login1.anselm.it4i.cz
– login2.anselm.it4i.cz
VNC
• ssh anselm –L 5961:localhost:5961
• Remmina
• Vncviewer 127.0.0.1:5961
Links
• https://support.it4i.cz/docs/anselm-cluster-documentation/
• https://support.it4i.cz/
• https://www.it4i.cz/
Questions
Thank you.