platform ocs 5 - queen's...
TRANSCRIPT
Platform OCS 5Technical Training
What we’ll cover
Day 1: Module 1: Concepts & Terminology Module 2: Node Provisioning Module 3: Basic Administration Module 4: HPC and Workload Management
What we’ll cover
Day 2: Module 1: Installer Node Setup & Configuration Module 2: Compute Node Installation & Customization Module 3: The OCS 5 “Survival Kit” Module 4: Case Studies / Lab Activities
Day1, Module 1
Platform OCS 5Concepts & Terminology
Module Objectives
Upon completion of this module, you will be able to:
‘Understand the term High Performance Computing Describe a “beowulf cluster” Understand Platform OCS 5 concepts and terminology Know what Kusu is and its relation to Platform OCS 5 Introduce key commands that will be explored in future
modules
HPC: Who is doing HPC?
.com .gov .edu
HPC for .gov : Energy Research
Premier applied science laboratory that is part of the National Nuclear Security Administration (NNSA) within the Department of Energy (DOE)
#1,#11 on top500.org IBM BlueGene/L eServer 212,992 processors 73,728 GB memory 478,200 GFlops (Rmax) 596,378 GFlops (Rpeak)
http://top500.org/site/systems/2556 - as of November 2007Photo courtesy of Lawrence Livermore National Laboratory
HPC for .gov : merging black holes
Largest astrophysical calculation ever performed on a NASA supercomputer
SGI Altix system running Linux : 20 nodes 512 cpu per node
Total : 10,240 processors
http://www.nasa.gov/centers/goddard/universe/gwave.html
HPC for .edu : the bluebrain project
Understand brain function and dysfunction through detailed simulations
Objective : replicate neocortical column of a rat (10,000 neurons)
IBM Blugene : 8000 cpus MPI
http://bluebrain.epfl.ch/page18699.html
First comprehensive attempt to reverse-engineer the mammalian brain
HPC for .com : GOOGLE
150M queries/day (2000/second) 8.0B documents in the index 100,000 Linux systems in data
centers around the world 15 TFlops 1000 TB total
Eigenvalue problem, transition probability matrix, markov chain
http://en.wikipedia.org/wiki/Markov_chain
HPC for .edu (2) : Seti@home
Largest distributed computation project in existence
Running on 500,000 PCs, ~1000 CPU Years per day
Distributes Datasets from Arecibo Radio Telescope
Results sent back and combined
http://setiathome.berkeley.edu/
Why is HPC so important ?
Because money matters! Airlines: System-wide logistics optimization Savings: approx. $100 million per airline per year*.
Automotive design: CAD-CAM, crash testing, structural integrity and aerodynamics.
Savings: approx. $1 billion per company per year*.
Semiconductor industry: device electronics simulation and logic validation
Savings: approx. $1 billion per company per year*.
Securities industry: mortgage risk simulation Savings: approx. $15 billion per year for U.S. home mortgages*.
Solve new classes of problems More refined models Larger models
* source: http://www.cs.utk.edu/~dongarra/WEB-PAGES/SPRING-2005/Lect01-overview.pdf
Different systems for solving different problems
Different architectures have evolved to solve different specialized problems in HPC
“SMP like “Clusters with High Speed interconnects (parallel)Clusters with standard interconnects (Monte Carlo)Hybrid systems (clearspeed, Cell BE, FPGA, GPUs)…
HPC: what is OpenMP?
OpenMP (Open Multi-Processing) is an Application Programming Interface (API) that supports multi-platform shared memory multiprocessing programming in C/C++ and Fortran.
It consists of a set of compiler directives, library routines, and environment variables (ex: OMP_NUM_THREADS) that influence run-time behavior.
GCC 4.2 supports OpenMP Keywords: SMP, fat nodes
http://www.openmp.org/
HPC: what is MPI ?
MPI stands for Message Passing Interface Library specification designed to support parallel
computing in a distributed environment 2 standards: MPI-1 and MPI-2 Several implementations (Open Source, ISV) Keywords: distributed memory, beowulf cluster
http://www.mpi-forum.org
HPC: what is PVM?
PVM stands for Parallel Virtual Machine PVM is a software package that permits a
heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer
Not widely used but still actively developed Keyword: virtual machine
http://www.csm.ornl.gov/pvm/
Cluster: What is a cluster?
Cluster: independent computers combined into a unified system through software and networking.
Typically used for High Availability (HA) for greater reliability or High Performance Computing (HPC) to provide greater computational power than a single computer can provide.
Cluster: What is a Beowulf cluster?
Beowulf Clusters: scalable performance commodity hardware Open Source software infrastructure.
Class I clusters are built entirely using commodity hardware and software
Class II clusters may use specialized hardware to achieve higher performance.
http://www.beowulf.org/
Cluster: What is a Beowulf cluster?
NASA: Project ColumbiaBeowulf cluster
42.7 teraflops – built in 120 days
Tunghai University, TaiwanBeowulf parallel testbed
17 compute nodes
Class I Class II
Cluster: stateful vs. stateless nodes?
Stateful: each system can be modified locally Stateless: no state exists on single computers; all
state is centralized Manageable complexity (no or limited growing entropy) Scalability Needs 100% automatic configuration
As we’ll see shortly, OCS 5 has rich capabilities in this area – nodes can be stateful, but administrators can realize the
administrative benefits of managing nodes as if they were stateless
Cluster: clustering suites
OCS like: OSCAR xCAT warewulf
Single System Image (cluster seen as a unique machine): Scyld (bproc) Clustermatic Mosix / OpenMosix Kerrighed
Cluster: kind of clusters OCS 5 can handle?
Beowulf clusters type I/II
Only x86, x86_64 based
Red Hat / CentOS or Fedora Nodes*
Extensible to other OS environments
Interoperable with other Architectures
What is Project Kusu?
‘Kusu’ is an Open Source provisioning, cluster file management and repository management toolkit.
‘Kusu’ is the first completely Open Source project created by Platform – http://www.osgdc.org
‘Kusu’ provides a technology foundation for Platform OCS 5
What is Kusu?
Kusu Island is located to the south of the main island of Singapore, off the Straits of Singapore. The name means "Tortoise Island" or "Turtle Island" in Chinese; the island is also known as Peak Island or Pulau Tembakul in Malay. From 2 tiny outcrops on a reef, the island was enlarged and transformed into an island holiday resort of 85,000 square metres. The island is 5.6 km south of the main island of Singapore.
What is Kusu?
Legend has it that a magical tortoise turned itself into an island to save two shipwrecked sailors - a Malay and a Chinese. The two men gave thanks according to his belief system, the former by building a Muslim kramat (keramat) (shrine), and the latter by establishing a Taoist shrine. Each year during the ninth lunar month (which falls between Sep and Nov according to the Lunar Calendar), thousands of devotees flock to the island for their annual Kusu Pilgrimage to pay homage for good health, peace, happiness, good luck and prosperity.
What is Platform OCS 5?
Project Kusu forms the cluster toolkit foundation for Platform OCS 5.
Kusu is a key part of Platform OCS 5. OCS 5 is the complete software stack for HPC clusters The software stack is all of the individual software components
that must be installed and configured in a cluster so that the end user or customer can run their applications.
Platform OCS 5 is a hybrid software stack.* Many components are Open Source (Platform Lava, MPI, Linux) Some components are freeware Some components are Commercial (Platform LSF HPC)
Red Hat HPC contains only the Open Source components of Platform OCS.
What is Red Hat HPC?
Red Hat® HPC is a solution based on Platform OCS 5 Red Hat HPC contains all of the Open Source
components of OCS 5 Red Hat HPC is integrated with RHN A Red Hat HPC channel exists on RHN Customers must subscribe to the Red Hat HPC
channel. Red Hat HPC is installed using yum
Before an install proceeds the software checks the network and disk configuration to ensure the machine meets minimum requirements.
What is Red Hat HPC?
Support for Red Hat HPC Red Hat is the first line support – customers call Red Hat Red Hat will escalate to Platform when needed There is no ‘hand off’ of customers Red Hat and Platform
jointly support customers until problem resolution
Patches and Security Updates for Red Hat HPC Red Hat builds the software from source at Red Hat Platform releases patches to Red Hat. Platform or Red Hat may identify security problems – patches
are either pushed upstream to Platform or downstream to Red Hat.
Red Hat will use the source to identify issues and resolve them.
Overview of the OCS 5 framework
Network Hardware – Infiniband, Myrinet, Ethernet, Gig-E, BMC
BasicClusterServices
DHCP
NFS
NTP
DNS
HTTP
IPMI
LDAP
NIS
MySQL
PFS
KusuCluster cfm
addhost
genconfig
driverpatch repoman
repopatch
nghosts
ngedit
buildimage
builtinitrd kitops
buildkit
ClusterMiddleware SOAM OFED Lava/LSF Portals MPI Compilers
AdminApplications
ClusterMonitoring
ClusterReporting
ClusterManagement
WorkloadManagement
UserApplications
Reservoir
Tools
Seismic
CFD
BI
FEA
Bio
Other
Compute,Network &Storage Resources
Linux® Operating System
Challenges Uniquely Addressed by OCS 5
Key Platform OCS 5 Advantages
Deploy & manage site specific node types Easily maintain systems at current patch levels Perform low-risk “trial installs” of packages & OS environments Support diskless clusters using images Install patches or rpms without reprovisioning Dynamically change node configurations Synchronize key files across all nodes Self monitoring / notification of problems Scale to hundreds of hosts / multiple clusters
Platform OCS 5: Key Concepts & Definitions
Key Platform OCS 5/Kusu concepts: Installer Node/Primary Installer Node Node Group Kit Component Snapshot Repository
We’ll now explain these essential/fundamental concepts and will cover them in detail in future modules
Platform OCS 5 Installer Nodes
Installer Nodes are: Nodes that install the Cluster
DHCP – for nodes in the cluster, TFTP – to initiate remote install, HTTP – send full OS install to nodes
Provide Packages for Installing nodes: OS Repositories (Package collections) Kits – Package collections of applications and operating systems
Provide the Configuration Management for the Nodes Kit Installation Cluster File Management, file replication and update
Maintain the SQL based cluster database (kusudb)
Primary Installer Nodes (note not implemented in OCS 5) Installer nodes can be arranged in a hierarchical manner One Installer node (usually the first one) is the Primary Installer Node All other Installer Nodes synchronize their configuration and database
with the primary installer node. (not in OCS 5.0-.1)
OCS 5 Node Groups
A Node group is a template that defines the properties of cluster nodes: Repository to Use (Defines OS) Install to perform:
Package Based / Image Based / Diskless
Components to install (components are packaged in kits) Platform and 3rd party applications
Disk Partitioning Scheme (including LVM) Network Configurations
Multiple network types allowed including Infiniband interfaces.
OS packages Custom Scripts Cluster Node naming scheme Custom driver modules to load Kernel parameters Kernel and Initrd to boot
Xen, or Regular
Kit A
Platform OCS 5 Kits
Kits are pre-packaged applications or services that once added to a Platform OCS cluster can be automatically installed and configured onto cluster nodes.
Component A-1
Component A-k
RPM 1
RPM 2
RPM 3
RPM4
RPM 5
RPM N
Component A-2
…
Kit B
Platform OCS 5 Repositories
Platform OCS 5 can use Red Hat®, Fedora or CentOS based repositories
A single Installer node can manage multiple repositories – this means that many different OS versions can be installed in a single cluster if desired.
Multiple repositories with the same OS and version are not supported
OCS 5 can take standard OS media and create a repository from the media.
A snapshot can be made from any repository allowing the Administrator to modify the repository without messing up the original.
The Platform OCS 5 Database
Platform OCS 5 adheres to the following design principals All OCS Cluster Configuration is in the database. All tools and GUIs modify the database tables Any configuration files required to run the cluster are
generated from the database such as: hosts, DNS, dhcpd, pdsh files etc.
All Platform OCS cluster tools retrieve configuration from the database.
So obviously the database is a key component of Platform OCS!
The Platform OCS 5 Database
Platform OCS 5 Administration tools
These are key administration tools we’ll examine in future modulesaddhost – add or remove hosts to/from clusterdriverpatch – installs new drivers into the initrdkitops – used to create and manage kitsbuildkit – used to build kitsbuildimage – used to create disk imagesbuildinitrd – used to create the initial ram disk for diskless & imagedboothost – create PXE config files for bootinggenconfig – generate config files from databaserepoman – repository management toolrepopatch – patch packages into the repositorynghosts – assigns nodes to node groups
Platform OCS 5 Administration tools
ngedit – change the properties of node groups netedit – used to manage the networks in a cluster cfmsync – signal nodes to update files/components
Platform OCS 5 Kits
OCS 5 comes with the following kits which we’ll also examine in future modules: Base kit HPC kit Platform Lava kit Platform LSF HPC kit Nagios kit OFED kit Intel® Software Tools Kit Cacti Kit Ntop kit
Thank You!
Creating and using an OCS 5 Repository
OCS 5RHEL 5
repo
OCS 5Fedora
corerepo
Administrator inserts Fedora Core
CDs/DVD into the Installer Node
Kitops is used to add Fedora Core to the
Installer Node
/depot/repos
Repoman is used to create a Fedora Core Repo from the Fedora
Core Kit.
Ngedit is used to create a node group and associate
the new repo.
Addhost is then used to add new nodes using the
repository
Node is PXE
booted
Kit Fedora
1
2
3
4 5 6
OCS 5 Repository Snapshots
Updating a cluster can be risky. Kernel drivers make it impossible to update the kernel. Some update packages have dependencies on kernel versions Sometimes updates ‘regress’ functionality –
Platform OCS 5 provides repository snapshots A snapshot is a full copy of an existing repository. A new repository directory is created and symbolic links are
created from the snapshot directory to the original kits. Adminstrators then update the snapshot using repopatch Admins assign the repository to a ‘test’ node group and
provision some machines to test the patches. If all goes well the new patches can be merged into the
original “production” repository by adding the update kit to a repository
The Platform OCS 5 Database
The SQL database tables that hold state are accessible to Kusu database administrators via the command line or via popular web-based MySQL administration tools such as PHPAdmin
Kits, Repositories and Node Groups
Kits are added to OCS 5 using ‘kitops’. After a Kit is added it must be assigned to a repository and then a node group.
OCS 5Repository
/depot/repo/…
OCS 5Kit depot/depot/kits
Kit A
# kitops –add A
repoman adds kit to repository
repository refreshed to add new Kit
Kit components automatically assigned
to node group defined in the kit.
OR
Use ‘ngedit’ to assign kit components to node
groups
Nodes in node group are re-provisioned or
updated with the new Kit
1
2 3
4a
4b
5