running clusters on a shoestring
DESCRIPTION
Running clusters on a Shoestring. Fermilab SC 2007. Poor man’s cluster tool-kit. PXE boot – cluster installation rgang – run commands on worker nodes IPMI – Intelligent Platform Management. installation PXE boot. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/1.jpg)
Running clusters on a Shoestring
FermilabSC 2007
![Page 2: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/2.jpg)
Poor man’s cluster tool-kit PXE boot – cluster installation rgang – run commands on
worker nodes IPMI – Intelligent Platform
Management
![Page 3: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/3.jpg)
installation PXE boot PXE booting is a way to boot a
computer entirely from the network without any form of storage on the destination computer (RAM aside).
PXE booting allows us to mass install an OS on the worker nodes. The worker node OS images are stored on the local hard disk on a BOOTP-TFTP server.
![Page 4: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/4.jpg)
installation PXE boot
bootp-tftp server
root imageboot
image
DOS image
BOOTP-TFTP Request Response
tomsrtbt and kernel Response
It takes 14 minutes to install 80 Opteron 270 nodes, using an Intel Xeon BOOTP-TFTP server, over a 100Mbps network.
clusterInstallScript partitions local disk, remote copies images and unzips them into disk partitions, configures network and reboots the node.
![Page 5: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/5.jpg)
deploymentrgang
rgang allows one to execute commands on or distribute files to many nodes.
rgang forks separate rsh or ssh children, which execute in parallel. After successfully waiting on returns from each child or after timing out, rgang displays the sorted node responses.
To scale to kilo clusters, rgang can utilize a tree-structure, via a nway switch. When so invoked, rgang uses rsh or ssh to spawn copies of itself on multiple nodes. These copies in turn spawn additional copies.
![Page 6: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/6.jpg)
deploymentrgang
rgang took 350 seconds to transfer a 1GB file from the head-node to 600 nodes using
nway=2 option over 100Mbps network which is 13Gbps of transfer rate for the entire
cluster
head-node
1GB file
nway = 2
![Page 7: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/7.jpg)
managementIPMI
IPMI is an open standard for monitoring, logging, recovery, inventory, and control of hardware that is implemented independent of the main CPU, BIOS, and OS.
The Baseboard Management Controller (BMC) is the brain behind platform management and its primary purpose is to handle the autonomous sensor monitoring and event logging features.
ipmitool provides a simple command-line in-band and out-of-band interface to the BMC.
![Page 8: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/8.jpg)
management IPMI
OS Running OS not loaded, OS booting, OS unresponsive
In-band Server Management
Out-of-band Server Management
BMC EthernetLocal interface SSIF, KCS, SMBus
Power cycle/on/off CPU
temperature System temperature
CPU/Chassis Fan speed Sensor Event
Logs
We power on, off or reset a thousand+ worker nodes in seconds.
![Page 9: Running clusters on a Shoestring](https://reader034.vdocuments.net/reader034/viewer/2022051820/56813515550346895d9c6bd5/html5/thumbnails/9.jpg)