michigan grid testbed report shawn mckee university of michigan uta us atlas testbed meeting april...

17
Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

Upload: branden-leonard

Post on 27-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

Michigan Grid Testbed Report

Shawn McKeeUniversity of Michigan

UTA US ATLAS Testbed MeetingApril 4, 2002

Page 2: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Michigan Grid Testbed Layout

Page 3: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Grid Machine Details

Machine CPU Memory Disk Network OS/kernel

atgrid 2 x 800 MHz 1024 MB 4 x 36 GB (Raid5)

100 Mbs

1000 Mbs fiberRH 6.2/2.4.16

linat01 2 x 450 MHz 512 MB 2 x 9GB 100 Mbs

1000 Mbs fiberRH 6.2/2.4.16

linat02 2 x 800 MHz 768 MB 18 GB 100 Mbs

1000 Mbs copperRH 6.2/2.4.16

linat03 2 x 800 MHz 768 MB 4 x 18 GB 100 Mbs

1000 Mbs copperRH 6.2/2.4.16

linat04 2 x 800 MHz 512 MB 2 x 18 GB 100 Mbs

1000 Mbs copper(being rebuilt)

linat05 2 x 550 MHz 512 MB 35 GB 100 Mbs

1000 Mbs fiber(being added)

linat06 2 x 800 MHz 768 MB 450 GB(Raid5)

100 Mbs1000 Mbs fiber

RH 7.1/2.4.16

Page 4: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Grid Related Activites at UM

• Network monitoring and testing

• Security related tools and configuration

• Crash-dump testing for Linux

• Web100 testing• MGRID initiative (sent

to UM Administration)

• MJPEG video boxes for videoconferencing

• UM is now an “unsponsored” NSF SURA Network Middleware Initiative Testbed site

• Authenticated QoS signaling and testing

Page 5: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Web100 Experience

• We upgraded many of our nodes kernels to 2.4.16 and then applied the Web100 patches (alpha release)

• The goalgoal is to provide network tuning and debugging info and tools by instrumenting low level code in the TCP stack and kernel

• Our experience has been mixed:– Nodes with patches crash every ~24-36 hours– Application monitoring tools don’t all work– Difficult to have a non-expert get anything meaningful

from the tools• Recommendation is to wait for a real release!

Page 6: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Iperf/Network Testinghttp://atgrid.physics.lsa.umich.edu/~cricket/cricket/grapher.cgi

• We have been working on automated network testing and monitoring• Perl scripts have been used to run Iperf tests from LINAT01

(gatekeeper) to each other testbed sites gatekeeper using Globus.• We track UDP/TCP bandwidth, packet loss, jitter, buffer sizes for each

“direction” between each pair of sites.• Results are recorded by Cricket and are available as plots for various

time-frames• Problems with Globus job submissions at certain sites, automating

restart of Perl scripts and “zombie” processes accumulating…needs better exception handling.

• We separately use Cricket to monitor:– Round-trip times and packet losses using Ping– Testbed node details (load avg, cpu usage, disk usage, processes) using

SNMP– Switch and router statistics using SNMP

• Long term goal is to deploy hardware: monitors&beacons on testbed.

Page 7: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

MGRID (Michigan Grid Research and Infrastructure Development )

• Various colleges and units at UM are very interested in grids and grid technology

• We have proposed formation of an MGRID center, funded by the University

• Size is to be 3 FTEs plus a director with initial funding for three years

• The MGRID Center is a cooperative center of faculty and staff from participating units with a central core of technical staff, who together will carry out the grid development and deployment activities at the UM.

• US ATLAS grids would be a focus of such a center…we should find out about MGRID by July 2002

Page 8: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

NMI Testbed

Michigan has been selected as an “unsponsored” NMI testbed member. Goals are to:

• Develop and release a first version of GRIDS and Middleware software

• Develop security and directory architectures, mechanisms and best practices for campus integration

• Put in place associated support and training mechanisms • Develop partnership agreements with external groups

focused on adoption of software • Put in place a communication and outreach plan • Develop a community repository of NMI software and best

practices

Page 9: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

NMI GRIDS

NMI-GRIDS components: • Globus Toolkit 2.0 (Resource discovery and management, authenticated

access to and scheduling of distributed resources, coordinated performance of selected distributed resources to function as a dynamically configured "single" resource.)

• GRAM 1.5 • MDS 2.2 • GPT v.? • GridFTP • Condor-G • Network Weather Service • All services should accept x.509 credentials for authentication and access

control.

Much the same type of tools we already are using on our testbedMuch the same type of tools we already are using on our testbed

Page 10: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

NMI EDIT (Enterprise and Desktop Integration Technologies) NMI-EDIT components: The deliverables anticipated from NMI-EDIT for NMI Release 1 are of four types: • 10. Code - Code is being developed, adapted or identified for desktops (e.g.

KX.509, openH.323, SIP clients) and for enterprise use (such as Metamerge connectors, Shibboleth modules for Apache, etc.). Code releases are generally clients, modules, plug-ins and connectors, rather than stand-alone executables.

• 11. Objects - Objects include data and metadata standards for directories, certificates, and for use with applications such as video. Examples include eduPerson and eduOrg objectclasses, S/MIME certificate profiles, video objectclasses, etc.

• 12. Documents - This includes white papers, conventions and best practices, and formal policies. There is an implied progression in that the basic development of a new core middleware area results in a white paper (scenarios and alternatives) intended to promote an architectural consensus as well as to inform researchers and campuses. The white paper in turn leads to deployments, which require in conventions, best practices and requisite policies. The various core middleware areas being worked within release 1 include PKI, directories, account management, and video.

• 13. Services - “Within the net” operations are needed to register unique names and keys for organizations, services, etc. Roots and bridges for security and directory activities must be provided.

Page 11: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Authenticated QoS Work

• We have been working with CITI (Andy Adamson) at UM on issues related to QoS (Quality of Service)

• This is a critical critical issue for grids and any applications which require certain levels of performance from the underlying network

• A secure signaling protocol has been developed and tested…it is being moved into the GSI (Globus Security Infrastructure)

• A “Grid Portal” application is planned to provide web based secure access to grids.

• Our US ATLAS testbed could be a testing ground for such an application, if there is interest.

Page 12: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Network Connectivity Diagram

Page 13: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Future UM Network Layout

Page 14: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

Future Additions to the UM Grid

• We have been working closely with others on campus in grid related activities

• Tom Hacker(CAC/Visible Human) has asked us to install VDT 1.0 on two different installations on campus with significant compute resources.

• We hope to test how we can use and access shared resources as part of the US ATLAS grid testbed

• Primary issue is finding a block of time to complete the install and testing…

Page 15: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

• Linux Cluster – 100 processor Beawolf cluster equipment donated from Intel Corporation dual 800 Mhz Pentium III, 1GB RAM per node (512 MB per processor) 30 GB hard drive per node, Intel connect is Gigabit Ethernet.

Hardware Resources – Arbor Lakes

80 GB NSFFileserver Node

“Master Node”For login,

text and job submission

GigabitEthernet

Interconnect

42 TB Tivioli Mass Storage System Via NSF

Computation Node

ComputationNodes

• Intel Copper Gigabit Ethernet• Adapter• 2 processors• 1 GB RAM• 30 GB Hard drive

Page 16: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

• AMD Linux Cluster

– 100 AMD 1800+ processor

– 2 per node

– 1 GB Ram per node (512 MB per processor)

– Interconnected with Mgnnect

– Redhat Linux

– Distributed Architecture

Hardware Resources – Media Union

Page 17: Michigan Grid Testbed Report Shawn McKee University of Michigan UTA US ATLAS Testbed Meeting April 4, 2002

4/4/2002 Shawn McKee - University of Michigan - UTA ATLAS Grid Mtg

To Do…

• Install VDT 1.0Install VDT 1.0, first at Arbor Lakes and Media Union, then upgrading our site

• Get network details at each site documented• Start gigabit level testing to selected sites• Get crash dumps to actually work • Document and provide best practices on WWW

site for networking (HENP+NSF?) and grid related software…

• Determine how to leverage NMI testbed tools for USATLAS…