1 sigcse 2008 technical symposium on computer science education friday, march 14, 2008 grid...
TRANSCRIPT
1
SIGCSE 2008Technical Symposium on Computer Science Education
Friday, March 14, 2008
Grid Computing at the Undergraduate Level: Can We Do It?
Jens MacheLewis & Clark College
Portland, Oregon
Panel
Thomas FeilhauerUniversity of Applied Sciences
Dornbirn, Austria
Barry WilkinsonUniversity of North Carolina Charlotte
(Moderator)
Amy AponUniversity of Arkansas
Fayetteville
2
SIGCSE 2008Technical Symposium on Computer Science Education
Friday, March 14, 2008
Grid Computing at the Undergraduate Level: Can We Do It?
Thomas FeilhauerUniversity of Applied Sciences
Dornbirn, Austria
3
Course Web page:http://www2.staff.fh-vorarlberg.ac.at/~tf/grid/
The students have to work in a Linux environment they shouldn't be afraid of Linux
Grid Computing Course at FHV
Senior-level course taught in the last (6th) semester of the computer science bachelor program
Prerequisites:– all students need to have:
• knowledge of network protocols• experiences with Object-oriented programming• good working knowledge in Java• basics of client/server programming (Web apps)• fundamental knowledge of XML
– most students have (in addition to the above) knowledge of:• RPC/RMI• JNDI (naming & directory service)• CORBA• JavaEE• Java Web services (Apache Axis)
4
How did we proceed?
Web services– standards: WSDL, SOAP– tools: Apache Axis
State in Web services– define "resource"– standards: WSRF
WS-Addressing, WS-ResourceProperties, WS-ResourceLifetime, WS-Notification
Frameworks and tools for Grid applications– GT4 Java WS core– scheduler: Condor– database access: OGSA-DAI– gLite (EGEE)
5
Most of the problems are not specific for teaching Grid computing, but for developing apps within the Grid environment in generalProblems faced
Lots of specifications & standards for underlying technologies Lots of (mechanical) steps need to be performed to get first program
running– dependencies between the steps– lots of different command line tools for code generation & deployment– lots of different files to maintain and keep consistent
• WSDL file, Java files, WSDD file, JNDI deployment file, ant file• dependencies & redundancies error prone
Existing tutorials on GT4– explanations often oversimplified– students tend to rush through the examples– students miss out on understanding code and tool interaction– positive experiences with the tutorials on Condor and OGSA-DAI
Need appropriate IDEs– e.g. Introduce, gEclipse– under development
6
Steps in developing a GT4 App.
1. Define service's interface in WSDL adapt template WSDL file for resources and services
2. Optionally: Use WSDL2Java to generate framework classes for service implementation
3. Resource implementation4. Resource Home implementation5. Service implementation6. Provide WSDD Deployment Descriptor7. Provide JNDI deployment file8. Implement client9. Adapt Ant build file build.xml10. Build service using Ant11. Deploy service12. Invoke service using the client
7
Experiences
Very high motivation among students– elective course– attractive and relevant topic
Good mixture of theory and practice– lots of examples– start with simple examples, stepwise extended to more complex ones
• singleton resource• multiple resources• finding a resource by querying resource properties• destroying resources
Concentrate on the relevant parts of the specifications Provide templates for configuration and build files
available from tutorials, e.g. by Sotomayor Use communication (collaboration) or sequence diagrams to explain
relationships and message flow between different objects Explain and discuss the code step-by-step All students passed the (first) exam
8
SIGCSE 2008Technical Symposium on Computer Science Education
Friday, March 14, 2008
Grid Computing at the Undergraduate Level: Can We Do It?
Jens MacheLewis & Clark College
Portland, Oregon
9
Assignment 1 web concepts (http)
Assignment 2-4 sockets (Java)
Assignment 5+6 RMI
Assignment 7 web services (Apache Axis)
Assignment 8 grid “math” service
Assignment 9 grid “sticky note” service
Mini-project bigger example, e.g. “File buy”
200/300-level course that covers grid and network programming
10
Steps in the “math” assignment
0. Setting up the environment1. Defining the interface in WSDL2. Implementing the service in Java3. Configuring the deployment in WSDD4. Build the Math service (Create a GAR file)5. Deploy the Math service6. Write and compile the client7. Start the container and execute the clientAll of the above steps are mostly done for you!8. Add functionality to the service
11
“Math” assignment
Write .wsdl & .java
Compile & deploy
Re-start container
Write client
Compile & execute
12
“Sticky note” assignment
1. Getting Started: Deploy a Service2. State Management Part I: Create Resources3. Lifetime Management Part I: Destroy Resources4. State Management Part II: Add a Resource Property5. Aggregating Resources: Register with a Local Index6. Building a VO: Register with a Community Index7. Lifetime Management Part II: Lease-based Model8. Notification: Resource as Notification Producer9. Discovery: Find a Resource
13
Recommendations
Cover network programming in Java and RMI– introduces important concepts (stub compilation and interfaces
versus implementations) Cover web services, XML and WSDL. Cover the basics of certificates
– at least step-by-step, and with theoretical background if possible – typically, one cannot even start a grid service without cert’s
Do not underestimate the time and effort required to set up the required software.
– A viable alternative to one server shared by all students is installing a stand-alone container on individual student computers.
Follow a basic grid service exercise with a second more advanced grid exercise.
14
Prerequisites
the client/server paradigm XML web services network security ? network programming in the Java
Unlike the prereq’s for cluster computing algorithms, message passing in C or Fortran
15
SIGCSE 2008Technical Symposium on Computer Science Education
Friday, March 14, 2008
Grid Computing at the Undergraduate Level: Can We Do It?
Amy AponUniversity of Arkansas
Fayetteville
16
Our beginning programming class is taken by both computer science majors and advanced students in science and engineering courses
Course is taught in C and includes a weekly lab We wanted to introduce grid computing as a
research tool to the students in this class This meant teaching grid computing to freshmen
computer science students
University of Arkansas: Teaching Grid Computing to Beginning Programmers
17
The Ultimate Target Grid Platform: GPN Grid
GPNGrid was developed as a virtual organization within the Open Science Grid
Open Science Grid uses Condor for workload management
18
The Actual Student Platform – a Condor pool on our local cluster
We configured Condor on a small cluster of about 30 computers, with a single submit node that the students logged in to
Condor is based on the idea of a ClassAd
universe = vanillaexecutable = firearguments = $(PROCESS)output = fire_$(PROCESS).outerror = fire_$(PROCESS).errorlog = fire.logqueue 5
19
First Attempt: Fall 2005
First, a one hour lecture was given on Condor concepts, including how to write a ClassAd
Then, Condor was used by the students in one hour of the last lab of the semester
Students were given substantial code for an application they could run in Condor: the Game of Life
Students completed the implementation
20
First Attempt: Fall 2005
Then, a scientific question was posed:
“Given a set of input configuration files, which of these will still have living cells after 20 generations of the simulation?
Answering the question required running the program a lot of times – a great application for grid computing!
21
First Attempt: Mostly failure
Several concepts were more difficult than we expected:
The batch submission process Using the computer to solve a scientific
problem Understanding the distributed nature of the
application – a failure of the submit machine caused a lot of frustration and many students did not complete the exercise!
22
Second Attempt: Spring 2006
A new application, a fire simulation, was developed that did not require input files
23
Second Attempt: Spring 2006
Again, a scientific question was posed:
“What percentage of the forest will burn with a given probability of a neighbor tree catching on fire?”
[http://www.shodor.org]
• Students were asked to use the grid to run the application many times and graph the results
24
Second Attempt: Only partial success
Again, the results were not completely satisfactory
Students could perform the mechanics of submitting a Condor application, and use Excel to graph the results
They still did not seem to understand the distributed nature of the application
Grid computing seemed to get in the way of understanding the science
25
Third Attempt: In two parts
Fall 2006: We had students do a homework assignment to learn the computational science concepts only – write a program to calculate the heat distribution in a room
This was the last homework assignment of the semester
26
Third Attempt: In two parts
Spring 2007: In a special studies course, build on the computational concepts
Several assignments were given:– The use of Unix tools such as cat, sort, and
gnuplot– Complete the fire simulation from Spring 2006– Study Condor and ClassAds– Finally, pose a scientific question:
“What percentage of the forest will burn with a given probability of a neighbor tree catching on fire?”
28
University of Arkansas Conclusions
Grid computing can be taught to beginning students, but not in the first semester
The infrastructure must be absolutely flawless for this to succeed
29
University of Arkansas Conclusions
Prerequisites to teaching Grid computing include:– Background in computational concepts and the
idea of using the computer to answer a scientific question
– The concept of batch submission– Basic use of command line Unix tools if command
line tools are used, or a portal
30
University of Arkansas Conclusions
Grid computing can be useful to undergraduate science and engineering majors
Curriculum at this level needs to focus on running application, accessing data, and synthesizing results from the grid computation
31
SIGCSE 2008Technical Symposium on Computer Science Education
Friday, March 14, 2008
Grid Computing at the Undergraduate Level: Can We Do It?
Barry WilkinsonUniversity of North Carolina Charlotte
(Moderator)
32
North Carolina State-wide undergraduate course
Taught jointly: UNC-Charlotte and UNC Wilmington.
First taught 2004. Again in 2005 and 2007.
Uses North Carolina’s televideo network NCREN, which connects universities and colleges across state.
Distributed computing resources at several universities form Grid computing platform.
14 Universities and colleges participated in total.
33
Participating Sites
Western Carolina University
UNC GreensboroAppalachian State University
UNC AshevilleWinston-Salem State University
UNC Chapel Hill
NC State University
NC Central University
Lenoir Rhyne College
UNC Wilmington
Elon University
UNC Pembroke
UNC Charlotte
Wake Tech. Comm. College
© World Sites Atlas (sitesatlas.com)
SOUTH CAROLINA
VIRGINIA
TENNESSEE
GEORGIA
NORTH CAROLINA
34
Undergraduate Grid computing courses
Often take bottom-up approach– Starting with client-server concepts, creating Web and
Grid services, and then progressing through underlying Globus middleware, security mechanisms, and job submission all using a Linux command-line interface.
Need to raise level to top-down approach– Introduce students to production Grid tools such as
portals, application portlets, workflow tools, and how to Grid-enable applications.
35
Grid Computing platform
A Grid computing platform is needed to teach Grid computing in realistic setting
Problems with many students trying to do Grid computing assignments on a Grid or centralized server.
36
Aspects of new North Carolina Grid Course
Now starts with a GridSphere Grid portal to access resources.
Moves to command line assignments later.
Leads to assignment for developing portlets within Grid portal.
Students use their own computers for some assignments.
Student final projects
37
Assignment 1 Using grid computing portal
Assignment 2 Using the grid through a command line.
Assignment 3 Using a scheduler (Condor-G)
Assignment 4 Installing GT4 core. Creating, deploying, and testing a GT4 Grid service.
Assignment 5 Installing and using GridNexus workflow editor to create and execute workflows.
Assignment 6 Install Gridshpere and Implement a portlet within Gridsphere portal.
Assignment 7 MPI assignment on grid
Mini-project Developing grid computing assignment
Programming Assignments (Spring 2007)
Assignments 4, 5, and 6 require students to install significant software packages on their computer.
38
Avoiding problems
It require immense work to prepare for a hands-on distributed Grid computing course.
Critical that all assignments fully tested prior to start of class and all computer systems reliable and software maintained.
Assignments went much smoother by requiring students to use personal computers when possible.