1 sigcse 2008 technical symposium on computer science education friday, march 14, 2008 grid...

38
1 SIGCSE 2008 Technical Symposium on Computer Science Education Friday, March 14, 2008 Grid Computing at the Undergraduate Level: Can We Do It? Jens Mache Lewis & Clark College Portland, Oregon Panel Thomas Feilhauer University of Applied Sciences Dornbirn, Austria Barry Wilkinson University of North Carolina Charlotte (Moderator) Amy Apon University of Arkansas Fayetteville

Upload: julianna-shields

Post on 30-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

1

SIGCSE 2008Technical Symposium on Computer Science Education

Friday, March 14, 2008

Grid Computing at the Undergraduate Level: Can We Do It?

Jens MacheLewis & Clark College

Portland, Oregon

Panel

Thomas FeilhauerUniversity of Applied Sciences

Dornbirn, Austria

Barry WilkinsonUniversity of North Carolina Charlotte

(Moderator)

Amy AponUniversity of Arkansas

Fayetteville

2

SIGCSE 2008Technical Symposium on Computer Science Education

Friday, March 14, 2008

Grid Computing at the Undergraduate Level: Can We Do It?

Thomas FeilhauerUniversity of Applied Sciences

Dornbirn, Austria

3

Course Web page:http://www2.staff.fh-vorarlberg.ac.at/~tf/grid/

The students have to work in a Linux environment they shouldn't be afraid of Linux

Grid Computing Course at FHV

Senior-level course taught in the last (6th) semester of the computer science bachelor program

Prerequisites:– all students need to have:

• knowledge of network protocols• experiences with Object-oriented programming• good working knowledge in Java• basics of client/server programming (Web apps)• fundamental knowledge of XML

– most students have (in addition to the above) knowledge of:• RPC/RMI• JNDI (naming & directory service)• CORBA• JavaEE• Java Web services (Apache Axis)

4

How did we proceed?

Web services– standards: WSDL, SOAP– tools: Apache Axis

State in Web services– define "resource"– standards: WSRF

WS-Addressing, WS-ResourceProperties, WS-ResourceLifetime, WS-Notification

Frameworks and tools for Grid applications– GT4 Java WS core– scheduler: Condor– database access: OGSA-DAI– gLite (EGEE)

5

Most of the problems are not specific for teaching Grid computing, but for developing apps within the Grid environment in generalProblems faced

Lots of specifications & standards for underlying technologies Lots of (mechanical) steps need to be performed to get first program

running– dependencies between the steps– lots of different command line tools for code generation & deployment– lots of different files to maintain and keep consistent

• WSDL file, Java files, WSDD file, JNDI deployment file, ant file• dependencies & redundancies error prone

Existing tutorials on GT4– explanations often oversimplified– students tend to rush through the examples– students miss out on understanding code and tool interaction– positive experiences with the tutorials on Condor and OGSA-DAI

Need appropriate IDEs– e.g. Introduce, gEclipse– under development

6

Steps in developing a GT4 App.

1. Define service's interface in WSDL adapt template WSDL file for resources and services

2. Optionally: Use WSDL2Java to generate framework classes for service implementation

3. Resource implementation4. Resource Home implementation5. Service implementation6. Provide WSDD Deployment Descriptor7. Provide JNDI deployment file8. Implement client9. Adapt Ant build file build.xml10. Build service using Ant11. Deploy service12. Invoke service using the client

7

Experiences

Very high motivation among students– elective course– attractive and relevant topic

Good mixture of theory and practice– lots of examples– start with simple examples, stepwise extended to more complex ones

• singleton resource• multiple resources• finding a resource by querying resource properties• destroying resources

Concentrate on the relevant parts of the specifications Provide templates for configuration and build files

available from tutorials, e.g. by Sotomayor Use communication (collaboration) or sequence diagrams to explain

relationships and message flow between different objects Explain and discuss the code step-by-step All students passed the (first) exam

8

SIGCSE 2008Technical Symposium on Computer Science Education

Friday, March 14, 2008

Grid Computing at the Undergraduate Level: Can We Do It?

Jens MacheLewis & Clark College

Portland, Oregon

9

Assignment 1 web concepts (http)

Assignment 2-4 sockets (Java)

Assignment 5+6 RMI

Assignment 7 web services (Apache Axis)

Assignment 8 grid “math” service

Assignment 9 grid “sticky note” service

Mini-project bigger example, e.g. “File buy”

200/300-level course that covers grid and network programming

10

Steps in the “math” assignment

0. Setting up the environment1. Defining the interface in WSDL2. Implementing the service in Java3. Configuring the deployment in WSDD4. Build the Math service (Create a GAR file)5. Deploy the Math service6. Write and compile the client7. Start the container and execute the clientAll of the above steps are mostly done for you!8. Add functionality to the service

11

“Math” assignment

Write .wsdl & .java

Compile & deploy

Re-start container

Write client

Compile & execute

12

“Sticky note” assignment

1. Getting Started: Deploy a Service2. State Management Part I: Create Resources3. Lifetime Management Part I: Destroy Resources4. State Management Part II: Add a Resource Property5. Aggregating Resources: Register with a Local Index6. Building a VO: Register with a Community Index7. Lifetime Management Part II: Lease-based Model8. Notification: Resource as Notification Producer9. Discovery: Find a Resource

13

Recommendations

Cover network programming in Java and RMI– introduces important concepts (stub compilation and interfaces

versus implementations) Cover web services, XML and WSDL. Cover the basics of certificates

– at least step-by-step, and with theoretical background if possible – typically, one cannot even start a grid service without cert’s

Do not underestimate the time and effort required to set up the required software.

– A viable alternative to one server shared by all students is installing a stand-alone container on individual student computers.

Follow a basic grid service exercise with a second more advanced grid exercise.

14

Prerequisites

the client/server paradigm XML web services network security ? network programming in the Java

Unlike the prereq’s for cluster computing algorithms, message passing in C or Fortran

15

SIGCSE 2008Technical Symposium on Computer Science Education

Friday, March 14, 2008

Grid Computing at the Undergraduate Level: Can We Do It?

Amy AponUniversity of Arkansas

Fayetteville

16

Our beginning programming class is taken by both computer science majors and advanced students in science and engineering courses

Course is taught in C and includes a weekly lab We wanted to introduce grid computing as a

research tool to the students in this class This meant teaching grid computing to freshmen

computer science students

University of Arkansas: Teaching Grid Computing to Beginning Programmers

17

The Ultimate Target Grid Platform: GPN Grid

GPNGrid was developed as a virtual organization within the Open Science Grid

Open Science Grid uses Condor for workload management

18

The Actual Student Platform – a Condor pool on our local cluster

We configured Condor on a small cluster of about 30 computers, with a single submit node that the students logged in to

Condor is based on the idea of a ClassAd

universe = vanillaexecutable = firearguments = $(PROCESS)output = fire_$(PROCESS).outerror = fire_$(PROCESS).errorlog = fire.logqueue 5

19

First Attempt: Fall 2005

First, a one hour lecture was given on Condor concepts, including how to write a ClassAd

Then, Condor was used by the students in one hour of the last lab of the semester

Students were given substantial code for an application they could run in Condor: the Game of Life

Students completed the implementation

20

First Attempt: Fall 2005

Then, a scientific question was posed:

“Given a set of input configuration files, which of these will still have living cells after 20 generations of the simulation?

Answering the question required running the program a lot of times – a great application for grid computing!

21

First Attempt: Mostly failure

Several concepts were more difficult than we expected:

The batch submission process Using the computer to solve a scientific

problem Understanding the distributed nature of the

application – a failure of the submit machine caused a lot of frustration and many students did not complete the exercise!

22

Second Attempt: Spring 2006

A new application, a fire simulation, was developed that did not require input files

23

Second Attempt: Spring 2006

Again, a scientific question was posed:

“What percentage of the forest will burn with a given probability of a neighbor tree catching on fire?”

[http://www.shodor.org]

• Students were asked to use the grid to run the application many times and graph the results

24

Second Attempt: Only partial success

Again, the results were not completely satisfactory

Students could perform the mechanics of submitting a Condor application, and use Excel to graph the results

They still did not seem to understand the distributed nature of the application

Grid computing seemed to get in the way of understanding the science

25

Third Attempt: In two parts

Fall 2006: We had students do a homework assignment to learn the computational science concepts only – write a program to calculate the heat distribution in a room

This was the last homework assignment of the semester

26

Third Attempt: In two parts

Spring 2007: In a special studies course, build on the computational concepts

Several assignments were given:– The use of Unix tools such as cat, sort, and

gnuplot– Complete the fire simulation from Spring 2006– Study Condor and ClassAds– Finally, pose a scientific question:

“What percentage of the forest will burn with a given probability of a neighbor tree catching on fire?”

27

Third Attempt: Success

Use Condor to run over 10,000 simulations, graph the results

28

University of Arkansas Conclusions

Grid computing can be taught to beginning students, but not in the first semester

The infrastructure must be absolutely flawless for this to succeed

29

University of Arkansas Conclusions

Prerequisites to teaching Grid computing include:– Background in computational concepts and the

idea of using the computer to answer a scientific question

– The concept of batch submission– Basic use of command line Unix tools if command

line tools are used, or a portal

30

University of Arkansas Conclusions

Grid computing can be useful to undergraduate science and engineering majors

Curriculum at this level needs to focus on running application, accessing data, and synthesizing results from the grid computation

31

SIGCSE 2008Technical Symposium on Computer Science Education

Friday, March 14, 2008

Grid Computing at the Undergraduate Level: Can We Do It?

Barry WilkinsonUniversity of North Carolina Charlotte

(Moderator)

32

North Carolina State-wide undergraduate course

Taught jointly: UNC-Charlotte and UNC Wilmington.

First taught 2004. Again in 2005 and 2007.

Uses North Carolina’s televideo network NCREN, which connects universities and colleges across state.

Distributed computing resources at several universities form Grid computing platform.

14 Universities and colleges participated in total.

33

Participating Sites

Western Carolina University

UNC GreensboroAppalachian State University

UNC AshevilleWinston-Salem State University

UNC Chapel Hill

NC State University

NC Central University

Lenoir Rhyne College

UNC Wilmington

Elon University

UNC Pembroke

UNC Charlotte

Wake Tech. Comm. College

© World Sites Atlas (sitesatlas.com)

SOUTH CAROLINA

VIRGINIA

TENNESSEE

GEORGIA

NORTH CAROLINA

34

Undergraduate Grid computing courses

Often take bottom-up approach– Starting with client-server concepts, creating Web and

Grid services, and then progressing through underlying Globus middleware, security mechanisms, and job submission all using a Linux command-line interface.

Need to raise level to top-down approach– Introduce students to production Grid tools such as

portals, application portlets, workflow tools, and how to Grid-enable applications.

35

Grid Computing platform

A Grid computing platform is needed to teach Grid computing in realistic setting

Problems with many students trying to do Grid computing assignments on a Grid or centralized server.

36

Aspects of new North Carolina Grid Course

Now starts with a GridSphere Grid portal to access resources.

Moves to command line assignments later.

Leads to assignment for developing portlets within Grid portal.

Students use their own computers for some assignments.

Student final projects

37

Assignment 1 Using grid computing portal

Assignment 2 Using the grid through a command line.

Assignment 3 Using a scheduler (Condor-G)

Assignment 4 Installing GT4 core. Creating, deploying, and testing a GT4  Grid service.

Assignment 5 Installing and using GridNexus workflow editor to create and execute workflows.

Assignment 6 Install Gridshpere and Implement a portlet within Gridsphere portal.

Assignment 7 MPI assignment on grid

Mini-project Developing grid computing assignment

Programming Assignments (Spring 2007)

Assignments 4, 5, and 6 require students to install significant software packages on their computer.

38

Avoiding problems

It require immense work to prepare for a hands-on distributed Grid computing course.

Critical that all assignments fully tested prior to start of class and all computer systems reliable and software maintained.

Assignments went much smoother by requiring students to use personal computers when possible.