dpeas training session april 19, 2005 1 dpeas training session dr. andrew s. jones, mr. phil shott,...
TRANSCRIPT
1DPEAS Training Session April 19, 2005
DPEAS Training Session
Dr. Andrew S. Jones,Mr. Phil Shott, and Mr. John Forsythe
Cooperative Institute for Research in the Atmosphere (CIRA)
Colorado State University (CSU)
Fort Collins, CO
April 19, 2005 / Suitland, MD
2DPEAS Training Session April 19, 2005
What is it? • Data processing system for “large” data analysis
tasks using common PCs• Features:
– Parallel implementation– Web-based documentation and monitoring– Incorporates a Fortran-interpreter for input tasks– Virtualized I/O subsystem (only memory-resident data
structures are needed, data algorithms now function like a model)
– Able to failover to redundant hardware– Extensible User Module
• Error Analysis code is still under development• Implemented on Windows NT/2000/2003 OS
3DPEAS Training Session April 19, 2005
What Does it Do?• Global merge capabilities for numerous data sets• Current system in operational use for 7+ years at CIRA• Simplifies
– Powerful abstraction layers allow anyone to write parallel code– Virtual I/O subsystem reduces end-user code complexities– Users interact using a language most already know
• Easily Scales– Limited process “cross-talk” improves scaling behavior– Tests have shown that a 2000 machine cluster is physically
feasible.– Basically… just add hardware.
4DPEAS Training Session April 19, 2005
10+ Data Types Are Currently Supported
• Reads and Writes HDF-EOS natively• GOES IMAGER (McIDAS)• NOAA AVHRR GAC and LAC (McIDAS)• NOAA AMSU-A and B (HDF-EOS)• DMSP SSM/I (Byte Stream)• DMSP SSM/T-2 (NGDC OIS)• DMSP OLS (NGDC OIS)• TRMM TMI and VIRS (HDF)• User extensible… (your format here)
5DPEAS Training Session April 19, 2005
The Big Picture
• DPEAS can run in what is called“failover mode”– This means that if hardware fails on one cluster
of machines, the cluster can migrate to another cluster configuration automatically
– This is an optional advanced capability
• When first learning about DPEAS, focus on the single node capabilities, then focus on the parallel capabilities if needed
6DPEAS Training Session April 19, 2005
PC Compiler Basics
• The compiler documentation for the MS Visual Studio C compiler and the Compaq/Intel Visual Fortran compiler is extensive….
• The following is a quick start “how-to”…
7DPEAS Training Session April 19, 2005
Defining Your Project
To define your project, you need to:
1. Create the project
2. Populate the project with files
3. Choose a configuration
4. Define build options, including project settings
5. Build (compile and link) the project
To create a new project:
1. Click the File menu and select New. A dialog box opens that has the following tabs:
• Files
• Projects
• Workspaces
• Other Documents
1. The Projects tab displays various project types. Click the type of Fortran project to be created. If you have other Visual tools installed, make sure you select a Fortran project type. You can set the Create New Workspace check box to create a new Workspace.
1. Specify the project name and location.
2. Click OK to create the new project. Depending on the type of project being created, one or more dialog boxes may appear allowing you to only create the project without source files or create a template-like source file.
3. If a saved Fortran environment exists for the Fortran project type being created, you can also import a Fortran environment to provide default project settings for the new project (see Saving and Using the Project Setting Environment for Different Projects).
4. This action creates a project workspace and one project. It also leaves the project workspace open.
To discontinue using this project workspace, click Close Workspace from the File menu.
To open the project workspace later, in the File menu, click either Open Workspace or Recent Workspaces.
Creating a New Project (taken from online help)
Win32 Console Application almost always
Will build in release or debug directory; can change configuration in workspace
8DPEAS Training Session April 19, 2005
How do I create a program?• On Windows you create a project (Win32 console application almost always); point and click. This creates a makefile which the user shouldn’t have to work with
• The makefile describes the files to build, their dependencies, compiler settings etc.
• On Unix / Linux, you edit a makefile directly, including compiler settings [might have evolved…]
9DPEAS Training Session April 19, 2005
Files in Project
Important Compiler / Linker Settings
10DPEAS Training Session April 19, 2005
The Developer Studio Environment and Debugger is Powerful!
Features
• “Live” cursor
• Set / remove breakpoints, run to cursor
• Drag and drop variable view in watch windows
• Exception settings
• Visual array viewer!
• All point and click
The editor can also be useful for non-coding editing (e.g. deleting a column of text)
11DPEAS Training Session April 19, 2005
If you get confusing assembly language windows, make sure the correct context is selected
Watch window
12DPEAS Training Session April 19, 2005
DPEAS Code Specifics
• The DPEAS code base is large
• All modules are placed into a structured directory layout
• User code is generally always at a “high-level” and isolated from the rest of the DPEAS system modules
• Low level library modifications are rarely needed, this encourages reuse of code
13DPEAS Training Session April 19, 2005
Module ContextGUIs
Command Shell Interpreter
Internet InformationServices
Web Browser
Other Applications
DPEAS Fortran Interpreter
DPEAS HDF-EOSVirtual I/O Subsystem
Analysis Modules User Modules
DPEASSystemState
Batch Job Client
TranslationModules
OutputModules
Operating System (Windows 2000)
Explorer Command Line
DPEAS Data Processing Engine
Sp
awn
Su
bta
skDPEAS Input Script
Command Line Script
DP
EA
S S
ub
task
Batch JobService
This is DPEAS
14DPEAS Training Session April 19, 2005
How to Use DPEAS with the MS Visual Studio Environment
Active Configuration Selector
Build Tools
I’ve found that using “Build | Update All Dependencies…”allows the Visual Studio makefile to perform better.
I use it at the start of each programming session.
15DPEAS Training Session April 19, 2005
How Do I Modify My DPEASParallel Configuration?
• All DPEAS parallelism configuration is performed by ASCII text files called “resource files”.
• The resource files are located at: “.\configuration\resource\*.txt”.
• You can update resource files while DPEAS is running.
• Your resource files are your means of control of the DPEAS parallelism behaviors.
• What is the default configuration? By default, all DPEAS configurations are capable of submitting parallel jobs, however, as a safety precaution, only your local machine has a resource file initially created.
16DPEAS Training Session April 19, 2005
Frequently AskedConfiguration Questions
• I want more power: To use additional machines create a resource file for each machine. You can use a copy your local machine’s resource file as a template.
• I want more control: The resource files also allow you to specify hardware capabilities (or “not to exceed” limits) and scheduling preferences.
• Sometimes, like for debugging, I don’t want to run in parallel mode: To disable parallel capabilities use the statement “CALL DPE_SLAVE” in your DPEAS input scripts.
17DPEAS Training Session April 19, 2005
• Can I interleave DOS or Unix shell commands within the DPEAS input script file? No, DPEAS only understands Fortran syntax. You must place DOS commands either before or after the DPEAS command line.
Frequently AskedUsage Questions
18DPEAS Training Session April 19, 2005
Things to Remember!• Start with simple-small-quick examples
– Simple/Idealized cases– Minimal number of files– Small array sizes
• Disable parallel execution– Add the statement “Call DPE_SLAVE” to your DPEAS
input script file to turn off DPEAS parallelism
• Use the debugger! It’s easy and it can show you what your code is really doing– Use breakpoints to stop inside your own code– Verify that your code segment was entered– Verify that your code segment was exited
19DPEAS Training Session April 19, 2005
More Debugging Suggestions• Additional useful DPEAS statements
– Use “Call DPE_WRITE_DATA_STRUCTURE” to list the contents of the virtual I/O data structure
– Use “Call DPE_WRITE_VARIABLES” to list the contents of the DPE fortran interpreter variables
• Caution about modifying the DPEAS internals– Please do not modify any existing DPEAS modules other
than “user_module.f90”– DPEAS is designed to hide its complexity– DPEAS contains over 260,000 lines of integrated code– Of course, you can add as many new modules as you need
20DPEAS Training Session April 19, 2005
How to Run in Debug Mode with the MS Visual Studio Environment
• Select “Win32 Debug” in the active configuration selector dialog box
• Set a breakpoint in the source code using the Build toolbar’s “hand” (F9)
• Start execution in debug mode, enter “Go” (F5) from the Build toolbar
• Wait until the program reaches one of your breakpoints
• Examine variables, set watches, etc.• Stop the Debugger or wait for the
program to exit on its own
21DPEAS Training Session April 19, 2005
How to Run in Release Mode with the MS Visual Studio Environment
• Select “Win32 Release” in the active configuration selector dialog box
• Start execution in release mode, enter “Execute Program” (Ctrl+F5) from the Build toolbar
• The program’s command line window will remain open after the program exits
• Close the command line window when done
22DPEAS Training Session April 19, 2005
How to Run DPEAS in Release Mode with the Batch Job Server Client
• Verify that BJS is installed and that you have appropriate BJS user privileges
• At a minimum your user account should belong to the local user groups:– “Batch Users”– “Batch Job Dir Users”
on each machine for which you intend to run DPEAS in parallel mode
• Submit the file “.com\DPEAS release.bat” with the argument containing the relative path name to the input DPEAS script file from the DPEAS executable
23DPEAS Training Session April 19, 2005
How to Run DPEAS in Parallel Mode• Run as before, but
1. Remove or comment out any “call DPE_SLAVE” statements in your DPEAS input script file
2. Create DPEAS resource files for the computers you wish to run DPEAS on
The resource files are named: “.\configuration\resource\<computername>.txt”
Feel free to copy existing resource files as a template and then use notepad to edit the file contents as appropriate
Security is handled at the network domain level, the resource files are used to inform DPEAS of potential resources that are available, it does not grant resources
24DPEAS Training Session April 19, 2005
How to Monitor Parallel Mode DPEAS Runs with the Batch Job Server Client
DPEAS submitted jobs arenamed: “DPE_AAAAA_NNNNN”
Shows Pedigree“Instance” and “Iteration”
25DPEAS Training Session April 19, 2005
An example of a DPEAS input script file
26DPEAS Training Session April 19, 2005
How DPEAS Starts
Program Start
DPEAS Initialization
Interpreting DPEAS script declarations
Interpreting DPEAS script executable statements
27DPEAS Training Session April 19, 2005
How DPEAS Ends
Program End
DPEAS Summary
Interpreting DPEAS script executable statements
28DPEAS Training Session April 19, 2005
How Are Spawned Input Scripts and Jobs Created?
• All spawned DPEAS jobs run machine-generated DPEAS input scripts which are generated by the data processing engine from the Master DPEAS input script (The examples shown previously were examples of DPEAS machine-generated code)
• This is automated within DPEAS and the user code goes along for the free ride since it is part of the DPEAS executable (it’s like meeting a friendly virus which helps to spread your code along with it)
29DPEAS Training Session April 19, 2005
What Does DPEASParallelism Look Like?
Do loop contentsare sent to other resources in parallel
The new jobs run the same “DPEAS.exe”, but execute only the subtask operations
Completed Jobsallow additional jobs to start
30DPEAS Training Session April 19, 2005
DPEAS Error Handling Behaviors
• DPEAS attempts to handle abnormal terminations and will exit with the appropriate status automatically
• The following error levels are recognized:Error Level Behavior
Success DPEAS continues
Informational DPEAS continues
Warnings DPEAS continues
Errors DPEAS conditionally terminates(depending on where it is and what it is
doing)
Fatal Errors DPEAS terminates
31DPEAS Training Session April 19, 2005
To abort a parallel mode job manually:1. Cancel the Master job2. Allow Spawned jobs to run and complete normally
(i.e., “do nothing” – my personal favorite), or3. Cancel Spawned jobs manually (this works even if
they were submitted by another user)a) Examine the Spawned job names, e.g. “DPE_AAAAA_nnnnn”b) Use the BJS Client to connect to the specified machinec) Cancel the remaining spawned jobs (remember, they may have
finished by the time you get to them)d) If you cancel another user’s job, their master job will continue to
wait until the canceled job is manually restarted on another system, so please communicate with each other if you do this.
How to Abort Parallel Mode DPEAS Runs with the Batch Job Server Client
32DPEAS Training Session April 19, 2005
How to Restart Spawned DPEAS Subtasks from a Parallel Mode DPEAS Run with the
Batch Job Server Client
33DPEAS Training Session April 19, 2005
How do I modify DPEAS?• All user routines should interface through the
module: “user_module.f90”• Each DPEAS user routine added requires a
wrapper to interface correctly to the DPEAS interpreter
• All virtual I/O data interfaces are through the DPEAS libraries
• The principle DPEAS library statements are:– generic function “found”– generic subroutine “allocate_hdfeos”– generic overloaded operator “=”
34DPEAS Training Session April 19, 2005
The 3 Programming Steps to Add a User Routine to DPEAS1. Insert a program “hook”
The program hook makes the main DPEAS programaware of the existence of your wrapper routine.
2. Create a wrapper routineThe wrapper routine tells the DPEAS fortraninterpreter how to parse and interact with yourapplication subroutine arguments.
3. Create an application routineThe application routine performs the “real” work.You can do anything you want within the applicationroutine.
35DPEAS Training Session April 19, 2005
How does the “User_Module.f90” relate to my DPEAS Input Scripts?
User_Module.f90
Program HookWrapper Routine
Application Routine
DPEAS InputScript
OrdinaryFortran Compiler
Compile Interpret AutomatedParallelization
Using Self-Replication
"DPEAS.exe"Interprets DPEAS
Input Script
End
Return toMaster
"DPEAS.exe"Interprets DPEAS
Input Script
DPEAS InputScript
Subtask
36DPEAS Training Session April 19, 2005
User Example:The user’s program hook
2 lines of code
37DPEAS Training Session April 19, 2005
User Example:The user’s wrapper routine
4 lines of executable code
38DPEAS Training Session April 19, 2005
User Example:The user’s application routine
Declarations (1 of 2)
39DPEAS Training Session April 19, 2005
User Example:The user’s application routine
Declarations (2 of 2)
Pointersto the virtual I/Odata structures
Define arrays as pointers
40DPEAS Training Session April 19, 2005
User Example:The user’s application routine
Using existing Virtual I/O data structures
One function is used to find ALL virtual I/Odata structure pointers
41DPEAS Training Session April 19, 2005
User Example:The user’s application routine
Creating new Virtual I/O data structures
One subroutine is used to allocate ALL virtual I/Odata structures
42DPEAS Training Session April 19, 2005
User Example:The user’s application routine
Using the virtual I/O data via pointers
1. Find each MW channel
2. Allocate a new output array data structure
Your science code looks like this
43DPEAS Training Session April 19, 2005
User Example:Usage of the new user routine in a
DPEAS input script file
44DPEAS Training Session April 19, 2005
User Example:The results: Complete integration
The new user routine is now fully integrated into DPEAS
45DPEAS Training Session April 19, 2005
User Example:The output HDF-EOS file
46DPEAS Training Session April 19, 2005
User Example:The output image representation
150 GHzEffective Emissivity
Calculated from:GOES-08 IMAGERNOAA-15 AMSU-B
47DPEAS Training Session April 19, 2005
• Creates 2 new routines:– Wrapper routine– Application routine
• Requires 25 lines of executable code:– 2 – Program hook– 4 – Wrapper routine– 19 – Application routine
• 2 – Variable assignments• 3 – Science algorithm• 14 – Virtual I/O library calls
(using only 2 Virtual I/O library routines)
User Example:Summary
Small overhead for gaining massive parallelism capabilities!
48DPEAS Training Session April 19, 2005
• Creates 2 new routines:– Wrapper routine– Application routine
• Requires 59 lines of executable code:– 2 – Program hook– 4 – Wrapper routine– 53 – Application routine
• 2 – Variable assignments
• 3 – Science algorithm
• 48 – HDF-EOS library calls(using 26 HDF-EOS library routines)
User Example:How complex would the user routine be, if written without the Virtual I/O library?
Answer: Without the DPEAS Virtual I/O library there would be:
24 additional I/O routines called by the user (+1200%)
34 additional lines of user code (+236%)
49DPEAS Training Session April 19, 2005
User Example:Conclusions
• Implementation Insights– Minimal amount of end-user code is required– The effort and resources involved are small
(The DPEAS program recompiled in < 30 s on the user’s desktop)
• Virtual I/O Insights– The DPEAS virtual I/O access method is less complex than
traditional HDF-EOS file access methods
• End user’s perspective– End users are protected from technical data format issues– End users can develop higher quality code by leveraging
shared robust common modules– Scalability is greatly enhanced with little end user effort