sg246649

Upload: vitorab

Post on 05-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 sg246649

    1/180

    ibm.com/redbooks

    Grid Computing ing inResearch and

    Education

    Luis Ferreira, Fabiano Lucchese

    Tomoari Yasuda, Chin Yau Lee

    Carlos Alexandre Queiroz

    Elton Minetto, Antonio Mungioli

    Grid in Research Institutions

    Grid in Universities

    Examples

    Front cover

    http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/
  • 8/2/2019 sg246649

    2/180

  • 8/2/2019 sg246649

    3/180

    Grid Computing in Research and Education

    April 2005

    International Technical Support Organization

    SG24-6649-00

  • 8/2/2019 sg246649

    4/180

    Copyright International Business Machines Corporation 2005. All rights reserved.

    Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADPSchedule Contract with IBM Corp.

    First Edition (April 2005)

    This edition applies to the capability of the IBM, ISVs, and open source products used to build agrid computing solution.

    Note: Before using this information and the product it supports, read the information inNotices on page xiii.

  • 8/2/2019 sg246649

    5/180

    Copyright IBM Corp. 2005. All rights reserved. iii

    Contents

    Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

    Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xi

    Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiTrademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

    Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvThe team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

    Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiiComments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii

    Part 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    Chapter 1. Introduction to grid concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Beginning of the grid concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.1.1 Research and education on grid context. . . . . . . . . . . . . . . . . . . . . . . 61.2 Applicability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.2.1 Why use grids in research and education? . . . . . . . . . . . . . . . . . . . . . 71.2.2 Leveraging research activities with grids . . . . . . . . . . . . . . . . . . . . . . 91.2.3 Leveraging educational activities with grids . . . . . . . . . . . . . . . . . . . 10

    1.3 What will the future bring?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3.1 What exists today . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3.2 What is the potential for grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.3.3 What is likely to happen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    Chapter 2. How to implement a grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.1.1 The main difficulties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.1.2 Approaches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.2 Basic requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.2.1 Hardware requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2.2 Software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.2.3 Human-resource requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.3 Setting up grid environments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.3.1 Defining the architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.3.2 Hardware setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.3.3 Software setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.4 Setting up grid applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

  • 8/2/2019 sg246649

    6/180

    iv Grid Computing in Research and Education

    2.4.1 Deploying an application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.4.2 Making application data available . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    2.5 Maintaining grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.5.1 Grid platform administration tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    2.5.2 Grid application administration tasks . . . . . . . . . . . . . . . . . . . . . . . . 28

    Part 2. Grid by examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    Chapter 3. Introducing the examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.1 What you will find in these chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    Chapter 4. Scientific simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    4.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    4.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    4.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    4.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    4.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . 44

    4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    Chapter 5. Medical images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    5.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    5.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    5.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    5.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    5.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . 545.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    Chapter 6. Computer-Aided Drug Discovery . . . . . . . . . . . . . . . . . . . . . . . 57

    6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    6.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

  • 8/2/2019 sg246649

    7/180

    Contents v

    6.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    6.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    6.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . 636.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    Chapter 7. Big Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    7.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    7.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    7.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    7.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . 74

    7.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

    Chapter 8. e-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    8.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

    8.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    8.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    8.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 888.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . 89

    8.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 918.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    Chapter 9. Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 939.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    9.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 949.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    9.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    9.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

  • 8/2/2019 sg246649

    8/180

    vi Grid Computing in Research and Education

    9.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 999.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1019.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . . 101

    9.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1019.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    Chapter 10. Microprocessor design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10310.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

    10.1.1 Business context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10410.1.2 Business needs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

    10.2 Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    10.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    10.2.2 Use-cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10610.3 Case design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10710.3.1 Component model diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10710.3.2 Component model description . . . . . . . . . . . . . . . . . . . . . . . . . . . 10710.3.3 Architectural decisions and product selection . . . . . . . . . . . . . . . . 108

    10.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10810.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

    Part 3. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

    Appendix A. TeraGrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Beneficiaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119How to join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

    Appendix B. Research oriented grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

    Business requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122High level design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124Products used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

    Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

    Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

    Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

  • 8/2/2019 sg246649

    9/180

    Contents vii

    Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

  • 8/2/2019 sg246649

    10/180

    viii Grid Computing in Research and Education

  • 8/2/2019 sg246649

    11/180

    Copyright IBM Corp. 2005. All rights reserved. ix

    Figures

    1-1 Heterogeneous and independent computing resources . . . . . . . . . . . . . 52-1 How a grid should expand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184-1 Use-cases diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414-2 Component model diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435-1 Use-cases diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    5-2 Component model diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526-1 Use-cases diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606-2 Component model diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    7-1 Use-cases diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717-2 Software component architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    7-3 Component model diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738-1 Use-cases diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848-2 e-learning framework schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 878-3 Software components architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879-1 A user s point of view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959-2 Use-case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    9-3 Diagram model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

    10-1 Use-case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10610-2 Component model diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107A-1 TeraGrid overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115A-2 Layers diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117A-3 Typical connection between sites and the TeraGrid backplane. . . . . . 119B-1 Virtual environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124B-2 Virtualization organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125B-3 High level component diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

    B-4 Globus Toolkit and meta-scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . 126B-5 Submitting a job through Community Scheduler Framework. . . . . . . . 127B-6 Job sequencer and gridport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

    B-7 Overall architecture diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129B-8 Workflow of a research working on the grid. . . . . . . . . . . . . . . . . . . . . 130

  • 8/2/2019 sg246649

    12/180

    x Grid Computing in Research and Education

  • 8/2/2019 sg246649

    13/180

    Copyright IBM Corp. 2005. All rights reserved. xi

    Tables

    1-1 Types of grid that drive the grid solution for each area (shaded cells) . . 93-1 Examples of grid computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344-1 A typical product selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444-2 A typical product selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455-1 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . . . 54

    6-1 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . . . 637-1 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . . . 748-1 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . . . 89

    9-1 Architectural decisions and product selection . . . . . . . . . . . . . . . . . . . 10110-1 A typical product selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

  • 8/2/2019 sg246649

    14/180

    xii Grid Computing in Research and Education

  • 8/2/2019 sg246649

    15/180

    Copyright IBM Corp. 2005. All rights reserved. xiii

    Notices

    This information was developed for products and services offered in the U.S.A.

    IBM may not offer the products, services, or features discussed in this document in other countries. Consultyour local IBM representative for information on the products and services currently available in your area.Any reference to an IBM product, program, or service is not intended to state or imply that only that IBMproduct, program, or service may be used. Any functionally equivalent product, program, or service thatdoes not infringe any IBM intellectual property right may be used instead. However, it is the user'sresponsibility to evaluate and verify the operation of any non-IBM product, program, or service.

    IBM may have patents or pending patent applications covering subject matter described in this document.The furnishing of this document does not give you any license to these patents. You can send licenseinquiries, in writing, to:IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.

    The following paragraph does not apply to the United Kingdom or any other country where such provisionsare inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDESTHIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimerof express or implied warranties in certain transactions, therefore, this statement may not apply to you.

    This information could include technical inaccuracies or typographical errors. Changes are periodically madeto the information herein; these changes will be incorporated in new editions of the publication. IBM may

    make improvements and/or changes in the product(s) and/or the program(s) described in this publication atany time without notice.

    Any references in this information to non-IBM Web sites are provided for convenience only and do not in anymanner serve as an endorsement of those Web sites. The materials at those Web sites are not part of thematerials for this IBM product and use of those Web sites is at your own risk.

    IBM may use or distribute any of the information you supply in any way it believes appropriate withoutincurring any obligation to you.

    Information concerning non-IBM products was obtained from the suppliers of those products, their publishedannouncements or other publicly available sources. IBM has not tested those products and cannot confirm

    the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions onthe capabilities of non-IBM products should be addressed to the suppliers of those products.

    This information contains examples of data and reports used in daily business operations. To illustrate themas completely as possible, the examples include the names of individuals, companies, brands, and products.All of these names are fictitious and any similarity to the names and addresses used by an actual businessenterprise is entirely coincidental.

    COPYRIGHT LICENSE:This information contains sample application programs in source language, which illustrates programmingtechniques on various operating platforms. You may copy, modify, and distribute these sample programs inany form without payment to IBM, for the purposes of developing, using, marketing or distributing application

    programs conforming to the application programming interface for the operating platform for which thesample programs are written. These examples have not been thoroughly tested under all conditions. IBM,therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,modify, and distribute these sample programs in any form without payment to IBM for the purposes ofdeveloping, using, marketing, or distributing application programs conforming to IBM's applicationprogramming interfaces.

  • 8/2/2019 sg246649

    16/180

    xiv Grid Computing in Research and Education

    Trademarks

    The following terms are trademarks of the International Business Machines Corporation in the United States,other countries, or both:

    AFSAIXDB2DFSEserverEservereServer

    ibm.comIBMLotusOS/2OS/390POWER4pSeries

    RedbooksRedbooks (logo) TCSTivoliWebSpherexSerieszSeries

    The following terms are trademarks of other companies:

    Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun

    Microsystems, Inc. in the United States, other countries, or both.

    Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in theUnited States, other countries, or both.

    Intel, Intel Inside (logos), MMX, and Pentium are trademarks of Intel Corporation in the United States, othercountries, or both.

    UNIX is a registered trademark of The Open Group in the United States and other countries.

    Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

    Other company, product, and service names may be trademarks or service marks of others.

  • 8/2/2019 sg246649

    17/180

    Copyright IBM Corp. 2005. All rights reserved. xv

    Preface

    This IBM Redbook, Grid Computing in Research and Education, belongs to aseries of documents related to grid computing that IBM is presenting to thecommunity to enrich the IT industry and all its players: customers, industryleaders, emerging enterprises, universities, and producers of technology. Thebook is mainly oriented to IT architects or those who have the responsibility ofanalyzing the capabilities to build in a grid solution.

    The book is organized into the following parts.

    Part 1, Introduction on page 1In this part of the book we present the basics about what, why, and how the gridconcept can be applied to the research and education fields. When goingthrough this part, the reader can expect to acquire a concise yet comprehensiveview of how researchers, professors, teachers, and R&D professionals in generalmight benefit from this brand-new field of computing industry.

    Part 2, Grid by examples on page 31

    In this part of the book we present a collection of examples based of real-worldgrid implementations that have been accomplished in the research andeducational world. They aim to show the multiple aspects of suchimplementations and illustrate the concepts formerly presented in more concreteterms. Here is a list of the examples:

    Scientific simulation on page 37

    Presents a computational grid implementation to provide the execution ofcomplex system simulations in the areas of physics, chemistry, and biology.

    Medical images on page 47

    Presents a joint use of data grid and computational grid in a medical-imagestorage and processing framework.

    Computer-Aided Drug Discovery on page 57

    Presents a computational grid implementation to support the area of CADD(Computer-Aided Drug Discovery).

  • 8/2/2019 sg246649

    18/180

    xvi Grid Computing in Research and Education

    Big Science on page 67

    Presents an implementation of a data and computational grid to supportgovernment-sponsored laboratory projects (also known as big science).

    e-Learning on page 79

    Presents a network grid implementation supporting an e-learning infrastructurethat embraces many of the requirements for exchanging information in theeducational and research fields.

    Visualization on page 93

    Presents a grid implementation to support the field of advanced scientificvisualization.

    Microprocessor design on page 103Presents a computational grid implementation that helps to reduce themicroprocessor design cycle and also allows the design centers to share theirresources more efficiently.

    Part 3, Appendixes on page 111Describes the Teragrid project, a cyber-infrastructure that aims to solve theproblem of emerging terascale applications. Also provides a hypotheticalexample of a research oriented grid involving multiple schedulers and multipledifferent components and services.

    The team that wrote this redbook

    This redbook was produced by a team of specialists from around the worldworking at the International Technical Support Organization, Austin Center.

    Luis Ferreira, also known as Luix, is a Senior Software Engineer at theInternational Technical Support Organization, Austin Center, working on Linuxand grid computing projects. He has 20 years of experience with UNIX-likeoperating systems in design, architecture, and implementation, and holds aMaster of Science degree in Systems Engineering from Universidade Federal doRio de Janeiro in Brazil. Before joining the ITSO, Luis worked at Tivoli Systemsas a Certified Tivoli Consultant, at IBM Brazil as a Certified IT Specialist, and atCobra Computadores as a kernel developer and operating systems designer.

  • 8/2/2019 sg246649

    19/180

    Preface xvii

    Fabiano Lucchese is the business director of Sparsi Computing in Grid(http://www.sparsi.com) and works as a grid computing consultant in a numberof nation-wide projects. In 1994, Fabiano was admitted to the ComputerEngineering undergraduate course of the State University of Campinas, Brazil,and in mid-1997, he moved to France to finish his undergraduate studies at the

    Central School of Lyon. Also in France, he pursued graduate-level studies inIndustrial Automation. Back in Brazil, he joined Unisoma Mathematics forProductivity, where he worked as a software engineer on the development ofimage processing and optimization systems. From 2000 to 2002, he joined theFaculty of Electrical and Computer Engineering of the State University ofCampinas as a graduate student and acquired a Master of Science degree inComputer Engineering for developing a task scheduling algorithm for balancingprocessing loads on heterogeneous grids. Fabiano has also taken part in thepublishing of the IBM Redbook, Grid Services Programming and Application

    Enablement, SG24-6100-00.

    Tomoari Yasuda is an IBM Certified IT Specialist for Distributed Computing inIBM Japan. After getting a Master's degree in Mechanical Engineering at thegraduate school of Keio University, he joined IBM and worked for digital mediacustomers in Japan for 3 years as a consultant and a developer with theWebSphere family. He has a deep knowledge of the digital media industry.Since then, he has focused on offering new solutions to several cross-industrycustomers. In 2004, he was certified in IBM Grid Computing Technical Sales,

    and has been in charge of technical sales support for grid computing.

    Chin Yau Lee works as an Advisory Technical Specialist in grid computing forIBM ASEAN/South Asia. He holds an Honours degree in Computing andInformation System from the University of Staffordshire. He has been using Linuxsince 1996 and had a few years of experience as a UNIX and Linux engineerbefore joining IBM. His areas of expertise includes High Performance Linux andUNIX, UNIX Systems Administration, High Availability solutions, Internet basedsolutions, and grid computing architectures, which he has been actively working

    on for the last 4 years. He is also an IBM Certified Advanced Technical Expert onAIX, a Sun Certified System/Network Administrator, and a Red Hat CertifiedEngineer. He is also a co-author of the IBM Redbook, Deploying Linux on IBMeServer pSeries clusters, SG24-7014-00.

    Carlos Alexandre Queiroz is an independent consultant working for AlexMicrosystems. He has been working with grid computing, JINI, and J2EEtechnologies since 2000. Currently, he is earning a Master's degree atUniversidade de So Paulo as a Distributed Systems and Network Specialist. He

    has published articles at several congresses, such as middleware2003, SBRC,grid computing, and parallel applications events. Carlos is an active developer ofthe Web site, http://gsd.ime.usp.br/integrade.

    http://www.sparsi.com/http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://www.sparsi.com/
  • 8/2/2019 sg246649

    20/180

    xviii Grid Computing in Research and Education

    Elton Minetto is a professor at Universidade Comunitria Regional de Chapec,Brazil, teaching programming, networking, and operational systems courses. Healso works as a System Analyst and Network Administrator in the sameinstitution, supporting Linux, Oracle, PHP, Java, and Python. Elton holds aBachelors degree in Computer Science by Universidade Comunitria Regional

    de Chapec, and a Latus Sensus Graduation degree in Computer Sciences byUNOESC/UFSC, Brazil. Elton is an active member of the open softwarecommunity, collaborating on various projects.

    Antonio Saverio Rincon Mungioli is an electrical engineer and professor atEscola de Engenharia Mau, Sao Paulo, Brazil. He also works as a SystemAnalyst in the computing center at Universidade de So Paulo, and as aTechnical Consultant of the IBM Business Partners in Brazil. Antonio holds aMaster of Science degree by Escola Politcnica of Universidade de So Paulo,

    Brazil.

    Acknowledgements

    Thanks to the following people for their contributions:

    Joanne Luedtke, Lupe Brown, Cheryl Pecchia, Arzu Gucer, Chris Blatchley,Wade Wallace, Ella Buslovich, Yvonne LyonInternational Technical Support Organization, IBM

    Tony WhiteWorldwide Grid Computing Technical Sales Business Unit Executive, IBM

    Ronald WatkinsWorldwide Grid Computing Business Development Executive, Public Sector,IBM

    Chris McMahonAmericas Sales Executive, Grid Computing, Higher Education, IBM

    Dr. Martin F. Maldonado

    Sr. Technical Architect, Grid Computing, Higher Education and Research, IBM

    Joe CataniGrid Computing in Higher Education, Public Sector, IBM

    Lori SouthworthMarket Manager, Education Industry, IBM

    Al HamidExecutive IT Architect and STSM, Grid/OSS Worldwide Leader, BCS, IBM

    Chris Reech, Jeff MausolfIBM Global Services / e-Technology Center, Grid Computing Initiative, IBM

    http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integrade
  • 8/2/2019 sg246649

    21/180

    Preface xix

    Nina WilnerGrid Technology - IT Technical Architect LifeSciences, IBM

    Elizabeth B DavisEducation Client Representative, IBM

    Wolfgang RoesnerVerification Tools, eCLipz Verification, IBM

    John ReysaProcessor Simulation and Infrastructure, IBM

    Ross AikenHPC Technical Solutions Architect, IBM

    Nam Keung

    Senior Technical Consultant, IBM

    Lee B WilsonTechnical Sales Specialist, IBM

    Takanori SekiDistinguished Engineer, IBM Japan

    Ryuhichi NakataICP TS - Higher Education Industry, IBM Japan

    Hideyuki YokoyamaEBO Support Technical Competency, IBM Japan

    Shu ShimizuTokyo Research Laboratory, IBM Japan

    Naritoh Yamada, Michitaka KamimuraLifeSciences, IBM Japan

    Yoshihiko ItohGEO Sales Lead, Grid Business AP, IBM Japan

    Fumiki NegishiGrid Computing Business, IBM Japan

    Stephen ChuGrid Computing Executive, IBM China

    Al Min ZhuUniversity Relations, IBM China

    Jian Jiong ZhuangIBM China

    http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integrade
  • 8/2/2019 sg246649

    22/180

    xx Grid Computing in Research and Education

    Jing Hui LiIBM China

    Li Yang ZhouGrid Computing, IBM China

    Linda LinIT Architect, IBM China

    Jean-Yves GirardGrid Computing Specialist, IBM France

    Yann GuerinEMEA Grid Computing TSM, IBM France

    Sebastien Fibra

    IT Specialist, IBM France

    Jean-Pierre ProstEMEA Design Center for on demand business, IBM France

    Dr. Luigi BrochardDistinguished Engineer, IBM Deep Computing, IBM France

    Mariano BatistaIT Architect, IBM Argentina

    Ruth HaradaAlliances Manager, IBM Brasil

    Katia PessanhaUniversities Alliances Manager, IBM Brasil

    Jose Carlos Duarte GoncalvesExecutive IT Architect, IBM Brasil

    Joao Marques dos Santos

    Account manager, Public Sector, IBM Brasil

    Luiz Roberto RochaGrid Computing Technical Sales, IBM Brasil

    Joao AlmeidaIT Specialist, IBM Portugal

    Srikrishnan SundararajanIBM India Software Labs

    Clive HarrisSenior Architect, IBM UK

    http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integrade
  • 8/2/2019 sg246649

    23/180

    Preface xxi

    John EastonSenior Consulting IT Specialist, IBM UK

    Dr. Victor AlessandriniIDRIS - CNRS - DEISA

    Gisele S. Craveiro, Rogerio Iope, Liria Sata, Srgio KofujiUniversidade de Sao Paulo, Brasil

    Edward Walker, Ph.D., Tina Romanella de Marquez, Chris HempelTexas Advanced Computing Center, The University of Texas at Austin

    Trish L. Barker, Karen GreenNational Center for Supercomputing Applications, University of Illinois atUrbana-Champaign

    Alex Tropsha and his teamDirector Molecular Modeling lab, University of North Carolina at Chapel Hill

    Terry O'Brien, Dr. Anne Aldous, Scott OloffUniversity of North Carolina Project - IBM

    Madhu GombarLS Solutions Architect, Healthcare/Life Sciences Solutions Development,provided a case study on the pilot engagement conducted in the cheminformatics

    arena with molecular modeling lab of UNC-Chapel Hill, NC. This wasaccompanied by a multi-media, interactive Flash demo developed by her tohighlight application of IBM middleware in drug discovery.

    Thanks to the following institutions for their contributions:

    TACC - Texas Advanced Computing CenterUniversity of Texas at Austin

    MML - Molecular Modeling LaboratoryUniversity of North Carolina at Chapel Hill

    NCSA - National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-Champaign

    Computational Science Research Center in Hosei UniversityJapan

    National Institute of Advanced Industrial Science and Technology, AISTJapan

    Advanced Center for Computing and Communication, RIKENJapan

    http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integrade
  • 8/2/2019 sg246649

    24/180

    xxii Grid Computing in Research and Education

    Chinese Ministry of Education - MOEChina

    Peking UniversityChina

    Tsinghua UniversityChina

    Huazhong University of Science & TechnologyChina

    Shanghai Jiao Tong UniversityChina

    Xi'an Jiao Tong University

    China

    Southeast UniversityChina

    Northeastern UniversityChina

    Sun Yat-Sen UniversityChina

    South China University of TechnologyChina

    Shandong UniversityChina

    Beijing University of Aeronautics and AstronauticsChina

    National University of Defense Technology

    China

    Become a published author

    Join us for a two- to six-week residency program! Help write an IBM Redbookdealing with specific products or solutions, while getting hands-on experiencewith leading-edge technologies. You'll team with IBM technical professionals,Business Partners and/or customers.

    Your efforts will help increase product acceptance and customer satisfaction. Asa bonus, you'll develop a network of contacts in IBM development labs, andincrease your productivity and marketability.

    http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integrade
  • 8/2/2019 sg246649

    25/180

    Preface xxiii

    Find out more about the residency program, browse the residency index, andapply online at:

    ibm.com/redbooks/residencies.html

    Comments welcome

    Your comments are important to us!

    We want our Redbooks to be as helpful as possible. Send us your commentsabout this or other Redbooks in one of the following ways:

    Use the online Contact us review redbook form found at:

    ibm.com/redbooks

    Send your comments in an e-mail to:

    [email protected]

    Mail your comments to:

    IBM Corporation, International Technical Support OrganizationDept. JN9B Building 003 Internal Zip 283411400 Burnet RoadAustin, Texas 78758-3493

    http://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://gsd.ime.usp.br/integradehttp://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/contacts.htmlhttp://www.redbooks.ibm.com/contacts.htmlhttp://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/residencies.html
  • 8/2/2019 sg246649

    26/180

    xxiv Grid Computing in Research and Education

  • 8/2/2019 sg246649

    27/180

    Copyright IBM Corp. 2005. All rights reserved. 1

    Part 1 Introduction

    This part of the book includes the following chapters: Chapter 1, Introduction to grid concepts on page 3

    Chapter 2, How to implement a grid on page 15

    Part 1

  • 8/2/2019 sg246649

    28/180

    2 Grid Computing in Research and Education

  • 8/2/2019 sg246649

    29/180

    Copyright IBM Corp. 2005. All rights reserved. 3

    Chapter 1. Introduction to grid

    concepts

    In this chapter we discuss the following topics:

    Research and education on the grid concept

    The applicability of the grid in research and education environments

    Some thoughts about what the future of grid computing concepts might bringto research and education institutions and to society as a whole

    1

  • 8/2/2019 sg246649

    30/180

    4 Grid Computing in Research and Education

    1.1 Beginning of the grid concept

    The termgrid, coined in the mid 90s in the academic world, was originallyproposed to denote a distributed computing system that would providecomputing services on demand just like conventional power and water grids do.

    During the last few years, as the technology evolved and the grid concept startedbeing explored on commercial endeavours, some slight but meaningful changeshave been made in its original definition. Nowadays, an accepted definition,world-wide, states that a grid is a system that:

    coordinates resources that are not subject to centralized control...

    ... using standard, open, general-purpose interfaces and protocols...

    ... to deliver non-trivial qualities of service

    For more information, refer to What is the Grid? A Three Point Checklist by I.Foster in GRID Today, July 20, 2002.

    Nowadays, most of the interest driven toward the grid concept derives from thefact that, stated as it is, a grid can be regarded as a technology with noboundaries. In fact, if one can integrate all its computing resources, no matterwhat they are, in a single vir tual computing environment, such a system wouldmake possible:

    The effective use of computing resources that otherwise would remain idle for

    most of the time... To perform complex and computing-demanding tasks that would normally

    require large-scale computing resources.

    As Web technologies have changed the way that information in shared all overthe world, grid computing aims at being the next technological revolution,integrating and making available not only information, but also computingresources such as computing power and data-storage capacity.

    Figure 1-1 illustrates the way that a grid can be built by means of computingresources that are somehow interconnected by the Internet but that have norelation among them.

  • 8/2/2019 sg246649

    31/180

    Chapter 1. Introduction to grid concepts 5

    Figure 1-1 Heterogeneous and independent computing resources

    In the next section we describe the types of grids available.

    Types of gridsIdeally, a grid should provide full-scale integration of heterogeneous computingresources of any type: processing units, storage units, communication units, and

    so on. However, as the technology hasnt yet reached its maturity, real-world gridimplementations are more specialized and generally focus on the integration ofcertain types of resources. As a result, nowadays we have different types ofgrids, which we describe as follows:

    Computational grid A computational grid is a grid that has the processingpower as the main computing resource shared among itsnodes. This is the most common type of grid and it hasbeen used to perform high-performance computing to

    tackle processing-demanding tasks.

  • 8/2/2019 sg246649

    32/180

    6 Grid Computing in Research and Education

    Data grid Just as a computational grid has the processing power asthe main computing resource shared among their nodes,a data grid has the data storage capacity as its mainshared resource. Such a grid can be regarded as amassive data storage system built up from portions of a

    large number of storage devices.Network grid This is known as either a network grid or a delivery grid.

    Such a grid has as its main purpose to providefault-tolerant and high-performance communicationservices. In this sense, each grid node works as a datarouter between two communication points, providingdata-caching and other facilities to speed up thecommunications between such points. In this sense, theWWW can be regarded as an embryonic communication

    grid that does not satisfy (yet) the third requirement of thegrid definition [see 1.1, Beginning of the grid concept onpage 4].

    Despite grids being a new field of research and development, there are a numberof bibliographic references that comprehensively describe the concept ofgridcomputingand its applicability. See Related publications on page 137 for acomprehensive list of such references.

    1.1.1 Research and education on grid context

    A knowledge-oriented activity is what exists behind research and education.Therefore, we can say that this redbook discusses how knowledge-orientedactivities might be leveraged by the use of (potentially) massive computingresources made available through high-scale distributed systems that adhere tothe grid formal definition.

    Note: There is no a clear boundary for each type of grid. Every computationalgrid has a data and network component; likewise for a data grid and a networkgrid. As such, there really is just one sort of grid which is biased towards oneor more of these considerations.

    K l d i t d ti iti f d i i t f i t

  • 8/2/2019 sg246649

    33/180

    Chapter 1. Introduction to grid concepts 7

    Knowledge-oriented activities are performed in a variety of environments:schools, high-schools, universities, research institutes, large corporations, etc.On the other hand, grid implementations only make sense in environmentswhere a meaningful number of computing resources can be integrated to form ahigher-performance system, which tends to be rather restrictive. In this book, we

    consider the implementation of grid systems in environments gifted with arather large number of computing resources, and that can be greatly benefittedby grid technologies.

    Each type of grid may be more or less suitable for each type of institution. Thefollowing list presents some comments about what may be best in each case:

    Universities Here, all grid types may be used for leveraging researchand educational activities; either a computational grid or adata grid would probably be focused on the area of

    research, while a network grid would better fit educationalpurposes.

    Research institutes Just as for universities, it is easy to see how researchactivities performed in institutes can benefit from acomputational grid or a data grid. A network grid might beuseful in some particular cases, as shown in Part 2, Gridby examples on page 31.

    Schools In the case of grade schools through high schools, these

    institutions would probably invest in a network grid forleveraging their educational activities.

    In the next section we discuss some issues related to the applicability of gridcomputing in research and education.

    1.2 Applicability

    This section presents a brief discussion on which types of research andeducational activities could benefit from grid computing technologies.

    1.2.1 Why use grids in research and education?

    In 1.1, Beginning of the grid concept on page 4, we have presented what a gridis, but havent gone into details on what it can do. Actually, it is not difficult tofigure out how useful a high-performance computing infrastructure can be, butthis is not all of the truth. The fact is that such an infrastructure can be built upfrom computing resources that are already available, which is the reason whygrids are so appealing.

    Briefly stated a computational grid provides high performance computing; a data

  • 8/2/2019 sg246649

    34/180

    8 Grid Computing in Research and Education

    Briefly stated, a computational grid provides high-performance computing; a datagrid provides large storage capacity; and a network grid provides high throughputcommunication that may be useful for a variety of applications, such as virtualconferences. Having this in mind, we can list the main reasons for using gridcomputing as follows:

    Improve efficiency/reduce costs Exploit under-utilized resources Enable collaborations Virtual resources and vir tual organizations (VO) Increase capacity and productivity Parallel processing capacity Support heterogeneous systems Provide reliability/availability Access to additional resources Resource balancing Reduce time to results

    When these reasons are regarded under the light of scientific research, it is easyto understand why scientists are so keen on grids: they believe that the use ofgrids will transform the practice of their science. As stated in Needs AssessmentWorkshop for Grid Techniques in Introductory Physics Classroom Projects, byBardeen et al, grids are a tool for:

    1. Sharing the costs and burdens of immense computing needs2. Supporting the participation of scientists worldwide in large collaborations in

    particle physics, astronomy, cosmology, fusion and nuclear physics,medicine, and life science.

    Grids have the potential to change how people work together to make scientificdiscoveries.

    On the side of education, it is important to note that grids can play a major role

    as, according to QuarkNet Cosmic Ray Studies and the Grid: Probing ExtensiveShowers by Bardeen et al, grids represent:

    An opportunity for a new style of collaborative learning An aid to online posters and discussions with students at other schools An easy way to present and review results An easy way to conduct peer-to-peer discussions A rapporteur of presentations and discussions A single portal to distributed resources

    Distance Education and Higher Education are the fields more directly touched bygrid application in education.

    1 2 2 Leveraging research activities with grids

  • 8/2/2019 sg246649

    35/180

    Chapter 1. Introduction to grid concepts 9

    1.2.2 Leveraging research activities with grids

    As grid computing makes available low-cost high-performance computing (HPC)infrastructures, the best candidates for using such infrastructures areapplications that require high computational power, large storage capacity, orfast and high-throughput networking.

    Table 1-1 shows an example of research areas that typically make use of highperformance infrastructures and which type of grid would first drive their needs.As mentioned before, every computational grid has a data and networkcomponent; likewise for a data grid and a network grid.

    Table 1-1 Types of grid that drive the grid solution for each area (shaded cells)

    Performing meteorological forecasts, calculating the aerodynamic behavior of anairplane, assembling the genome of an organism, analyzing the elementaryparticles on an accelerator, virtualizing computing resources, and data-miningseveral terabytes of data these actions all need extensive calculations andhandling of enormous amounts of data. This is the perfect scenario for gridcomputing technologies.

    In Part 2, Grid by examples on page 31, a number of examples are analyzed

    and, for each one, a graph is used to represent the portion of computing, dataand communication features that a specific implementation has.

    Research area Computational Data Network

    High energy physics * *

    Environmental studies (*) * *

    Biology and genetics * *

    Chemistry * *

    Materials *

    Astrophysics *

    Astronautics and aerospace *

    Automotive *

    Economics analysis *

    Medicine imagery * * *

    Remote access toexperimental apparatus

    * *

    1 2 3 Leveraging educational activities with grids

  • 8/2/2019 sg246649

    36/180

    10 Grid Computing in Research and Education

    1.2.3 Leveraging educational activities with grids

    When discussing grid computing in education, one is probably talking about adata grid or a network grid. Here are some of the reasons:

    Data grid The reason that a data grid can be used to leverage

    educational activities can be easily understood when oneconsiders how important is the role that WWW has beenplaying in education. A data grid would provide the samesort of service which is making information readilyavailable world-wide satisfying specific quality ofservice (QoS) requirements.

    Network grid Such a grid makes it possible for data-intensive streamingapplications to be executed on non-specializedcommunication networks. This allows for audio and videoapplications to take place, such as remote learningsessions, and even collaborative sessions in which alarge number of parties can participate.

    Thus, as stated in article ITR: Distance Collaboration - Education and Trainingon the Access Grid , by Morton et al, grid application in education represents anambitious venture in a direction that substantially increases the ability of groupsto cooperate and achieve a sense of collaborative community even though theyare distributed across the planet.... Using one of the ... collaborative tools in the

    realm of information technology, ... the project investigators will launch aninitiative to advance the state of the art (in a social and technical sense) ingeographically distributed project oriented collaborations

    An interesting scenario that can be drawn from these ideas is, as presented inthe Needs Assessment Workshop for Grid Techniques in Introductory PhysicsClassroom Projects, by Bardeen et al, the one in which educators becomeinterested in and excited about the potential that grid tools and techniques bringto data-based classroom projects and, as a result, use the grid as a hosting

    environment in which inquiry-based projects are standards-based, visuallyappealing, use common tools and data formats, allow for levels and scale of use,and provide support materials for educators and students.

    In such a scenario, some teachers will come to the projects as experiencedusers with a great deal of knowledge about the research and experience withinquiry-based learning using online resources. Others will be emerging orbeginning users. Most classroom users will analyze data from the Web tools.Some will be interested in learning whats under the hood, exploring grid portals,

    and a few will become developers of grid skins or transforms.

    Another interesting example described in GRASP-Grid Accessed Data and

  • 8/2/2019 sg246649

    37/180

    Chapter 1. Introduction to grid concepts 11

    g pComputational Scientific Portal, by Sharly, shows how to develop a scientificportal for a learning community who can access Computational Servers,Streaming Servers, Digital Library, Course Materials from different serversspread across the globe, or use Mathematical packages, Computer Aided

    Design (CAD), Simulation packages, etc.

    1.3 What will the future bring?

    In this section, we analyze the new perspectives that grid technologies bring toresearch and education, and present what has already been accomplished.

    1.3.1 What exists todayAs stated before, grid computing is still in its early years of existence. Althoughon the one hand, the concepts that define the grid philosophy are convergingtoward something more consistent, on the other hand there is still a lot to bedone in terms of the software and (why not?) the hardware infrastructure.

    As one can see in Part 2, Grid by examples on page 31 of this redbook, anumber of research and education oriented grid computing implementations arealready in place all over the world. Actually, these implementations provide grid

    services delivered according to rather strict QoS requirements. However, eventhough the level of integration reached can still be considered embryonic, we arestill talking about several unconnected grids instead of a single global grid.

    Nevertheless, the grid culture is already becoming a part of the lives of todaysresearchers and educators. Just as the World Wide Web has changed the waythat society deals with information, researchers and educators now expect thegrid to change the way that they deal with computing resources and, ultimately,how they deal with knowledge.

    And last but not least, it is important to mention that, so far, the computationalgrid has reached a much higher level of maturity than the other types of grid.This may derive from the fact that computing science has always beenconcerned with computing activities, for obvious reasons, and that distributedcomputing research and development has been on the road for more than 30years now. The data grid and the network grid are taking off mainly due to recentdatabase and Web technology breakthroughs, and still have a rather long way togo until full-scale integration.

    1.3.2 What is the potential for grids

  • 8/2/2019 sg246649

    38/180

    12 Grid Computing in Research and Education

    3 at s t e pote t a o g ds

    Thanks to the WWW revolution, networked desktop computers can be foundeverywhere today. Having a computer and, what is most surprising, having thiscomputer connected to the Internet, is something that few people can afford toreject. Government, industry, universities, and other research and educationalinstitutions rely on every sort of computer to perform their daily activities, and thiscomputer-based society is growing bigger every day.

    In addition to this, we know that as time goes by and technology evolves, thecomputers and network connections get faster and more reliable. As a result, theglobal computing pool becomes more powerful and more strongly coupled,leading to systems that can handle large amounts of data in shorter periods oftime.

    These factors can be summarized in a few words under the grid perspective:A global grid infrastructure is evolving to be readily available in the near future.

    Knowing that this infrastructure will be somehow available, we should analyze thepotential applicability of grid technologies. As seen in 1.2, Applicability onpage 7, there are a great number of research and educational areas that couldbenefit from grid technology; having them in mind, we can say that the potentialfor grid technology applications depends on the following facts:

    After the World Wide Web, the grid has been regarded as the next naturalstep towards the evolution of information technology.

    The forthcoming scientific breakthroughs are likely to be brought by the powerunleashed by grid computing.

    Such a powerful computing infrastructure can embrace existing and brandnew paradigms of application execution.

    Researchers and educators do believe that the grid will come to reality andare looking forward to using it.

    For all these reasons, we believe that grid computing will form part of aninexorable future, changing the way that research, education, and even ordinaryor everyday tasks are performed.

    1.3.3 What is likely to happen

  • 8/2/2019 sg246649

    39/180

    Chapter 1. Introduction to grid concepts 13

    y pp

    Throughout this chapter, establishing a grid computing infrastructure has beenan abstract concept with which we have been dealing in a rather superficial way;we have a reasonably good idea of whatsuch a structure is capable of doing, butwe have not gone into the whereabouts of how to set up a grid. Actually, this isthe subject of Chapter 2, How to implement a grid on page 15, but one thingthat is not covered in this chapter is:How people will interact with the grid?Answering to this question might give us a clue about the future paths that thistechnology will follow, which is exactly what this section is about.

    As the technology evolves and computers come out-of-the-box with grid-enabledsoftware, connecting a computer to the grid will probably be as easy asconnecting a data cable to the proper outlet. When starting up a grid on-linemachine, logging onto the operational system might register users onto the grid,

    which is the environment where their personal computing tasks will all beperformed: reading their e-mail, editing documents, managing files, browsing theWeb, and so on. So far, so good, but exactly what is the role that the grid isplaying in this scenario?

    Despite what normally happens in an application-server-oriented architecture,where all the computing-intensive tasks are centralized in a server that isaccessed by a number of dumb terminals, in a grid, every computing resourcehas to play its part in the overall job: Thus, every computer should be regarded

    as a source of computing power, data storage, and even data routing, from anetworking perspective.

    Having this in mind, when we get back to the scenario depicted above, weunderstand that plugging a computer into the grid means not only getting accessto the grid but also making more resources available to the grid. The amount ofresources that are granted to a certain user may be proportional to the amount ofresources that he donates to the grid, but his computing tasks are no longerdependent on his personal computing infrastructure!

    It is very easy and fun to figure out the implications of such a fullyintegrated computing environment:

    As long as a user is connected to the grid, he/she will be able to use the sameworking environment, no matter how or where this connection is established.

    Computing power, storage capacity, and data throughput will be available ascommodities on the grid, and a user will be able to use them on demand.

    Once the application code is being executed on the grid, its level of availability

    and fault-tolerance can be arbitrarily high.

    The same applies to the data stored to the grid: its level of availability andfail-tolerance can be arbitrarily high.

    Here are some other interesting issues regarding the future of grid computing:

  • 8/2/2019 sg246649

    40/180

    14 Grid Computing in Research and Education

    The grid expansion may embrace multiple media types; thus, radios,televisions, and phone networks will also be available as a grid service.

    Personal and home-based offices will become a reality; this may change theway that small and large corporations are conceived.

    These are some of the possibilities that might arise from the grid world, and thereis no doubt that they will definitely change the way that we deal with informationin our personal and professional activities.

  • 8/2/2019 sg246649

    41/180

    Copyright IBM Corp. 2005. All rights reserved. 15

    Chapter 2. How to implement a grid

    In this chapter we discuss the following topics:

    Information on how the ideas presented in Introduction to grid concepts on

    page 3 might be implemented in a real-world computational environment How a grid computing infrastructure can be correctly set up

    In a practical sense, how this chapter can be regarded as a bridge betweenthe concepts and the examples presented in Grid by examples on page 31

    2

    2.1 Introduction

  • 8/2/2019 sg246649

    42/180

    16 Grid Computing in Research and Education

    Knowing what a grid is and what it can do for you is essential when you plan touse this technology to tackle your most demanding computational problems.However, when going through the process of implementing a grid computing

    environment, there are many other issues that arise and that may require specialattention.

    This chapter offers a brief discussion on how to implement a grid computingenvironment and, as such, it covers the following topics:

    Basic requirements for setting up a grid computing environment How to set up an initial grid How to maintain and expand the grid

    The following topics are notcovered:

    Which software or hardware, in particular, should be used for implementinggrid environments

    Which companies can best provide grid implementation services

    More information about the various grid topics can be found in the bibliographicreferences presented in Related publications on page 137.

    2.1.1 The main difficulties

    According to the definition presented in 1.1, Beginning of the grid concept onpage 4 and reproduced here, a grid is a system that:

    Coordinates resources that are not subject to centralized control... Using standard, open, general-purpose protocols and interfaces... To deliver nontrivial qualities of service

    This definition suggests a very important aspect of a grid system: its unboundeddistributed nature. As any system that is not subject to centralized control, agrid is made up of nodes that might be physically distributed across a world-widearea and that interact with each other by means of open and general-purposeprotocols and interfaces. Another important aspect of a grid system is itsheterogeneity, which is a natural consequence of its distributed nature.

    Setting up a computational environment that is physically distributed across apotentially wide area and that integrates heterogeneous computing resourcesmay cause severe headaches. Apart from the technical issues, many human

    related issues, such as political interests and personal preferences, make it verydifficult to accomplish this task.

    In the future, we expect that, as the technology evolves and the grid conceptbecomes part of common sense implementing a grid will be as simple as

  • 8/2/2019 sg246649

    43/180

    Chapter 2. How to implement a grid 17

    becomes part of common sense, implementing a grid will be as simple asinstalling a certain software application on a number of computers. Before gettingthere, grid implementors should be aware of the traps into which they may fall.

    2.1.2 Approaches

    There are two basic engineering approaches that we have chosen to adopt whenimplementing a grid environment: bottom-up implementation and incrementalgrowing.

    Bottom-up implementationIn order to understand what is meant by bottom-up, a system as a grid shouldbe regarded as having multiple levels of abstraction. In this case, we are

    considering that the lowest level of abstraction is the one that takes intoconsideration the details about the hardware that build up the grid. As the level ofabstraction increases, we move our focus to the software layer and, at last, thehuman factor layer.

    Having these things in mind, we are able to state that performing a bottom-upimplementation implies making sure that everything in a certain layer ofabstraction is working properly before moving to an upper layer. This may soundquite obvious, but it is not! There are very specific conditions in which a layer has

    to work and we will try to depict these conditions here.

    Incremental growingThe bottom-up implementation philosophy refers to the way that a group ofnodes should be set up to make part of the grid, but it does not address the waythat a set of nodes should be integrated into the grid. In these circumstances, theorder in which nodes are set up does matter, and for such, we recommend theadoption of an incremental growingphilosophy.

    The combination of these two ideas can be represented by the diagram inFigure 2-1.

  • 8/2/2019 sg246649

    44/180

    18 Grid Computing in Research and Education

    Figure 2 1.

    Figure 2-1 How a grid should expand

    In this figure, each tank represents a group of grid nodes, and the level of waterinside a tank represents the level of abstraction in which we are working to setthese nodes up for the grid. This figure suggests that:

    The group of nodes should be integrated into the grid only afterfully-integrating previous nodes.

    The order in which nodes are integrated into the grid depends on theunderlying physical and logical structures that connect them.

    In terms of this figure, implementing a grid is the same as fil ling interconnectedtanks with water. In the following sections, we show how this interconnectionshould be accomplished and at which tank the water should be shed.

    2.2 Basic requirements

    In this section we analyze the requirements that must be satisfied for a grid to be

    implemented.

    2.2.1 Hardware requirements

  • 8/2/2019 sg246649

    45/180

    Chapter 2. How to implement a grid 19

    A grid environment is made up of computing resources. A computing resource,which in normal conditions is simply a computer, can be regarded as a source ofcomputing power and data storage capacity. The basic hardware requirementsthat must be satisfied by any grid implementation are as follows:

    Every computing resource must have enough computing power and datastorage capacity to properly run the grid platform.

    The computing resources do not need to be directly connected to each other.The resource needs to know some entity that takes it to the grid; an entitycould be an internal scheduler, or a data server, and so forth.

    Computing resources can be indirectly connected, through routers, gateways,hubs, switches, bridges, and wireless connections, by which a data packet

    can be sent from one computing resource to another.

    Depending on the type of grid that we intend to implement, there are additionalhardware requirements that have to be satisfied:

    For a computational grid:

    The overall computing power of the grid, calculated as the sum of thecomputing power of its nodes, gives you a clue as to how powerful a gridis, but this is no guarantee of performance at all. The efficiency of a gridwill largely depend on the application that it executes.

    The overall performance of a grid also largely depends on the quality ofthe communication links that interconnect the nodes. Under the best

    conditions, the time spent on data exchange during the execution of anapplication should be negligible compared to the time spent on processingthis data.

    For a data grid:

    The overall data storage capacity of a grid is the sum of the storagecapacity made available for the grid in all its nodes. Apart from thiscapacity, each node should have enough room to house the grid platformand to let the computer users perform their daily activities.

    Note: By directly or indirectly connected we mean that there is a physicalpath, which includes cables, routers, gateways, hubs, switches, bridges, andwireless connections, by which a data packet can be sent from one computingresource to another.

    The performance of a data grid heavily depends on its communicationlinks, but it is very difficult to express the grid performance as a function of

  • 8/2/2019 sg246649

    46/180

    20 Grid Computing in Research and Education

    the quality of its links. A worst-case estimate can be found calculating thetime that it takes to exchange a data record between the two nodes forwhich the communication has the worst performance.

    For a network grid: The hardware requirements for these grids are even more difficult to

    determine due to the on-demand nature of their functionality. As a rule ofthumb, the average data throughput provided by such grids between twopoints can be estimated as the average data throughput of the bestcommunication path between these nodes.

    2.2.2 Software requirements

    These are the basic software requirements that must be satisfied by any gridimplementation:

    There must be an interoperability among grid platforms of all the computingresources.

    Network software must be properly configured to allow the direct or indirectcommunication between any pair of computing resources. In other words,there must exist at least one logical path by which two computing resourcescan exchange data.

    These are requirements that must be met prior to the installation of a gridplatform, but there are some important requirements that must be met by the gridplatform itself. To administrate a grid, which is, as stated earlier, a widelydistributed computing environment, the need for comprehensive administrationtools is imperative. When choosing a grid platform, the availability of such toolsshould be carefully checked, as they must provide facilities for:

    Easily first-installing the platform in a computing resource; this means that

    the platform should be available through some sort of on-line network, suchas the Internet, or through commonly-used storage medias, such asCD-ROMs. It also means that the installation itself should be straightforward,requiring few and simple steps.

    Remotely and automatically upgrading the grid platform and the code for itsapplications; it is impossible to rely on manual software upgrades whentalking about dozens of computers (not to say hundreds or thousands).

    Remotely monitoring the computing resources; the grid platform must provide

    real-time information about the state of its computing resources, such as ifthey are working properly or if they have failed, how efficiently they areexecuting application tasks, and so on.

    Storing logging information about all the activities performed on theplatform; historical information about the grid performance is essential when

    i li i F h h id l f h ld id h

  • 8/2/2019 sg246649

    47/180

    Chapter 2. How to implement a grid 21

    tuning applications. For such, the grid platform should provide a way thatdevelopers can analyze this information.

    Controlling access to the platform; for obvious reasons, there must be a way

    to control the access to the platform. Securing the data exchanged within the platform; application developers will

    not put their applications to run onto the grid if they are not assured thatsensitive data can be secured.

    Once these requirements are satisfied, we can move to the next level ofabstraction: the human-resource requirements.

    2.2.3 Human-resource requirementsNowadays, grid systems depend on human resources much more than we wouldlike them to. Besides the high-level administrative tasks, traditionally assigned toa specialized analyst, there are several tasks, such as software installation, thatmight somehow be performed by non-specialized people. This section presentssome basic rules that have been learned from grid implementations regardinghuman-resource factors:

    There must be at least one analyst who will be responsible for the higher-level

    administrative tasks. Roughly speaking, these tasks include: Upgrading the grid platform

    Managing applications (installing new applications, starting them up,interrupting, cancelling etc.)

    Managing user accounts

    Monitoring the grid and generating reports

    There must be at least one analyst who will be responsible for the technical

    support of the grid environment. His activities comprise:

    Installing the grid software onto the computing resources or helpingpeople do so

    Helping and guiding network administrators to properly configure theirenvironments so that their networked resources can join the grid

    Fixing reported failures on the grid

    Answering technical questions that users and developers may pose

    Maintaining the grid Web portal

    In addition to these two analyst roles, it is recommended that:

    There is one analyst able to help application developers to develop and test

  • 8/2/2019 sg246649

    48/180

    22 Grid Computing in Research and Education

    y p pp p ptheir applications.

    Finally, here is an important remark concerning the human factor:

    The grid software execution should be as transparent as possible whenperformed on ordinary desktop computers; users tend to interrupt everyrunning program that they do not recognize as useful or that they believe bea source of overhead; two generally good options arescreen-savers andsystem services.

    2.3 Setting up grid environments

    In this section we present the basic steps for setting up a grid environment anddiscuss some of the issues that commonly arise during these implementations.

    2.3.1 Defining the architecture

    At first, we must clearly distinguish the concepts of logical and physicalarchitecture:

    physical architecture This is the architecture defined by the way that the

    computing resources are physically connected to eachother as far as communication links, gateways, bridges,hubs, and routers are concerned.

    Logical architecture This is the architecture defined by the way that thecomputing resources are logically connected to eachother as far as software configurations are concerned.

    We are assuming here that in a grid implementation, most of the integratedcomputing resources will not be dedicated to the grid. This means that they arealready set in place and made part of a physical architecture that was previouslydefined for other purposes. In general, a grid implementation does not addresscomplex physical architecture issues and, more important, it does not depend onsuch issues being accomplished.

    Defining the logical architecture of a grid implementation as, for example,separating computing resources in grid groups, is something we expect to betransparent in the future. Future generation grid platforms will hopefully be ableto automatically map a given physical architecture into the best possible logicalarchitecture by performing network tests and benchmarks. As these platformsare still to come, there are some simple rules that might be useful whenimplementing a grid:

    Computing resources that are interconnected by a high-speed network andthat are physically close to each other are the best candidates for building upa logical grid group; such a group would work very similarly to a cluster of

  • 8/2/2019 sg246649

    49/180

    Chapter 2. How to implement a grid 23

    a logical grid group; such a group would work very similarly to a cluster ofcomputers, exchanging data among themselves at high rates and with othergroups at low rates.

    If a logical group has been set for computers that are directly connected toeach other by a high-speed network, then they will have a local computer thatwill bridge all the inter-group data exchange. The natural choice for thisparticular computer is the one that has the role of network gateway, forefficiency and security reasons.

    Setting logical links between logical groups of computing resources isreferred to as defining the high-level architecture of the grid; as groups arenot expected to exchange huge amounts of data, the performance of thecommunication links should not be a concern as long as there are noconverging points of communication and/or coordination; a master-slavehigh-level architecture has this drawback.

    Inter-group links have to be stable. If stability cannot be assured, dynamichigh-level architectures should be considered. Dynamic architectures dependa lot more on the grid platform and, if they can offer flexibility and robustness,they are harder to maintain and have not yet reached maturity in terms ofstandardization.

    Having these rules in mind, one should still remember the diagram in Figure 2-1,How a grid should expand on page 18, when planning the architecture of thegrid. This means:

    Defining the point from where the grid will expand is crucial; this is the placewhere the administrative infrastructure of the grid will be located and wherethe fault-tolerant parts of the system will be installed. In general, this placehas to have a fast and stable down and up links.

    Defining at which directions the grid will expand is crucial as well; the gridgrowth should never compromise its performance as, in theory, it is infinitelyscalable.

    2.3.2 Hardware setup

    As we are not focusing on physical architecture issues, hardware installation isnot treated in detail in this chapter. However, we present some important notesthat should be taken into account when setting up the st