ogsa/gt3 evaluation at cern

26
May 25, 2022 OGSA/GT3 evaluation 1 OGSA/GT3 evaluation at CERN Activity Report M. Lamanna (CERN), R. Brito Da Rocha (European Data Grid), Tao-Sheng Chen (Academia Sinica Taipei), A. Demichev (Moscow State Univ. MSU), D. Foster (CERN), V. Kalyaev (MSU), A. Kryukov (MSU), V. Pose (JINR Dubna), C. Wang (Academica Sinica Taipei)

Upload: avedis

Post on 13-Jan-2016

29 views

Category:

Documents


4 download

DESCRIPTION

OGSA/GT3 evaluation at CERN. Activity Report M. Lamanna (CERN), R. Brito Da Rocha (European Data Grid), Tao-Sheng Chen (Academia Sinica Taipei), A. Demichev (Moscow State Univ. MSU), D. Foster (CERN), V. Kalyaev (MSU), A. Kryukov (MSU), V. Pose (JINR Dubna), C. Wang (Academica Sinica Tai pei ). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 1

OGSA/GT3 evaluation at CERN

Activity Report

M. Lamanna (CERN), R. Brito Da Rocha (European Data Grid), Tao-Sheng Chen (Academia Sinica Taipei),

A. Demichev (Moscow State Univ. MSU), D. Foster (CERN), V. Kalyaev (MSU), A. Kryukov (MSU), V. Pose

(JINR Dubna), C. Wang (Academica Sinica Taipei)

Page 2: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 2

Table of Content

• Introduction• Activity highlights

– GT3 ToolKit Experience– GT3 Performance studies– Integration of Existing Codes/Services

• Conclusions

Page 3: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 3

Motivation• The promise of the web services framework

• Standard environment for Grid applications• New projects are looking to OGSA as the solution to all

problems…• Here and now

• Globus release of the new toolkit in May 2003.– OGSI framework and some grid services– GT3 out July the 1st (new major release -3.2- 1H04)

• To provide input to the EGEE middleware activity• OGSA/GT3 project

– Primary objectives of the OGSA/GT3 evaluation:• Understand the GT3 offering and its “quality”• Learn how to create new services in this framework• Study how to leverage existing developments in an OGSA

context• Create local know-how on promising technologies

– Most people at CERN only for short periods• Variable geometry approach• 75% of the people are not always at CERN• Open to new collaborators

Page 4: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 4

What does GT3 offer? (NOW)• The first OGSI implementation (July 2003: 3.0.x)

– The toolkit itself• Build new services and extend existing ones

– Security Infrastructure• GSI (Globus Security Infrastructure)

– Services• GRAM (GT2 implementation wrapped up as a Grid

service)• IS ( Index Service; new GT3 implementation)• RFT (Reliable File Transfer; it uses Globus FTP)

• Explore these three lines

Page 5: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 5

Grid Service Development• Grid Services

– Extended Web Services complying to the OGSI specification

• Converge on a standard system

COMPLEMENTARYOGSI IMPL.

GRID SERVICES

WEB SERVICES ENGINE

GRID CONTAINER

HOSTING ENVIRONMENT

Page 6: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 6

Grid Service Development• What we get:

– From Web Services• Interoperability

– standard for message creation and definition - XML– standard for protocol-independent message passing – SOAP – standard for service definition – WSDL

» choice on hosting environment is left to the service provider• Service Oriented Design approach

– From OGSI• Stateful Services (Service Data)• Other common features on independent services

– Different from GT2 where nothing is common between services apart from GSI

– Straightforward development: common framework for service usage and management

Page 7: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 7

Grid Service Development

• Current options:– Hosting Environments:

• J2EE Application Server – Jakarta Tomcat, GT3 Standalone Container, Websphere, …

• Microsoft .NET Platform– OGSI implementations:

• J2EE Servers: Globus Toolkit 3• Microsoft .NET: OGSI.NET (Virginia Univ.); MS.NETGrid

(EPCC)• Others are appearing

– Any environment with an existing implementation of a Web Services engine is one single step away from providing Grid Services

– Ex: OGSI::Lite (Perl), pyGridWare (Python)

Page 8: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 8

INSTANCE

Designing Grid Services• Important concepts when designing Grid Services:

– Factories and Instances

CLIENT FACTORY

• Factories create instances and respond to instance creation requests by clients

• Instances respond to client’s service specific interaction requests• Advantages:

– Workload balancing between pools of instances– User dependent instances

• Disadvantages:– Instance creation overhead

1

2

Page 9: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 9

Designing Grid Services• Approach:

– Service Data, Subscriptions and Notifications

• Each Grid Service has it’s own Service Data Set - collection of Service Data Elements (SDEs)

• Every SDE has a set of associated values concerning its validity in time – goodFrom, goodUntil, availableUntil

• A service or client may declare interest in a SDE by issuing a Subscription

• Service Data flows by means of Notifications – normally when a change occurs or the value lifetime has expired

GRID SERVICE A

SDE A1 SDE A2

GRID SERVICE B

SDE B1

1 - SUBSCRIPTION

2,.. - NOTIFICATIONS

Page 10: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 10

TestBeds• First hand experience on Globus Toolkit 3

– This can be achieved only by using it!• The main tool are prototypes, with the following common

features: – Small– Working (with limited functionality)– No architectural ambition– Engineering approach

• Mapping of functionality – prototype functions

• GT3 TestBed– 4 CERN machines + 1 in Moscow– Focus on GT3 basic functionality and performances

• Performance tests use also some high performance machines and Lxplus (CERN Linux interactive service)

• AliEn TestBed– 3 CERN TestBed machines

• “Architecture” TestBed– Focus on the complexity of future possible architectures– Deployment use cases

Page 11: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 11

Example: GT3 Test Bed

• Resource broker and L&B (Custom service)– Surprisingly fast to set-up

• A few computing elements (GT3-GRAM, with modifications)– 2 PC boxes in the CERN Computing Centre– In a second phase, one PC located in

Moscow was added– Some problems (solved) in data

stage-in/stage-out– See GRAM comments in the performance

part• Information service (GT3-IS)

– Native GT3 service– In this TestBed talks only with other

services

ToolKit

GT3 Services

Security

Page 12: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 12

GT3 TestBed coverage…PortType Operation OSGA/GT3 evaluation

Gridservice FindServiceData •GT3TestBed-RB uses it to retrieve data from IS•IS performance tests (C-client)

SetTerminationTime •Not Used Yet (directly)

Destroy •Everywhere, e.g. GRAM

NotificationSource SubscribeToNotificationTopic •IS perf. Tests (data sources)

NotificationSink DeliveryNotification •IS perf. Tests (listener)

Registry RegisterService •Code examples

UnRegisterService •Code examples

Factory CreateService •Via GRAM (first tests)•Specific tests using DummyService

HandleMap FindByHandle •Not Used Yet

Every service must implement this PortType

Modelling activity of this type of service starting

Page 13: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 13

Prototypes developed within the project

• Performance Prototypes– Dummy Service– Dummy Secure Service– Dummy Service with Service Data– Dummy Service with Notifications– Dummy Service + Index Service– Index Listener

• Higher Level Prototyping– File Catalog Service– Metadata Catalog Service– Storage Element Service– Workload Management Service– Computing Element– Authentication and

Authorization

• Implementation of deployment use cases

– Remote installation (via dedicated custom services)

– Remote management of different version of a service

• Globus 3 “components” tests– GRAM tests– Index Service tests– Reliable File Transfer tests – GSI (Security) tests

Page 14: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 14

Globus Toolkit 3 Overview

• The GT 3 is the first complete implementation of the OGSI specification– The development process is much easier when compared with GT2.

• Steep learning curve should be taken into account!• New approach to service design and implementation

– Deployment Tools (not yet complete)• Backward compatibility:

– All GT2 components are shipped with the GT3 full bundle– Others are completely independent implementations (eg. MDS2 and MDS3)

• A large user community is being built• Incomplete documentation

– But not so bad...– Getting much better now (tutorials, etc...)

• Several bugs found in these exercises• Core implementation related - due to framework short lifetime• From tools deployed with the framework – hard to solve (e.g. Axis)• From the outside – easy to solve (e.g. Tomcat)

• GT2 GRAM – with an OGSI-compliant but complex architecture behind– Worry to lose past experience (gained within the EDG and LCG projects)– Confirmed by performance tests (see next slides)

Page 15: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 15

GT3 performance measurements (highlights)

• Goal:– explore GT3 under heavy load/concurrency:

• maximal throughput/rate of GT3 services• see the limiting factors

– Other dimensions to be explored:• Hardware• Hosting environment• Implementations• Security options

• Highlights from:– GRAM– DummyService (test custom service)– Security– IndexService– RFT

Page 16: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 16

GT3 GRAM performance

• Results: service node– Saturation throughput for job submission on the service node:

3.8 jobs/minute with an average CPU user+system usage of 62%

Comments:• Very slow!• Scalability issues for (heavily used) servers

Page 17: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 17

DummyService performance• Setup (1)

– each DummyService client executes the following steps:1. calls DummyServiceFactory to create a DummyService instance2. executes 2 simple methods (echo and getTime) on the

DummyService instance3. calls DummyService instance to destroy itself

– up to 1000 clients talking to the DummyService were run simultaneously on up to 45 client nodes (lxplus)

– with and without authentication via GSI message level security used according to guides and tutorials at www.globus.org

– grid service node hardware: 2 * Intel Pentium III 600MHz processors, 256MB RAM

• Setup (2)– same as Setup (1), but step 2. consists of 100 cycles, each of them

calling the 2 simple methods (echo and getTime) on the DummyService instance

Page 18: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 18

DummyService performance

• The security overhead needs further investigation (OGSA/GT3 group and Globus)• The result depend on the type of security. In all tested cases, the penalty is few

times a factor of 10• Some problems have been identified by the Globus team and we will test the new

version as soon it becomes available

setup authen-tication

service container

saturation throughput

average CPU u+s usage, %

1

no GT3 standalone 41 services/s 89

yes GT3 standalone 1.3 services/s 88

no Tomcat 60 services/s 89

yes Tomcat 1.2 services/s 88

2

no GT3 standalone 300 method calls/s 96

yes GT3 standalone 10 method calls/s 72

no Tomcat 290 method calls/s 96

yes Tomcat 13 method calls/s 79

Page 19: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 19

IndexService performance• Push setup: IndexService acting as a notification source

(pushing data)– multiple notification sinks subscribe to the IndexService "Host"

Service Data Element (SDE), and are notified about each update of "Host" SDE, happening at a fixed rate

– no security– grid service node hardware: 2 * Intel Pentium III 600MHz

processors, 256MB RAM

• Pull setup: IndexService responding to findServiceData requests (pulling data)– multiple ogsi-find-service-data clients are run sequentially and in

parallel asking for IndexService "Host" Service Data Element– no security– grid service node hardware: 2 * Intel Pentium III 600MHz

processors, 256MB RAM

Page 20: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 20

IndexService performance

• Results• No security

setup service container

saturation throughput

average CPU u+s

usage, %

pushGT3 standalone 10-15 notifications/s 81 – 87

Tomcat -

pull

GT3 standalone 200 requests/s 88

Tomcat 200 requests/s 90

Page 21: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 21

Complex set up (here ~400 client processes distributed on several computers); Modelling activity required as design tool; These tests are being used to validate the first simulation prototypes

Index Service (Information service building block) in push mode

Page 22: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 22

Reliable File Transfer Service• Emphasis on reliability. Solve problems like e.g.

– dropped connections, – machine reboots, – temporary network outages, etc

• Functionality: OK• Main problem: resource hog• Comprehensive report submit to Globus • Fix found by the GT3 team

– Functionality tests on the new version OK (it ships with 3.2 )

– We agreed to test it in detail• Open chapters

– gridFTP performances (the RFT “engine”)• WU-FTP and the new globusFTP

Page 23: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 23

Integration

• GRID mainly concerns about the interoperability among heterogeneous grid components

• Heterogeneous Grid environments– AliEn (Alice Environment; LHC ALICE experiment)

• Should provide first-hand experience within LCG

• Heterogeneous Grid technologies (non GT3)– OGSI .NET, MS .NETGrid (.NET environment)– Unicore, others…– Discussion with some teams at GGF9; to be restarted

end of November (after SC2003)

• Necessary to validate GT3 itself!

Page 24: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 24

Do not lose the “past” experience

• Assessment of OGSA/GT3• Strategy defined in coordination with embryonic EGEE

teams (last August)

– EDG• Major issue so far: GRAM and Information Service

– LCG deployment• Major issues so far: GRAM, Information Services and

configuration issues

– VDT• CondorG/GT3 becoming available• Agreement to use it in our test

– eScience gap analysis (Geoffry Fox report)• Used to inform the original evaluation plan

Page 25: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 25

Relationship building:Globus Toolkit 3

• Contact with the GT3 team– Little formalities, working relationship– We acknowledge the good relationship with GT3 team

• Status of the interactions:– Access to unreleased software; agreed mechanisms to discuss

and give feedback– Job Gatekeeper (GRAM)

• Feedback• More priority on performances inside the GT3 team since

– Reliable File Transfer (RFT)• Issues (high CPU consumption) confirmed. Fix available• Access to the experimental trunk for verification

– Index Server (IS)• Several issues being discussed; 3.2 IS available

– Security (GSI)• Issues being discussed; fix available soon

Page 26: OGSA/GT3 evaluation at CERN

April 21, 2023 OGSA/GT3 evaluation 26

Conclusions• Much progress has been made in a short time…• Generally impressed with GT3 and the overall concept• Some major issues around the performance of the hosting

environments and the factories– Continue closely working with Globus

• Continue to validate this approach and prototype interfaces and services in a GT3 context– Be able to hand over a consistent picture to EGEE

• Continue to take into account use cases / valuable information from any source– EDG, LCG,– HepCAL, ARDA– …