batch processing with websphere

39
© 2010 IBM Corporation Batch processing with WebSphere Sridhar Sudarsan, Chief Architect, Batch processing strategy [email protected] 4 th March 2011

Upload: ibm-software-polska

Post on 20-Aug-2015

3.208 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Batch processing with WebSphere

© 2010 IBM Corporation

Batch processing with WebSphere

Sridhar Sudarsan, Chief Architect, Batch processing strategy

[email protected]

4th March 2011

Page 2: Batch processing with WebSphere

© 2010 IBM Corporation2

Outline and objectives

WebSphere batch solutions Overview– Architecture– Components– Topology

WebSphere Batch offerings– WebSphere Feature Pack for modern batch– WebSphere Compute Grid

Summary

Page 3: Batch processing with WebSphere

© 2010 IBM Corporation3

WebSphere Extended Deployment (XD) is now 3 separate products

Software to virtualize, control, and turbo-charge your application infrastructure

Infrastructure Optimization

Intelligent Workload Management

Virtualization

(VIRTUAL ENTERPRISE)

Data Fabrics&

Caching(EXTREME SCALE)

Innovative Application Patterns, like Java batch,beyond OLTP

(COMPUTE GRID)

Automatic Sense & Respond

Management(VIRTUAL ENTERPRISE)

Page 4: Batch processing with WebSphere

© 2010 IBM Corporation4

What is Compute Grid (CG) ? -- J(2)EE View

Set of binaries that get deployed to WebSphere Application Server Network Deployment (WAS ND) nodes within a cell

Those nodes are then CG-enabled, and become the potential “Grid Execution Environment (GEE)” (or “Long Running Execution Environment (LREE)”

Java developers use the CG framework to code the batch application and deploy it as a typical .ear file

ND admins manage the .ear like any other WAS ND application (same console, same skills)

Some additional components, and a Job Management Console, of which more later.

Page 5: Batch processing with WebSphere

© 2010 IBM Corporation5

Batch Applications Need Batch Middleware

Batch Application

Batch Container

Job Control Language

Job Scheduler

PJM WLM

Security

Logging/archival

Resource Mgmt

Batch Framework

Application S

upportB

atch M

iddleware

- Parallelization, WLM, and availability

- Library of common data access and utilities

- Declarative job definition (xml based)

- Runtime engine for batch applications

- Job dispatcher, operational control point

- Manager for job history and output

HA

- Business and custom data access logic

- Rule-based CPU and file limits

- Security for jobs and job operations

Page 6: Batch processing with WebSphere

© 2010 IBM Corporation6

Fundamental Concept -- WAS Provides the Foundation ...... and Java EE solutions can leverage that foundation to provide additional functionality such as batch processing:

WAS Muliple JVM modelW eb and EJB C ontainersJD B C , JC A, JMS, MQSecurity ManagementTransaction ManagementThread ManagementIntegration w ith W LM

OS Platform

Batch Execution Platform Solution

B atch Feature Pack, W ebSphere C ompute Grid

Your batch business function

Several Things Going on Here:You are relieved of having to code custom middleware functionalityYou stay focused on core business imperatives

The IBM batch developers are relieved of having to re-invent WAS platform functionalityThis is important -- it helps maintain currency of specification support and keeps product service aligned. It allow s for focused attention on batch functionality development and not on low er-level foundational issues.

This is what allows the mixing of OLTP and batch while maintaining SLAs for bothEspecially w hen deployed to z /OS or co-deployed w ith W VE on D istributed platforms. WAS z /OS has very rich w orkload classification and system resource management functionality. The IB M Java batch solutions ride on top of that existing capability.

Page 7: Batch processing with WebSphere

© 2010 IBM Corporation7

Lets look at the basic WebSphere Batch runtime

WAS Server 1

Batch App

WAS Server

Jobdispatcher

xJCL

BatchContainer

Job Repository

Job # 1

App Data store

Job Scheduler/Dispatcher (JS)

– The job entry point to Compute Grid

– Job life-cycle management (Submit, Stop, Cancel, etc) and monitoring

– Dispatches workload to either the PJM or GEE

– Hosts the Job Management Console (JMC)

WAS Server

Jobdispatcher

WAS Server 1

Batch App

BatchContainer

Job # 1

Grid Endpoints (GEE)

– Executes the actual business logic of the batch job

– Hosts the programming model

xJCL

– XML descriptor for the job

– Allows variable substitution

Dispatcher interfaces

Command window

EJB call

JMC

Page 8: Batch processing with WebSphere

© 2010 IBM Corporation8

WebSphere Compute Grid: Job step

WebSphere Compute Grid enables Java as a language for batch workloads on the mainframe or in distributed environments to create an infrastructure for Batch and OLTP processing that can share business

logic to lower costs, eliminate Batch window and deliver high availability.

Input

Output

BatchJob Step

Fixed Block DatasetVariable Block

DatasetJDBC

FileIBATIS

More to come…

Fixed Block DatasetVariable Block DatasetJDBCJDBC w/ BatchingFileIBATISMore to come….

Map Data to Object

TransformObject

Map Objectto Data

A simplified batch job

Page 9: Batch processing with WebSphere

© 2010 IBM Corporation9

The Batch Programming M odelFunctions and class libraries supplied w ith the Feature Pack and Compute Grid

Job ControlxJC L -- very much like

traditional JC L, except it is coded in XML. Equivalents

to JOB cards, D D statements, STEPs, etc.

Batch Controller BeanPart of the B atch C ontainer

code supplied by IB M

Batch Data StreamsProvides data input and output

services for the job steps

Checkpoint AlgorithmsService to programmatically determine

and handle checkpointing

Results and Return CodesServices to determine, manipulate and act

upon return codes, both at the application and system level

Job Step ControlInvoking and coordinating processing betw een steps

Batch Container

Batch AppPOJO

Step 1

Step 2

Step n

Development Libraries

R AD or Eclipse

WAS Runtime InterfacesJD B C , JC A, Security, Transaction,

Logging, D eployment, etc., etc.

Page 10: Batch processing with WebSphere

© 2010 IBM Corporation10

Lifecycle …

BatchContainer

setProperties(Properties p) {

}

createJobStep() {

}

processJobStep() {

}

destroyJobStep() {

…}

1

2

3

4

Compute Grid makes it easy for developers to create transactional batch applications by allowing them to use a streamlined POJO model and to focus on business logic

and not on the batch infrastructure

The anatomy of a transactional batch application – batch job stepBatch Programming Model

Page 11: Batch processing with WebSphere

© 2010 IBM Corporation11

WebSphere Batch makes it easy for developers to encapsulate input/output data streams using POJOs that optionally support checkpoint/restart semantics.

Job Start

BatchContainer

open()

positionAtInitialCheckpoint()

externalizeCheckpoint()

close()

1

2

3

4

Job Restart

BatchContainer

open()

internalizeCheckpoint()

positionAtCurrentCheckpoint()

externalizeCheckpoint()

close()

1

2

3

5

4

Checkpoint & Restart with Batch Data Streams

Page 12: Batch processing with WebSphere

© 2010 IBM Corporation12

xJCL -- The Job Control DefinitionNot JCL (no // and no column issues) ... but amazingly similar in concepts:

<?xml version="1.0" encoding="UTF-8" ?>

<job name="name" ... ><jndi-name>batch_controller_bean_jndi</jndi-name><substitution-props>

<prop name="property_name" value="value" /></substitution-props>

<job-step name="name"><classname>package.class</classname>

<checkpoint-algorithm-ref name="chkpt"/><results-ref name="jobsum"/><batch-data-streams>

<bds><logical-name>input_stream</logical-name>

<props><prop name="name" value="value"/>

</props></bds>

</batch-data-streams></job-step>

<job-step

</job>

Roughly analogous to the JOB card

A job step

Like the EXEC PGM = statement in JCL

Similar to DD statements

A brief sampling ... many things not shown.

Do you see the similarity to traditional JCL?

Page 13: Batch processing with WebSphere

© 2010 IBM Corporation13

Parallel job manager (PJM)

PJM breaks large batch jobs into smaller partitions for parallel execution

– Installed as a system application

– Can be installed to a single server or a cluster

– Provides out of the box and custom SPIs to implement

PJM is the target application of a parallel job

– PJM does not process batch data streams

– It submits or restarts sub jobs under the control of step properties which identify the sub job in the job repository and the count of sub jobs to process.

– A parallel job is submitted using the xJCL for its ‘top-level’ job that specifies these details.

PJM is a Sub job Manager, where a subjob is

– An instance of a regular batch job that can be bounded by substitution properties specified in its xJCL.

– Submitted to the job scheduler by the PJM.

– Aggregated by the PJM into one logical top level job for status, result code, life cycle management

Page 14: Batch processing with WebSphere

© 2010 IBM Corporation14

Compute Grid runtime components with PJM

WAS Server 1

Batch App

WAS Server N

Batch App

WAS Server

JobScheduler

WAS Server

BatchContainer

ParallelJob Manager

ParameterizerSPI

Logical TXSynchronization

SPI

SubJobAnalyzer

SPI

SubJobCollector

SPI

xJCL

SubJobCollector

SPI

logical transaction

scope

BatchContainer

BatchContainer

Job Repository

Sub Job

Name

SubJob # 1

SubJob # N

Page 15: Batch processing with WebSphere

© 2010 IBM Corporation15

logical Deployment

JobScheduler

BatchContainer

Workload Connector

Wo

rkload

Sch

edu

ler(e.g

. TW

S)

BatchContainer

BatchContainer

Per Line of B

usiness

Jobs Jobs

Jobs

Jobs

Jobs

Jobs

JobJob

Job

Jobs

Jobs

Jobs

Jobs Console

Jobs

OnlineApplications

public submit(Job j) { _sched.submit(j);}A

PI

s

ParallelJob

Manager

Page 16: Batch processing with WebSphere

© 2010 IBM Corporation16

Per Line of B

usiness

Batch Containers

Batch Containers

Batch Containers

Enterprise scheduler like TWS

OnlineApplications

WA

S N

D C

ell

JobsJobs

Jobs

Jobs

Jobs

Jobs

Jobs

Job Scheduler

PJM

Physical Deployment - Distributed

Page 17: Batch processing with WebSphere

© 2010 IBM Corporation17

Admin & Configuration with WAS admin console

Page 18: Batch processing with WebSphere

© 2010 IBM Corporation18

Integrated operational control

Provides an operational infrastructure for job life cycle

Integrates with existing enterprise schedulers such as Tivoli Workload Scheduler

Provides log management and integrates with archiving and auditing

Provides resource usage monitoring

Integrates with existing security and disaster recovery procedures

Configures as a highly available component

Bulk application container

WCG Batch Container

Information storage

Data access management services

“File”

Data access

Queue based data

access

In-memory data access

Custom

data acces

s

Infrastructure servicesWCG Batch Framework

Bulk application

developmentEnvironment

for creating and migrating bulk applications

System

management and operationsManage,

monitor and secure bulk processes

Analytics for scheduling, check-pointing, resource management

WCG Eclipse Plugin

WCG SchedulerGateway

Bulk Partner services

Business process and

event services

Scheduler services

Invocation & Scheduling optimization

Resource brokering, Split & Parallelize, Pace, Throttle

Invocation services

Ad hoc Planned

Page 19: Batch processing with WebSphere

© 2010 IBM Corporation19

Job Management Console – View jobs

Page 20: Batch processing with WebSphere

© 2010 IBM Corporation20

Job management console: Job schedules

• Save a job definition

– xJCL– Schedule

• Date and time

• Repeating

• Manage schedules

– View details– Cancel

Page 21: Batch processing with WebSphere

© 2010 IBM Corporation21

Benefits of running WebSphere Compute Grid on z/OS

Page 22: Batch processing with WebSphere

© 2010 IBM Corporation22

Essential Story – Exploitation of Lower Level Benefits

WebSphere Compute GridFunction Common

Across All Platforms

WebSphere Application Server z/OSFunction Common

Across All Platforms

System zInherent Reliability zAAPs

z/OS and Parallel SysplexWLM

RRS

SMF Shared Data

SAF CoLocation

Awareness of WAS z/OS

Exploit WAS z/OS

Function Specific to z/OS

Exploit Platform

Page 23: Batch processing with WebSphere

© 2010 IBM Corporation23

zAAPs – Providing a Java Cost Advantage on z/OS

Java workload offloaded to zAAP processors

Completely transparent to Java applications, including batch

Benefits:• MIPs related to Java on zAAPs not

counted towards other software monthly license charges

• Frees GPs to do traditional z/OS work, such as CICS, DB2 and IMS

Page 24: Batch processing with WebSphere

© 2010 IBM Corporation24

RRS – Sysplex Wide Global Transaction Syncpoint Coordinator

Very fast and reliable

Excels at TX rollback when needed

Page 25: Batch processing with WebSphere

© 2010 IBM Corporation25

WLM Classification – Prioritize Work

Prioritize Compute Grid Relative to Other Tasks within the z/OS System

Prioritize Batch Jobs Relative to Other Batch Jobs within Compute Grid

WebSphere Compute Grid z/OS

Higher Priority Job

Medium Priority Job

Lower Priority Job

Relatively more system resources

Relatively less system resources Example of WAS z/OS

is exploiting WLM

Page 26: Batch processing with WebSphere

© 2010 IBM Corporation26

SMF – Accounting Information ... Very Efficient, Very Fast

WebSphere Compute Grid

z/OS

z/OS SMF Interface

WAS z/OS

RMF

DB2

CICS

MQMemory Buffers

SMF Data Sets

Data Analysis Tools

Other z/OS subsystems and

facilities

Job identifier

Job submitter

Final Job state

Server

Node

Accounting information

Job start time

Last update time

CPU consumed

Type 120, Subtype 20

• Chargeback• Performance and Tuning• Capacity Planning

Page 27: Batch processing with WebSphere

© 2010 IBM Corporation27

Parallel Sysplex – Availability and Scalability

z/OS Instance

WAS z/OS +Compute Grid

DB2, CICS, IMS, MQ

z/OS Instance

WAS z/OS +Compute Grid

DB2, CICS, IMS, MQ

z/OS Instance

WAS z/OS +Compute Grid

DB2, CICS, IMS, MQ

Local Data Caches

Centralized shared data structures with integrated data locking and update

Proven ScalabilityNear linear up to 32

nodes in Sysplex

AvailabilityThis provides thefoundation for ahighly available

architecture

Parallel JobsExcellent platform on which to use Compute

Grid’s Parallel Job Manager

Direct value to Compute Grid and Your Batch Processes

Page 28: Batch processing with WebSphere

© 2010 IBM Corporation28

SAF – Centralized Security

Centralized SAF Security Repository• Userids and Groups• EJBROLE Role Enforcement• Digital Certificates and Keyrings• Much more related to WAS z/OS Security• Extensive auditing

Proven secure, and centralized enables tighter control

Page 29: Batch processing with WebSphere

© 2010 IBM Corporation29

WebSphere Batch control by external workload scheduler (e.g. Control-M, etc)

Tivoli Workload Scheduler

JES

//JOB1 JOB ‘…’//STEP1 PGM=IDCAMS//STEP2 PGM=WSGRID,//WGJOB DD *<job … >…</job>

submit

monitor

WebSphereBatch

Scheduler

submit

monitor

WASBatchApp<job name=“JOB1" …

<job-step name=“STEP2"> …

WSGrid

JobSchedule

External Scheduler Integration on z/OS

MQ Messages

• JCL/xJCL jobs have synchronized lifecycle

• xJCL job restartable from JCL job

• xJCL job log piped to JCL job, written to SYSOUT dataset

• xJCL job RC is step RC in JCL job

Page 30: Batch processing with WebSphere

© 2010 IBM Corporation30

WSGrid JCL Example

Page 31: Batch processing with WebSphere

© 2010 IBM Corporation31

WSGrid JCL Job Output (SYSPRINT DD – Top of File)

Page 32: Batch processing with WebSphere

© 2010 IBM Corporation32

WSGrid JCL Job Output (SYSPRINT DD – Bottom of File)

Page 33: Batch processing with WebSphere

© 2010 IBM Corporation33

Revisit the Picture from a Higher PerspectiveJust to reinforce the key concepts ...

W ebSphere Application Server Runtime

System Platform

Platform Exploitation below the open standard specification line

Batch ContainerYour batch applications deployed into the batch-enabled server or cluster

Eclipseor

RAD

B atch D ata S tream D evelopment

Framew ork C lasses

AppServer JVM

xJC L Job D efin ition F ile

Job Scheduler

Job C onsole

Brow se rW e b Se rvicesEJB IIOP

Schedule r dispa tches to e nd points based on know ledge of e nvironm ent

Batch Platform(not just a progra m m ing fram e w ork)

Built on WAS

Avoid custom m idd lew are

IB M supplies m idd lew are , you focus on your

business batch requ irem ents

Other batch container end points

If WAS z/OS then w e have the w hole "W hy WAS z/OS" story to te ll.

If it applies to OLTP, it a pplies to ba tch a s w e ll

Page 34: Batch processing with WebSphere

© 2010 IBM Corporation34

Feature-set Options

WebSphere App Server

WebSphere Batch

Feature PackJob

Scheduler

Batch Toolkit

WebSphere Compute Grid Product

ParallelJob

Manager

Start with the Feature Pack;grow into Compute Grid!

BatchContainer

JobScheduler

Batch Toolkit

BatchContainer

EnterpriseConnectors

AdvancedOperations

Pack

Page 35: Batch processing with WebSphere

© 2010 IBM Corporation35

Common batch container, development tools to develop batch applications, operational commands to manage batch job life cycle

√ √ √

Container managed checkpoint/restart capabilities √ √ √

Job management console √ √ √

Application Execution Platform √ √ √

Basic Scheduler/Job dispatcher √ √ √

System managed job logs √ √ √

High availability and clustering of Batch Job Scheduler/Job Dispatcher √ √

Multi-site disaster recovery for batch platform √ √

Integration with WLM on z/OS √ √

Interoperability between Java and COBOL on z/OS √

Non-disruptive batch application update/endpoint quiesce √

Job usage accounting, including SMF integration on z/OS √

Job classes and workload classification √

Integrated “Parallel Job Manager” for job parallelization across multi-JVMs √

Enterprise Scheduler connectors √

Enterprise Monitoring capabilities √

Disaster Recovery with operational state transfer √

Integration with VE for goal oriented job placement √

Features and QoS Guidance to choose optimal deployment option for Batch workloads

WebSphere Compute Grid

WAS on z/OS, WAS ND with FeP for Modern Batch

WAS Base with FeP for Modern Batch

Deployment Options

Page 36: Batch processing with WebSphere

© 2010 IBM Corporation36

Summary

WebSphere Batch solutions create the separation of concerns between business and application logic and the batch infrastructure.

WebSphere Batch solutions provide an environment and infrastructure for running mixed workloads in Java efficiently

WebSphere Batch solutions are strategically important and a fundamental component of IBM’s Batch infrastructure leadership

– WebSphere Compute Grid provides market leading capabilities for development to accelerate time-to-value for clients

– WebSphere Compute Grid is production ready with many customers running mission critical Batch workloads

Page 37: Batch processing with WebSphere

© 2010 IBM Corporation37 Back Office Operation Center – New Assets Overview and Insights

Page 38: Batch processing with WebSphere

© 2010 IBM Corporation38

Comparison of JZOS and WebSphere Compute Grid (WCG)

Java Batch Execution– Both JZOS and WCG provide an environment to execute Java batch programs

JES/JCL Jobs– Both JZOS and WCG workload can be described/submitted/run through JES/JCL

Control-M Scheduling– Both JZOS and WCG workload can be directly scheduled/controlled by TWS

Managed Job Restart– Both JZOS and WCG workload can be restarted through TWS

SMF Usage Recording– Both JZOS and WCG workload can be measured with SMF records

Where they are the same …

Page 39: Batch processing with WebSphere

© 2010 IBM Corporation39

Comparison of JZOS and WCG – Where they differ …

Feature JZOS WCG

Transactionality Local transaction mode only Local transaction mode (1PC)RRS transaction mode (2PC)XA transaction mode (2PC)

Service Integration Remote calls only Remote callsLocal, optimized calls (co-location)

Inter-language Java/COBOL interoperability, but NO connection sharing

Java/COBOL interoperability with DB2 connection sharing

Java Services J2SE, JZOS J2SE, JZOS, J2EE, WS*

Environment JES-managed Batch Initiator JES-managed Batch Initiator + WebSphere Application Server

JVM LifeCycle Deposable JVM Reusable JVM(operational efficiency)

Checkpoints Application-managed System-managed(operational optimization)

Parallelization Ad-hoc or roll-your-own System-managed(operational control)