apache sling - distributed eventing, discovery, and jobs (adaptto 2013)

46
APACHE SLING & FRIENDS TECH MEETUP BERLIN, 23-25 SEPTEMBER 2013 Distributed Eventing and Jobs Carsten Ziegeler | Adobe Research Switzerland 1

Upload: carsten-ziegeler

Post on 10-May-2015

1.001 views

Category:

Technology


1 download

DESCRIPTION

Presentation from adaptTo 2013 in Berlin about Apache Sling's discovery and job handling modules

TRANSCRIPT

Page 1: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

APACHE SLING & FRIENDS TECH MEETUP BERLIN, 23-25 SEPTEMBER 2013

Distributed Eventing and Jobs Carsten Ziegeler | Adobe Research Switzerland

1

Page 2: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

About

2

§  RnD Team at Adobe Research Switzerland §  Co-founder Adobe Granite

§  OSGi Core Platform and Enterprise Expert Groups

§  Member of the ASF §  Current PMC Chair of Apache Sling §  Apache Sling, Felix, ACE

§  Conference Speaker §  Technical Reviewer §  Article/Book Author

§  [email protected] §  @cziegeler

Page 3: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Overview

3

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 4: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Overview

4

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 5: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

OSGi Event Admin Publish Subscribe Model

5

OSGi  Event  Admin  Component  A  publish

deliver

Component  X  

Component  Y  

Page 6: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

OSGi Event Admin Publish Subscribe Model

6

§  OSGi event is a data object with §  Topic (hierarchical namespace) §  Properties (key-value-pairs)

§  Resource Event §  Topic:

org/apache/sling/api/resource/Resource/ADDED §  Properties: path, resource type etc.

Page 7: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

OSGi Event Admin Publish Subscribe Model

7

§  Publisher creates event object §  Sends event through EventAdmin service §  Either sync or async delivery

§  Subscriber is an OSGi service (EventHandler) §  Service registration properties §  Interested topic(s)

§  org/apache/sling/api/resource/Resource/*  

§  Additional filters (optional) §  (path=/libs/*)  

Page 8: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

OSGi Event Admin Publish Subscribe Model

8

§  Immediate delivery to available subscribers

§  No guarantee of delivery §  No distributed delivery

Page 9: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

OSGi Event Admin Publish Subscribe Model

9

§  Immediate delivery to available subscribers

§  No guarantee of delivery §  No distributed delivery

Discovery  Sling   Job  Distribu3on  

Page 10: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Overview

10

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 11: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Overview

11

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 12: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Installation Scenarios

12

Clustered  JCR  

Single  Instance  

JCR  

Instance  1  

Instance  2  

Instance  3  

Page 13: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Topologies I Apache Sling Discovery

13

Clustered  JCR  JCR  

ID  :  A   ID  :  X   ID  :  42  ID  :  1  

Single  Instance  

Instance  1  

Instance  2  

Instance  3  

§  Instance: Unique Id (Sling ID)

Page 14: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Topologies I Apache Sling Discovery

14

§  Instance: Unique Id (Sling ID) §  Cluster: Unique Id and leader

                     

Cluster  99  

                     

             Cluster  35  

Clustered  JCR  JCR  

ID  :  A   ID  :  X   ID  :  42  ID  :  1  

Single  Instance  

Instance  1  

Instance  2  

Instance  3  

Leader   Leader  

Page 15: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Topologies I Apache Sling Discovery

15

§  Topology: Set of clusters

                     

Cluster  99  

                     

             Cluster  35  

Clustered  JCR  JCR  

ID  :  A   ID  :  X   ID  :  42  ID  :  1  

Single  Instance  

Instance  1  

Instance  2  

Instance  3  

Leader   Leader  

Topology  Topology  

Page 16: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

                     

Cluster  99  

                     

             Cluster  35  

Topologies I Apache Sling Discovery

16

Clustered  JCR  JCR  

ID  :  A   ID  :  X   ID  :  42  ID  :  1  

Single  Instance  

Instance  1  

Instance  2  

Instance  3  

Leader   Leader  

Topology  

§  Topology: Set of clusters

Page 17: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Topologies II Apache Sling Discovery

17

§  Instance §  Sling ID §  Optional:

Name and description §  Belongs to a cluster

§  Might  be  the  cluster  leader  

§  Additional distributed properties §  Extensible  through  own  services  

(PropertyProvider)  §  E.g.  data  center,  region  or  enabled  job  topics  

                 

                 

Cluster  99  

ID  :  42  

Instance  3  

Topology  

Leader  

Page 18: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Topologies II Apache Sling Discovery

18

§  Cluster §  Elects (stable) leader §  Stable instance ordering

                 

                 

Cluster  99  

ID  :  42  

Instance  3  

Topology  

Leader  

Page 19: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Topologies II Apache Sling Discovery

19

§  TopologyEventListener §  Receives events on

topology changes §  Topology is changing §  Topology changed §  Properties changed

                 

                 

Cluster  99  

ID  :  42  

Instance  3  

Topology  

Leader  

Page 20: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Overview

20

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 21: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Handling I Apache Sling Job Handling

21

§  Job : Guaranteed processing, exactly once §  Exactly one job consumer

§  Started by client code, e.g. for replication, workflow... §  Job topic §  Payload is a serializable map

Page 22: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Handling I Apache Sling Job Handling

22

§  Sling Job Manager handles and distributes jobs §  Delivers job to a job consumer… §  …and waits for response §  Retry and failover

§  Notification listeners (fail, retry, success)

Page 23: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Starting / Processing a Job I Apache Sling Job Handling

23

public interface JobConsumer { String PROPERTY_TOPICS = "job.topics"; enum JobResult { OK, FAILED, CANCEL, ASYNC } JobResult process(Job job);}  

public interface JobManager { Job addJob(String topic, String optionalName, Map<String, Object> properties); …}  

Star3ng  a  job  

Processing  a  job  

Note:  Star3ng/processing  of  jobs  through  Event  Admin  is  deprecated  but  s3ll  supported  

Page 24: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Starting / Processing a Job II Apache Sling Job Handling

24

@Component@Service(value={JobConsumer.class})@Property(name=JobConsumer.PROPERTY_TOPICS, value="org/apache/sling/jobs/backup")public class BackupJobConsumer implements JobConsumer { @Override public JobResult process(final Job job) { // do backup return JobResult.OK; }}

Page 25: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Handling I Apache Sling Job Handling

25

• New jobs are immediately persisted •  Jobs are “pushed” to the processing

instance •  Processing instances use different

queues •  Associated with job topic(s) •  Main queue •  0..n custom queues

Page 26: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Queue I Apache Sling Job Handling

26

• Queue is configurable • Queue is started on demand in own

thread •  And stopped if unused for some time

Page 27: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Queue II Apache Sling Job Handling

27

• Queue Types •  Ordered queue •  Parallel queues: Plain and Topic Round

Robin

Page 28: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Queue III Apache Sling Job Handling

28

•  Limit for parallel threads per queue • Number of retries (-1 = endless) • Retry delay •  Thread priority

Page 29: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Additional Configurations Apache Sling Job Handling

29

•  Job Manager Configuration = Main Queue Configuration •  Maximum parallel jobs (15) •  Retries (10) •  Retry Delay

•  Eventing Thread Pool Configuration •  Used by all queues •  Pool size (35) = Maximum parallel jobs for

a single instance

Page 30: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Monitoring – Web Console Apache Sling Job Handling

30

Page 31: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Monitoring – Web Console Apache Sling Job Handling

31

Page 32: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Distribution I Apache Sling Job Distribution

32

•  Each instance determines enabled job topics •  Derived from Job Consumers (new API

required) •  Can be whitelisted/blacklisted (in Job

Consumer Manager) •  Announced through Topology

Page 33: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Distribution II Apache Sling Job Distribution

33

•  Job Distribution depends on enabled job topics and queue type •  Potential set of instances derived from

topology (enabled job topics) •  Ordered: processing on leader only, one job

after the other •  Parallel: Round robin distribution on all

potential instances §  Local  cluster  instances  have  preference  

Page 34: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Distribution III Apache Sling Job Distribution

34

•  Failover •  Instance crash: leader redistributes jobs to

available instances §  Leader  change  taken  into  account  

• On enabled job topics changes: potential redistribution

Page 35: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Sling Job Distribution

35

Topology  

Instance  Sling  ID:  1  

Job  Manager  

Instance  Sling  ID:  2  

Job  Manager  

Instance  Sling  ID:  3  

Job  Manager  

Instance  Sling  ID:  4  

Job  Manager  

Job  Consumer  Topic:  A  

Job  Consumer  Topic:  B  

Job  Consumer  Topic:  C  

Page 36: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Sling Job Distribution

36

Instance  Sling  ID:  1  

Job  Manager  

Instance  Sling  ID:  2  

Job  Manager  

Instance  Sling  ID:  3  

Job  Manager  

Instance  Sling  ID:  4  

Job  Manager  A  

Job  Consumer  Topic:  A  

Job  Consumer  Topic:  B  

A:2  

Job  Consumer  Topic:  C  Job  

Topology  

Page 37: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Sling Job Distribution

37

Instance  Sling  ID:  1  

Job  Manager  

Instance  Sling  ID:  2  

Job  Manager  

Instance  Sling  ID:  3  

Job  Manager  

Instance  Sling  ID:  4  

Job  Manager  B  

Job  Consumer  Topic:  A  

Job  Consumer  Topic:  B  

B:3  

Job  Consumer  Topic:  C  Job  

Topology  

Page 38: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Sling Job Distribution

38

Instance  Sling  ID:  1  

Job  Manager  

Instance  Sling  ID:  2  

Job  Manager  

Instance  Sling  ID:  3  

Job  Manager  

Instance  Sling  ID:  4  

Job  Manager  C  

Job  Consumer  Topic:  A  

Job  Consumer  Topic:  B  

C:4  

Job  Consumer  Topic:  C  Job  

?  

Topology  

Page 39: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Discovery and Eventing

39

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 40: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Discovery and Eventing – What’s Next?

40

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

New  OSGi  Specifica-on    •  Distributed  Even3ng  •  Cloud  Compu3ng  

Page 41: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Discovery and Eventing – What’s Next?

41

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Job  Distribu-on    •  Improved  load  balancing  •  Pull  based  distribu3on  

Page 42: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

One More Thing…

42

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g  

Page 43: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Job Progress Tracking Apache Sling Job Processing

43

§  Jobs can inform about §  Progress (percentage) §  ETA

§  Additional informational messages §  All information is persisted

NEW  

Page 44: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Improved Failure Handling I Apache Sling Job Processing

44

§  Currently, no history of jobs §  Immediately removed once

§  Job succeeds §  Job is cancelled

§  What happened? §  What did go wrong?

Page 45: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Improved Failure Handling II Apache Sling Job Processing

45

§  Cancelled jobs are kept §  With a reason and log §  Can be retried

§  Successful jobs can be kept §  With a message and log

NEW  

Page 46: Apache Sling - Distributed Eventing, Discovery, and Jobs (adaptTo 2013)

Discovery and Eventing

46

Discovery  Sling   Job  Distribu3on  

Felix   OSGi  Event  Admin  

From

 Even3

ng  to

 Job  Processin

g