gearman: a job server made for scale

34
A Job Server to Scale By Mike Willbanks Software Engineering Manager CaringBridge MinneBar April 7, 2012

Upload: mike-willbanks

Post on 17-May-2015

6.181 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Gearman: A Job Server made for Scale

A Job Server to Scale

By Mike Willbanks

Software Engineering Manager

CaringBridge

MinneBar April 7, 2012

Page 2: Gearman: A Job Server made for Scale

2

•Talk

Slides will be online later!

•Me

Software Engineering Manager at CaringBridge

MNPHP Organizer

Open Source Contributor (Zend Framework and various others)

Where you can find me:

• Twitter: mwillbanks G+: Mike Willbanks

• IRC (freenode): mwillbanks Blog: http://blog.digitalstruct.com

• GitHub: https://github.com/mwillbanks

Housekeeping…

Page 3: Gearman: A Job Server made for Scale

3

• What is Gearman

Yeah yeah…

• Main Concepts

How it really works

• Quick Start

Get it up and running and start playing.

• The Details

How can it be a tech talk without details?

• Some use cases

How you might use it.

• Questions

Although you can bring them up at anytime!

Agenda

Page 4: Gearman: A Job Server made for Scale

What is Gearman? Official Statement

What the hell it means

Visual understanding

Platforms

Page 5: Gearman: A Job Server made for Scale

5

“Gearman provides a generic application framework to farm

out work to other machines or processes that are better

suited to do the work. It allows you to do work in parallel,

to load balance processing, and to call functions between

languages.”

Official Statement

Page 6: Gearman: A Job Server made for Scale

6

•Gearman consists of a daemon, client and worker

At the core, they are simply small programs.

•The daemon handles the negotiation of work

Workers and Clients

•The worker does the work

•The client requests work to be done

What The Hell? Tell me!

Page 7: Gearman: A Job Server made for Scale

7

In Pictures

Page 8: Gearman: A Job Server made for Scale

8

•Gearman works on linux

•API implementations available

PHP

Perl

Java

Ruby

Python

Platforms

Page 9: Gearman: A Job Server made for Scale

Main Concepts Client -> Daemon -> Worker communication

Distributed Model

Page 10: Gearman: A Job Server made for Scale

10

Client -> Daemon -> Worker communication

Page 11: Gearman: A Job Server made for Scale

11

Distributed Model

Page 12: Gearman: A Job Server made for Scale

Quick Start Installation

Simple Bash Example

PHP Related (sorry, I’m all about the PHP)

Page 13: Gearman: A Job Server made for Scale

13

•Head to gearman.org

•Click Download

•Click on the LaunchPad download

•Download the Binary

•Unpack the binary

• ./configure && make && make install

•Bam! You’re off!

For more advanced configuration see ./configure –help

•Starting

gearmand -d

Installation

Page 14: Gearman: A Job Server made for Scale

14

•Starting the Daemon

gearmand –d

•Worker – command line style

gearman -w -f wc -- wc –l

•Client – command line style

gearman -f wc < /etc/passwd

•Check it!

Simple Bash Example

Page 15: Gearman: A Job Server made for Scale

15

PHP Style

Page 16: Gearman: A Job Server made for Scale

16

•So, you know… we all like to talk about ourselves…

Yes, I wrote a layer on top of Zend Framework called

Zend_Gearman; wow unique.

https://github.com/mwillbanks/Zend_Gearman

PHP – Zend Framework

Page 17: Gearman: A Job Server made for Scale

The Details Persistence

Workers

Monitoring

Page 18: Gearman: A Job Server made for Scale

18

•Gearman by default is an in-memory queue

Leaving this as the default is ideal; however, does not work in all

environments.

•Persistent Queues

Libdrizzle

Libsqlite3

Libmemcached

Postgres

TokyoCabinet

MySQL

Redis

Persistence

Page 19: Gearman: A Job Server made for Scale

19

•Persistent queues require specific configuration during the

compilation of gearman.

•Additionally, arguments to the gearman daemon need to be

passed to talk to the specific persistence layer.

•Each persistence layer is actually built as a plugin to

gearmand

http://bazaar.launchpad.net/~tangent-

org/gearmand/trunk/files/head:/libgearman-

server/plugins/queue/

Getting Up and Running with Persistence

Page 20: Gearman: A Job Server made for Scale

20

Configuration Options

Page 21: Gearman: A Job Server made for Scale

21

•Clients send work to the gearmand server

This is called the workload; it can be anything that can become a

string.

Utilize an open format; it will make life easier if you chose to use

a different language for processing

• XML, JSON, etc.

• Yes, you can serialize objects if you wanted to… not recommended

although.

Clients

Page 22: Gearman: A Job Server made for Scale

22

•Workers are the dudes in the factory doing all the work

•Generally they will run as a daemon in the background

•Workers register a function that they perform

They should ONLY be doing a single task.

This makes them far easier to manage.

•The worker does the work and “can” return results

If you are doing the work asynchronously you generally do not

return the result.

Synchronous work you will return the result.

Workers

Page 23: Gearman: A Job Server made for Scale

23

•Utilizing the Database

If you keep a database connection

• Must have the ability to reconnect to the database.

• Watch for connection timeouts

•Handling Memory Leaks

Watch the amount of memory and detect leaks then kill the

worker.

•Request Languages

PHP for instance, sometimes slows down after hundreds of

executions, kill it off if you know this will happen.

Workers – special notes

Page 24: Gearman: A Job Server made for Scale

24

•Workers sometimes have issues and die, or you need to boot

them back up after a restart

Utilizing a service to watch your workers and ensure they are

always running is a GOOD thing.

•Supervisord

Can watch processes, restart them if they die or get killed

Can manage multiple processes of the same program

Can start and stop your workers.

•When running workers, BE SURE to handle KILL signals such

as SIGKILL.

Keeping the Daemon Running

Page 25: Gearman: A Job Server made for Scale

25

Supervisord Example

Page 26: Gearman: A Job Server made for Scale

26

•Until recently you were writing something against the

gearman socket interface…

telnet on port 4730

Write “STATUS”

• Gives you the registered functions, number of workers and items in the

queue.

•Gearman Monitor – PHP Project

NOTE: I’ve never actually attempted this; BUT it is referenced on

gearman.org so it must be doing something!

https://github.com/yugene/Gearman-Monitor

Monitoring

Page 27: Gearman: A Job Server made for Scale

Use Cases Email

Photos

Log Analysis / Aggregation

Page 28: Gearman: A Job Server made for Scale

28

• If you resize images on your web server:

Web servers should serve, not process images.

Images require a lot of memory AND processing power

• They are best to be processed on their own!

•Processing in the Background

Generally will require a change to your workflow and checking the

status with XHR to see if the job has been completed.

• This allows you to process them as you have resources available.

• Have enough workers to process them “quickly enough”

Images

Page 29: Gearman: A Job Server made for Scale

29

Image Processing Example

Page 30: Gearman: A Job Server made for Scale

30

•Sending email and/or generating templates and processing

variables can take up time, time that is better spent getting

the user to the next page.

•The feedback on the mail doesn’t really make a difference

so it is great to send it to the background.

Email

Page 31: Gearman: A Job Server made for Scale

31

Email Example

Page 32: Gearman: A Job Server made for Scale

32

•Get all of your logs to a single place

•Process the logs to produce analytical data

• Impression / Click Tracking

•Why run a cron over your logs nightly?

Real-time data is where it is at!

Log Analysis / Aggregation

Page 33: Gearman: A Job Server made for Scale

33

Log Analysis / Aggregation

Page 34: Gearman: A Job Server made for Scale

Questions? These slides will be posted to SlideShare & SpeakerDeck.

Slideshare: http://www.slideshare.net/mwillbanks

SpeakerDeck: http://speakerdeck.com/u/mwillbanks

Twitter: mwillbanks

G+: Mike Willbanks

IRC (freenode): mwillbanks

Blog: http://blog.digitalstruct.com

GitHub: https://github.com/mwillbanks