gearman - job queue

24
Gearman 'The manager' ”since it dispatches jobs to be done, but does not do anything useful itself.”

Upload: diego-lewin

Post on 15-Jan-2015

7.317 views

Category:

Documents


2 download

DESCRIPTION

Presentation about the Gearman job queue and examples using PHP

TRANSCRIPT

Page 1: Gearman - Job Queue

Gearman

'The manager'

”since it dispatches jobs to be done, but does not do anything useful itself.”

Page 2: Gearman - Job Queue

Presentation done using info from...

http://www.slideshare.net/pcdinh/gearman-and-asynchronous-processing-in-php-applications-6135047

http://assets.en.oreilly.com/1/event/45/The%20Gearman%20Cookbook%20Presentation.pdf

http://www.gearman.org

http://nz.php.net/manual/en/book.gearman.php

others...

Page 3: Gearman - Job Queue

Scalable Solutions..

More Hardware

Caching

Precalculated Data

Load Balancing

Multi-tier application

Job Queue

Page 4: Gearman - Job Queue

History

Created by Danga Interactive.

Some company that developed Memcache.

Original implementation in perl (2005).

2008 rewriteen in C by Brian Aker

PHP Extension by James Luedke

Page 5: Gearman - Job Queue

Used by

Digg: 45+ Servers, 400K jobs/day

Yahoo: 60+ servers, 6M jobs/day

And many others..

Page 6: Gearman - Job Queue

InstallingCompiling:

tar xzf gearmand-X.Y.tar.gzcd gearmand-X.Y./configuremakemake install

Starting server:

$ gearmand -d

Pecl extension:

tar xzf gearman-X.Y.tgzcd gearman-X.Yphpize./configuremakemake install

To add to the php.ini:

extension="gearman.so"

Page 7: Gearman - Job Queue

Terminology

Client: Create jobs to be run and send them to a job server.

Worker: Run jobs given from the job server.

Job Server: Handle the job queue form clients to workers.

Page 8: Gearman - Job Queue

“A massively distributed, massively fault tolerant fork mechanism.”

- Joe Stump, SimpleGeo

Gearman is...

Page 9: Gearman - Job Queue

Open Source.

Simple & Fast.

Multi-language.

Flexible application design.

Load Balancing.

No single point of failure.

Features

Page 10: Gearman - Job Queue

Client

Worker

Job Server Job Server

Client Client Client

Worker Worker Worker

Page 11: Gearman - Job Queue

Memory

Memcached

Mysql/Drizzle

PostgreSQL

SQLite

Tokio Cabinet

Queue Options

Page 12: Gearman - Job Queue

Foreground(synchronus)

Or

Background(asynchronus)

Page 13: Gearman - Job Queue

Fishpond_Controller_Front::getResource('gearman')

->getGearmanClient()

->doBackground("updateCompetitorPrice", $this->_barcode);

->do("updateCompetitorPrice", $this->_barcode);

GearmanClient::do() - Run a single task and return a result

GearmanClient::doLow() - Run a single low priority task

GearmanClient::doBackground() - Run a task in the background

GearmanClient::doHighBackground() - Run a high priority task in the background

GearmanClient::doLowBackground() - Run a low priority task in the background

Gearman Client

Page 14: Gearman - Job Queue

Scatter / Gather.

Map / Reduce.

Asynchronus Queues. Pipeline Processing.

Strategies

Page 15: Gearman - Job Queue

Scatter / Gather

Client

Price Calculation Image Resize

RecomendationsProduct Detail

Page 16: Gearman - Job Queue

$client = Fishpond_Controller_Front::getResource('gearman') ->getGearmanClient();

//adding gearman tasks $client->addTask("getProductDetail", $barcode); $client->addTask("getPrice", $barcode); $client->addTask("resizeImage", serialize($barcode,100,100)); $client->addTask("getRecomendations", $barcode);

//callbacks to know when this finish $client->setCompleteCallback(array($this, "complete"));

//runing tasks $client->runTasks();

/** * Callback when task is complete * */ public function complete($task) {

$data = $task->data();

}

Page 17: Gearman - Job Queue

GearmanClient::addTaskHigh() - Add a high priority task to run in parallel

GearmanClient::addTaskLow() - Add a low priority task to run in parallel

GearmanClient::addTaskBackground() - Add a background task to be run in parallel

GearmanClient::addTaskHighBackground() - Add a high priority background task to be run in parallel

GearmanClient::addTaskLowBackground() - Add a low priority background task to be run in parallel

GearmanClient::runTasks() - Run a list of tasks in parallel

Task Methods

Page 18: Gearman - Job Queue

GearmanClient::setDataCallback() - Callback function when there is a data packet for a task

GearmanClient::setCompleteCallback() - Set a function to be called on task completion

GearmanClient::setCreatedCallback() - Set a callback for when a task is queued.

GearmanClient::setExceptionCallback() - Set a callback for worker exceptions.

GearmanClient::setFailCallback() - Set callback for job failure.

GearmanClient::setStatusCallback() - Set a callback for collecting task status.

GearmanClient::setWarningCallback() - Set a callback for worker warnings.

GearmanClient::setWorkloadCallback() - Set a callback for accepting incremental data updates

Client Callback

Page 19: Gearman - Job Queue

Concurrent tasks with different workers.

All tasks run in the time for longest running.

Must have enough workers available.

Scatter / Gather

Page 20: Gearman - Job Queue

Map/Reduce

ClientTask T

Task T-0 Task T-3Task T-2Task T-1

Task T-00 Task T-02Task T-01

Page 21: Gearman - Job Queue

Asynchronous Queues

No everyting need inmediate procesing..

Competitor pricing. Emails. Whole price engine. Loging. Etc.

Example:

$gearmanClient = Fishpond_Controller_Front::getResource('gearman')->getGearmanClient();

$gearmanClient->doBackground("updateCompetitorPrice", $this->_barcode);

Page 22: Gearman - Job Queue

Loging

<VirtualHost *:80> ServerName example.com DocumentRoot /var/www/ CustomLog “| gearman -n -f looger” common (client)</VirtualHost>

Page 23: Gearman - Job Queue

Pipeline Procesing

ClientTask T

Output

WorkerOperation 3

WorkerOperation 2

WorkerOperation 1

Page 24: Gearman - Job Queue

Questions ?