life in a queue - using message queue with django

32
Life in a Queue Tareque Hossain Education Technology

Upload: tareque-hossain

Post on 10-May-2015

14.597 views

Category:

Technology


1 download

DESCRIPTION

Brief introduction on message queue and how its relevant in web applications How to tell if your web application could benefit from message queue Common example of tasks that could benefit from message queues Choosing a broker/protocol What broker/protocol PBS Education chose and why Message queue solution architecture Brief introduction on celery/carrot Writing a message queue task using celery How to invoke a message queue taks What happens when you invoke a task (walk through architecture) How to write tasks efficiently What are the things that are good to know when writing tasks (things we experienced at PBS Education)

TRANSCRIPT

Page 1: Life in a Queue - Using Message Queue with django

Life in a Queue Tareque Hossain Education  Technology

Page 2: Life in a Queue - Using Message Queue with django

What is Message Queue?

•  Message Queues are: o Communication Buffers o  Between independent sender & receiver processes

o Asynchronous •  Time of sending not necessarily same as receiving

•  In context of Web Applications: o  Sender: Web Application Servers

o Receiver: Background worker processes o Queue items: Tasks that the web server doesn’t have

time/resources to do

Page 3: Life in a Queue - Using Message Queue with django
Page 4: Life in a Queue - Using Message Queue with django

Web  App  Server  

Web  App  Server  

Web  App  Server  

Worker  Server  

Worker  Server  

Worker  Server  

Q1 Q2

T1

T2

T3

Web  App  Server  

T5

T4

T6

T7

Dequeue  Manager  

Enqueue  Manager  

Inside a Message Queue

Message  Queue  Broker  

Page 5: Life in a Queue - Using Message Queue with django

How does it work? •  Say a web application server has a task it

doesn’t have time to do •  It puts the task in the message queue •  Other web servers can access the same queue(s)

and put tasks there •  Queues are FIFO (First In First Out) •  Workers are greedy and they all watch the

queues for tasks •  Workers asynchronously pick up the first

available task on the queue when they are ready

Page 6: Life in a Queue - Using Message Queue with django

Do I need Message Queues?

•  Message Queues are useful in certain situations

•  General guidelines: o Does your web applications take more than a

few seconds to generate a response? o Are you using a lot of cron jobs to process data

in the background? o Do you wish you could distribute the processing

of the data generated by your application among many servers?

Page 7: Life in a Queue - Using Message Queue with django

Wait I’ve heard Asynchronous before! •  Yes. AJAX is an asynchronous communication

method between client & server

•  Some of the response time issues can be solved: o With AJAX responses that continually enhance the

initial response o Only if the AJAX responses also complete within a

reasonable amount of time

•  You need Message Queues when: o  Long processing times can’t be avoided in generating

responses o You want application data to be continuously processed

in the background and readily available when requested

Page 8: Life in a Queue - Using Message Queue with django

MQ Tasks: Processing User Uploads

•  Resize uploaded image to generate different resolutions of images, avatars, gallery snapshots

•  Reformat videos to match your player requirements

•  YouTube, Facebook, Slideshare are good examples

Page 9: Life in a Queue - Using Message Queue with django

MQ Tasks: Generate Reports •  Generating reports from large amount of data

o Reports that contains graphical charts

o Multiple reports that cross reference each other

Page 10: Life in a Queue - Using Message Queue with django

MQ Tasks: 3rd Party Integrations •  Bulk processing of 3rd party service requests

o Refund hundreds of transactions using Paypal

o Any kind of data synchronization o Aggregation of RSS/other feeds

Social  Network  Feed  Aggregator  

Page 11: Life in a Queue - Using Message Queue with django

MQ Tasks: Cron Jobs •  Any cron job that is not time sensitive

o Asynchronous behavior of message queue doesn’t guarantee execution of tasks on the dot

o  Jobs in cron that should be done as soon as resources become available are good candidates

Page 12: Life in a Queue - Using Message Queue with django

Message Queue Solution Stack

Web  Application  Server  

Task  Management  Subsystem  

Message  Queue  Protocol  Library  

Message  Queue  Broker  

Queue  Worker  

Task  Management  Subsystem  

Message  Queue  Protocol  Library  

Page 13: Life in a Queue - Using Message Queue with django

Protocol/Broker Choices

AMQP  (Advanced  Message  Queuing  Protocol)  

 

Brokers    •  RabbitMQ  •  Apache  Qpid  •  Apache  ActiveMQ  •  OpenAMQ  •  StormMQ  

   

JMS  (Java  Message  Service)  

 

Brokers    •  Apache  Qpid  •  Apache  ActiveMQ  •  OpenJMS  •  Open  Message  

Queue  

STOMP  (Streaming  Text  Orientated  

Messaging  Protocol)    

Brokers    •  Apache  ActiveMQ  •  STOMPServer  •  CoilMQ  

   

Page 14: Life in a Queue - Using Message Queue with django

OMG That’s too much! •  Yeah. I agree. •  Read great research details at Second Life dev site

o  http://wiki.secondlife.com/wiki/Message_Queue_Evaluation_Notes

•  Let’s simplify. How do we choose? o How is the exception handling and recovery? o  Is maintenance relatively low?

o How easy is deployment? o Are the queues persistent?

o How is the community support? o What language is it written in? How compatible is that

with our current systems?

o How detailed are the documentations?

Page 15: Life in a Queue - Using Message Queue with django

Choice of PBS Education •  We chose AMQP & RabbitMQ •  Why?

o We don’t expect message volumes as high as 1M or more at a time

o RabbitMQ is free to use o  The documentation is decent o  There is decent clustering support, even though we never

needed clustering o We didn’t want to lose queues or messages upon broker

crash/ restart o We develop applications using Python/django and

setting up an AMQP backend using celery/kombu was easy

Page 16: Life in a Queue - Using Message Queue with django

Message Queue Solution Stack

Web  Application  Server  

Celery  

PyAMQPlib/Kombu  

RabbitMQ  

Queue  Worker  

Celery  

PyAMQPlib/Kombu  

Page 17: Life in a Queue - Using Message Queue with django

Celery? Kombu? Yummy. •  django made web development using Python a

piece of cake

•  Celery & Kombu make using message queue in your django/Python applications a piece of cake

•  Kombu o AMQP based Messaging Framework for Python,

powered by PyAMQPlib o  Provides fundamentals for creating queues, configuring

broker, sending receiving messages

•  Celery o Distributed task queue management application

Page 18: Life in a Queue - Using Message Queue with django

Celery Backends •  Celery is very, very powerful •  You can use celery to emulate message queue

brokers using a DB backend for broker o  Involves polling & less efficient than AMQP

o Use for local development

•  Bundled broker backends o  amqplib, pika, redis, beanstalk, sqlalchemy, django,

mongodb, couchdb

•  Broker backend is different that task & task result store backend o Used by celery to store results of a task, errors if failed

Page 19: Life in a Queue - Using Message Queue with django

A Problem with a View •  What is wrong with this view?

 def  create_report(request):          ...          Code  for  extracting  parameters  from  request          ...          ...          Code  for  generating  report  from  lots  of  data          ...          return  render_to_response(‘profiles/index.html’,  {                  ‘report’:  report,          },  context_instance=RequestContext(request))    

Page 20: Life in a Queue - Using Message Queue with django

A Problem with a View

Page 21: Life in a Queue - Using Message Queue with django

Lets Write a Celery Task •  Writing celery tasks was never any more difficult

than this:

 import  celery    @celery.task()  def  generate_report(*args,  **kwargs):          ...          Code  for  generating  report          ...          report.save()    

Page 22: Life in a Queue - Using Message Queue with django

Lets Write a Celery Task II •  If you want to customize your tasks, inherit from

the base Task object  from  celery.task.base  import  Task    class  GenerateReport(Task):          def  __init__(self,  *args,  **kwargs):                  ...                  Custom  init  code                  ...                  return  super(GenerateReport,  self).__init__(*args,  **kwargs)            def  run(self,  *args,  **kwargs):                  ...                  Code  for  generating  report                  ...                  report.save()    

Page 23: Life in a Queue - Using Message Queue with django

Issuing a task •  After writing a task, we issue the task from within

a request in the following way:

 def  create_report(request):          ...          Code  for  extracting  parameters  from  request          ...          generate_report.delay(**params)          //  or          GenerateReport.delay(**params)          messages.success(request,  'You  will  receive  an  email  when  report  generation  is  complete.')          return  HTTPResponseRedirect(reverse(‘reports_index’))    

Page 24: Life in a Queue - Using Message Queue with django

What happens when you issue tasks?

Application  Server  

Celery  

Broker  

Worker  

Queue  

Request  Handler  

Celery  

Worker  

Celery  

Worker  

Celery  

Page 25: Life in a Queue - Using Message Queue with django

Understanding Queue Routing •  Brokers contains multiple virtual hosts •  Each virtual host contains multiple exchanges

•  Messages are sent to exchanges o Exchanges are hubs that connect to a set of queues

•  An exchange routes messages to one or more queues

Exchange  VHost  

Queue  

Page 26: Life in a Queue - Using Message Queue with django

Understanding Queue Routing •  In Celery configurations:

o  binding_key binds a task namespace to a queue

o  exchange defines the name of an exchange o  routing_key defines which queue a message should be

directed to under a certain exchange

o  exchange_type = ‘direct’ routes for exact routing keys o  exchange_type = ‘topic’ routes for namespaced &

wildcard routing keys

•  * (matches a single word) •  # (matches zero or more words)

Page 27: Life in a Queue - Using Message Queue with django

Example Celery Config for Routing CELERY_DEFAULT_QUEUE  =  "default"  CELERY_QUEUES  =  {          "feed_tasks":  {                  "binding_key":  "feed.#",          },          "regular_tasks":  {                  "binding_key":  "task.#",          },          "image_tasks":  {                  "binding_key":  "image.compress",                  "exchange":  "mediatasks",                  "exchange_type":  "direct",          },  }  CELERY_DEFAULT_EXCHANGE  =  "tasks"  CELERY_DEFAULT_EXCHANGE_TYPE  =  "topic"  CELERY_DEFAULT_ROUTING_KEY  =  "task.default”  

Page 28: Life in a Queue - Using Message Queue with django

Quick Tips

#  Set  expiration  for  a  task  –  in  seconds  mytask.apply_async(args=[10,  10],  expires=60)  

#  Route  a  task  mytask.apply_async(  

 args=[filename],      routing_key=“video.compress”  

)  #  Or  define  task  mapping  in  CELERY_ROUTES  setting  

#  Revoke  a  task  using  the  task  instance  result  =  mytask.apply_async(args=[2,  2],  countdown=120)  result.revoke()  #  Or  save  the  task  ID  (result.task_id)  somewhere  from  celery.task.control  import  revoke  revoke(task_id)  

Page 29: Life in a Queue - Using Message Queue with django

Quick Tips •  Execute task as a blocking call using:

•  Avoid issuing tasks inside an asynchronous task that waits on children data (blocking) o Write re-usable pieces of code that can be called as

functions instead of called as tasks

o  If necessary, use the callback + subtask feature of celery

•  Ignore results if you don’t need them o  If your asynchronous task doesn’t return anything

generate_report.apply(kwargs=params,  **options)  

@celery.task(ignore_results=True)  

Page 30: Life in a Queue - Using Message Queue with django

Good to know

•  Do check whether your task parameters are serializable o WSGI request objects are not serializable

o Don’t pass request as a parameter for your task

•  Don’t pass unnecessary data in task parameters o They have to be stored until task is complete

Page 31: Life in a Queue - Using Message Queue with django

Good to know

•  Avoid starvation of tasks using multiple queues o  If really long video re-formatting tasks are processed

in the same queue as relatively quicker thumbnail generation tasks, the latter may starve

o Only available when using AMQP broker backend

•  Use celerybeat for time sensitive repeated tasks o Can replace time sensitive cron jobs related to your web

application

Page 32: Life in a Queue - Using Message Queue with django

Q & A •  Slides available at:

o  http://www.slideshare.net/tarequeh

•  Extensive guides & documentation available at: o  http://ask.github.com/celery/