Download - Gevent rabbit rpc
Асинхронный RPC с помощью Gevent и RabbitMQАлександр Мокров
О чем доклад
Некоторые ограничения Celery
Как их обойти
Gevent
RabbitMQ (некоторые особенности)
Будет предложена модель асинхронного RPC
Пример построения приложения на Celery workflow
get flour
bake pie
get meat
seal pie
create dough
order pie
get milk
get aggs
...
entry point 2
entry point 1
entry point 3 service n
app
service 2
service 1
sommelier
winery
сhateauService of degustation 3
appService of degustation 2
Service of degustation 1
Дегустация вин с gevent и RabbitMQ
entry task
service task 1
DBcallback task 1
service task n
callback task n
Что хотелось бы получить
entry task
service task 1
service task 2
service task n
long task services
time
Celery AsyncResult
async_result = task.apply_async()
print(async_result.status)
False
result = async_result.wait()
long task service task
persistant queue
exclusive queue
reply_to=amq.gen-E6.correlation_id
Request
correlation_id
Response reply_to=amq.gen-E6...
RabbitMQ RPC
greenletgreenlet
long taskservice response listener
service service service
reply_to exclusive queue
services queues
Workers
● solo
● prefork
● eventlet
● gevent
Gevent
gevent is a concurrency library based around libev. It provides a clean API for a variety of concurrency and network related tasks.
Greenlet
The primary pattern used in gevent is the Greenlet, a lightweight coroutine provided to Python as a C extension module. Greenlets all run inside of the OS process for the main program but are scheduled cooperatively.
Only one greenlet is ever running at any given time.
Spin-off of Stackless, a version of CPython that supports micro-threads called “tasklets”. Tasklets run pseudo-concurrently (typically in a single or a few OS-level threads) and are synchronized with data exchanges on “channels”.
Its coroutine
Event loop
def foo(): print('Running in foo') gevent.sleep(0) print('Explicit context switch to foo')
def bar(): print('Explicit context to bar') gevent.sleep() print('Implicit context switch to bar')
gevent.joinall([ gevent.spawn(foo), gevent.spawn(bar),])
Running in fooExplicit context to barExplicit context switch to fooImplicit context switch to bar
def task(pid):
gevent.sleep(random.randint(0,2)*0.001) print('Task %s done' % pid)
def synchronous(): for i in range(1, 8): task(i)
def asynchronous(): threads = [gevent.spawn(task, i) for i in range(10)] gevent.joinall(threads)
Synchronous:Task 1 doneTask 2 doneTask 3 doneTask 4 doneTask 5 doneTask 6 doneTask 7 doneAsynchronous:Task 1 doneTask 5 doneTask 6 doneTask 2 doneTask 4 doneTask 7 doneTask 0 doneTask 3 done
def echo(i): time.sleep(0.001) return i
# Non Deterministic Process Pool
from multiprocessing.pool import Pool
p = Pool(10)run1 = [a for a in p.imap_unordered(echo, xrange(10))]run2 = [a for a in p.imap_unordered(echo, xrange(10))]run3 = [a for a in p.imap_unordered(echo, xrange(10))]run4 = [a for a in p.imap_unordered(echo, xrange(10))]
print(run1 == run2 == run3 == run4)
False
# Deterministic Gevent Pool
from gevent.pool import Pool
p = Pool(10)run1 = [a for a in p.imap_unordered(echo, xrange(10))]run2 = [a for a in p.imap_unordered(echo, xrange(10))]run3 = [a for a in p.imap_unordered(echo, xrange(10))]run4 = [a for a in p.imap_unordered(echo, xrange(10))]
print(run1 == run2 == run3 == run4)
True
Spawning Greenletsfrom gevent import Greenlet
thread1 = Greenlet.spawn(foo, "message", 1)
thread2 = gevent.spawn(foo, "message", 2)
thread3 = gevent.spawn(lambda x: (x+1), 2)
threads = [thread1, thread2, thread3]
# Block until all threads complete.gevent.joinall(threads)
class MyGreenlet(Greenlet):
def __init__(self, message, n): Greenlet.__init__(self) self.message = message self.n = n
def _run(self): print(self.message) gevent.sleep(self.n)
g = MyGreenlet("Hi there!", 3)g.start()g.join()
Greenlet State
started -- Boolean, indicates whether the Greenlet has been started
ready() -- Boolean, indicates whether the Greenlet has halted
successful() -- Boolean, indicates whether the Greenlet has halted and not thrown an exception
value -- arbitrary, the value returned by the Greenlet
exception -- exception, uncaught exception instance thrown inside the greenlet
greenletgreenlet
long taskservice
response listener
subscribe(task_id)
Timeouts
from gevent import Timeout
seconds = 10
timeout = Timeout(seconds)timeout.start()
def wait(): gevent.sleep(10)
try: gevent.spawn(wait).join()except Timeout: print('Could not complete')
time_to_wait = 5 # seconds
class TooLong(Exception): pass
with Timeout(time_to_wait, TooLong): gevent.sleep(10)
class Queue(maxsize=None, items=None)
empty()
full()
get(block=True, timeout=None)
get_nowait()
next()
peek(block=True, timeout=None)
peek_nowait()
put(item, block=True, timeout=None)
put_nowait(item)
qsize()
greenletgreenlet
greenletservice results
dispatcher
gevent.queues
task_id
reply_to, results_queue
EventsGroups and
Pools
Locks and Semaphores
Subprocess
Thread Locals
Actors
Monkey patching
guerrilla patch
gorilla patch
monkey patch
import socket
print(socket.socket)
from gevent import monkey
monkey.patch_socket()
print("After monkey patch")
print(socket.socket)
import select
print(select.select)
monkey.patch_select()
print("After monkey patch")
print(select.select)
<class 'socket.socket'>
After monkey patch
<class 'gevent._socket3.socket'>
<built-in function select>
After monkey patch
<function select at 0x7ff7e111c378>
Stack layout for a greenlet | ^^^ | | older data | | | stack_stop . |_______________| . | | . | greenlet data | . | in stack | . * |_______________| . . _____________ stack_copy + stack_saved . | | | | . | data | |greenlet data| . | unrelated | | saved | . | to | | in heap |stack_start . | this | . . |_____________| stack_copy | greenlet | | | | newer data | | vvv |
greenletgreenlet
greenletservice results dispatcher
service service service
reply_to exclusive queue
services queues
subscribe
gevent.queues
Service Result Dispatcher
greenletgreenlet
greenletservice results dispatcher
reply_to exclusive queue
reply_to, results_queue
gevent.queues
task_id
task_id, reply_to
class ServiceResultsDispatcher(Greenlet):
def __init__(self): … self.reply_to = None Greenlet.__init__(self)
def create_connection(self): ... result = self.channel.queue_declare(exclusive=True) self.reply_to = result.method.queue
def subscribe(self, task_id):
service_results_queue = gevent.queue.Queue()
self.service_results[task_id] = service_results_queue
return service_results, self.reply_to
def unsubscribe(self, task_id):
self.service_results.pop(task_id, None)
def _run(self):
while True:
try:
for method_frame, properties, body in self.channel.consume(self.reply_to, no_ack=True):
if properties.correlation_id in self.tasks:
self.tasks[properties.correlation_id].put_nowait((method_frame, properties, body))
except ...
Greenlet task
greenletgreenlet
greenletservice results dispatcher
services queues
subscribe
gevent.queues
self.results_queue, self.reply_to = self.service_publisher.subscribe(self.task_id)
self.channel.basic_publish(exchange='',
routing_key=service_queue,
properties=BasicProperties(
reply_to=self.reply_to,
correlation_id=self.task_id
),
body=request)
try:
method_frame, properties, body =
self.results_queue.get(block=True, timeout=self.timeout)
except Empty:
logger.info('timeout')
break
else:
logger.info('body = {}'.format(body))
Services
response = channel.basic_publish(
exchange='',
routing_key=props.reply_to,
properties=BasicProperties(correlation_id=request.task_id),
body=response
)
Альтернативы?
Почему gevent?
1. Встроенная поддержка в Celery (малыми силами)
2. Хотелось рассмотреть в докладе именно gevent. Ничто не мешает переделать, к примеру, на asyncio.
Вывод
Ссылки
http://www.gevent.org
http://sdiehl.github.io/gevent-tutorial/
https://github.com/python-greenlet/greenlet
https://www.rabbitmq.com/
http://www.celeryproject.org/
Спасибо за внимание!