erlounge/sf bay · cotweet robot: task • pull updates from twitter api: • n * 10,000 channels (...
TRANSCRIPT
ErLounge/SF Bay2010.1.12
Welcome
hello_world.erl
wello_horld.erl
wello_horld.erl
wello_horld.erl
CoTweet Robot: Task
CoTweet Robot: Task
• Pull updates from Twitter API:
• n * 10,000 channels ( n is going up -- \o/ )
CoTweet Robot: Task
• Pull updates from Twitter API:
• n * 10,000 channels ( n is going up -- \o/ )
channel update req. 4 http requests on average
= n million http requests / hr
CoTweet Robot: Task
• Pull updates from Twitter API:
• n * 10,000 channels ( n is going up -- \o/ )
• Minimize latency:
• Ideally < 300s between updates
CoTweet Robot: Task
• Pull updates from Twitter API:
• n * 10,000 channels ( n is going up -- \o/ )
• Minimize latency (< 300s between updates)
• <blink>Survivable</blink> !
Primordial Ooze: Ruby
Note: not a real robot
Primordial Ooze: Ruby
Primordial Ooze: Ruby
Like any tool, rails is great for some things...
Robot Evolution: I
Robot Evolution: I
• Main process: receive/after loop:
service_loop() -> spawn(?MODULE, do_work, []), receive after SleepyTimeMS * 1000 -> ok end, service_loop().
Robot Evolution: I
• Main process: receive/after loop
• Reads user info from database
• For every N records ...
Robot Evolution: I
• Main process: receive/after loop
• Reads user info from database
• For every N records:
• spawn() a worker to process bucket:
• Retrieve updates from Twitter API
• Insert into database (returning ID)
• Insert (using ID) into memcached
spawn() responsibly.
• I <3 supervisors
• trap_exit + PID = spawn_link(M, F, [A]).
• {PID,Ref} = spawn_monitor(M,F,[A]).
Single Assignment != Stateless
Single Assignment != Stateless
• gen_server, gen_event, gen_fsm: State
-record(State, {foo,bar}).
handle_call({set_foo, V}, _From, State) -> {reply, State#state{ foo = V };
Single Assignment != Stateless
• ets
• Many concurrent reads? No problem.
start() -> ets:new(state, [named_table, public]).
set_foo(V) -> ets:insert(state, {foo,V}).
Single Assignment != Stateless
• ets
• Many concurrent reads? No problem.
• Many concurrent writes?
• May need to Serialize through a gen_server.
handle_call({set_foo, V}, _From, State) -> ets:insert(state, {foo, V}), {reply, State};
Single Assignment != Stateless
• gen_server, gen_event, gen_fsm: State
• ets: app-wide
• serialize writes through a gen_server
• mnesia: persistent
• process dictionary: put(K,V) / get(K)
• Tiny, transient, process-level
• (think: private):
Robot Evolution: II
Robot Evolution: II
• All state held in ets
• Robot comprises many ‘services:’
• Feeder: poll db as needed, update ets
• Scheduler: add tasks to work queue
• Business logic:
• even work load, limit resource(s)
• Dispatcher: service queue for workers
Hm, our “workers” may block waiting for result(i.e., ID) of an insert from DB . . .
Needs more OTP.
Hm, our “workers” may block waiting for result(i.e., ID) of an insert from DB . . .
Robot Evolution: III
• gen_event as simplest MUX ever
• no time for downtime: hot deployment
gen_event
• Extraordinarily easy to multiplex:
• Create an event manager: gen_event
• Add handlers
• Fire an event
• Not just for logging . . .
Deployment
-module(my_app).-export([reload_modules/0]).
-define(APP_MODS, [my_app, my_service]).
reload_modules() -> lists:map(fun(M) -> code:load_file(M) end,
?APP_MODS).
See: http://www.erlang.org/cgi-bin/ezmlm-cgi/4/36236
Open Source
S. Mestrom with K. Newby
“Standing on the shoulders of giants...”
-- Bob Ippolito
Open Source
• plists (parallel list operations)
• lhttpc (much lighter than inets)
• rfc_4627 (utf8 JSON decoding)
• epgsql (PostGreSQL driver)
• erlang_twitter (Twitter API via xmerl)
• merle (erlang memcached adapter)
• mochi (congrats to MochiMedia!)
Thanks
• SocialMedia -- Ken Thom
• CoTweet
• You
ErLounge
• Grab another drink
• Share your stories