erlounge/sf bay · cotweet robot: task • pull updates from twitter api: • n * 10,000 channels (...

37
ErLounge/SF Bay 2010.1.12

Upload: others

Post on 22-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

ErLounge/SF Bay2010.1.12

Page 2: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Welcome

Page 3: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

hello_world.erl

Page 4: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

wello_horld.erl

Page 5: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

wello_horld.erl

Page 6: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

wello_horld.erl

Page 7: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

CoTweet Robot: Task

Page 8: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

CoTweet Robot: Task

• Pull updates from Twitter API:

• n * 10,000 channels ( n is going up -- \o/ )

Page 9: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

CoTweet Robot: Task

• Pull updates from Twitter API:

• n * 10,000 channels ( n is going up -- \o/ )

channel update req. 4 http requests on average

= n million http requests / hr

Page 10: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

CoTweet Robot: Task

• Pull updates from Twitter API:

• n * 10,000 channels ( n is going up -- \o/ )

• Minimize latency:

• Ideally < 300s between updates

Page 11: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

CoTweet Robot: Task

• Pull updates from Twitter API:

• n * 10,000 channels ( n is going up -- \o/ )

• Minimize latency (< 300s between updates)

• <blink>Survivable</blink> !

Page 12: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Primordial Ooze: Ruby

Note: not a real robot

Page 13: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Primordial Ooze: Ruby

Page 14: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Primordial Ooze: Ruby

Like any tool, rails is great for some things...

Page 15: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million
Page 16: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: I

Page 17: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: I

• Main process: receive/after loop:

service_loop() -> spawn(?MODULE, do_work, []), receive after SleepyTimeMS * 1000 -> ok end, service_loop().

Page 18: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: I

• Main process: receive/after loop

• Reads user info from database

• For every N records ...

Page 19: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: I

• Main process: receive/after loop

• Reads user info from database

• For every N records:

• spawn() a worker to process bucket:

• Retrieve updates from Twitter API

• Insert into database (returning ID)

• Insert (using ID) into memcached

Page 20: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

spawn() responsibly.

• I <3 supervisors

• trap_exit + PID = spawn_link(M, F, [A]).

• {PID,Ref} = spawn_monitor(M,F,[A]).

Page 21: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Single Assignment != Stateless

Page 22: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Single Assignment != Stateless

• gen_server, gen_event, gen_fsm: State

-record(State, {foo,bar}).

handle_call({set_foo, V}, _From, State) -> {reply, State#state{ foo = V };

Page 23: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Single Assignment != Stateless

• ets

• Many concurrent reads? No problem.

start() -> ets:new(state, [named_table, public]).

set_foo(V) -> ets:insert(state, {foo,V}).

Page 24: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Single Assignment != Stateless

• ets

• Many concurrent reads? No problem.

• Many concurrent writes?

• May need to Serialize through a gen_server.

handle_call({set_foo, V}, _From, State) -> ets:insert(state, {foo, V}), {reply, State};

Page 25: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Single Assignment != Stateless

• gen_server, gen_event, gen_fsm: State

• ets: app-wide

• serialize writes through a gen_server

• mnesia: persistent

• process dictionary: put(K,V) / get(K)

• Tiny, transient, process-level

• (think: private):

Page 26: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: II

Page 27: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: II

• All state held in ets

• Robot comprises many ‘services:’

• Feeder: poll db as needed, update ets

• Scheduler: add tasks to work queue

• Business logic:

• even work load, limit resource(s)

• Dispatcher: service queue for workers

Page 28: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Hm, our “workers” may block waiting for result(i.e., ID) of an insert from DB . . .

Page 29: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Needs more OTP.

Hm, our “workers” may block waiting for result(i.e., ID) of an insert from DB . . .

Page 30: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Robot Evolution: III

• gen_event as simplest MUX ever

• no time for downtime: hot deployment

Page 31: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

gen_event

• Extraordinarily easy to multiplex:

• Create an event manager: gen_event

• Add handlers

• Fire an event

• Not just for logging . . .

Page 32: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Deployment

-module(my_app).-export([reload_modules/0]).

-define(APP_MODS, [my_app, my_service]).

reload_modules() -> lists:map(fun(M) -> code:load_file(M) end,

?APP_MODS).

See: http://www.erlang.org/cgi-bin/ezmlm-cgi/4/36236

Page 33: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million
Page 34: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Open Source

S. Mestrom with K. Newby

“Standing on the shoulders of giants...”

-- Bob Ippolito

Page 35: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Open Source

• plists (parallel list operations)

• lhttpc (much lighter than inets)

• rfc_4627 (utf8 JSON decoding)

• epgsql (PostGreSQL driver)

• erlang_twitter (Twitter API via xmerl)

• merle (erlang memcached adapter)

• mochi (congrats to MochiMedia!)

Page 36: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

Thanks

• SocialMedia -- Ken Thom

• CoTweet

• You

Page 37: ErLounge/SF Bay · CoTweet Robot: Task • Pull updates from Twitter API: • n * 10,000 channels ( n is going up -- \o/ ) channel update req. 4 http requests on average = n million

ErLounge

• Grab another drink

• Share your stories