![Page 1: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/1.jpg)
Developing Amazon’s Dynamo in POE and Erlang – Prologue of “From POE to Erlang”
takemaru
![Page 2: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/2.jpg)
Dynamoとは?
Amazon CTO の Werner Vogels らが開発
![Page 3: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/3.jpg)
Dynamoとは?
Key-valueストア(ハッシュテーブル)
コレのすごいやつ!
memcached のデータがなくならないやつ,のが正確かも
my %hash = ( key1 => “value1”, key2 => “value2”, );
![Page 4: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/4.jpg)
Dynamoとは?
Key-valueストア(ハッシュテーブル)
簡単にスケールアウトできる(数百台とか)
![Page 5: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/5.jpg)
Dynamoとは?
Key-valueストア(ハッシュテーブル)
障害に強い(マシン障害はもちろんラック障害にも耐える)
![Page 6: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/6.jpg)
Dynamoとは?
Key-valueストア(ハッシュテーブル)
レスポンスタイム(Latency)が安定している
< 300ms ,マシンが故障しても ネットワーク障害が起きても
![Page 7: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/7.jpg)
Dynamoとは?
Key-valueストア(ハッシュテーブル)
いつでも読み書きできる(Lockによるストールがない)
というようなことがない
![Page 8: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/8.jpg)
Dynamoとは?
Key-valueストア(ハッシュテーブル)
小さなデータをたくさん格納するのに向いている
Dynamo Bigtable/GFS
…………… ……………
…………… ……………
…………… ……………
![Page 9: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/9.jpg)
Kai – Yet Anothoer Amazon’s Dynamo
さて,Dynamo のオープンソース実装について
まず POE で
次に Erlang で
memcache プロトコルから使えるよ
![Page 10: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/10.jpg)
動作例
クライアントがリクエスト
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
![Page 11: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/11.jpg)
動作例
レプリカを持っている3ノードに転送 Consistent Hashing で選択
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
![Page 12: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/12.jpg)
動作例
1つめからレスポンス
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
![Page 13: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/13.jpg)
動作例
2つめのレスポンスでクライアントに返す バージョンチェックなどを実行 タイムアウトしても返す
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
![Page 14: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/14.jpg)
動作例
遅れてきたレスポンスを処理 バージョン修復などを実行
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
![Page 15: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/15.jpg)
POE (Event Driven) 実装
イベント単位に関数を実装
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Receive from Client Receive from Node Timeout
State my $State;
sub recv_from_client; sub recv_from_node; sub timeout; sub another_error;
実際にはちょっと違う実装をしたのですが, まぁ普通ならこうするだろうってを紹介します
擬似コード
![Page 16: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/16.jpg)
POE (Event Driven) 実装
クライアントがリクエスト
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Receive from Client Receive from Node Timeout
State
sub recv_from_client { $cli_sock->recv(my $req, 1024);
![Page 17: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/17.jpg)
POE (Event Driven) 実装
レプリカを持っている3ノードに転送
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Receive from Client Receive from Node Timeout
State state_init($cli_sock); my @nodes = choose_nodes($req); for my $node (@nodes) { $sock = IO::Socket::Inet->new(…); $sock->write($req); $poe->kernel->select_read( $sock, ‘recv_from_node’, $cli_sock, ); }
![Page 18: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/18.jpg)
POE (Event Driven) 実装
1つめからレスポンス
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Receive from Client Receive from Node Timeout
State
sub recv_from_node { my $cli_sock = $poe->args->[2]; $node_sock->read(my $res, 1024); stat_add_res($cli_sock, $res); my $count = stat_count_res($cli_sock); if ($count == 2) {
![Page 19: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/19.jpg)
POE (Event Driven) 実装
2つめのレスポンスでクライアントに返す タイムアウトしても返す
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Receive from Client Receive from Node Timeout
State
sub recv_from_node { my $cli_sock = $poe->args->[2]; $node_sock->read(my $res, 1024); stat_add_res($cli_sock, $res); my $count = stat_count_res($cli_sock); if ($count == 2) { my $res = stat_uniq_res($cli_sock); $cli_sock->write($res); }
![Page 20: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/20.jpg)
POE (Event Driven) 実装
イベント(時間の進み)単位の実装
Client Receive from Client Receive from Node Node
Client Dynamo Node
Dynamo Node
Receive from Client Receive from Node Timeout
State
![Page 21: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/21.jpg)
Erlang の実装
プロセス単位の実装
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Process for Client
![Page 22: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/22.jpg)
Erlang の実装
クライアントがリクエスト
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Process for Client Processes for Nodes
recv(Sock) -> {ok, Req} = gen_tcp:recv(Sock, 0), lists:foreach( fun(Node) -> spawn(Mod, map, [Node, Req, self()]) end, choose_nodes(Req) ), Res = gather(2, []),
![Page 23: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/23.jpg)
Erlang の実装
レプリカを持っている3ノードに転送
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Process for Client Processes for Nodes
map(Node, Req, Parent) -> {ok, Sock} = gen_tcp:connect(Node, …), gen_tcp:send(Sock, Req),
![Page 24: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/24.jpg)
Erlang の実装
1つめからレスポンス
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Process for Client Processes for Nodes
map(Node, Req, Parent) -> {ok, Sock} = gen_tcp:connect(Node, …), gen_tcp:send(Sock, Req), receive {tcp, Sock, Res} -> send(Parent, Res);
![Page 25: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/25.jpg)
Erlang の実装
2つめのレスポンスでクライアントに返す タイムアウトしても返す
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Process for Client Processes for Nodes
map(Node, Req, Parent) -> {ok, Sock} = gen_tcp:connect(Node, …), gen_tcp:send(Sock, Req), receive {tcp, Sock, Res} -> send(Parent, Res);
![Page 26: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/26.jpg)
Erlang の実装
2つめのレスポンスでクライアントに返す タイムアウトしても返す
Client
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Dynamo Node
Process for Client Processes for Nodes
Res = gather(2, []), gen_tcp:send(Sock, Res).
gather(0, Acc) -> Acc; gather(N, Acc) -> receive Res -> gather(N-1, uniq([Res|Acc])), after 100 -> % timeout Acc end.
![Page 27: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/27.jpg)
Erlang の実装
プロセス単位の実装
Client Process for Client Process for Node Node
Client Dynamo Node
Dynamo Node
Process for Client Process for Node
クライアントとのやり取りと,ノード間のやり取りを,それぞれのプロセスにまとめられる
![Page 28: Amazon's Dynamo in POE and Erlang @ YAPC::Asia 2008 LT](https://reader033.vdocuments.net/reader033/viewer/2022052622/5591c3411a28ab37408b45e0/html5/thumbnails/28.jpg)
まとめ
Erlang いいよ
振る舞い(動作モデル)に従った素直なコードが書ける
マルチスレッドのような排他制御問題がない
共有メモリではなくメッセージ交換なので
プロセス生成やメッセージのコストが低い
数千万プロセスを起動できるとか
といっても,分散システム以外には向かないかも
ファイルや文字列といった基本的な操作が弱い