erts diagram

8
what is a stack in IT? We're talking about a silo of dependencies, like a tower, with the end-user sitting at the top. Material science and E&E engineering are the foundations near the bottom, which give us hardware. Hardware is hardly a firm foundation... it changes very often, like the currents in the sea, and so we have software operating systems like Windows, Linux, and Android to deal with the frequent changes - giving us a longer term "ground floor", or "floating platform", upon which to base applications that provide value to end-users of IT. Most recently, distributed computing is becoming very trendy - and you must be aware of salespeak like "cloud computing" and "data centres". Unfortunately, although some significant work has been done, at the research level, on operating systems that tie many separate computers together, there are no dominant solutions in this space - no solutions that have a level of acceptance that is as widespread as any of the aforementioned operating systems. The most popular solutions are currently a thin layer, called hypervisors, that fit between operating systems designed to work on single computers, and the many separate computers that need to be made to work together. The operating system remains a "single machine" model, and so it almost forces all application development to implement its own way of dealing with the coordination of work that needs to be done among many computers. So let's say now, that you're a programmer who is quite flexible, and can learn to speak any language you like, and you need to write an application that runs on many computers at the same time - an easy example for this crowd is an application that needs to serve 100,000 complex webpages per second. You might ask yourself, "which is the hardware or software machine that already knows how to deal with most of the knitty-gritty of getting this job done?" Once you had an answer, you might then ask what language that software or hardware speaks, and then you would learn that language, so that you could communicate with the hardware or software that already knows how to get the job done. It turns out that Ericsson has continuously developed and maintained for over 20 years, a piece of software that is very clever at solving this problem problem, of getting many pieces of hardware to work together on performing a single job that helps some end-user. Ericsson calls this clever machine the "BEAM emulator", and it calls the framework for working with the BEAM emulator, "the Open Telecoms Platform (OTP)". So BEAM is this smart machine, and OTP is a pattern that BEAM uses to think about distributed computing problems. Now how do we humans talk to BEAM? We would normally use a language called Erlang. This is why it is valuable to learn Erlang. (Zooming out for a second, let's remember that languages don't do work - people, or machines, do work, and we only use languages to communicate with machines, as if they are little people.) The rest of today's talk is going to briefly cover three topics: 1) how BEAM gets distributed computing done, with the help of OTP, 2) how the Erlang language works, and 3) how Erlang or any other language is made comprehensible to BEAM (technically: I'm talking about BEAM as a compilation target). "stack" I'm an end-user! Hardware: the shifting sands Operating Systems: a plea for sanity Applications: I like to use it. Virtualisation: you mean we still have to use these mofos? (now: yes) designers falala! fala.. er.. developers FU !! finance / ops we R FU-ed. investors FFUU !!!!

Upload: yangjerng-hwa

Post on 09-Jul-2015

143 views

Category:

Technology


1 download

DESCRIPTION

2012 08-24 - talk at WebCamp KL

TRANSCRIPT

Page 1: ERTS diagram

what is a stack in IT? We're talking about a silo of dependencies, like a tower, with the end-user sitting at the top. Material science and E&E engineering are the foundations near the bottom, which give us hardware. Hardware is hardly a firm foundation... it changes very often, like the currents in the sea, and so we have software operating systems like Windows, Linux, and Android to deal with the frequent changes - giving us a longer term "ground floor", or "floating platform", upon which to base applications that provide value to end-users of IT. Most recently, distributed computing is becoming very trendy - and you must be aware of salespeak like "cloud computing" and "data centres". Unfortunately, although some significant work has been done, at the research level, on operating systems that tie many separate computers together, there are no dominant solutions in this space - no solutions that have a level of acceptance that is as widespread as any of the aforementioned operating systems. The most popular solutions are currently a thin layer, called hypervisors, that fit between operating systems designed to work on single computers, and the many separate computers that need to be made to work together. The operating system remains a "single machine" model, and so it almost forces all application development to implement its own way of dealing with the coordination of work that needs to be done among many computers. So let's say now, that you're a programmer who is quite flexible, and can learn to speak any language you like, and you need to write an application that runs on many computers at the same time - an easy example for this crowd is an application that needs to serve 100,000 complex webpages per second. You might ask yourself, "which is the hardware or software machine that already knows how to deal with most of the knitty-gritty of getting this job done?" Once you had an answer, you might then ask what language that software or hardware speaks, and then you would learn that language, so that you could communicate with the hardware or software that already knows how to get the job done. It turns out that Ericsson has continuously developed and maintained for over 20 years, a piece of software that is very clever at solving this problem problem, of getting many pieces of hardware to work together on performing a single job that helps some end-user. Ericsson calls this clever machine the "BEAM emulator", and it calls the framework for working with the BEAM emulator, "the Open Telecoms Platform (OTP)". So BEAM is this smart machine, and OTP is a pattern that BEAM uses to think about distributed computing problems. Now how do we humans talk to BEAM? We would normally use a language called Erlang. This is why it is valuable to learn Erlang. (Zooming out for a second, let's remember that languages don't do work - people, or machines, do work, and we only use languages to communicate with machines, as if they are little people.) The rest of today's talk is going to briefly cover three topics: 1) how BEAM gets distributed computing done, with the help of OTP, 2) how the Erlang language works, and 3) how Erlang or any other language is made comprehensible to BEAM (technically: I'm talking about BEAM as a compilation target).

"stack"I'm an end-user!

Hardware: the shifting sands

Operating Systems: a plea for sanity

Applications: I like to use it.

Virtualisation: you mean we still have to use these mofos? (now: yes)

designersfalala!

fala.. er..

developers

FU !!

finance / ops

we R FU-ed.

investors

FFUU !!!!

Page 2: ERTS diagram

what is a stack in IT? We're talking about a silo of dependencies, like a tower, with the end-user sitting at the top. Material science and E&E engineering are the foundations near the bottom, which give us hardware. Hardware is hardly a firm foundation... it changes very often, like the currents in the sea, and so we have software operating systems like Windows, Linux, and Android to deal with the frequent changes - giving us a longer term "ground floor", or "floating platform", upon which to base applications that provide value to end-users of IT. Most recently, distributed computing is becoming very trendy - and you must be aware of salespeak like "cloud computing" and "data centres". Unfortunately, although some significant work has been done, at the research level, on operating systems that tie many separate computers together, there are no dominant solutions in this space - no solutions that have a level of acceptance that is as widespread as any of the aforementioned operating systems. The most popular solutions are currently a thin layer, called hypervisors, that fit between operating systems designed to work on single computers, and the many separate computers that need to be made to work together. The operating system remains a "single machine" model, and so it almost forces all application development to implement its own way of dealing with the coordination of work that needs to be done among many computers. So let's say now, that you're a programmer who is quite flexible, and can learn to speak any language you like, and you need to write an application that runs on many computers at the same time - an easy example for this crowd is an application that needs to serve 100,000 complex webpages per second. You might ask yourself, "which is the hardware or software machine that already knows how to deal with most of the knitty-gritty of getting this job done?" Once you had an answer, you might then ask what language that software or hardware speaks, and then you would learn that language, so that you could communicate with the hardware or software that already knows how to get the job done. It turns out that Ericsson has continuously developed and maintained for over 20 years, a piece of software that is very clever at solving this problem problem, of getting many pieces of hardware to work together on performing a single job that helps some end-user. Ericsson calls this clever machine the "BEAM emulator", and it calls the framework for working with the BEAM emulator, "the Open Telecoms Platform (OTP)". So BEAM is this smart machine, and OTP is a pattern that BEAM uses to think about distributed computing problems. Now how do we humans talk to BEAM? We would normally use a language called Erlang. This is why it is valuable to learn Erlang. (Zooming out for a second, let's remember that languages don't do work - people, or machines, do work, and we only use languages to communicate with machines, as if they are little people.) The rest of today's talk is going to briefly cover three topics: 1) how BEAM gets distributed computing done, with the help of OTP, 2) how the Erlang language works, and 3) how Erlang or any other language is made comprehensible to BEAM (technically: I'm talking about BEAM as a compilation target).

re-stack?I'm an end-user!

Applications: I like to use it.

A magic box, please?(your fav. language / platform)

Page 3: ERTS diagram

what is a stack in IT? We're talking about a silo of dependencies, like a tower, with the end-user sitting at the top. Material science and E&E engineering are the foundations near the bottom, which give us hardware. Hardware is hardly a firm foundation... it changes very often, like the currents in the sea, and so we have software operating systems like Windows, Linux, and Android to deal with the frequent changes - giving us a longer term "ground floor", or "floating platform", upon which to base applications that provide value to end-users of IT. Most recently, distributed computing is becoming very trendy - and you must be aware of salespeak like "cloud computing" and "data centres". Unfortunately, although some significant work has been done, at the research level, on operating systems that tie many separate computers together, there are no dominant solutions in this space - no solutions that have a level of acceptance that is as widespread as any of the aforementioned operating systems. The most popular solutions are currently a thin layer, called hypervisors, that fit between operating systems designed to work on single computers, and the many separate computers that need to be made to work together. The operating system remains a "single machine" model, and so it almost forces all application development to implement its own way of dealing with the coordination of work that needs to be done among many computers. So let's say now, that you're a programmer who is quite flexible, and can learn to speak any language you like, and you need to write an application that runs on many computers at the same time - an easy example for this crowd is an application that needs to serve 100,000 complex webpages per second. You might ask yourself, "which is the hardware or software machine that already knows how to deal with most of the knitty-gritty of getting this job done?" Once you had an answer, you might then ask what language that software or hardware speaks, and then you would learn that language, so that you could communicate with the hardware or software that already knows how to get the job done. It turns out that Ericsson has continuously developed and maintained for over 20 years, a piece of software that is very clever at solving this problem problem, of getting many pieces of hardware to work together on performing a single job that helps some end-user. Ericsson calls this clever machine the "BEAM emulator", and it calls the framework for working with the BEAM emulator, "the Open Telecoms Platform (OTP)". So BEAM is this smart machine, and OTP is a pattern that BEAM uses to think about distributed computing problems. Now how do we humans talk to BEAM? We would normally use a language called Erlang. This is why it is valuable to learn Erlang. (Zooming out for a second, let's remember that languages don't do work - people, or machines, do work, and we only use languages to communicate with machines, as if they are little people.) The rest of today's talk is going to briefly cover three topics: 1) how BEAM gets distributed computing done, with the help of OTP, 2) how the Erlang language works, and 3) how Erlang or any other language is made comprehensible to BEAM (technically: I'm talking about BEAM as a compilation target).

Erlang/OTP?

I'm an end-user!

Applications: I like to use it.

A candidate for something that doesn't exist yet?

Page 4: ERTS diagram

OS: Hardware Abstraction

(logical cores)

(RAM address space)

a=>4b=>?c=>7etc.HEAP

|v| |s|<=

|f||k||_|

STACK

An OS Process assigned this

space...

Other OS Processes...

(hardware cores)

(hardware RAM)

1. An Erlang "Node a.k.a. Emulator a.k.a. VM" runs as an OS process.

2. Internally it can manage '00s, '000s, or '000,000s of Erlang processes.

3. Erlangprocessesare spawnedjust by passingfunctions into spawn().

4. Each Erlangprocess hasits ownheap andstack (implemented in the OS process's heap).

5. Erlang processes can talk to each other via message passing (messages are copied between Erlang processes' mailboxes).

6. Erlang processes can be organisedinto supervision trees which providenodes with a framework for communalliving and dying (very biological).

7. Architectural difference from e.g.

Node.js:

Preemptive Multitasking of Erlang Processes(time sharing)

One Scheduler per logical core

Other Erlang nodes... Other OS processes...

Page 5: ERTS diagram

-module(dummy_module).-compile(native).-compile(export_all).

dummy_function()-> A = lala, B = "lala", C = [$l,97,$l,$a], D = <<"lala">>, E = <<$l,$a,$l,97>>,

F = 437947750258039476268479386, G = (3.142 + 2), % 5.142 H = "la" ++ [$l] ++ [97], % "lala" [1,2,3] == [ 1 | [2,3] ], % true I = {a,b,{d,e}}, J = lists:reverse([1,2,3]), % [3,2,1] K = fun()->{ok,[A,B,G]} end, K.

% Then elsewhere, running % X = dummy_module:dummy_function(),% X().% would return% {ok, [lala,"lala",5.14199999... 5]}

% declares the module name% targets MIs instead of byte codes% declaring the interface

%function_name()->% VariableA = atom,% VariableB = "string",% VariableC = ditto, with less sugar,% VariableD = <<"binary">>,% VariableE = ditto, with less sugar,

% VariableF = integer of any size,% Dynamic typing: floats + ints% Dynamic typing: strings% list [ head | tail ] syntax

% VariableI = 3-tuple% VariableJ = module:function() call% VariableK = 1st-class fn, closure% dummy_function() can return the closure

% . ends dummy_function's declaration

language layer; basic data types, and syntax - don't try to remember it right now

POPULAR: "patten matching" basically means that you can use wild-cards in switch-cases, even for calling different branches of a function.

Page 6: ERTS diagram

compilation : script, to abstract format... which goes into the module "compile"

Erlang has "leex" and "yecc" which do similar work, and output "Abstract Formatted Erlang", which is fairly regular/readable, and probably a good compilation target for beginners (like me).

There's also a lower-level format called Core Erlang.

Page 7: ERTS diagram

compilation : byte code and/or native code... what comes out of the "compile" module

LLVM backend is experimental, but FYI :)

Page 8: ERTS diagram

Further reading (top picks):

- http://LearnYouSomeErlang.com (easy, prosaic, illustrated intro to everything)

Search for it (or I can Dropbox the PDFs to you):

- Vorreuter's presentation on Hacking Erlang (on compiling custom/other languages to Erlang's Abstract Format)

- Zhang's 2011 thesis: Erlang on many cores (for a readable description of BEAM internals)

- Erl-llvm thesis with the Greek cover page (more on lower-level & native compilation)

- A study of ETS table implementation and performance (alternativs to Erlang's built-in in-memory key-value-store architecture/algorithms)

- Erlang-embedded whitepaper (on getting a 3MB disk footprint, 16MB runtime)