the haxl project at facebook - haskell · the haxl project at facebook simon marlow jon coens louis...
TRANSCRIPT
![Page 1: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/1.jpg)
The Haxl Project at Facebook
Simon MarlowJon Coens
Louis BrandyJon Purdy& others
![Page 2: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/2.jpg)
BusinessLogic
service API
Databases
Other back-end services
![Page 3: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/3.jpg)
Use case: fighting spam
BusinessLogic
Databases
Other back-end services
www (PHP)
Is this thing spam?
YES/NO
![Page 4: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/4.jpg)
Use case: fighting spam
BusinessLogic
Databases
Other back-end services
www (PHP)
Site-integrity engineers
push new rules hundreds of times per day
![Page 5: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/5.jpg)
Data dependencies in a computation database
thrift
memcache
![Page 6: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/6.jpg)
Code wants to be structured hierarchically
• abstraction• modularity
database
thrift
memcache
![Page 7: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/7.jpg)
Code wants to be structured hierarchically
• abstraction• modularity
database
thrift
memcache
![Page 8: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/8.jpg)
Code wants to be structured hierarchically
• abstraction• modularity
database
thrift
memcache
![Page 9: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/9.jpg)
Code wants to be structured hierarchically
• abstraction• modularity
database
thrift
memcache
![Page 10: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/10.jpg)
Execution wants to be structured horizontally
• Overlap multiple requests• Batch requests to the same data source• Cache multiple requests for the same data
database
thrift
memcache
![Page 11: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/11.jpg)
• Furthermore, each data source has different characteristics• Batch request API?
• Sync or async API?
• Set up a new connection for each request, or keep a pool of connections around?
• Want to abstract away from all of this in the business logic layer
![Page 12: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/12.jpg)
But we know how to do this!
![Page 13: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/13.jpg)
But we know how to do this!
• Concurrency.• Threads let us keep our abstractions & modularity while
executing things at the same time.
• Caching/batching can be implemented as a service in the process• as we do with the IO manager in GHC
![Page 14: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/14.jpg)
But we know how to do this!
• Concurrency.• Threads let us keep our abstractions & modularity while
executing things at the same time.
• Caching/batching can be implemented as a service in the process• as we do with the IO manager in GHC
• But concurrency (the programing model) isn’t what we want here.
![Page 15: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/15.jpg)
But we know how to do this!
• Concurrency.• Threads let us keep our abstractions & modularity while
executing things at the same time.
• Caching/batching can be implemented as a service in the process• as we do with the IO manager in GHC
• But concurrency (the programing model) isn’t what we want here.
• Example...
![Page 16: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/16.jpg)
• x and y are Facebook users
• suppose we want to compute the number of friends that x and y have in common
• simplest way to write this:
length (intersect (friendsOf x) (friendsOf y))
![Page 17: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/17.jpg)
Brief detour: TAO• TAO implements Facebook’s data model
• most important data source we need to deal with
• Data is a graph• Nodes are “objects”, identified by 64-bit ID• Edges are “assocs” (directed; a pair of 64-bit IDs)
• Objects and assocs have a type• object fields determined by the type
• Basic operations:• Get the object with a given ID• Get the assocs of a given type from a given ID
User A
User B
User C
User D
FRIENDS
![Page 18: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/18.jpg)
• Back to our example
length (intersect (friendsOf x) (friendsOf y))
• (friendsOf x) makes a request to TAO to get all the IDs for which there is an assoc of type FRIEND (x,_).
• TAO has a multi-get API; very important that we submit (friendsOf x) and (friendsOf y) as a single operation.
![Page 19: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/19.jpg)
Using concurrency
• This:
length (intersect (friendsOf x) (friendsOf y))
![Page 20: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/20.jpg)
Using concurrency
• This:
• Becomes this:
length (intersect (friendsOf x) (friendsOf y))
dom1 <- newEmptyMVarm2 <- newEmptyMVarforkIO (friendsOf x >>= putMVar m1)forkIO (friendsOf y >>= putMVar m2)fx <- takeMVar m1fy <- takeMVar m2return (length (intersect fx fy))
![Page 21: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/21.jpg)
![Page 22: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/22.jpg)
• Using the async package:
doax <- async (friendsOf x)ay <- async (friendsOf y)fx <- wait axfy <- wait ayreturn (length (intersect fx fy))
![Page 23: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/23.jpg)
![Page 24: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/24.jpg)
• Using Control.Concurrent.Async.concurrently:
do(fx,fy) <- concurrently (friendsOf x) (friendsOf y)return (length (intersect fx fy))
![Page 25: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/25.jpg)
![Page 26: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/26.jpg)
Why not concurrency?
• friendsOf x and friendsOf y are • obviously independent• obviously both needed• “pure”
![Page 27: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/27.jpg)
Why not concurrency?
• friendsOf x and friendsOf y are • obviously independent• obviously both needed• “pure”
• Caching is not just an optimisation:• if friendsOf x is requested twice, we must get the same
answer both times• caching is a requirement
![Page 28: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/28.jpg)
Why not concurrency?
• friendsOf x and friendsOf y are • obviously independent• obviously both needed• “pure”
• Caching is not just an optimisation:• if friendsOf x is requested twice, we must get the same
answer both times• caching is a requirement
• we don’t want the programmer to have to ask for concurrency here
![Page 29: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/29.jpg)
• Could we use unsafePerformIO?
• we could do caching this way, but not concurrency. Execution will stop at the first data fetch.
length (intersect (friendsOf x) (friendsOf y))
friendsOf = unsafePerformIO ( .. )
![Page 30: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/30.jpg)
Central problem
• Reorder execution of an expression to perform data fetching optimally.
• The programming model has no side effects (other than reading)
![Page 31: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/31.jpg)
What we would like to do:
• explore the expression along all branches to get a set of data fetches
![Page 32: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/32.jpg)
What we would like to do:
• submit the data fetches
![Page 33: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/33.jpg)
What we would like to do:
• wait for the responses
![Page 34: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/34.jpg)
What we would like to do:
• now the computation is unblocked along multiple paths• ... explore again• collect the next batch of data fetches• and so on
Round 0
Round 1
Round 2
![Page 35: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/35.jpg)
• Facebook’s existing solution to this problem: FXL
• Lets you write
• And optimises the data fetching correctly.
• But it’s an interpreter, and works with an explicit representation of the computation graph.
Length(Intersect(FriendsOf(X),FriendsOf(Y)))
![Page 36: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/36.jpg)
• We want to run compiled code for efficiency
• And take advantage of Haskell• high quality implementation
• great libraries for writing business logic etc.
• So, how can we implement the right data fetching behaviour in a Haskell DSL?
![Page 37: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/37.jpg)
Start with a concurrency monad
newtype Haxl a = Haxl { unHaxl :: Result a }
data Result a = Done a| Blocked (Haxl a)
instance Monad Haxl wherereturn a = Haxl (Done a)m >>= k = Haxl $case unHaxl m of
Done a -> unHaxl (k a)Blocked r -> Blocked (r >>= k)
![Page 38: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/38.jpg)
• Old idea:• Scholz 1995 A Concurrency Monad Based on Constructor
Primitives
• Claessen’s A Poor Man’s Concurrency Monad (JFP 1999)
• Called the “Resumption Monad” in Harrison’s Cheap (But Functional) Threads (JFP 2004)
• Used to good effect in Li & Zdancewic’s PLDI’07 paper Combining events and threads for scalable network services implementation and evaluation of monadic, application-level concurrency primitives
• Nowadays it’s a Free Monad
![Page 39: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/39.jpg)
• The concurrency monad lets us run a computation until it blocks, do something, then resume it
• But we need to know what it blocked on...
• Could add some info to the Blocked constructor
![Page 40: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/40.jpg)
newtype Haxl a = Haxl { unHaxl :: IO (Result a) }
data Result a = Done a| Blocked (Haxl a)
instance Monad Haxl wherereturn a = Haxl (return (Done a))m >>= k = Haxl $ doa <- unHaxl mcase a of
Done a -> unHaxl (k a)Blocked r -> return (Blocked (r >>= k))
dataFetch :: Request a -> Haxl adataFetch r = doaddRequest rHaxl (return (Blocked (Haxl (getResult r))))
• We choose to put it in IO instead:
![Page 41: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/41.jpg)
• Ok so far, but we still get blocked at the first data fetch.
numCommonFriends x y = dofx <- friendsOf xfy <- friendsOf yreturn (length (intersect fx fy))
Blocked here
![Page 42: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/42.jpg)
• To explore multiple branches, we need to use Applicative
instance Applicative Haxl wherepure = returnHaxl f <*> Haxl a = Haxl $ dor <- fcase r ofDone f' -> dora <- a case ra ofDone a' -> return (Done (f' a'))Blocked a' -> return (Blocked (f' <$> a'))
Blocked f' -> dora <- acase ra ofDone a' -> return (Blocked (f' <*> return a'))Blocked a' -> return (Blocked (f' <*> a'))
![Page 43: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/43.jpg)
• This is precisely the advantage of Applicative over Monad: • Applicative allows exploration of the structure of the
computation
• Our example is now written:
• Or:
numCommonFriends x y = length <$> (intersect <$> friendsOf x <*> friendsOf y)
numCommonFriends x y = length <$> common (friendsOf x) (friendsOf y)where common = liftA2 intersect
![Page 44: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/44.jpg)
• Note that we still have the Monad!
• The Monad allows us to make decisions based on values when we need to.
• Batching will not explore the then/else branches• exactly what we want.
dofs <- friendsOf xif simon `elem` fs
then ...else ...
Blocked here
![Page 45: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/45.jpg)
• But it does mean the programmer should use Applicative composition to get batching.
• This is suboptimal:
• So our plan is to guide programmers away from this by using a combination of:• APIs that discourage it (e.g. providing lifted operations)
• Code analysis and automatic feedback
• Profiling
dofx <- friendsOf xfy <- friendsOf yreturn (length (intersect fx fy))
![Page 46: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/46.jpg)
• We really want bulk operations to benefit from batching.
• But this doesn’t work: mapM uses Monad rather than Applicative composition.
• This is why traverse exists:
• So in our library, we make mapM = traverse
• Also: sequence = sequenceA
traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
friendsOfFriends id = concat <$> (mapM friendsOf =<< friendsOf id)
![Page 47: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/47.jpg)
---- Return the names of all the friends of the user that are-- members of the Functional Programming group.--fpFriends :: Id -> Haxl [Text]fpFriends id = do
fs <- friendsOf idfp <- filterM (memberOfGroup functionalProgramming) fsmapM getName fp
![Page 48: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/48.jpg)
---- Return the names of all the friends of the user that are-- members of the Functional Programming group.--fpFriends :: Id -> Haxl [Text]fpFriends id = do
fs <- friendsOf idfp <- filterM (memberOfGroup functionalProgramming) fsmapM getName fp
friendsOf
memberOfGroup
getName
![Page 49: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/49.jpg)
•But we need a different filterM. The standard one:
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]filterM _ [] = return []filterM p (x:xs) = do
flg <- p xys <- filterM p xsreturn (if flg then x:ys else ys)
![Page 50: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/50.jpg)
•But we need a different filterM. The standard one:
•and the one that batches the data fetches correctly:
filterM :: (Applicative f, Monad f) => (a -> f Bool) -> [a] -> f [a]filterM pred xs = dobools <- mapM pred xsreturn [ x | (x,True) <- zip xs bools ]
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]filterM _ [] = return []filterM p (x:xs) = do
flg <- p xys <- filterM p xsreturn (if flg then x:ys else ys)
![Page 51: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/51.jpg)
• Let’s write the same example in a slightly different way:
nameOfFPFriend :: Id -> Haxl (Maybe Text)nameOfFPFriend id = dob <- memberOfGroup functionalProgramming idif b then Just <$> getName id
else return Nothing
fpFriends2 :: Id -> Haxl [Text]fpFriends2 id = dofs <- friendsOf idname_maybes <- mapM nameOfFPFriend fsreturn (catMaybes name_maybes)
![Page 52: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/52.jpg)
friendsOf
memberOfGroup
getName
•Before there were obviously 3 rounds.
![Page 53: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/53.jpg)
friendsOf
memberOfGroup
getName
•Before there were obviously 3 rounds.•But in the new version, we packaged up two fetches in nameOfFPFriend
nameOfFPFriend
![Page 54: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/54.jpg)
friendsOf
memberOfGroup
getName
•Before there were obviously 3 rounds.•But in the new version, we packaged up two fetches in nameOfFPFriend•The monad still executes everything in 3 rounds.
nameOfFPFriend
![Page 55: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/55.jpg)
•But maybe we don’t like all this mapM/filterM stuff
![Page 56: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/56.jpg)
•But maybe we don’t like all this mapM/filterM stuff•A query-style API might be nicer.
![Page 57: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/57.jpg)
•But maybe we don’t like all this mapM/filterM stuff•A query-style API might be nicer.
newtype HaxlQuery a = HaxlQuery { query :: Haxl [a] }-- basically ListT Haxl, but ListT doesn’t work!-- instances of Monad, MonadPlus
selectFrom :: Haxl [a] -> HaxlQuery aselectFrom = HaxlQuery
liftH :: Haxl a -> HaxlQuery aliftH m = HaxlQuery (return <$> m)
suchThat :: Haxl Bool -> HaxlQuery ()suchThat m = liftH m >>= guard
![Page 58: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/58.jpg)
•But maybe we don’t like all this mapM/filterM stuff•A query-style API might be nicer.
fpFriends3 :: Id -> Haxl [Text]fpFriends3 id = query [ name | f <- selectFrom (friendsOf id),
_ <- suchThat (memberOfFPGroup f)name <- liftH (getName f) ]
newtype HaxlQuery a = HaxlQuery { query :: Haxl [a] }-- basically ListT Haxl, but ListT doesn’t work!-- instances of Monad, MonadPlus
selectFrom :: Haxl [a] -> HaxlQuery aselectFrom = HaxlQuery
liftH :: Haxl a -> HaxlQuery aliftH m = HaxlQuery (return <$> m)
suchThat :: Haxl Bool -> HaxlQuery ()suchThat m = liftH m >>= guard
![Page 59: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/59.jpg)
DSL tricks
instance Num a => Num (Haxl a) where(+) = liftA2 (+)(-) = liftA2 (-)-- etc.
simonFriends id =filterM (isCalled “Simon”) =<< friendsOf id
numCoolFriends id = length <$> fpFriends id + length <$> simonFriends id
duplicate data fetching between the two arguments
![Page 60: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/60.jpg)
Implementation
• DataSource abstraction
• Replaying requests
• Scaling
• Hot-code swapping
• Experience
• Status etc.
![Page 61: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/61.jpg)
Data Source Abstraction
• We want to structure the system like this:
• Core code includes the monad, caching support etc.
• Core is generic: no data sources built-in
Core
TAO
Memcache
other service...
Data sources
![Page 62: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/62.jpg)
How do we arrange this?
• Three ways that a data source interacts with core:• issuing a data fetch request
• persistent state
• fetching the data
• Package this up in a type class
• Let’s look at requests first...
class DataSource req where...
parameterised over the type of
requests
![Page 63: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/63.jpg)
Example Request type
• Core has a single way to issue a request
• Note how the result type matches up.
data ExampleReq a whereCountAardvarks :: String -> ExampleReq IntListWombats :: Id -> ExampleReq [Id]deriving Typeable
it’s a GADT, where the type parameter is the type of the result of this request
dataFetch :: DataSource req => req a -> Haxl a
![Page 64: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/64.jpg)
Fetching requests
• It is Core’s job to keep track of requests submitted via dataFetch
• When the computation is blocked, we have to fetch the batch of requests from the data source
class DataSource req wherefetch :: [BlockedFetch req] -> IO ()
data BlockedFetch req= forall a . BlockedFetch (req a) (MVar a)
![Page 65: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/65.jpg)
How do we store the requests?
• Core doesn’t know which data sources are in use
• Need to use dynamic typing (Typeable) to implement the store of pending requests and the cache.
class (Typeable1 req, Hashable1 req, Eq1 req) => DataSource req where ...
class Eq1 req whereeq1 :: req a -> req a -> Bool
unfortunate that we need this
![Page 66: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/66.jpg)
Data Source State
• Core keeps track of the state for each data source
• Passes it to fetch
class (Typeable1 req, Hashable1 req, Eq1 req) => DataSource req where
data DataState req
fetch :: DataState req-> [BlockedFetch req]-> IO ()
The state for a data source is an associated datatype
![Page 67: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/67.jpg)
• Clean data source abstraction
• Means that we can plug in any set of data sources at runtime• e.g. mock data sources for testing
• core code can be built & tested independently
![Page 68: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/68.jpg)
Replayability
• The Haxl monad and the type system give us:• Guarantee of no side effects, except via dataFetch
• Guarantee that everything is cached
• The ability to replay requests...
![Page 69: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/69.jpg)
user code
![Page 70: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/70.jpg)
user code Haxl Core
![Page 71: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/71.jpg)
user code Haxl Core data sources
![Page 72: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/72.jpg)
user code Haxl Core data sources
Cache
![Page 73: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/73.jpg)
user code Haxl Core data sources
Cache
![Page 74: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/74.jpg)
user code Haxl Core data sources
Cache
![Page 75: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/75.jpg)
user code Haxl Core data sources
Cache
![Page 76: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/76.jpg)
user code Haxl Core data sources
Cache
• The data sources change over time
• But if we persist the cache, we can re-run the user code and get the same results
• Great for • testing• fault diagnosis• profiling
![Page 77: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/77.jpg)
Scaling• Each server has lots of cores, pounded by requests
from other boxes constantly.
![Page 78: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/78.jpg)
Hot code swapping
• 1-2K machines, new code pushed many times per day
• Use GHC’s built-in linker• Had to modify it to unload code
• GC detects when it is safe to release old code
• We can swap in new code while requests are still running on the old code
![Page 79: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/79.jpg)
Haskell Extensions?
![Page 80: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/80.jpg)
Haskell Extensions?
• Other people on the team are Haskell beginners
![Page 81: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/81.jpg)
Haskell Extensions?
• Other people on the team are Haskell beginners
• Goal: keep the code simple, use few extensions
![Page 82: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/82.jpg)
Haskell Extensions?
• Other people on the team are Haskell beginners
• Goal: keep the code simple, use few extensions
{-# LANGUAGE ExistentialQuantification,TypeFamilies,GADTs,RankNTypes,ScopedTypeVariables,DeriveDataTypeable,StandaloneDeriving,MultiParamTypeClasses,FlexibleInstances
#-}
![Page 83: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/83.jpg)
Haskell Extensions?
• Other people on the team are Haskell beginners
• Goal: keep the code simple, use few extensions
• But we’re doing ok.
{-# LANGUAGE ExistentialQuantification,TypeFamilies,GADTs,RankNTypes,ScopedTypeVariables,DeriveDataTypeable,StandaloneDeriving,MultiParamTypeClasses,FlexibleInstances
#-}
![Page 84: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/84.jpg)
Status
• Prototyped most features (including hot code swapping & scaling)
• Core is written
• We have a few data sources, more in the works
• Busy hooking it up to the infrastructure
• Can play around with the system in GHCi, including data sources
![Page 85: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/85.jpg)
Key points
• Haxl Monad:• Implicit concurrency with Applicative
• Automatic batching of data fetches
• Haskell’s type system guarantees the user can’t break the rules• ... and gives us guaranteed replayability
• Clean separation of core from datasources
• Scaling & hot-code swapping
![Page 86: The Haxl Project at Facebook - Haskell · The Haxl Project at Facebook Simon Marlow Jon Coens Louis Brandy Jon Purdy & others. Business Logic service API Databases Other back-end](https://reader031.vdocuments.net/reader031/viewer/2022041301/5e118f3852f4b57499074a1b/html5/thumbnails/86.jpg)
Questions?