ne scala 2016 roundup
TRANSCRIPT
About ● A media company - Creative Content Everyone Can Afford● Since 2011, we saved more than $5B for our customers● We reached 1M clips of inventory + $1M marketplace
sales faster than any company in history● 15 engineers (total 60+ employees)
2
About Me● Use Scala for about 5 years● Not using Scala day to day for the past 9 months● Scala developer, not sorcerer.● Data Handyman @ Videoblocks - We Heart Data● codingphilosophy.com
3
● On 3/4 - 3/5● Since 2011● At Philadelphia this year (last year in Boston)● 3/4: conference - 1 track, 11 talks● 3/5: unconference - 2-5 tracks, 26 talks● pretty academia as usual● $60 - order of magnitude cheaper, and better
NEScala 2016
5
You have to be there!
Scalaz-Stream Masterclass● https://github.com/runarorama/ircz● monad is just a fancy way of saying "accumulate the
effects"● monoid is just a fancy way of saying "collect the
elements"
8
Spark RDD● compile-time type safe● api is based on Scala collection (good and bad)
10
def collect(): Array[T]def collect[U](f: PartialFunction[T,U])(implicit arg0: ClassTag[U]): RDD[U]
//val rdd = sc.textFile("hdfs:/user/bmc/wikipedia‐pagecounts.gz")//val parsedRDD = rdd.flatMap { line=>// line.split("""\s+""") match { // case Array(project,_,numRequests,_)=>Some((project,numRequests))// case_=>None// }}parsedRDD.filter { case (project,numRequest) => project=="en" }. reduceByKey(_+_). take(100). Foreach{ case (project,requests) => println(s"$project:$requests") }
Spark DataFrame● DSL● Optimized by catalyst● lost type safety
11
val df = parsedRDD.toDF("project", "numRequests")df.groupBy($"page"). agg(sum($"numRequests").as("count")). limit(100). show(100)
scala>:typedf.collect()Array[org.apache.spark.sql.Row]
Spark Dataset● DSL + RDD api● Type safety● Optimized● Use Tungsten, faster (20x) and smaller (4x)
12
val df = sqlContext.read.json("people.json")case class Person(name: String, age: Long)val ds: Dataset[Person] = df.as[Person]ds.groupBy(_.name).count()
experimental
Macros for Mortals● illTyped from Shapeless● AST is pretty hard to deal with● quasiquotes: val tree = q"class C"
14
Logical Programming in Scala
https://github.com/stewSquared/ukanren 16
Monadic Logging?● Monad to accumulate effects (log)● Write log of the same thread together● Write log with logic
○ if (latency > 10ms) { mlog.toStdOut }
● Maybe too complicated and defeat the purpose of logging?○ Async + JVM crash, how to know what's going?
18
CBT: Community Build Tool● https://github.com/cvogt/cbt● Side project from Chris Vogt● Write Scala code to build, instead of the DSL (black
magic) of SBT
24
Modularity● We shouldn't need to Mock if we modularize our code
correctly● https://adriaanm.github.io/files/higher.pdf
26
Data Pipeline with Akka Stream● Websocket as emitter● Kafka as queue● Akka Stream as process engine
28
Type Parameter v.s. Type Member● Sometimes compiler has issues with type parameter
○ trait Foo[F <: Foo[T]]
30
trait Foo[+T]
trait Foo { Type T <: Exception}
Rapture - Guava of Scala
34
● single point of failure is GREAT in code● config through import● Rapture JSON supports 8 JSON backend● syntax error of html or json are checked by compiler● use gitter to report bug to Jon
SI-2712
36
Scala Issue: Implement higher-order unification for type constructor inference.
object Test { def meh[M[_], A](x: M[A]): M[A] = x meh{(x: Int) => x} // should solve ?M = [X] X => X and ?A = Int ...}
Resources
38
● Runar's blog: http://blog.higher-order.com/● Slides of Brendan's talk: https://speakerdeck.com/bwmcadams/nescala-16-scala-
macros-for-mortals-or-how-i-learned-to-stop-worrying-and-mumbling-wtf● Logic programming: https://github.com/stewSquared/ukanren● Akka stream: https://github.com/pkinsky/akka-streams-example● Type member vs type parameter: http://www.artima.com/weblogs/viewpost.jsp?
thread=270195● Good blog of category theory: http://bartoszmilewski.com/2014/10/28/category-
theory-for-programmers-the-preface/● Unconference grid: https://goo.gl/ei5ijy● Rapture (The Grava of Scala) https://github.com/propensive/rapture● Another javascript library in shapeless: https://github.com/travisbrown/circe● Typesafe RPC: https://github.com/lihaoyi/autowire● https://issues.scala-lang.org/browse/SI-2712