lessons learnt with lambdas and streams in jdk 8 8...lambda expressions in jdk 8 § old style,...
TRANSCRIPT
© Copyright Azul Systems 2016
© Copyright Azul Systems 2015
@speakjava azul.com
Lessons Learnt With Lambdas and Streams
in JDK 8
Simon Ritter Deputy CTO, Azul Systems
1
© Copyright Azul Systems 2016
A clever man learns from his mistakes...
...a wise man learns from other people’s
© Copyright Azul Systems 2016
Lambdas And Streams Primer
© Copyright Azul Systems 2016
Lambda Expressions In JDK 8
§ Old style, anonymous inner classes
§ New style, using a Lambda expression
§ Can be used wherever the type is a functional interface
4
Simplified Parameterised Behaviour
newThread(newRunnable{publicvoidrun(){doSomeStuff();}}).start();
newThread(()->doSomeStuff()).start();
© Copyright Azul Systems 2016
Functional Interface Definition § Is an interface § Must have only one abstract method
– In JDK 7 this would mean only one method (like ActionListener)
§ JDK 8 introduced default methods – Adding multiple inheritance of types to Java – These are, by definition, not abstract
§ JDK 8 also now allows interfaces to have static methods § @FunctionalInterface to have the compiler check
5
© Copyright Azul Systems 2016
Stream Overview § A stream pipeline consists of three types of things
– A source – Zero or more intermediate operations – A terminal operation
§ Producing a result or a side-effect
inttotal=transactions.stream().filter(t->t.getBuyer().getCity().equals(“London”)).mapToInt(Transaction::getPrice).sum();
Source
Intermediate operation Terminal operation
© Copyright Azul Systems 2016
Stream Terminal Operations § The pipeline is only evaluated when the terminal operation
is called – All operations can execute sequentially or in parallel – Intermediate operations can be merged
§ Avoiding multiple redundant passes on data § Short-circuit operations (e.g. findFirst) § Lazy evaluation
– Stream characteristics help identify optimisations § DISTINT stream passed to distinct() is a no-op
© Copyright Azul Systems 2016
Lambda Expressions And Delayed Execution
© Copyright Azul Systems 2016
Performance Impact For Logging § Heisenberg’s uncertainty principle
§ Setting log level to INFO still has a performance impact § Since Logger determines whether to log the message the
parameter must be evaluated even when not used
9
logger.finest(getSomeStatusData());
Always executed
© Copyright Azul Systems 2016
Supplier<T> § Represents a supplier of results § All relevant logging methods now have a version that takes
a Supplier
§ Pass a description of how to create the log message – Not the message
§ If the Logger doesn’t need the value it doesn’t invoke the Lambda
§ Can be used for other conditional activities 10
logger.finest(getSomeStatusData()); logger.finest(()->getSomeStatusData());
© Copyright Azul Systems 2016
Avoiding Loops In Streams
© Copyright Azul Systems 2016
Functional v. Imperative § For functional programming you should not modify state § Java supports closures over values, not closures over
variables § But state is really useful…
12
© Copyright Azul Systems 2016
Counting Methods That Return Streams
13
Still Thinking Imperatively
Set<String>sourceKeySet=streamReturningMethodMap.keySet();LongAddersourceCount=newLongAdder();sourceKeySet.stream().forEach(c->sourceCount.add(streamReturningMethodMap.get(c).size()));
© Copyright Azul Systems 2016
Counting Methods That Return Streams
14
Functional Way
sourceKeySet.stream().mapToInt(c->streamReturningMethodMap.get(c).size()).sum();
© Copyright Azul Systems 2016
Printing And Counting
15
Still Thinking Imperatively
LongAddernewMethodCount=newLongAdder();functionalParameterMethodMap.get(c).stream().forEach(m->{output.println(m);if(isNewMethod(c,m))newMethodCount.increment();});
© Copyright Azul Systems 2016
Printing And Counting
16
More Functional, But Not Pure Functional
intcount=functionalParameterMethodMap.get(c).stream().mapToInt(m->{intnewMethod=0;output.println(m);if(isNewMethod(c,m))newMethod=1;returnnewMethod}).sum();
There is still state being modified in the Lambda
© Copyright Azul Systems 2016
Printing And Counting
17
Even More Functional, But Still Not Pure Functional
intcount=functionalParameterMethodMap.get(nameOfClass).stream().peek(method->output.println(method)).mapToInt(m->isNewMethod(nameOfClass,m)?1:0).sum();
Strictly speaking printing is a side effect, which is not purely functional
© Copyright Azul Systems 2016
The Art Of Reduction (Or The Need to Think Differently)
© Copyright Azul Systems 2016
A Simple Problem § Find the length of the longest line in a file § Hint: BufferedReader has a new method, lines(), that
returns a Stream
19
BufferedReaderreader=...intlongest=reader.lines().mapToInt(String::length).max().getAsInt();
© Copyright Azul Systems 2016
Another Simple Problem § Find the length of the longest line in a file
20
© Copyright Azul Systems 2016
Another Simple Problem § Find the length of the longest line in a file
21
© Copyright Azul Systems 2016
Naïve Stream Solution
§ That works, so job done, right? § Not really. Big files will take a long time and a lot of
resources § Must be a better approach
22
Stringlongest=reader.lines().sort((x,y)->y.length()-x.length()).findFirst().get();
© Copyright Azul Systems 2016
External Iteration Solution
§ Simple, but inherently serial § Not thread safe due to mutable state
23
Stringlongest="";while((Strings=reader.readLine())!=null)if(s.length()>longest.length())longest=s;
© Copyright Azul Systems 2016
Recursive Approach
24
StringfindLongestString(Stringlongest,BufferedReaderreader){Stringnext=reader.readLine();if(next==null)returnlongest;if(next.length()>longest.length())longest=next;returnfindLongestString(longest,reader);}
© Copyright Azul Systems 2016
Recursion: Solving The Problem
§ No explicit loop, no mutable state, we’re all good now, right? § Unfortunately not:
– larger data sets will generate an OOM exception – Too many stack frames
25
Stringlongest=findLongestString("",reader);
© Copyright Azul Systems 2016
A Better Stream Solution § Stream API uses the well known filter-map-reduce pattern § For this problem we do not need to filter or map, just
reduce
Optional<T>reduce(BinaryOperator<T>accumulator)
§ BinaryOperator is a subclass of BiFunction– Rapply(Tt,Uu)
§ For BinaryOperator all types are the same – Tapply(Tx,Ty)
26
© Copyright Azul Systems 2016
A Better Stream Solution § The key is to find the right accumulator
– The accumulator takes a partial result and the next element, and returns a new partial result
– In essence it does the same as our recursive solution – But without all the stack frames
27
© Copyright Azul Systems 2016
A Better Stream Solution § Use the recursive approach as an accululator for a
reduction
28
StringlongestLine=reader.lines().reduce((x,y)->{if(x.length()>y.length())returnx;returny;}).get();
© Copyright Azul Systems 2016
A Better Stream Solution § Use the recursive approach as an accululator for a
reduction
29
StringlongestLine=reader.lines().reduce((x,y)->{if(x.length()>y.length())returnx;returny;}).get();
x in effect maintains state for us, by providing the partial result, which is the longest string found so far
© Copyright Azul Systems 2016
The Simplest Stream Solution § Use a specialised form of max()§ One that takes a Comparator as a parameter
§ comparingInt() is a static method on Comparator Comparator<T>comparingInt(ToIntFunction<?extendsT>keyExtractor)
30
reader.lines().max(comparingInt(String::length)).get();
© Copyright Azul Systems 2016
Parallel Streams: Myths And Realities
© Copyright Azul Systems 2016 32
© Copyright Azul Systems 2016
Serial And Parallel Streams § Syntactically very simple to change from serial to parallel
33
intsum=list.stream().filter(w->w.getColour()==RED).mapToInt(w->w.getWeight()).sum();
intsum=list.parallelStream().filter(w->w.getColour()==RED).mapToInt(w->w.getWeight()).sum();
© Copyright Azul Systems 2016
Serial And Parallel Streams § Syntactically very simple to change from serial to parallel
§ Use serial() or parallel() to change mid-stream
34
intsum=list.parallelStream().filter(w->w.getColour()==RED).serial().mapToInt(w->w.getWeight()).parallel().sum();
© Copyright Azul Systems 2016
Serial And Parallel Streams § Syntactically very simple to change from serial to parallel
§ Use serial() or parallel() to change mid-stream § Whole stream is processed serially or in parallel
– Last call wins § Operations should be stateless and independent
35
intsum=list.parallelStream().filter(w->w.getColour()==RED).serial().mapToInt(w->w.getWeight()).parallel().sum();
© Copyright Azul Systems 2016
Common ForkJoinPool § Created when JVM starts up § Default number of threads is equal to CPUs reported by OS
– Runtime.getRuntime().availableProcessors() § Can be changed manually
36
-Djava.util.concurrent.ForkJoinPool.common.parallelism=n
© Copyright Azul Systems 2016
Common ForkJoinPool
37
Set<String>workers=newConcurrentSet<String>();intsum=list.parallelStream().peek(n->workers.add(Thread.currentThread().getName())).filter(w->w.getColour()==RED).mapToInt(w->w.getWeight()).sum();System.out.println(”Workerthreadcount=”+workers.size());
-Djava.util.concurrent.ForkJoinPool.common.parallelism=4
Worker thread count = 5
© Copyright Azul Systems 2016
Common ForkJoinPool
38
Set<String>workers=newConcurrentSet<String>();...worker.stream().forEach(System.out::println);
-Djava.util.concurrent.ForkJoinPool.common.parallelism=4
ForkJoinPool.commonPool-worker-0 ForkJoinPool.commonPool-worker-1 ForkJoinPool.commonPool-worker-2 ForkJoinPool.commonPool-worker-3 main
© Copyright Azul Systems 2016
Parallel Stream Threads § Invoking thread is also used as worker § Parallel stream invokes ForkJoin synchronously
– Blocks until work is finished § Other application threads using parallel stream will be
affected § Beware IO blocking actions inside stream operartions § Do NOT nest parallel streams
© Copyright Azul Systems 2016
Parallel Stream Custom Pool Hack
40
ForkJoinPoolcustomPool=newForkJoinPool(4);ForkJoinTask<Integer>hackTask=customPool.submit(()->{returnlist.parallelStream().peek(w->workers.add(Thread.currentThread().getName())).filter(w->w.getColour()==RED).mapToInt(w->w.getWeight()).sum();});intsum=hackTask.get();
Worker thread count = 4
© Copyright Azul Systems 2016
Is A Parallel Stream Faster? § Using a parallel stream guarantees more work
– Setting up fork-join framework is overhead § This might complete more quickly § Depends on several factors
– How many elements in the stream (N) – How long each element takes to process (T) – Parallel streams improve with N x T – Operation types
§ map(), filter() good § sort(), distinct() not so good
41
© Copyright Azul Systems 2016
Lambdas And Streams And JDK 9
© Copyright Azul Systems 2016
Additional APIs § Optional now has a stream() method
– Returns a stream of one element or an empty stream § Collectors.flatMapping()
– Returns a Collector that converts a stream from one type to another by applying a flat mapping function
43
© Copyright Azul Systems 2016
Additional Stream Sources § java.util.regex.Matcher § java.util.Scanner (results and tokens) § java.net.NetworkInterface§ java.security.PermissionCollection
44
© Copyright Azul Systems 2016
Parallel Support For Files.lines() § Memory map file for UTF-8, ISO 8859-1, US-ASCII
– Character sets where line feeds easily identifiable § Efficient splitting of mapped memory region § Divides approximately in half
– To nearest line feed
45
© Copyright Azul Systems 2016
Parallel Lines Performance
46
© Copyright Azul Systems 2016
Stream dropWhile § Stream<T>dropWhile(Predicate<?superT>p)§ Ignore elements from stream until Predicate matches § Unordered stream still needs consideration
thermalReader.lines().mapToInt(i->Integer.parseInt(i)).dropWhile(i->i<56).forEach(System.out::println);
© Copyright Azul Systems 2016
Stream takeWhile § Stream<T>takeWhile(Predicate<?superT>p)§ Select elements from stream until Predicate matches § Unordered stream needs consideration
thermalReader.lines().mapToInt(i->Integer.parseInt(i)).takeWhile(i->i<56).forEach(System.out::println);
© Copyright Azul Systems 2016
Conclusions
© Copyright Azul Systems 2016
Conclusions § Lambdas and Stream are a very powerful combination § Does require developers to think differently
– Avoid loops, even non-obvious ones! – Reductions
§ Be careful with parallel streams § More to come in JDK 9 (and 10)
§ Join the Zulu.org community – www.zulu.org
50
© Copyright Azul Systems 2016
© Copyright Azul Systems 2015
@speakjava azul.com
Q & A
Simon Ritter Deputy CTO, Azul Systems
51