java 8 stream api. a different way to process collections
DESCRIPTION
A look on one of the features of Java 8 hidden behind the lambdas. A different way to iterate Collections. You'll never see the Collecions the same way. These are the slides I used on my talk at the "Tech Thursday" by Oracle in June in Madrid.TRANSCRIPT
Java8 Stream APIA different way to process collectionsDavid Gómez G.@[email protected]
Streams? What’s that?
A Stream is…An convenience method to iterate over
collections in a declarative wayList<Integer> numbers = new ArrayList<Integer>();for (int i= 0; i < 100 ; i++) { numbers.add(i); }
List<Integer> evenNumbers = new ArrayList<>();for (int i : numbers) { if (i % 2 == 0) { evenNumbers.add(i); } }
@dgomezg
A Stream is…An convenience method to iterate over
collections in a declarative wayList<Integer> numbers = new ArrayList<Integer>();for (int i= 0; i < 100 ; i++) { numbers.add(i); }
List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList());
@dgomezg
So… Streams are collections?Not Really
Collections Streams
Sequence of elements
Computed at construction
In-memory data structure
Sequence of elements
Computed at iteration
Traversable only Once
External Iteration Internal Iteration
Finite size Infinite size
@dgomezg
Iterating a CollectionList<Integer> evenNumbers = new ArrayList<>();for (int i : numbers) { if (i % 2 == 0) { evenNumbers.add(i); } }
External Iteration - Use forEach or Iterator - Very verbose Parallelism by manually using Threads - Concurrency is hard to be done right! - Lots of contention and error-prone - Thread-safety
@dgomezg
Iterating a Stream
List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList());
Internal Iteration - No manual Iterators handling - Concise - Fluent API: chain sequence processing Elements computed only when needed
@dgomezg
Iterating a Stream
List<Integer> evenNumbers = numbers.parallelStream() .filter(n -> n % 2 == 0) .collect(toList());
Easily Parallelism - Concurrency is hard to be done right! - Uses ForkJoin - Process steps should be - stateless - independent
@dgomezg
Lambdas &
Method References
@FunctionalInterface
@FunctionalInterfacepublic interface Predicate<T> {
boolean test(T t); !!!!!}
An interface with exactly one abstract method !
!
@dgomezg
@FunctionalInterface
@FunctionalInterfacepublic interface Predicate<T> {
boolean test(T t); ! default Predicate<T> negate() { return (t) -> !test(t); } !}
An interface with exactly one abstract method Could have default methods, though! !
@dgomezg
Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>
@FunctionalInterfacepublic interface Predicate<T> {
boolean test(T t); }
T -> boolean
@dgomezg
Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>
@FunctionalInterfacepublic interface Runnable {
void run(); }
() -> void
@dgomezg
Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>
@FunctionalInterfacepublic interface Supplier<T> {
T get(); }
() -> T
@dgomezg
Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>
@FunctionalInterfacepublic interface BiFunction<T, U, R> {
R apply(T t, U t); }
(T, U) -> R
@dgomezg
Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>
@FunctionalInterfacepublic interface Comparator<T> {
int compare(T o1, T o2); }
(T, T) -> int
@dgomezg
Method ReferencesAllows to use a method name as a lambda Usually better readability !
Syntax: <TargetReference>::<MethodName> !
TargetReference: Instance or Class
@dgomezg
Method References
phoneCall -> phoneCall.getContact()
Method ReferenceLambda
PhoneCall::getContact
() -> Thread.currentThread() Thread::currentThread
(str, c) -> str.indexOf(c) String::indexOf
(String s) -> System.out.println(s) System.out::println
@dgomezg
From Collections to
Streams
Characteristics of A Stream
• Interface to Sequence of elements • Focused on processing (not on storage) • Elements computed on demand
(or extracted from source) • Can be traversed only once • Internal iteration • Parallel Support • Could be Infinite
@dgomezg
Anatomy of a Stream
Source
Intermediate Operations
filter
map
order
function
Final operation
pipe
line
@dgomezg
Anatomy of Stream Iteration1. Start from the DataSource (Usually a
collection) and create the Stream
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); Stream<Integer> numbersStream = numbers.stream();
@dgomezg
Anatomy of Stream Iteration2. Add a chain of intermediate Operations
(Stream Pipeline)Stream<Integer> numbersStream = numbers.stream() .filter(new Predicate<Integer>() { @Override public boolean test(Integer number) { return number % 2 == 0; } }) ! .map(new Function<Integer, Integer>() { @Override public Integer apply(Integer number) { return number * 2; } });
@dgomezg
Anatomy of Stream Iteration2. Add a chain of intermediate Operations
(Stream Pipeline) - Better using lambdas
Stream<Integer> numbersStream = numbers.stream() .filter(number -> number % 2 == 0) .map(number -> number * 2);
@dgomezg
Anatomy of Stream Iteration3. Close with a Terminal Operation
List<Integer> numbersStream = numbers.stream() .filter(number -> number % 2 == 0) .map(number -> number * 2) .collect(Collectors.toList());
•The terminal operation triggers Stream Iteration •Before that, nothing is computed. •Depending on the terminal operation, the stream could be fully traversed or not.
@dgomezg
Stream operations
Operation TypesIntermediate operations • Always return a Stream • Chain as many as needed (Pipeline) • Guide processing of data • Does not start processing • Can be Stateless or Stateful
Terminal operations • Can return an object, a collection, or void • Start the pipeline process • After its execution, the Stream can not be revisited
Intermediate Operations // T -> boolean Stream<T> filter(Predicate<? super T> predicate); ! //T -> R<R> Stream<R> map(Function<? super T, ? extends R> mapper); //(T,T) -> intStream<T> sorted(Comparator<? super T> comparator); Stream<T> sorted(); ! //T -> voidStream<T> peek(Consumer<? super T> action); !Stream<T> distinct();Stream<T> limit(long maxSize);Stream<T> skip(long n);
@dgomezg
Final Operations
Object[] toArray(); void forEach(Consumer<? super T> action); //T -> void<R, A> R collect(Collector<? super T, A, R> collector);!
!java.util.stream.Collectors.toList(); java.util.stream.Collectors.toSet(); java.util.stream.Collectors.toMap(); java.util.stream.Collectors.joining(CharSequence); !!!
@dgomezg
Final Operations (II)
//T,U -> R Optional<T> reduce(BinaryOperator<T> accumulator); //(T,T) -> int Optional<T> min(Comparator<? super T> comparator); //(T,T) -> int Optional<T> max(Comparator<? super T> comparator);long count();!
@dgomezg
Final Operations (y III)
//T -> boolean boolean anyMatch(Predicate<? super T> predicate);boolean allMatch(Predicate<? super T> predicate);boolean noneMatch(Predicate<? super T> predicate);!
@dgomezg
Usage examples - Context
public class Contact { private final String name; private final String city; private final String phoneNumber; private final LocalDate birth; public int getAge() { return Period.between(birth, LocalDate.now()) .getYears(); } //Constructor and getters omitted!}
@dgomezg
Usage examples - Contextpublic class PhoneCall { private final Contact contact; private final LocalDate time; private final Duration duration; ! //Constructor and getters omitted }
Contact me = new Contact("dgomezg", "Madrid", "555 55 55 55", LocalDate.of(1975, Month.MARCH, 26));Contact martin = new Contact("Martin", "Santiago", "666 66 66 66", LocalDate.of(1978, Month.JANUARY, 17));Contact roberto = new Contact("Roberto", "Santiago", "111 11 11 11", LocalDate.of(1973, Month.MAY, 11));Contact heinz = new Contact("Heinz", "Chania", "444 44 44 44", LocalDate.of(1972, Month.APRIL, 29));Contact michael = new Contact("michael", "Munich", "222 22 22 22", LocalDate.of(1976, Month.DECEMBER, 8));List<PhoneCall> phoneCallLog = Arrays.asList( new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 28), Duration.ofSeconds(125)), new PhoneCall(martin, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(5)), new PhoneCall(roberto, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(12)), new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 28), Duration.ofMinutes(3)), new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 29), Duration.ofSeconds(90)), new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 30), Duration.ofSeconds(365)), new PhoneCall(heinz, LocalDate.of(2014, Month.JUNE, 1), Duration.ofMinutes(7)), new PhoneCall(martin, LocalDate.of(2014, Month.JUNE, 2), Duration.ofSeconds(315))) ;
@dgomezg
People I phoned in June
phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.JUNE) .map(phoneCall -> phoneCall.getContact().getName()) .distinct() .forEach(System.out::println);!
@dgomezg
Seconds I talked in May
Long total = phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .map(PhoneCall::getDuration) .collect(summingLong(Duration::getSeconds));
@dgomezg
Seconds I talked in MayOptional<Long> total = phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .map(PhoneCall::getDuration) .reduce(Duration::plus); total.ifPresent(duration -> {System.out.println(duration.getSeconds());} ); !
@dgomezg
Did I phone to Paris?
boolean phonedToParis = phoneCallLog.stream() .anyMatch(phoneCall -> "Paris".equals(phoneCall.getContact().getCity()))!!
@dgomezg
Give me the 3 longest phone calls
phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .sorted(comparing(PhoneCall::getDuration)) .limit(3) .forEach(System.out::println);
@dgomezg
Give me the 3 shortest ones
phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .sorted(comparing(PhoneCall::getDuration).reversed()) .limit(3) .forEach(System.out::println);
@dgomezg
Creating Streams
Streams can be created fromCollections Directly from values Generators (infinite Streams) Resources (like files)
Stream ranges
@dgomezg
From collections
use stream()
List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}
Stream<Integer> evenNumbers = numbers.stream();
or parallelStream()
Stream<Integer> evenNumbers = numbers.parallelStream();
@dgomezg
Directly from Values & ranges
Stream.of("Using", "Stream", "API", "From", “Java8”);
can convert into parallelStreamStream.of("Using", "Stream", "API", "From", “Java8”) .parallel();
@dgomezg
Generators - Functions
Stream<Integer> integers = Stream.iterate(0, number -> number + 2);
This is an infinite Stream!, will never be exhausted!
Stream fibonacci = Stream.iterate(new int[]{0,1}, t -> new int[]{t[1],t[0]+t[1]}); fibonacci.limit(10) .map(t -> t[0]) .forEach(System.out::println);
@dgomezg
Generators - Functions
Stream<Integer> integers = Stream.iterate(0, number -> number + 2);
This is an infinite Stream!, will never be exhausted!
Stream fibonacci = Stream.iterate(new int[]{0,1}, t -> new int[]{t[1],t[0]+t[1]}); fibonacci.limit(10) .map(t -> t[0]) .forEach(System.out::println);
@dgomezg
From Resources (Files)
Stream<String> fileContent = Files.lines(Paths.get(“readme.txt”));
Files.lines(Paths.get(“readme.txt”)) .flatMap(line -> Arrays.stream(line.split(" "))) .distinct() .count()); !
Count all distinct words in a file
@dgomezg
Parallelism
Parallel Streams
use stream()
List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}
//This will use just a single thread Stream<Integer> evenNumbers = numbers.stream();
or parallelStream()//Automatically select the optimum number of threads Stream<Integer> evenNumbers = numbers.parallelStream();
@dgomezg
Let’s test it
use stream()
!for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());} 5001983 elements computed in 828 msecs with 2 threads 5001983 elements computed in 843 msecs with 2 threads 5001983 elements computed in 675 msecs with 2 threads 5001983 elements computed in 795 msecs with 2 threads
@dgomezg
Let’s test it
use stream()
!for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}
4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads
@dgomezg
Enough, for now, But this is just the beginning
Thank You.
www.adictosaltrabajlo.com