jdk/src/share/classes/java/util/stream/package-info.java
author psandoz
Wed, 17 Apr 2013 11:34:31 +0200
changeset 17168 b7d3500f2516
parent 17167 87067e3340d3
child 18156 edb590d448c5
permissions -rw-r--r--
8011426: java.util collection Spliterator implementations Summary: Spliterator implementations for collection classes in java.util. Reviewed-by: mduigou, briangoetz Contributed-by: Doug Lea <dl@cs.oswego.edu>, Paul Sandoz <paul.sandoz@oracle.com>
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
17167
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     1
/*
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     2
 * Copyright (c) 2012, 2013, Oracle and/or its affiliates. All rights reserved.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     3
 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     4
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     5
 * This code is free software; you can redistribute it and/or modify it
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     6
 * under the terms of the GNU General Public License version 2 only, as
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     7
 * published by the Free Software Foundation.  Oracle designates this
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     8
 * particular file as subject to the "Classpath" exception as provided
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
     9
 * by Oracle in the LICENSE file that accompanied this code.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    10
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    11
 * This code is distributed in the hope that it will be useful, but WITHOUT
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    12
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    13
 * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    14
 * version 2 for more details (a copy is included in the LICENSE file that
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    15
 * accompanied this code).
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    16
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    17
 * You should have received a copy of the GNU General Public License version
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    18
 * 2 along with this work; if not, write to the Free Software Foundation,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    19
 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    20
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    21
 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    22
 * or visit www.oracle.com if you need additional information or have any
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    23
 * questions.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    24
 */
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    25
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    26
/**
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    27
 * <h1>java.util.stream</h1>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    28
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    29
 * Classes to support functional-style operations on streams of values, as in the following:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    30
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    31
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    32
 *     int sumOfWeights = blocks.stream().filter(b -> b.getColor() == RED)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    33
 *                                       .mapToInt(b -> b.getWeight())
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    34
 *                                       .sum();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    35
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    36
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    37
 * <p>Here we use {@code blocks}, which might be a {@code Collection}, as a source for a stream,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    38
 * and then perform a filter-map-reduce ({@code sum()} is an example of a <a href="package-summary.html#Reduction">reduction</a>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    39
 * operation) on the stream to obtain the sum of the weights of the red blocks.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    40
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    41
 * <p>The key abstraction used in this approach is {@link java.util.stream.Stream}, as well as its primitive
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    42
 * specializations {@link java.util.stream.IntStream}, {@link java.util.stream.LongStream},
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    43
 * and {@link java.util.stream.DoubleStream}.  Streams differ from Collections in several ways:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    44
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    45
 * <ul>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    46
 *     <li>No storage.  A stream is not a data structure that stores elements; instead, they
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    47
 *     carry values from a source (which could be a data structure, a generator, an IO channel, etc)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    48
 *     through a pipeline of computational operations.</li>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    49
 *     <li>Functional in nature.  An operation on a stream produces a result, but does not modify
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    50
 *     its underlying data source.  For example, filtering a {@code Stream} produces a new {@code Stream},
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    51
 *     rather than removing elements from the underlying source.</li>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    52
 *     <li>Laziness-seeking.  Many stream operations, such as filtering, mapping, or duplicate removal,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    53
 *     can be implemented lazily, exposing opportunities for optimization.  (For example, "find the first
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    54
 *     {@code String} matching a pattern" need not examine all the input strings.)  Stream operations
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    55
 *     are divided into intermediate ({@code Stream}-producing) operations and terminal (value-producing)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    56
 *     operations; all intermediate operations are lazy.</li>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    57
 *     <li>Possibly unbounded.  While collections have a finite size, streams need not.  Operations
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    58
 *     such as {@code limit(n)} or {@code findFirst()} can allow computations on infinite streams
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    59
 *     to complete in finite time.</li>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    60
 * </ul>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    61
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    62
 * <h2><a name="StreamPipelines">Stream pipelines</a></h2>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    63
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    64
 * <p>Streams are used to create <em>pipelines</em> of <a href="package-summary.html#StreamOps">operations</a>.  A
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    65
 * complete stream pipeline has several components: a source (which may be a {@code Collection},
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    66
 * an array, a generator function, or an IO channel); zero or more <em>intermediate operations</em>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    67
 * such as {@code Stream.filter} or {@code Stream.map}; and a <em>terminal operation</em> such
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    68
 * as {@code Stream.forEach} or {@code java.util.stream.Stream.reduce}.  Stream operations may take as parameters
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    69
 * <em>function values</em> (which are often lambda expressions, but could be method references
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    70
 * or objects) which parameterize the behavior of the operation, such as a {@code Predicate}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    71
 * passed to the {@code Stream#filter} method.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    72
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    73
 * <p>Intermediate operations return a new {@code Stream}.  They are lazy; executing an
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    74
 * intermediate operation such as {@link java.util.stream.Stream#filter Stream.filter} does
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    75
 * not actually perform any filtering, instead creating a new {@code Stream} that, when
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    76
 * traversed, contains the elements of the initial {@code Stream} that match the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    77
 * given {@code Predicate}.  Consuming elements from the  stream source does not
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    78
 * begin until the terminal operation is executed.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    79
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    80
 * <p>Terminal operations consume the {@code Stream} and produce a result or a side-effect.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    81
 * After a terminal operation is performed, the stream can no longer be used and you must
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    82
 * return to the data source, or select a new data source, to get a new stream. For example,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    83
 * obtaining the sum of weights of all red blocks, and then of all blue blocks, requires a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    84
 * filter-map-reduce on two different streams:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    85
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    86
 *     int sumOfRedWeights  = blocks.stream().filter(b -> b.getColor() == RED)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    87
 *                                           .mapToInt(b -> b.getWeight())
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    88
 *                                           .sum();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    89
 *     int sumOfBlueWeights = blocks.stream().filter(b -> b.getColor() == BLUE)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    90
 *                                           .mapToInt(b -> b.getWeight())
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    91
 *                                           .sum();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    92
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    93
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    94
 * <p>However, there are other techniques that allow you to obtain both results in a single
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    95
 * pass if multiple traversal is impractical or inefficient.  TODO provide link
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    96
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    97
 * <h3><a name="StreamOps">Stream operations</a></h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    98
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
    99
 * <p>Intermediate stream operation (such as {@code filter} or {@code sorted}) always produce a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   100
 * new {@code Stream}, and are always<em>lazy</em>.  Executing a lazy operations does not
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   101
 * trigger processing of the stream contents; all processing is deferred until the terminal
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   102
 * operation commences.  Processing streams lazily allows for significant efficiencies; in a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   103
 * pipeline such as the filter-map-sum example above, filtering, mapping, and addition can be
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   104
 * fused into a single pass, with minimal intermediate state.  Laziness also enables us to avoid
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   105
 * examining all the data when it is not necessary; for operations such as "find the first
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   106
 * string longer than 1000 characters", one need not examine all the input strings, just enough
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   107
 * to find one that has the desired characteristics.  (This behavior becomes even more important
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   108
 * when the input stream is infinite and not merely large.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   109
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   110
 * <p>Intermediate operations are further divided into <em>stateless</em> and <em>stateful</em>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   111
 * operations.  Stateless operations retain no state from previously seen values when processing
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   112
 * a new value; examples of stateless intermediate operations include {@code filter} and
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   113
 * {@code map}.  Stateful operations may incorporate state from previously seen elements in
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   114
 * processing new values; examples of stateful intermediate operations include {@code distinct}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   115
 * and {@code sorted}.  Stateful operations may need to process the entire input before
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   116
 * producing a result; for example, one cannot produce any results from sorting a stream until
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   117
 * one has seen all elements of the stream.  As a result, under parallel computation, some
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   118
 * pipelines containing stateful intermediate operations have to be executed in multiple passes.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   119
 * Pipelines containing exclusively stateless intermediate operations can be processed in a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   120
 * single pass, whether sequential or parallel.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   121
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   122
 * <p>Further, some operations are deemed <em>short-circuiting</em> operations.  An intermediate
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   123
 * operation is short-circuiting if, when presented with infinite input, it may produce a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   124
 * finite stream as a result.  A terminal operation is short-circuiting if, when presented with
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   125
 * infinite input, it may terminate in finite time.  (Having a short-circuiting operation is a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   126
 * necessary, but not sufficient, condition for the processing of an infinite stream to
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   127
 * terminate normally in finite time.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   128
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   129
 * Terminal operations (such as {@code forEach} or {@code findFirst}) are always eager
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   130
 * (they execute completely before returning), and produce a non-{@code Stream} result, such
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   131
 * as a primitive value or a {@code Collection}, or have side-effects.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   132
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   133
 * <h3>Parallelism</h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   134
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   135
 * <p>By recasting aggregate operations as a pipeline of operations on a stream of values, many
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   136
 * aggregate operations can be more easily parallelized.  A {@code Stream} can execute either
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   137
 * in serial or in parallel.  When streams are created, they are either created as sequential
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   138
 * or parallel streams; the parallel-ness of streams can also be switched by the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   139
 * {@link java.util.stream Stream#sequential()} and {@link java.util.stream.Stream#parallel()}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   140
 * operations.  The {@code Stream} implementations in the JDK create serial streams unless
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   141
 * parallelism is explicitly requested.  For example, {@code Collection} has methods
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   142
 * {@link java.util.Collection#stream} and {@link java.util.Collection#parallelStream},
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   143
 * which produce sequential and parallel streams respectively; other stream-bearing methods
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   144
 * such as {@link java.util.stream.Streams#intRange(int, int)} produce sequential
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   145
 * streams but these can be efficiently parallelized by calling {@code parallel()} on the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   146
 * result. The set of operations on serial and parallel streams is identical. To execute the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   147
 * "sum of weights of blocks" query in parallel, we would do:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   148
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   149
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   150
 *     int sumOfWeights = blocks.parallelStream().filter(b -> b.getColor() == RED)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   151
 *                                               .mapToInt(b -> b.getWeight())
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   152
 *                                               .sum();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   153
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   154
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   155
 * <p>The only difference between the serial and parallel versions of this example code is
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   156
 * the creation of the initial {@code Stream}.  Whether a {@code Stream} will execute in serial
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   157
 * or parallel can be determined by the {@code Stream#isParallel} method.  When the terminal
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   158
 * operation is initiated, the entire stream pipeline is either executed sequentially or in
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   159
 * parallel, determined by the last operation that affected the stream's serial-parallel
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   160
 * orientation (which could be the stream source, or the {@code sequential()} or
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   161
 * {@code parallel()} methods.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   162
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   163
 * <p>In order for the results of parallel operations to be deterministic and consistent with
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   164
 * their serial equivalent, the function values passed into the various stream operations should
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   165
 * be <a href="#NonInteference"><em>stateless</em></a>.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   166
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   167
 * <h3><a name="Ordering">Ordering</a></h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   168
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   169
 * <p>Streams may or may not have an <em>encounter order</em>.  An encounter
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   170
 * order specifies the order in which elements are provided by the stream to the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   171
 * operations pipeline.  Whether or not there is an encounter order depends on
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   172
 * the source, the intermediate  operations, and the terminal operation.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   173
 * Certain stream sources (such as {@code List} or arrays) are intrinsically
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   174
 * ordered, whereas others (such as {@code HashSet}) are not.  Some intermediate
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   175
 * operations may impose an encounter order on an otherwise unordered stream,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   176
 * such as {@link java.util.stream.Stream#sorted()}, and others may render an
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   177
 * ordered stream unordered (such as {@link java.util.stream.Stream#unordered()}).
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   178
 * Some terminal operations may ignore encounter order, such as
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   179
 * {@link java.util.stream.Stream#forEach}.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   180
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   181
 * <p>If a Stream is ordered, most operations are constrained to operate on the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   182
 * elements in their encounter order; if the source of a stream is a {@code List}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   183
 * containing {@code [1, 2, 3]}, then the result of executing {@code map(x -> x*2)}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   184
 * must be {@code [2, 4, 6]}.  However, if the source has no defined encounter
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   185
 * order, than any of the six permutations of the values {@code [2, 4, 6]} would
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   186
 * be a valid result. Many operations can still be efficiently parallelized even
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   187
 * under ordering constraints.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   188
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   189
 * <p>For sequential streams, ordering is only relevant to the determinism
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   190
 * of operations performed repeatedly on the same source.  (An {@code ArrayList}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   191
 * is constrained to iterate elements in order; a {@code HashSet} is not, and
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   192
 * repeated iteration might produce a different order.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   193
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   194
 * <p>For parallel streams, relaxing the ordering constraint can enable
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   195
 * optimized implementation for some operations.  For example, duplicate
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   196
 * filtration on an ordered stream must completely process the first partition
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   197
 * before it can return any elements from a subsequent partition, even if those
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   198
 * elements are available earlier.  On the other hand, without the constraint of
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   199
 * ordering, duplicate filtration can be done more efficiently by using
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   200
 * a shared {@code ConcurrentHashSet}.  There will be cases where the stream
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   201
 * is structurally ordered (the source is ordered and the intermediate
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   202
 * operations are order-preserving), but the user does not particularly care
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   203
 * about the encounter order.  In some cases, explicitly de-ordering the stream
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   204
 * with the {@link java.util.stream.Stream#unordered()} method may result in
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   205
 * improved parallel performance for some stateful or terminal operations.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   206
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   207
 * <h3><a name="Non-Interference">Non-interference</a></h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   208
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   209
 * The {@code java.util.stream} package enables you to execute possibly-parallel
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   210
 * bulk-data operations over a variety of data sources, including even non-thread-safe
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   211
 * collections such as {@code ArrayList}.  This is possible only if we can
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   212
 * prevent <em>interference</em> with the data source during the execution of a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   213
 * stream pipeline.  (Execution begins when the terminal operation is invoked, and ends
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   214
 * when the terminal operation completes.)  For most data sources, preventing interference
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   215
 * means ensuring that the data source is <em>not modified at all</em> during the execution
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   216
 * of the stream pipeline.  (Some data sources, such as concurrent collections, are
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   217
 * specifically designed to handle concurrent modification.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   218
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   219
 * <p>Accordingly, lambda expressions (or other objects implementing the appropriate functional
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   220
 * interface) passed to stream methods should never modify the stream's data source.  An
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   221
 * implementation is said to <em>interfere</em> with the data source if it modifies, or causes
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   222
 * to be modified, the stream's data source.  The need for non-interference applies to all
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   223
 * pipelines, not just parallel ones.  Unless the stream source is concurrent, modifying a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   224
 * stream's data source during execution of a stream pipeline can cause exceptions, incorrect
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   225
 * answers, or nonconformant results.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   226
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   227
 * <p>Further, results may be nondeterministic or incorrect if the lambda expressions passed to
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   228
 * stream operations are <em>stateful</em>.  A stateful lambda (or other object implementing the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   229
 * appropriate functional interface) is one whose result depends on any state which might change
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   230
 * during the execution of the stream pipeline.  An example of a stateful lambda is:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   231
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   232
 *     Set<Integer> seen = Collections.synchronizedSet(new HashSet<>());
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   233
 *     stream.parallel().map(e -> { if (seen.add(e)) return 0; else return e; })...
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   234
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   235
 * Here, if the mapping operation is performed in parallel, the results for the same input
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   236
 * could vary from run to run, due to thread scheduling differences, whereas, with a stateless
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   237
 * lambda expression the results would always be the same.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   238
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   239
 * <h3>Side-effects</h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   240
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   241
 * <h2><a name="Reduction">Reduction operations</a></h2>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   242
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   243
 * A <em>reduction</em> operation takes a stream of elements and processes them in a way
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   244
 * that reduces to a single value or summary description, such as finding the sum or maximum
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   245
 * of a set of numbers.  (In more complex scenarios, the reduction operation might need to
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   246
 * extract data from the elements before reducing that data to a single value, such as
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   247
 * finding the sum of weights of a set of blocks.  This would require extracting the weight
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   248
 * from each block before summing up the weights.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   249
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   250
 * <p>Of course, such operations can be readily implemented as simple sequential loops, as in:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   251
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   252
 *    int sum = 0;
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   253
 *    for (int x : numbers) {
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   254
 *       sum += x;
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   255
 *    }
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   256
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   257
 * However, there may be a significant advantage to preferring a {@link java.util.stream.Stream#reduce reduce operation}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   258
 * over a mutative accumulation such as the above -- a properly constructed reduce operation is
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   259
 * inherently parallelizable so long as the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   260
 * {@link java.util.function.BinaryOperator reduction operaterator}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   261
 * has the right characteristics. Specifically the operator must be
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   262
 * <a href="#Associativity">associative</a>.  For example, given a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   263
 * stream of numbers for which we want to find the sum, we can write:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   264
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   265
 *    int sum = numbers.reduce(0, (x,y) -> x+y);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   266
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   267
 * or more succinctly:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   268
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   269
 *    int sum = numbers.reduce(0, Integer::sum);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   270
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   271
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   272
 * <p>(The primitive specializations of {@link java.util.stream.Stream}, such as
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   273
 * {@link java.util.stream.IntStream}, even have convenience methods for common reductions,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   274
 * such as {@link java.util.stream.IntStream#sum() sum} and {@link java.util.stream.IntStream#max() max},
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   275
 * which are implemented as simple wrappers around reduce.)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   276
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   277
 * <p>Reduction parallellizes well since the implementation of {@code reduce} can operate on
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   278
 * subsets of the stream in parallel, and then combine the intermediate results to get the final
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   279
 * correct answer.  Even if you were to use a parallelizable form of the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   280
 * {@link java.util.stream.Stream#forEach(Consumer) forEach()} method
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   281
 * in place of the original for-each loop above, you would still have to provide thread-safe
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   282
 * updates to the shared accumulating variable {@code sum}, and the required synchronization
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   283
 * would likely eliminate any performance gain from parallelism. Using a {@code reduce} method
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   284
 * instead removes all of the burden of parallelizing the reduction operation, and the library
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   285
 * can provide an efficient parallel implementation with no additional synchronization needed.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   286
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   287
 * <p>The "blocks" examples shown earlier shows how reduction combines with other operations
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   288
 * to replace for loops with bulk operations.  If {@code blocks} is a collection of {@code Block}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   289
 * objects, which have a {@code getWeight} method, we can find the heaviest block with:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   290
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   291
 *     OptionalInt heaviest = blocks.stream()
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   292
 *                                  .mapToInt(Block::getWeight)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   293
 *                                  .reduce(Integer::max);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   294
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   295
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   296
 * <p>In its more general form, a {@code reduce} operation on elements of type {@code <T>}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   297
 * yielding a result of type {@code <U>} requires three parameters:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   298
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   299
 * <U> U reduce(U identity,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   300
 *              BiFunction<U, ? super T, U> accumlator,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   301
 *              BinaryOperator<U> combiner);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   302
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   303
 * Here, the <em>identity</em> element is both an initial seed for the reduction, and a default
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   304
 * result if there are no elements. The <em>accumulator</em> function takes a partial result and
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   305
 * the next element, and produce a new partial result. The <em>combiner</em> function combines
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   306
 * the partial results of two accumulators to produce a new partial result, and eventually the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   307
 * final result.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   308
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   309
 * <p>This form is a generalization of the two-argument form, and is also a generalization of
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   310
 * the map-reduce construct illustrated above.  If we wanted to re-cast the simple {@code sum}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   311
 * example using the more general form, {@code 0} would be the identity element, while
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   312
 * {@code Integer::sum} would be both the accumulator and combiner. For the sum-of-weights
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   313
 * example, this could be re-cast as:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   314
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   315
 *     int sumOfWeights = blocks.stream().reduce(0,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   316
 *                                               (sum, b) -> sum + b.getWeight())
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   317
 *                                               Integer::sum);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   318
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   319
 * though the map-reduce form is more readable and generally preferable.  The generalized form
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   320
 * is provided for cases where significant work can be optimized away by combining mapping and
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   321
 * reducing into a single function.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   322
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   323
 * <p>More formally, the {@code identity} value must be an <em>identity</em> for the combiner
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   324
 * function. This means that for all {@code u}, {@code combiner.apply(identity, u)} is equal
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   325
 * to {@code u}. Additionally, the {@code combiner} function must be
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   326
 * <a href="#Associativity">associative</a> and must be compatible with the {@code accumulator}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   327
 * function; for all {@code u} and {@code t}, the following must hold:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   328
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   329
 *     combiner.apply(u, accumulator.apply(identity, t)) == accumulator.apply(u, t)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   330
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   331
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   332
 * <h3><a name="MutableReduction">Mutable Reduction</a></h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   333
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   334
 * A <em>mutable</em> reduction operation is similar to an ordinary reduction, in that it reduces
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   335
 * a stream of values to a single value, but instead of producing a distinct single-valued result, it
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   336
 * mutates a general <em>result container</em>, such as a {@code Collection} or {@code StringBuilder},
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   337
 * as it processes the elements in the stream.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   338
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   339
 * <p>For example, if we wanted to take a stream of strings and concatenate them into a single
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   340
 * long string, we <em>could</em> achieve this with ordinary reduction:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   341
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   342
 *     String concatenated = strings.reduce("", String::concat)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   343
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   344
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   345
 * We would get the desired result, and it would even work in parallel.  However, we might not
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   346
 * be happy about the performance!  Such an implementation would do a great deal of string
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   347
 * copying, and the run time would be <em>O(n^2)</em> in the number of elements.  A more
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   348
 * performant approach would be to accumulate the results into a {@link java.lang.StringBuilder}, which
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   349
 * is a mutable container for accumulating strings.  We can use the same technique to
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   350
 * parallelize mutable reduction as we do with ordinary reduction.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   351
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   352
 * <p>The mutable reduction operation is called {@link java.util.stream.Stream#collect(Collector) collect()}, as it
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   353
 * collects together the desired results into a result container such as {@code StringBuilder}.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   354
 * A {@code collect} operation requires three things: a factory function which will construct
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   355
 * new instances of the result container, an accumulating function that will update a result
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   356
 * container by incorporating a new element, and a combining function that can take two
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   357
 * result containers and merge their contents.  The form of this is very similar to the general
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   358
 * form of ordinary reduction:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   359
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   360
 * <R> R collect(Supplier<R> resultFactory,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   361
 *               BiConsumer<R, ? super T> accumulator,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   362
 *               BiConsumer<R, R> combiner);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   363
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   364
 * As with {@code reduce()}, the benefit of expressing {@code collect} in this abstract way is
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   365
 * that it is directly amenable to parallelization: we can accumulate partial results in parallel
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   366
 * and then combine them.  For example, to collect the String representations of the elements
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   367
 * in a stream into an {@code ArrayList}, we could write the obvious sequential for-each form:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   368
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   369
 *     ArrayList<String> strings = new ArrayList<>();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   370
 *     for (T element : stream) {
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   371
 *         strings.add(element.toString());
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   372
 *     }
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   373
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   374
 * Or we could use a parallelizable collect form:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   375
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   376
 *     ArrayList<String> strings = stream.collect(() -> new ArrayList<>(),
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   377
 *                                                (c, e) -> c.add(e.toString()),
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   378
 *                                                (c1, c2) -> c1.addAll(c2));
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   379
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   380
 * or, noting that we have buried a mapping operation inside the accumulator function, more
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   381
 * succinctly as:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   382
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   383
 *     ArrayList<String> strings = stream.map(Object::toString)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   384
 *                                       .collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   385
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   386
 * Here, our supplier is just the {@link java.util.ArrayList#ArrayList() ArrayList constructor}, the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   387
 * accumulator adds the stringified element to an {@code ArrayList}, and the combiner simply
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   388
 * uses {@link java.util.ArrayList#addAll addAll} to copy the strings from one container into the other.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   389
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   390
 * <p>As with the regular reduction operation, the ability to parallelize only comes if an
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   391
 * <a href="package-summary.html#Associativity">associativity</a> condition is met. The {@code combiner} is associative
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   392
 * if for result containers {@code r1}, {@code r2}, and {@code r3}:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   393
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   394
 *    combiner.accept(r1, r2);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   395
 *    combiner.accept(r1, r3);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   396
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   397
 * is equivalent to
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   398
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   399
 *    combiner.accept(r2, r3);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   400
 *    combiner.accept(r1, r2);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   401
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   402
 * where equivalence means that {@code r1} is left in the same state (according to the meaning
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   403
 * of {@link java.lang.Object#equals equals} for the element types). Similarly, the {@code resultFactory}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   404
 * must act as an <em>identity</em> with respect to the {@code combiner} so that for any result
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   405
 * container {@code r}:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   406
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   407
 *     combiner.accept(r, resultFactory.get());
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   408
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   409
 * does not modify the state of {@code r} (again according to the meaning of
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   410
 * {@link java.lang.Object#equals equals}). Finally, the {@code accumulator} and {@code combiner} must be
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   411
 * compatible such that for a result container {@code r} and element {@code t}:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   412
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   413
 *    r2 = resultFactory.get();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   414
 *    accumulator.accept(r2, t);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   415
 *    combiner.accept(r, r2);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   416
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   417
 * is equivalent to:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   418
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   419
 *    accumulator.accept(r,t);
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   420
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   421
 * where equivalence means that {@code r} is left in the same state (again according to the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   422
 * meaning of {@link java.lang.Object#equals equals}).
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   423
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   424
 * <p> The three aspects of {@code collect}: supplier, accumulator, and combiner, are often very
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   425
 * tightly coupled, and it is convenient to introduce the notion of a {@link java.util.stream.Collector} as
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   426
 * being an object that embodies all three aspects. There is a {@link java.util.stream.Stream#collect(Collector) collect}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   427
 * method that simply takes a {@code Collector} and returns the resulting container.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   428
 * The above example for collecting strings into a {@code List} can be rewritten using a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   429
 * standard {@code Collector} as:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   430
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   431
 *     ArrayList<String> strings = stream.map(Object::toString)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   432
 *                                       .collect(Collectors.toList());
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   433
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   434
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   435
 * <h3><a name="ConcurrentReduction">Reduction, Concurrency, and Ordering</a></h3>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   436
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   437
 * With some complex reduction operations, for example a collect that produces a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   438
 * {@code Map}, such as:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   439
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   440
 *     Map<Buyer, List<Transaction>> salesByBuyer
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   441
 *         = txns.parallelStream()
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   442
 *               .collect(Collectors.groupingBy(Transaction::getBuyer));
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   443
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   444
 * (where {@link java.util.stream.Collectors#groupingBy} is a utility function
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   445
 * that returns a {@link java.util.stream.Collector} for grouping sets of elements based on some key)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   446
 * it may actually be counterproductive to perform the operation in parallel.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   447
 * This is because the combining step (merging one {@code Map} into another by key)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   448
 * can be expensive for some {@code Map} implementations.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   449
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   450
 * <p>Suppose, however, that the result container used in this reduction
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   451
 * was a concurrently modifiable collection -- such as a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   452
 * {@link java.util.concurrent.ConcurrentHashMap ConcurrentHashMap}. In that case,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   453
 * the parallel invocations of the accumulator could actually deposit their results
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   454
 * concurrently into the same shared result container, eliminating the need for the combiner to
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   455
 * merge distinct result containers. This potentially provides a boost
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   456
 * to the parallel execution performance. We call this a <em>concurrent</em> reduction.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   457
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   458
 * <p>A {@link java.util.stream.Collector} that supports concurrent reduction is marked with the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   459
 * {@link java.util.stream.Collector.Characteristics#CONCURRENT} characteristic.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   460
 * Having a concurrent collector is a necessary condition for performing a
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   461
 * concurrent reduction, but that alone is not sufficient. If you imagine multiple
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   462
 * accumulators depositing results into a shared container, the order in which
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   463
 * results are deposited is non-deterministic. Consequently, a concurrent reduction
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   464
 * is only possible if ordering is not important for the stream being processed.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   465
 * The {@link java.util.stream.Stream#collect(Collector)}
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   466
 * implementation will only perform a concurrent reduction if
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   467
 * <ul>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   468
 * <li>The stream is parallel;</li>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   469
 * <li>The collector has the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   470
 * {@link java.util.stream.Collector.Characteristics#CONCURRENT} characteristic,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   471
 * and;</li>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   472
 * <li>Either the stream is unordered, or the collector has the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   473
 * {@link java.util.stream.Collector.Characteristics#UNORDERED} characteristic.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   474
 * </ul>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   475
 * For example:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   476
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   477
 *     Map<Buyer, List<Transaction>> salesByBuyer
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   478
 *         = txns.parallelStream()
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   479
 *               .unordered()
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   480
 *               .collect(groupingByConcurrent(Transaction::getBuyer));
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   481
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   482
 * (where {@link java.util.stream.Collectors#groupingByConcurrent} is the concurrent companion
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   483
 * to {@code groupingBy}).
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   484
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   485
 * <p>Note that if it is important that the elements for a given key appear in the
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   486
 * order they appear in the source, then we cannot use a concurrent reduction,
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   487
 * as ordering is one of the casualties of concurrent insertion.  We would then
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   488
 * be constrained to implement either a sequential reduction or a merge-based
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   489
 * parallel reduction.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   490
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   491
 * <h2><a name="Associativity">Associativity</a></h2>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   492
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   493
 * An operator or function {@code op} is <em>associative</em> if the following holds:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   494
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   495
 *     (a op b) op c == a op (b op c)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   496
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   497
 * The importance of this to parallel evaluation can be seen if we expand this to four terms:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   498
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   499
 *     a op b op c op d == (a op b) op (c op d)
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   500
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   501
 * So we can evaluate {@code (a op b)} in parallel with {@code (c op d)} and then invoke {@code op} on
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   502
 * the results.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   503
 * TODO what does associative mean for mutative combining functions?
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   504
 * FIXME: we described mutative associativity above.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   505
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   506
 * <h2><a name="StreamSources">Stream sources</a></h2>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   507
 * TODO where does this section go?
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   508
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   509
 * XXX - change to section to stream construction gradually introducing more
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   510
 *       complex ways to construct
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   511
 *     - construction from Collection
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   512
 *     - construction from Iterator
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   513
 *     - construction from array
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   514
 *     - construction from generators
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   515
 *     - construction from spliterator
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   516
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   517
 * XXX - the following is quite low-level but important aspect of stream constriction
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   518
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   519
 * <p>A pipeline is initially constructed from a spliterator (see {@link java.util.Spliterator}) supplied by a stream source.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   520
 * The spliterator covers elements of the source and provides element traversal operations
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   521
 * for a possibly-parallel computation.  See methods on {@link java.util.stream.Streams} for construction
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   522
 * of pipelines using spliterators.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   523
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   524
 * <p>A source may directly supply a spliterator.  If so, the spliterator is traversed, split, or queried
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   525
 * for estimated size after, and never before, the terminal operation commences. It is strongly recommended
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   526
 * that the spliterator report a characteristic of {@code IMMUTABLE} or {@code CONCURRENT}, or be
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   527
 * <em>late-binding</em> and not bind to the elements it covers until traversed, split or queried for
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   528
 * estimated size.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   529
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   530
 * <p>If a source cannot directly supply a recommended spliterator then it may indirectly supply a spliterator
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   531
 * using a {@code Supplier}.  The spliterator is obtained from the supplier after, and never before, the terminal
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   532
 * operation of the stream pipeline commences.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   533
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   534
 * <p>Such requirements significantly reduce the scope of potential interference to the interval starting
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   535
 * with the commencing of the terminal operation and ending with the producing a result or side-effect.  See
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   536
 * <a href="package-summary.html#Non-Interference">Non-Interference</a> for
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   537
 * more details.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   538
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   539
 * XXX - move the following to the non-interference section
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   540
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   541
 * <p>A source can be modified before the terminal operation commences and those modifications will be reflected in
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   542
 * the covered elements.  Afterwards, and depending on the properties of the source, further modifications
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   543
 * might not be reflected and the throwing of a {@code ConcurrentModificationException} may occur.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   544
 *
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   545
 * <p>For example, consider the following code:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   546
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   547
 *     List<String> l = new ArrayList(Arrays.asList("one", "two"));
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   548
 *     Stream<String> sl = l.stream();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   549
 *     l.add("three");
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   550
 *     String s = sl.collect(toStringJoiner(" ")).toString();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   551
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   552
 * First a list is created consisting of two strings: "one"; and "two". Then a stream is created from that list.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   553
 * Next the list is modified by adding a third string: "three".  Finally the elements of the stream are collected
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   554
 * and joined together.  Since the list was modified before the terminal {@code collect} operation commenced
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   555
 * the result will be a string of "one two three". However, if the list is modified after the terminal operation
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   556
 * commences, as in:
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   557
 * <pre>{@code
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   558
 *     List<String> l = new ArrayList(Arrays.asList("one", "two"));
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   559
 *     Stream<String> sl = l.stream();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   560
 *     String s = sl.peek(s -> l.add("BAD LAMBDA")).collect(toStringJoiner(" ")).toString();
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   561
 * }</pre>
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   562
 * then a {@code ConcurrentModificationException} will be thrown since the {@code peek} operation will attempt
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   563
 * to add the string "BAD LAMBDA" to the list after the terminal operation has commenced.
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   564
 */
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   565
87067e3340d3 8008682: Inital Streams public API
briangoetz
parents:
diff changeset
   566
package java.util.stream;