Explanation to the first example snippet
The problem comes into play when performing parallel processing.
//double the even values and put that into a list.
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 1, 2, 3, 4, 5);
List<Integer> doubleOfEven = new ArrayList<>();
numbers.stream()
.filter(e -> e % 2 == 0)
.map(e -> e * 2)
.forEach(e -> doubleOfEven.add(e)); // <--- Unnecessary use of side-effects!
This unnecessarily uses side-effects while not all side effects are bad if used correctly when it comes to using streams one must provide behaviour that is safe to execute concurrently on different pieces of the input. i.e. writing code which doesn’t access shared mutable data to do its work.
The line:
.forEach(e -> doubleOfEven.add(e)); // Unnecessary use of side-effects!
unnecessarily uses side-effects and when executed in parallel, the non-thread-safety of ArrayList
would cause incorrect results.
A while back I read a blog by Henrik Eichenhardt answering as to why a shared mutable state is the root of all evil.
This is a short reasoning as to why shared mutability is not good; extracted from the blog.
non-determinism = parallel processing + mutable state
This equation basically means that both parallel processing and
mutable state combined result in non-deterministic program behaviour.
If you just do parallel processing and have only immutable state
everything is fine and it is easy to reason about programs. On the
other hand if you want to do parallel processing with mutable data you
need to synchronize the access to the mutable variables which
essentially renders these sections of the program single threaded. This is not really new but I haven't seen this concept expressed so elegantly. A non-deterministic program is broken.
This blog goes on to derive the inner details as to why parallel programs without proper synchronization are broken, which you can find within the appended link.
Explanation to the second example snippet
List<Integer> doubleOfEven2 =
numbers.stream()
.filter(e -> e % 2 == 0)
.map(e -> e * 2)
.collect(toList()); // No side-effects!
This uses a collect reduction operation on the elements of this stream using a Collector
.
This is much safer, more efficient, and more amenable to parallelization.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…