parallel processing - Visualization of Java Stream parallelization

Question

Welcome To Ask or Share your Answers For Others

parallel processing - Visualization of Java Stream parallelization

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

parallel processing - Visualization of Java Stream parallelization

Often it's not very clear how exactly the parallel stream splits the input into chunks and in which order the chunks are joined. Is there any way to visualize the whole procedure for any stream source to better understand what's going on? Suppose I created a stream like this:

Stream<Integer> stream = IntStream.range(0, 100).boxed().parallel();

I want to see some tree-like structure:

             [0..99]
         _____/   \_____
        |               |
     [0..49]         [50..99]
    __/   \__        __/  \__
   |         |      |        |
[0..24]  [25..49] [50..74] [75..99]

Which means that the whole input range [0..99] is split to [0..49] and [50..99] ranges which in turn split further. Of course such diagram should reflect the real work of Stream API, so if I perform some real operation with such stream the splitting should be performed in the same way.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T02:51:18+0000

Current Stream API implementation uses collector combiner to combine the intermediate results in exactly the same way as they were previously split. Also the splitting strategy depends on the source and common pool parallelism level, but does not depend on exact reduction operation used (the same for reduce, collect, forEach, count, etc.). Relying on this it's not very difficult to create the visualizing collector:

public static Collector<Object, ?, List<String>> parallelVisualize() {
    class Range {
        private String first, last;
        private Range left, right;

        void accept(Object obj) {
            if (first == null)
                first = obj.toString();
            else
                last = obj.toString();
        }

        Range combine(Range that) {
            Range p = new Range();
            p.first = first == null ? that.first : first;
            p.last = Stream
                    .of(that.last, that.first, this.last, this.first)
                    .filter(Objects::nonNull).findFirst().orElse(null);
            p.left = this;
            p.right = that;
            return p;
        }

        String pad(String s, int left, int len) {
            if (len == s.length())
                return s;
            char[] result = new char[len];
            Arrays.fill(result, ' ');
            s.getChars(0, s.length(), result, left);
            return new String(result);
        }

        public List<String> finish() {
            String cur = toString();
            if (left == null) {
                return Collections.singletonList(cur);
            }
            List<String> l = left.finish();
            List<String> r = right.finish();
            int len1 = l.get(0).length();
            int len2 = r.get(0).length();
            int totalLen = len1 + len2 + 1;
            int leftAdd = 0;
            if (cur.length() < totalLen) {
                cur = pad(cur, (totalLen - cur.length()) / 2, totalLen);
            } else {
                leftAdd = (cur.length() - totalLen) / 2;
                totalLen = cur.length();
            }
            List<String> result = new ArrayList<>();
            result.add(cur);

            char[] dashes = new char[totalLen];
            Arrays.fill(dashes, ' ');
            Arrays.fill(dashes, len1 / 2 + leftAdd + 1, len1 + len2 / 2 + 1
                    + leftAdd, '_');
            int mid = totalLen / 2;
            dashes[mid] = '/';
            dashes[mid + 1] = '\';
            result.add(new String(dashes));

            Arrays.fill(dashes, ' ');
            dashes[len1 / 2 + leftAdd] = '|';
            dashes[len1 + len2 / 2 + 1 + leftAdd] = '|';
            result.add(new String(dashes));

            int maxSize = Math.max(l.size(), r.size());
            for (int i = 0; i < maxSize; i++) {
                String lstr = l.size() > i ? l.get(i) : String.format("%"
                        + len1 + "s", "");
                String rstr = r.size() > i ? r.get(i) : String.format("%"
                        + len2 + "s", "");
                result.add(pad(lstr + " " + rstr, leftAdd, totalLen));
            }
            return result;
        }

        public String toString() {
            if (first == null)
                return "(empty)";
            else if (last == null)
                return "[" + first + "]";
            return "[" + first + ".." + last + "]";
        }
    }
    return Collector.of(Range::new, Range::accept, Range::combine,
            Range::finish);
}

Here's some interesting results obtained with this collector using 4-core machine (results will differ on machine with different number of availableProcessors()).

Splitting of simple range:

IntStream.range(0, 100)
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);

Even split to 16 tasks:

                                                                  [0..99]                                                                   
                                   ___________________________________/\________________________________                                    
                                  |                                                                     |                                   
                              [0..49]                                                               [50..99]                                
                 _________________/\______________                                     _________________/\________________                  
                |                                 |                                   |                                   |                 
            [0..24]                           [25..49]                            [50..74]                            [75..99]              
        ________/\_____                   ________/\_______                   ________/\_______                   ________/\_______         
       |               |                 |                 |                 |                 |                 |                 |        
   [0..11]         [12..24]          [25..36]          [37..49]          [50..61]          [62..74]          [75..86]          [87..99]     
    ___/\_          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___    
   |      |        |        |        |        |        |        |        |        |        |        |        |        |        |        |   
[0..5] [6..11] [12..17] [18..24] [25..30] [31..36] [37..42] [43..49] [50..55] [56..61] [62..67] [68..74] [75..80] [81..86] [87..92] [93..99]

Split of two streams concatenation:

IntStream
        .concat(IntStream.range(0, 10), IntStream.range(10, 100))
        .boxed().parallel().collect(parallelVisualize())
        .forEach(System.out::println);

As you can see, first split un-concatenates the streams:

                                                                           [0..99]                                                                           
       _______________________________________________________________________/\_____                                                                        
      |                                                                              |                                                                       
   [0..9]                                                                        [10..99]                                                                    
    __/\__                                        ___________________________________/\__________________________________                                    
   |      |                                      |                                                                       |                                   
[0..4] [5..9]                                [10..54]                                                                [55..99]                                
                                _________________/\________________                                     _________________/\________________                  
                               |                                   |                                   |                                   |                 
                           [10..31]                            [32..54]                            [55..76]                            [77..99]              
                       ________/\_______                   ________/\_______                   ________/\_______                   ________/\_______         
                      |                 |                 |                 |                 |                 |                 |                 |        
                  [10..20]          [21..31]          [32..42]          [43..54]          [55..65]          [66..76]          [77..87]          [88..99]     
                   ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___          ___/\___    
                  |        |        |        |        |        |        |        |        |        |        |        |        |        |        |        |   
              [10..14] [15..20] [21..25] [26..31] [32..36] [37..42] [43..48] [49..54] [55..59] [60..65] [66..70] [71..76] [77..81] [82..87] [88..93] [94..99]

Split of two stream concatenation where intermediate operation (boxed()) was performed before concatenation:

Stream.concat(IntStream.range(0, 50).boxed().parallel(), IntStream.range(50, 100).boxed())
        .collect(parallelVisualize())
        .forEach(System.out::println);

If one of input streams was not turned into parallel mode before concatenation, it refuses to split at all:

                                   [0..99]                                   
                                   ___/\_________________________________    
                                  |                                      |   
                              [0..49]                                [50..99]
                 _________________/\______________                           
                |                                 |                          
            [0..24]                           [25..49]                       
        ________/\_____                   ________/\_______                  
       |               |                 |                 |                 
   [0..11]         [12..24]          [25..36]          [37..49]              
    ___/\_          ___/\___          ___/\___          ___/\___             
   |      |        |        |        |        |        |        |            
[0..5] [6..11] [12..17] [18..24] [25..30] [31..36] [37..42] [43..49]

Split of flatmapping:

Stream.of(0, 50)
        .flatMap(start -> IntStream.range(start, start+50).boxed().parallel())
        .parallel().collect(parallelVisualize())
        .forEach(System.out::println);

Flat-map never parallelizes inside nested streams:

    [0..99]     
    ____/\__    
   |        |   
[0..49] [50..99]

Stream from unknown-sized iterator of 7000 elements (see this answer for context):

StreamSupport
        .stream(Spliterators.spliteratorUnknownSize(
                IntStream.range(0, 7000).iterator(),
                Spliterator.ORDERED), true)
        .collect(parallelVisualize()).forEach(System.out::println);

The splitting is really bad, everybody waits for biggest part [3072..6143]:

                       [0..6999]                        
     _______________________/\___                       
    |                            |                      
[0..1023]                  [1024..6999]                 
                 ________________/\____                 
                |                      |                
          [1024..3071]           [3072..6999]           
                              _________/\_____          
                             |                |         
                       [3072..6143]     [6144..

Categories

parallel processing - Visualization of Java Stream parallelization

parallel processing - Visualization of Java Stream parallelization

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags