Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
177 views
in Technique[技术] by (71.8m points)

java - Does the JVM have the ability to detect opportunities for parallelization?

The Java Hotspot can optimize the sequential code very well. But I was guessing that with the advent of multi-core computers, can the information at runtime be useful to detect opportunities to parallelize the code at runtime, for example detect the software pipelining is possible in a loop and similar things.

Was any interesting work ever been done on this topic ? Or is it a research failure or some halting problem which is very hard to solve?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I think the current guarantees of the Java memory model make it quite hard to do much, if any, automatic parallelization at the compiler or VM level. The Java language has no semantics to guarantee that any data structure is even effectively immutable, or that any particular statement is pure and free of side-effects, so the compiler would have to figure these out automatically in order to parallelize. Some elementary opportunities would be possible to infer in the compiler, but the general case would be left to the runtime, since dynamic loading and binding could introduce new mutations that didn't exist at compile-time.

Consider the following code:

for (int i = 0; i < array.length; i++) {
    array[i] = expensiveComputation(array[i]);
}

It would be trivial to parallelize, if expensiveComputation is a pure function, whose output depends only on its argument, and if we could guarantee that array wouldn't be changed during the loop (actually we're changing it, setting array[i]=..., but in this particular case expensiveComputation(array[i]) is always called first so it's okay here - assuming that array is local and not referenced from anywhere else).

Furthermore, if we change the loop like this:

for (int i = 0; i < array.length; i++) {
    array[i] = expensiveComputation(array, i);
    // expensiveComputation has the whole array at its disposal!
    // It could read or write values anywhere in it!
}

then parallelization is not trivial any more even if expensiveComputation is pure and doesn't alter its argument, because the parallel threads would be changing the contents of array while others are reading it! The parallelizer would have to figure out which parts of the array expensiveComputation is referring to under various conditions, and synchronize accordingly.

Perhaps it wouldn't be outright impossible to detect all mutations and side-effects that may be going on and take those into account when parallelizing, but it would be very hard, for sure, probably infeasible in practice. This is why parallelization, and figuring out that everything still works correctly, is the programmer's headache in Java.

Functional languages (e.g. Clojure on JVM) are a hot answer to this topic. Pure, side-effect-free functions together with persistent ("effectively immutable") data structures potentially allow implicit or almost implicit parallelization. Let's double each element of an array:

(map #(* 2 %) [1 2 3 4 5])
(pmap #(* 2 %) [1 2 3 4 5])  ; The same thing, done in parallel.

This is transparent because of 2 things:

  1. The function #(* 2 %) is pure: it takes a value in and gives a value out, and that's it. It doesn't change anything, and its output depends only on its argument.
  2. The vector [1 2 3 4 5] is immutable: no matter who's looking at it, or when, it's the same.

It's possible to make pure functions in Java, but 2), immutability, is the Achilles' heel here. There are no immutable arrays in Java. To be pedant, nothing is immutable in Java because even final fields can be changed using reflection. Therefore no guarantees can be made that the output (or input!) of a computation wouldn't be changed by parallelization -> so automatic parallelization is generally infeasible.

The dumb "doubling elements" example extends to arbitrarily complex processing, thanks to immutability:

(defn expensivefunction [v x]
  (/ (reduce * v) x))


(let [v [1 2 3 4 5]]
  (map (partial expensivefunction v) v)) ; pmap would work equally well here!

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...