I'm not familiar with all the stuff you refer to, but notice that in the y==2 case, in the first bit of code, x is not written to at all (or read, for that matter). In the second bit of code, it is written twice. This is more of a difference than just writing once vs. writing twice (at least, it is in existing threading models such as pthreads). Also, storing a value which would not otherwise be stored at all is more of a difference than just storing once vs. storing twice. For both these reasons, you don't want compilers just replacing a no-op with tmp = x; x = 17; x = tmp;
.
Suppose thread A wants to assume that no other thread modifies x. It's reasonable to want it to be allowed to expect that if y is 2, and it writes a value to x, then reads it back, it will get back the value it has written. But if thread B is concurrently executing your second bit of code, then thread A could write to x and later read it, and get back the original value, because thread B saved "before" the write and restored "after" it. Or it could get back 17, because thread B stored 17 "after" the write, and stored tmp back again "after" thread A reads. Thread A can do whatever synchronisation it likes, and it won't help, because thread B isn't synchronised. The reason it's not synchronised (in the y==2 case) is that it's not using x. So the concept of whether a particular bit of code "uses x" is important to the threading model, which means compilers can't be allowed to change code to use x when it "shouldn't".
In short, if the transformation you propose were allowed, introducing a spurious write, then it would never be possible to analyse a bit of code and conclude that it does not modify x (or any other memory location). There are a number of convenient idioms which would therefore be impossible, such as sharing immutable data between threads without synchronisation.
So, although I'm not familiar with C++0x's definition of "data race", I assume that it includes some conditions where programmers are allowed to assume that an object is not written to, and that this transformation would violate those conditions. I speculate that if y==2, then your original code, together with concurrent code: x = 42; x = 1; z = x
in another thread, is not defined to be a data race. Or at least if it is a data race, it's not one which permits z to end up with value either 17, or 42.
Consider that in this program, the value 2 in y might be used to indicate, "there are other threads running: don't modify x, because we aren't synchronised here, so that would introduce a data race". Perhaps the reason there's no synchronisation at all, is that in all other cases of y, there are no other threads running with access to x. It seems reasonable to me that C++0x would want to support code like this:
if (single_threaded) {
x = 17;
} else {
sendMessageThatSafelySetsXTo(17);
}
Clearly then, you don't want that transformed to:
tmp = x;
x = 17;
if (!single_threaded) {
x = tmp;
sendMessageThatSafelySetsXTo(17);
}
Which is basically the same transformation as in your example, but with only 2 cases, instead of there being enough to make it look like a good code-size optimisation.