About the description of i = ++i + 1
I gather that the subtle explanation is that
(1) the expression ++i
returns an lvalue but +
takes prvalues as operands, so a conversion from lvalue to prvalue must be performed;
Probably, see CWG active issue 1642.
this involves obtaining the
current value of that lvalue (rather than one more than the old value
of i
) and must therefore be sequenced after the side effect from the
increment (i.e., updating i
)
The sequencing here is defined for the increment (indirectly, via +=
, see (a)):
The side effect of ++
(the modification of i
) is sequenced before the value computation of the whole expression ++i
. The latter refers to computing the result of ++i
, not to loading the value of i
.
(2) the LHS of the assignment is also an
lvalue, so its value evaluation does not involve fetching the current
value of i
;
while this value computation is unsequenced w.r.t. the
value computation of the RHS, this poses no problem
I don't think that's properly defined in the Standard, but I'd agree.
(3) the value
computation of the assignment itself involves updating i
(again),
The value computation of i = expr
is only required when you use the result, e.g. int x = (i = expr);
or (i = expr) = 42;
. The value computation itself does not modify i
.
The modification of i
in the expression i = expr
that happens because of the =
is called the side effect of =
. This side effect is sequenced before value computation of i = expr
-- or rather the value computation of i = expr
is sequenced after the side effect of the assignment in i = expr
.
In general, the value computation of the operands of an expression are sequenced before the side effect of that expression, of course.
but is sequenced after the value computation of its RHS, and hence after
the previous update to i
; no problem.
The side effect of the assignment i = expr
is sequenced after the value computation of the operands i
(A) and expr
of the assignment.
The expr
in this case is a +
-expression: expr1 + 1
. The value computation of this expression is sequenced after the value computations of its operands expr1
and 1
.
The expr1
here is ++i
. The value computation of ++i
is sequenced after the side effect of ++i
(the modification of i
) (B)
That's why i = ++i + 1
is safe: There's a chain of sequenced before between the value computation in (A) and the side effect on the same variable in (B).
(a) The Standard defines ++expr
in terms of expr += 1
, which is defined as expr = expr + 1
with expr
being evaluated only once.
For this expr = expr + 1
, we therefore have only one value computation of expr
. The side effect of =
is sequenced before the value computation of the whole expr = expr + 1
, and it's sequenced after the value computation of the operands expr
(LHS) and expr + 1
(RHS).
This corresponds to my claim that for ++expr
, the side effect is sequenced before the value computation of ++expr
.
About i += ++i + 1
Does the value computation of i += ++i + 1
involve undefined behavior?
Since the
LHS of +=
is still an lvalue (and its RHS still a prvalue), the same
reasoning as above applies as far as (1) and (2) are concerned;
as for
(3) the value computation of the +=
operator now must both fetch the
current value of i
, and then (obviously sequenced after it, even if
the standard does not say so explicitly, or otherwise the execution of
such operators would always invoke undefined behavior) perform the
addition of the RHS and store the result back into i
.
I think here's the problem: The addition of i
in the LHS of i +=
to the result of ++i + 1
requires knowing the value of i
- a value computation (which can mean loading the value of i
). This value computation is unsequenced with respect to the modification performed by ++i
. This is essentially what you say in your alternative description, following the rewrite mandated by the Standard i += expr
-> i = i + expr
. Here, the value computation of i
within i + expr
is unsequenced with respect to the value computation of expr
. That's where you get UB.
Please note that a value computation can have two results: The "address" of an object, or the value of an object. In an expression i = 42
, the value computation of the lhs "produces the address" of i
; that is, the compiler needs to figure out where to store the rhs (under the rules of observable behaviour of the abstract machine). In an expression i + 42
, the value computation of i
produces the value. In the above paragraph, I was referring to the second kind, hence [intro.execution]p15 applies:
If a side effect on a scalar object is unsequenced relative to either
another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
Another approach for i += ++i + 1
the value computation of the +=
operator now must both fetch the
current value of i
, and then [...] perform the addition of the RHS
The RHS being ++i + 1
. Computing the result of this expression (the value computation) is unsequenced with respect to the value computation of i
from the LHS. So the word then in this sentence is misleading: Of course, it must first load i
and then add the result of the RHS to it. But there's no order between the side-effect of the RHS and the value computation to get the value of the LHS. For example, you could get for the LHS either the old or the new value of i
, as modified by the RHS.
In general a store and a "concurrent" load is a data race, which leads to Undefined Behaviour.
Addressing the addendum
using a fictive |||
operator to designate unsequenced evaluations, one might try to define E op= F;
(with int operands for simplicity) as equivalent to { int& L=E ||| int R=F; L = L + R; }
, but then the example no longer has UB.
Let E
be i
and F
be ++i
(we don't need the + 1
). Then, for i = ++i
int* lhs_address;
int lhs_value;
int* rhs_address;
int rhs_value;
( lhs_address = &i)
||| (i = i+1, rhs_address = &i, rhs_value = *rhs_address);
*lhs_address = rhs_value;
On the other hand, for i += ++i
( lhs_address = &i, lhs_value = *lhs_address)
||| (i = i+1, rhs_address = &i, rhs_value = *rhs_address);
int total_value = lhs_value + rhs_value;
*lhs_address = total_value;
This is intended to represent my understanding of the sequencing guarantees. Note that the ,
operator sequences all value computations and side effects of the LHS before those of the RHS. Parentheses do not affect sequencing. In the second case, i += ++i
, we have a modification of i
unsequenced wrt an lvalue-to-rvalue conversion of i
=> UB.
The standard does not treat compound assignments as second-class primitives for which no separate definition of semantics is necessary.
I would say that's a redundancy. The rewrite from E1 op = E2
to E1 = E1 op E2
also includes which expression types and value categories are required (on the rhs, 5.17/1 says something about the lhs), what happens to pointer types, the required conversions etc. The sad thing is that the sentence about "With respect to an.." in 5.17/1 is not in 5.17/7 as an exception of that equivalence.
In any way, I think we should compare the guarantees and requirements for compound assignment vs. simple assignment plus the operator, and see if there's any contradiction.
Once we put that "With respect to an.." also in the list of exceptions in 5.17/7, I don't think there's a contradiction.
As it turns out, as you can see in the discussion of Marc van Leeuwen's answer, this sentence leads to the following interesting observation:
int i; // global
int& f() { return ++i; }
int main() {
i = i + f(); // (A)
i += f(); // (B)
}
It seems that (A) has an two possible outcomes, since the evaluation of the body of f
is indeterminately sequenced with the value computation of the i
in i + f()
.
In (B), on the other hand, the evaluation of the body of f()
is sequenced before the value computation of i
, since +=
must be seen as a single operation, and f()
certainly needs to be evaluated before the assignment of +=
.