Your concern about the lack of a rigorous definition of the active member of a union is shared by (at least some of) the members of the standardization committee - see the latest note (dated May 2015) in the description of active issue 1116:
We never say what the active member of a union is, how it can be changed, and so on. [...]
I think we can expect some sort of clarification in future versions of the working draft. That note also indicates that the best we have so far is the note in the paragraph you quoted in your question, [9.5p4].
That being said, let's look at your other questions.
First of all, there are no anonymous structs in standard C++ (only anonymous unions); struct {char a,b,c,d;};
will give you warnings if compiled with reasonably strict options (-std=c++1z -Wall -Wextra -pedantic
for Clang and GCC, for example). Going forward, I'll assume we have a declaration like struct { char a, b, c, d; } s;
and everything else is adjusted accordingly.
The implicitly defaulted default constructor in your first example doesn't perform any initialization according to [12.6.2p9.2]:
In a non-delegating constructor, if a given potentially constructed
subobject is not designated by a mem-initializer-id (including the
case where there is no mem-initializer-list because the constructor
has no ctor-initializer), then
(9.1) - if the entity is a non-static data member that has a brace-or-equal-initializer and either
(9.1.1) - the constructor’s class is a union (9.5), and no other variant member of that union is designated by a mem-initializer-id or
(9.1.2) - the constructor’s class is not a union, and, if the entity is a member of an anonymous union, no other member of that union is designated by a mem-initializer-id,
the entity is initialized as specified in 8.5;
(9.2) - otherwise, if the entity is an anonymous union or a variant member (9.5), no initialization is performed;
(9.3) - otherwise, the entity is default-initialized (8.5).
I suppose we could say that f
has no active member after its default constructor has finished executing, but I don't know of any standard wording that clearly indicates that. What can be said in practice is that it makes no sense to attempt to read the value of any of f
's members, since they're indeterminate.
In your next example, you're using aggregate initialization, which is reasonably well-defined for unions according to [8.5.1p16]:
When a union is initialized with a brace-enclosed initializer, the
braces shall only contain an initializer-clause for the first
non-static data member of the union. [ Example:
union u { int a; const char* b; };
u a = { 1 };
u b = a;
u c = 1; // error
u d = { 0, "asdf" }; // error
u e = { "asdf" }; // error
— end example ]
That, together with brace elision for the initialization of the nested struct, as specified in [8.5.1p12], makes the struct the active member. It answers your next question as well: you can only initialize the first union member using that syntax.
Your next question:
If I want to activate one or the other union member, should I provide a constructor activating it?
Yes, or a brace-or-equal-initializer for exactly one member according to [12.6.2p9.1.1] quoted above; something like this:
union Foo
{
struct { char a, b, c, d; } s;
char array[4];
int integer = 7;
};
Foo f;
After the above, the active member will be integer
. All of the above should also answer your question about #2
(the members are not already constructed when we reach the body of the constructor - #2
is fine as well).
Wrapping up, both Foo{}
and Foo{1}
perform aggregate initialization; they're interpreted as Foo{{}}
and Foo{{1}}
, respectively, (because of brace elision), and initialize the struct; the first one sets all the struct members to 0
and the second one sets the first member to 1
and the rest to 0
, according to [8.5.1p7].
All standard quotes are from the current working draft, N4527.
Paper N4430, which deals with somewhat related issues, but hasn't been integrated into the working draft yet, provides a definition for active member:
In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended ([basic.life]).
This effectively passes the buck to the definition of lifetime in [3.8], which also has a few issues open against it, including the aforementioned issue 1116, so I think we'll have to wait for several such issues to be resolved in order to have a complete and consistent definition. The definition of lifetime as it currently stands doesn't seem to be quite ready.