Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
78 views
in Technique[技术] by (71.8m points)

c++ - Reshaping a 1-d array to a multidimensional array

Taking into consideration the entire C++11 standard, is it possible for any conforming implementation to succeed the first assertion below but fail the latter?

#include <cassert>

int main(int, char**)
{  
    const int I = 5, J = 4, K = 3;
    const int N = I * J * K;

    int arr1d[N] = {0};
    int (&arr3d)[I][J][K] = reinterpret_cast<int (&)[I][J][K]>(arr1d);
    assert(static_cast<void*>(arr1d) ==
           static_cast<void*>(arr3d)); // is this necessary?

    arr3d[3][2][1] = 1;
    assert(arr1d[3 * (J * K) + 2 * K + 1] == 1); // UB?
}

If not, is this technically UB or not, and does that answer change if the first assertion is removed (is reinterpret_cast guaranteed to preserve addresses here?)? Also, what if the reshaping is done in the opposite direction (3d to 1d) or from a 6x35 array to a 10x21 array?

EDIT: If the answer is that this is UB because of the reinterpret_cast, is there some other strictly compliant way of reshaping (e.g., via static_cast to/from an intermediate void *)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Update 2021-03-20:

This same question was asked on Reddit recently and it was pointed out that my original answer is flawed because it does not take into account this aliasing rule:

If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object, or
  • a char, unsigned char, or std?::?byte type.

Under the rules for similarity, these two array types are not similar for any of the above cases and therefore it is technically undefined behaviour to access the 1D array through the 3D array. (This is definitely one of those situations where, in practice, it will almost certainly work with most compilers/targets)

Note that the references in the original answer refer to an older C++11 draft standard

Original answer:

reinterpret_cast of references

The standard states that an lvalue of type T1 can be reinterpret_cast to a reference to T2 if a pointer to T1 can be reinterpret_cast to a pointer to T2 (§5.2.10/11):

An lvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast.

So we need to determine if a int(*)[N] can be converted to an int(*)[I][J][K].

reinterpret_cast of pointers

A pointer to T1 can be reinterpret_cast to a pointer to T2 if both T1 and T2 are standard-layout types and T2 has no stricter alignment requirements than T1 (§5.2.10/7):

When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void.

  1. Are int[N] and int[I][J][K] standard-layout types?

    int is a scalar type and arrays of scalar types are considered to be standard-layout types (§3.9/9).

    Scalar types, standard-layout class types (Clause 9), arrays of such types and cv-qualified versions of these types (3.9.3) are collectively called standard-layout types.

  2. Does int[I][J][K] have no stricter alignment requirements than int[N].

    The result of the alignof operator gives the alignment requirement of a complete object type (§3.11/2).

    The result of the alignof operator reflects the alignment requirement of the type in the complete-object case.

    Since the two arrays here are not subobjects of any other object, they are complete objects. Applying alignof to an array gives the alignment requirement of the element type (§5.3.6/3):

    When alignof is applied to an array type, the result shall be the alignment of the element type.

    So both array types have the same alignment requirement.

That makes the reinterpret_cast valid and equivalent to:

int (&arr3d)[I][J][K] = *reinterpret_cast<int (*)[I][J][K]>(&arr1d);

where * and & are the built-in operators, which is then equivalent to:

int (&arr3d)[I][J][K] = *static_cast<int (*)[I][J][K]>(static_cast<void*>(&arr1d));

static_cast through void*

The static_cast to void* is allowed by the standard conversions (§4.10/2):

A prvalue of type “pointer to cv T,” where T is an object type, can be converted to a prvalue of type “pointer to cv void”. The result of converting a “pointer to cv T” to a “pointer to cv void” points to the start of the storage location where the object of type T resides, as if the object is a most derived object (1.8) of type T (that is, not a base class subobject).

The static_cast to int(*)[I][J][K] is then allowed (§5.2.9/13):

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T,” where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.

So the cast is fine! But are we okay to access objects through the new array reference?

Accessing array elements

Performing array subscripting on an array like arr3d[E2] is equivalent to *((E1)+(E2)) (§5.2.1/1). Let's consider the following array subscripting:

arr3d[3][2][1]

Firstly, arr3d[3] is equivalent to *((arr3d)+(3)). The lvalue arr3d undergoes array-to-pointer conversion to give a int(*)[2][1]. There is no requirement that the underlying array must be of the correct type to do this conversion. The pointers value is then accessed (which is fine by §3.10) and then the value 3 is added to it. This pointer arithmetic is also fine (§5.7/5):

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

This this pointer is dereferenced to give an int[2][1]. This undergoes the same process for the next two subscripts, resulting in the final int lvalue at the appropriate array index. It is an lvalue due to the result of * (§5.3.1/1):

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.

It is then perfectly fine to access the actual int object through this lvalue because the lvalue is of type int too (§3.10/10):

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object
  • [...]

So unless I've missed something. I'd say this program is well-defined.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...