Definition
General array slicing can be implemented (whether or not built into the language) by referencing every array through a dope vector or descriptor — a record that contains the address of the first array element, and then the range of each index and the corresponding coefficient in the indexing formula. This technique also allows immediate array transposition, index reversal, subsampling, etc. For languages like C, where the indices always start at zero, the dope vector of an array with d indices has at least 1 + 2d parameters.
http://en.wikipedia.org/wiki/Array_slicing#Details
That's a dense paragraph, but it's actually all in there. So we need a data structure like this:
struct {
TYPE *data; //address of first array element
int rank; //number of dimensions
int *dims; //size of each dimension
int *weight; //corresponding coefficient in the indexing formula
};
Where TYPE
is whatever the element type is, the field of the matrices. For simplicity and concreteness, we'll just use int
. For my own purposes, I've devised an encoding of various types into integer handles so int
does the job for me, YMMV.
typedef struct arr {
int rank;
int *dims;
int *weight;
int *data;
} *arr;
All of the pointer members can be assigned locations within the
same allocated block of memory as the struct itself (which we will
call the header). But by replacing the earlier use of offsets
and struct-hackery, the implementation of algorithms can be made
independent of that actual memory layout within (or without) the
block.
The basic memory layout for a self-contained array object is
rank dims weight data
dims[0] dims[1] ... dims[rank-1]
weight[0] weight[1] ... weight[rank-1]
data[0] data[1] ... data[ product(dims)-1 ]
An indirect array sharing data (whole array or 1 or more row-slices)
will have the following memory layout
rank dims weight data
dims[0] dims[1] ... dims[rank-1]
weight[0] weight[1] ... weight[rank-1]
//no data! it's somewhere else!
And an indirect array containing an orthogonal slice along
another axis will have the same layout as a basic indirect array,
but with dims and weight suitably modified.
The access formula for an element with indices (i0 i1 ... iN)
is
a->data[ i0*a->weight[0] + i1*a->weight[1] + ...
+ iN*a->weight[N] ]
, assuming each index i[j] is between [ 0 ... dims[j] ).
In the weight
vector for a normally laid-out row-major array, each element is the product of all lower dimensions.
for (int i=0; i<rank; i++)
weight[i] = product(dims[i+1 .. rank-1]);
So for a 3×4×5 array, the metadata would be
{ .rank=3, .dims=(int[]){3,4,5}, .weight=(int[]){4*5, 5, 1} }
or for a 7×6×5×4×3×2 array, the metadata would be
{ .rank=6, .dims={7,6,5,4,3,2}, .weight={720, 120, 24, 6, 2, 1} }
Construction
So, to create one of these, we need the same helper function from the previous question to compute the size from a list of dimensions.
/* multiply together rank integers in dims array */
int productdims(int rank, int *dims){
int i,z=1;
for(i=0; i<rank; i++)
z *= dims[i];
return z;
}
To allocate, simply malloc
enough memory and set the pointers to the appropriate places.
/* create array given rank and int[] dims */
arr arraya(int rank, int dims[]){
int datasz;
int i;
int x;
arr z;
datasz=productdims(rank,dims);
z=malloc(sizeof(struct arr)
+ (rank+rank+datasz)*sizeof(int));
z->rank = rank;
z->dims = z + 1;
z->weight = z->dims + rank;
z->data = z->weight + rank;
memmove(z->dims,dims,rank*sizeof(int));
for(x=1, i=rank-1; i>=0; i--){
z->weight[i] = x;
x *= z->dims[i];
}
return z;
}
And using the same trick from the previous answer, we can make a variable-argument interface to make usage easier.
/* load rank integers from va_list into int[] dims */
void loaddimsv(int rank, int dims[], va_list ap){
int i;
for (i=0; i<rank; i++){
dims[i]=va_arg(ap,int);
}
}
/* create a new array with specified rank and dimensions */
arr (array)(int rank, ...){
va_list ap;
//int *dims=calloc(rank,sizeof(int));
int dims[rank];
int i;
int x;
arr z;
va_start(ap,rank);
loaddimsv(rank,dims,ap);
va_end(ap);
z = arraya(rank,dims);
//free(dims);
return z;
}
And even automatically produce the rank argument by counting the other arguments using the awesome powers of ppnarg.
#define array(...) (array)(PP_NARG(__VA_ARGS__),__VA_ARGS__) /* create a new array with specified dimensions */
Now constructing one of these is very easy.
arr a = array(2,3,4); // create a dynamic [2][3][4] array
Accessing elements
An element is retrieved with a function call to elema
which multiplies each index by the corresponding weight, sums them, and indexes the data
pointer. We return a pointer to the element so it can be read or modified by the caller.
/* access element of a indexed by int[] */
int *elema(arr a, int *ind){
int idx = 0;
int i;
for (i=0; i<a->rank; i++){
idx += ind[i] * a->weight[i];
}
return a->data + idx;
}
The same varargs trick can make an easier interface elem(a,i,j,k)
.
Axial Slices
To take a slice, first we need a way of specifying which dimensions to extract and which to collapse. If we just need to select a single index or all elements from a dimension, then the slice
function can take rank int
s as arguments and interpret -1 as selecting the whole dimension or 0..dimsi-1 as selecting a single index.
/* take a computed slice of a following spec[] instructions
if spec[i] >= 0 and spec[i] < a->rank, then spec[i] selects
that index from dimension i.
if spec[i] == -1, then spec[i] selects the entire dimension i.
*/
arr slicea(arr a, int spec[]){
int i,j;
int rank;
for (i=0,rank=0; i<a->rank; i++)
rank+=spec[i]==-1;
int dims[rank];
int weight[rank];
for (i=0,j=0; i<rank; i++,j++){
while (spec[j]!=-1) j++;
if (j>=a->rank) break;
dims[i] = a->dims[j];
weight[i] = a->weight[j];
}
arr z = casta(a->data, rank, dims);
memcpy(z->weight,weight,rank*sizeof(int));
for (j=0; j<a->rank; j++){
if (spec[j]!=-1)
z->data += spec[j] * a->weight[j];
}
return z;
}
All the dimensions and weights corresponding to the -1s in the argument array are collected and used to create a new array header. All arguments >= 0 are multiplied by their associated weight and added to the data
pointer, offsetting the pointer to the correct element.
In terms of the array access formula, we're treating it as a polynomial.
offset = constant + sum_i=0,n( weight[i] * index[i] )
So for any dimension from which we're selecting a single element (+ all lower dimensions), we factor-out the now-constant index and add it to the constant term in the formula (which in our C representation is the data
pointer itself).
The helper function casta
creates the new array header with shared data
. slicea
of course changes the weight values, but by calculating weights itself, casta
becomes a more generally usable function. It can even be used to construct a dynamic array structure that operates directly on a regular C-style multidimensional array, thus casting.
/* create an array header to access existing data in multidimensional layout */
arr casta(int *data, int rank, int dims[]){
int i,x;
arr z=malloc(sizeof(struct arr)
+ (rank+rank)*sizeof(int));
z->rank = rank;
z->dims = z + 1;
z->weight = z->dims + rank;
z->data = data;
memmove(z->dims,dims,rank*sizeof(int));
for(x=1, i=rank-1; i>=0; i--){
z->weight[i] = x;
x *= z->dims[i];
}
return z;
}
Transposes
The dope vector can also be used to implement transposes. The order of the dimensions (and indices) can be changed.
Remember that this is not a normal 'transpose' like everybody else
does. We don't rearrange the data at all. This is more
properly called a 'dope-vector pseudo-transpose'.
Instead of changing the data, we just change the
constants in the access formula, rearranging the
coefficients of the polynomial. In a real sense, this
is just an application of the commutativity and
associativity of addition.
So for concreteness, assume the data is arranged
sequentially starting at hypothetical address 500.
500: 0
501: 1
502: 2
503: 3
504: 4
505: 5
506: 6
if a is rank 2, dims {1, 7), weight (7, 1), then the
sum of the indices multiplied by the associated weights
added to the initial pointer (500) yield the appropriate
addresses for each element
a[0][0] == *(500+0*7+0*1)
a[0][1] == *(500+0*7+1*1)
a[0][2] == *(500+0*7+2*1)
a[0][3] == *(500+0*7+3*1)
a[0][4] == *(500+0*7+4*1)
a[0][5] == *(500+0*7+5*1)
a[0][6] == *(500+0*7+6*1)
So the dope-vector pseudo-transpose rearranges the
weights and dimensions to match the new ordering of
indices, but the sum remains the same. The initial
pointer remains the same. The data does not move.
b[0][0] == *(500+0*1+0*7)
b[1][0] == *(500+1*1+0*7)
b[2][0] == *(500+2*1+0*7)
b[3][0] == *(500+3*1+0*7)
b[4][0] == *(500+4*1+0*7)
b[5][0] == *(500+5*1+0*7)
b[6][0] == *(500+6*1+0*7)
Inner Product (aka Matrix Multiplication)
So, by using general slices or transpose+"row"-slices (which are easier), generalized inner product can be implemented.
First we need the two helper functions for applying a binary operation to two vectors producing a vector result, and reducing a vector with a binary operation producing a scalar result.
As in the previous question we'll pass in the operator, so the same function can be used with many different operators. For the style here, I'm passing operators as single characters, so there's already an indirect mapping from C operators to our function's operators. This is an <a