Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
841 views
in Technique[技术] by (71.8m points)

c - What's the most efficient way to load and extract 32 bit integer values from a 128 bit SSE vector?

I'm trying to optimize my code using SSE intrinsics but am running into a problem where I don't know of a good way to extract the integer values from a vector after I've done the SSE intrinsics operations to get what I want.

Does anyone know of a good way to do this? I'm programming in C and my compiler is gcc version 4.3.2.

Thanks for all your help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It depends on what you can assume about the minimum level of SSE support that you have.

Going all the way back to SSE2 you have _mm_extract_epi16 (PEXTRW) which can be used to extract any 16 bit element from a 128 bit vector. You would need to call this twice to get the two halves of a 32 bit element.

In more recent versions of SSE (SSE4.1 and later) you have _mm_extract_epi32 (PEXTRD) which can extract a 32 bit element in one instruction.

Alternatively if this is not inside a performance-critical loop you can just use a union, e.g.

typedef union
{
    __m128i v;
    int32_t a[4];
} U32;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...