I was doing a quick performance test on a block of code
void ConvertToFloat( const std::vector< short >& audioBlock,
std::vector< float >& out )
{
const float rcpShortMax = 1.0f / (float)SHRT_MAX;
out.resize( audioBlock.size() );
for( size_t i = 0; i < audioBlock.size(); i++ )
{
out[i] = (float)audioBlock[i] * rcpShortMax;
}
}
I was happy with the speed up over the original very naive implementation it takes just over 1 msec to process 65536 audio samples.
However just for fun I tried the following
void ConvertToFloat( const std::vector< short >& audioBlock,
std::vector< float >& out )
{
const float rcpShortMax = 1.0f / (float)SHRT_MAX;
out.reserve( audioBlock.size() );
for( size_t i = 0; i < audioBlock.size(); i++ )
{
out.push_back( (float)audioBlock[i] * rcpShortMax );
}
}
Now I fully expected this to give exactly the same performance as the original code. However suddenly the loop is now taking 900usec (i.e. it's 100usec faster than the other implementation).
Can anyone explain why this would give better performance? Does resize()
initialize the newly allocated vector where reserve just allocates but does not construct? This is the only thing I can think of.
PS this was tested on a single core 2Ghz AMD Turion 64 ML-37.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…