Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
298 views
in Technique[技术] by (71.8m points)

c++ - What the heque is going on with the memory overhead of std::deque?

I am working on an external sorting algorithm that uses std::queue and must carefully constrain its memory usage. I have noticed that during the merge phase (which uses several std::queues of fixed length), my memory usage increases to about 2.5X what I expected. Since std::queue by default uses std::deque as its underlying container, I ran some tests on std::deque to determine its memory overhead. Here are the results, running on VC++ 9, in release mode, with a 64-bit process:

When adding 100,000,000 chars to a std::deque, the memory usage grows to 252,216K. Note that 100M chars (1 byte) should occupy 97,656K, so this is an overhead of 154,560K.

I repeated the test with doubles (8 bytes) and saw memory grow to 1,976,676K, while 100M doubles should occupy 781,250K, for an overhead of 1,195,426K!!

Now I understand that std::deque is normally implemented as a linked list of "chunks." If this is true, then why is the overhead proportional to the element size (because of course the pointer size should be fixed at 8 bytes)? And why is it so danged huge?

Can anybody shed some light on why std::deque uses so much danged memory? I'm thinking I should switch my std::queue underlying containers to std::vector as there is no overhead (given that the appropriate size is reserveed). I'm thinking the benefits of std::deque are largely negated by the fact that it has such a huge overhead (resulting in cache misses, page faults, etc.), and that the cost of copying std::vector elements may be less, given that the overall memory usage is so much lower. Is this just a bad implementation of std::deque by Microsoft?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Look at the code for _DEQUESIZ (number of elements per block):

#define _DEQUESIZ   (sizeof (_Ty) <= 1 ? 16 
    : sizeof (_Ty) <= 2 ? 8 
    : sizeof (_Ty) <= 4 ? 4 
    : sizeof (_Ty) <= 8 ? 2 : 1)    /* elements per block (a power of 2) */

It gets smaller if the element is larger. Only for elements larger than 8 bytes will you get the expected behavior (percentual decrease of overhead with increase of element size).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...