Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
209 views
in Technique[技术] by (71.8m points)

c++ - What causes std::sort() to access address out of range

I understand that to use std::sort(), the compare function must be strict weak order, otherwise it will crash due to accessing address out of bound. (https://gcc.gnu.org/ml/gcc-bugs/2013-12/msg00333.html)

However, why would std::sort() access out-of-bound address when the compare function is not strict weak order? What is it trying to compare?

Also I wonder if there are other pitfalls in STL that I should be aware of.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The first thing is that calling the algorithm with a comparator that does not comply with the requirements is undefined behavior and anything goes...

But other than that, I assume that you are interested in knowing what type of implementation might end up accessing out of bounds if the comparator is bad. Should the implementation not check the bounds before accessing the elements in the first place? i.e. before calling the comparator

The answer is performance, and this is just one of the possible things that could lead to this type of issues. There are different implementations of sorting algorithms, but more often than not, std::sort is built on top of a variant of quicksort that will degenerate on a different sorting algorithm like mergesort to avoid the quicksort worst case performance.

The implementation of quicksort selects a pivot and then partitions the input around the pivot, then independently sorts both sides. There are different strategies for selection of the pivot, but a common one is the median of three: the algorithm gets the values of the first, last and middle element, selects the median of the three and uses that as the pivot value.

Conceptually partition walks from the left until it finds an element that is not smaller than the pivot, it then walks from the right trying to find an element that is smaller than the pivot. If the two cursors meet, partition completed. If the out of place elements are found, the values are swapped and the process continues in the range determined by both cursors. The loop walking from the left to find the element to swap would look like:

while (pos < end && value(pos) < pivot) { ++pos; }

While in general partition cannot assume that the value of pivot will be in the range, quicksort knows that it is, after all it selected the pivot out of the elements in the range. A common optimization in this case is to swap the value of the median to be in the last element of the loop. This guarantees that value(pos) < pivot will be true before pos == end (worst case: pos == end - 1). The implication here is that we can drop the check for the end of the range and we can use a unchecked_partition (pick your choice of name) with a simpler faster condition:

while (/*pos < end &&*/ value(pos) < pivot) ++pos;

All perfectly good, except that < is spelled comparator(value(pos), pivot). Now if the comparator is incorrectly implemented you might end up with comparator(pivot,pivot) == true and the cursor will run out of bounds.

Note that this is just one example of optimization of the algorithm that can remove bounds check for performance: assuming a valid order, it is impossible to walk out of the array in the above loop if quicksort set the pivot to the last element before calling this modified partition.

Back to the question:

Should the implementation not check the bounds before accessing the elements in the first place? i.e. before calling the comparator

No, not if it removed bounds checking by proving that it won't walk out of the array, but that prove is built on the premise that the comparator is valid.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...