The standard more or less requires using buckets for collision
resolution, which means that the actual look up time will
probably be linear with respect to the number of elements in the
bucket, regardless of whether the element is present or not.
It's possible to make it O(lg N), but it's not usually done,
because the number of elements in the bucket should be small,
if the hash table is being used correctly.
To ensure that the number of elements in a bucket is small, you
must ensure that the hashing function is effective. What
effective means depends on the types and values being hashed.
(The MS implementation uses FNV, which is one of the best
generic hashs around, but if you have special knowledge of the
actual data you'll be seeing, you might be able to do better.)
Another thing which can help reduce the number of elements per
bucket is to force more buckets or use a smaller load factor.
For the first, you can pass the minimum initial number of
buckets as an argument to the constructor. If you know the
total number of elements that will be in the map, you can
control the load factor this way. You can also forse a minumum
number of buckets once the table has been filled, by calling
rehash
. Otherwise, there is a function
std::unordered_map<>::max_load_factor
which you can use. It
is not guaranteed to do anything, but in any reasonable
implementation, it will. Note that if you use it on an already
filled unordered_map
, you'll probably have to call
unordered_map<>::rehash
afterwards.
(There are several things I don't understand about the standard
unordered_map: why the load factor is a float
, instead of
double
; why it's not required to have an effect; and why it
doesn't automatically call rehash
for you.)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…