Using a power of two effectively masks out top bits of the hash code. Thus a poor-quality hash function might perform particularly badly in this scenario.
Java's HashMap
mitigates this by mistrusting the object's hashCode()
implementation and applying a second level of hashing to its result:
Applies a supplemental hash function to a given hashCode, which defends against poor quality hash functions. This is critical because HashMap uses power-of-two length hash tables, that otherwise encounter collisions for hashCodes that do not differ in lower bits.
If you have a good hash function, or do something similar to what HashMap
does, it does not matter whether you use prime numbers etc as the table size.
If, on the other hand, the hash function is of unknown or poor quality, then using a prime number would be a safer bet. It will, however, make dynamically-sized tables tricker to implement, since all of a sudden you need to be able to produce prime numbers instead of just multiplying the size by a constant factor.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…