algorithm - find if two arrays contain the same set of integers without extra space and faster than NlogN

Question

Welcome To Ask or Share your Answers For Others

algorithm - find if two arrays contain the same set of integers without extra space and faster than NlogN

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

algorithm - find if two arrays contain the same set of integers without extra space and faster than NlogN

I came across this post, which reports the following interview question:

Given two arrays of numbers, find if each of the two arrays have the same set of integers ? Suggest an algo which can run faster than NlogN without extra space?

The best that I can think of is the following:

(a) sort each array, and then (b) have two pointers moving along the two arrays and check if you find different values ... but step (a) has already NlogN complexity :(
(a) scan shortest array and put values into a map, and then (b) scan second array and check if you find a value that is not in the map ... here we have linear complexity, but we I use extra space

... so, I can't think of a solution for this question.

Ideas?

Thank you for all the answers. I feel many of them are right, but I decided to choose ruslik's one, because it gives an interesting option that I did not think about.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T01:04:05+0000

You can try a probabilistic approach by choosing a commutative function for accumulation (eg, addition or XOR) and a parametrized hash function.

unsigned addition(unsigned a, unsigned b);
unsigned hash(int n, int h_type);

unsigned hash_set(int* a, int num, int h_type){
    unsigned rez = 0;
    for (int i = 0; i < num; i++)
        rez = addition(rez, hash(a[i], h_type));
    return rez;
};

In this way the number of tries before you decide that the probability of false positive will be below a certain treshold will not depend on the number of elements, so it will be linear.

EDIT: In general case the probability of sets being the same is very small, so this O(n) check with several hash functions can be used for prefiltering: to decide as fast as possible if they are surely different or if there is a probability of them being equivalent, and if a slow deterministic method should be used. The final average complexity will be O(n), but worst case scenario will have the complexity of the determenistic method.

Categories

algorithm - find if two arrays contain the same set of integers without extra space and faster than NlogN

algorithm - find if two arrays contain the same set of integers without extra space and faster than NlogN

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags