Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
437 views
in Technique[技术] by (71.8m points)

iequalitycomparer - writing a custom comparer for linq groupby

Again this sample is a very simplified version of my actual problem involving a custom comparer for linq grouping. What have I done wrong?

The code below produces the result below (1.2, 0), (4.1, 0), (4.1, 0), (1.1, 0),

however I was expecting the following since 1.1 and 1.2 are < 1.0 apart. (1.2, 0), (1.1, 0), (4.1, 0), (4.1, 0),

class Program
{
    static void Main(string[] args)
    {
        IEnumerable<Point> points = new List<Point> { 
            new Point(1.1, 0.0)
            , new Point(4.1, 0.0) 
            , new Point(1.2, 0.0)
            , new Point(4.1, 0.0)
        };

        foreach (var group in points.GroupBy(p => p, new PointComparer()))
        {
            foreach (var num in group)
                Console.Write(num.ToString() + ", ");

            Console.WriteLine();
        }

        Console.ReadLine();
    }
}

class PointComparer : IEqualityComparer<Point>
{
    public bool Equals(Point a, Point b)
    {
        return Math.Abs(a.X - b.X) < 1.0;
    }

    public int GetHashCode(Point point)
    {
        return point.X.GetHashCode()
            ^ point.Y.GetHashCode();
    }
}

class Point
{
    public double X;
    public double Y;

    public Point(double p1, double p2)
    {
        X = p1;
        Y = p2;
    }

    public override string ToString()
    {
        return "(" + X + ", " + Y + ")";
    }
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The grouping algorithm (and I think all LINQ methods) using an equality comparer always first compares hash codes and only executes Equals if two hash codes are equal. You can see that if you add tracing statements in the equality comparer:

class PointComparer : IEqualityComparer<Point>
{
    public bool Equals(Point a, Point b)
    {
        Console.WriteLine("Equals: point {0} - point {1}", a, b);
        return Math.Abs(a.X - b.X) < 1.0;
    }

    public int GetHashCode(Point point)
    {
        Console.WriteLine("HashCode: {0}", point);
        return point.X.GetHashCode()
            ^ point.Y.GetHashCode();
    }
}

Which results in:

HashCode: (1.1, 0)
HashCode: (4.1, 0)
HashCode: (1.2, 0)
HashCode: (4.1, 0)
Equals: point (4.1, 0) - point (4.1, 0)
(1.1, 0), 
(4.1, 0), (4.1, 0), 
(1.2, 0), 

Only for the two points with equal hash codes Equals was executed.

Now you could trick the comparison by always returning 0 as hash code. If you do that, the output will be:

HashCode: (1.1, 0)
HashCode: (4.1, 0)
Equals: point (1.1, 0) - point (4.1, 0)
HashCode: (1.2, 0)
Equals: point (4.1, 0) - point (1.2, 0)
Equals: point (1.1, 0) - point (1.2, 0)
HashCode: (4.1, 0)
Equals: point (4.1, 0) - point (4.1, 0)
(1.1, 0), (1.2, 0), 
(4.1, 0), (4.1, 0), 

Now for each pair Equals was executed, and you've got your grouping.

But...

What is "equal"? If you add another point (2.1, 0.0), which points do you want in one group? Using the symbol for the fuzzy equality, we have -

1.1 ≈ 1.2
1.2 ≈ 2.1

but

1.1 !≈ 2.1

This means that 1.1 and 2.1 will never be in one group (their Equals never passes) and that it depends on the order of the points whether 1.1 or 2.1 are grouped with 1.2.

So you're on a slippery slope here. Clustering points by proximity is far from trivial. You're entering the realm of cluster analysis.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...