Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
147 views
in Technique[技术] by (71.8m points)

Linked List - remove duplicates algorithm in C#/Java

I am studying Data Structures and Algorithms in C#/Java. After encountering a solution to the problem of Linked List duplicate removal, I have been struggling to understand it.

The solution is the one proposed by the renowned book Cracking the coding Interview (5th edition, page 208).

void RemoveDuplicates_HashSet(Node n)
{
    HashSet<object> set = new HashSet<object>();

    Node previous = null;
    while (n != null)
    {
        if (set.Contains(n.Data))       // Condition 1
            previous.Next = n.Next;
        else                            // Condition 2
        {
            set.Add(n.Data);
            previous = n;
        }
            
        n = n.Next;
    }
}

Running the code with the following linked list A->B->A->B:

// Creating test Singly LinkedList
Node n = new Node("A");
n.Next = new Node("B");
n.Next.Next = new Node("A");
n.Next.Next.Next = new Node("B");

RemoveDuplicates_HashSet(n);

Works perfectly fine: the value of n after the method is A->B.

By following the code with a debugger, I can see that what happens in the method loop is the following:

| Pass | HashSet | n          | previous   | Comment                  |
| ---- | ------- | ---------- | ---------- | ------------------------ |
| –    | –       | A->B->A->B | null       |                          |
| 1    | A       | B->A->B    | A->B->A->B | Condition 2 is triggered |
| 2    | A,B     | A->B       | B->A->B    | Condition 2 is triggered |
| 3    | A,B     | B          | B->B       | Condition 1 is triggered |
| 4    | A,B     | null       | B          | Condition 1 is triggered |

I fail to understand how this actually results in several ways:

  1. Where/how exactly are duplicates dropped from n? I understand that HashSet contains only unique elements, and it will therefore detect if an element was encountered already, however I still cannot see how the algorithm works in its entirety.

  2. How is it that the values pointed to by n are updated to be A->B? Where is it that, given that essentially the loop is simply iterating over the Linked List doing n = n.Next, n is actually updated with the final value A->B? I understand that the list is passed by reference, but I cannot see how it is actually modified.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Where/how exactly are duplicates dropped from n?

A property of a Set if, that it may not contain duplicate elements.

if (set.Contains(n.Data))
            previous.Next = n.Next;
else
{
            set.Add(n.Data);
            previous = n;
}
n = n.Next;

Here you first check, if the data of the current node n is contained in the set. If yes, the successor of the previous node is the next node of the current node n. So:

[previous]->[n]->[next]
[previous]->[next]

Otherwise, the data is added to the set, and you proceed with the next node.

How is it that the values pointed to by n are updated to be A->B? Where is it that, given that essentially the loop is simply iterating over the Linked List doing n = n.Next, n is actually updated with the final value A->B?

I really don't understand this question. Do you mean "Why is the list modified?" If yes, then because it is passed (a bit simplified ) by reference. So you just dont do a copy, like with primitve types (int,long, ...), but you modify the object itself.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...