I am studying Data Structures and Algorithms in C#/Java. After encountering a solution to the problem of Linked List duplicate removal, I have been struggling to understand it.
The solution is the one proposed by the renowned book Cracking the coding Interview (5th edition, page 208).
void RemoveDuplicates_HashSet(Node n)
{
HashSet<object> set = new HashSet<object>();
Node previous = null;
while (n != null)
{
if (set.Contains(n.Data)) // Condition 1
previous.Next = n.Next;
else // Condition 2
{
set.Add(n.Data);
previous = n;
}
n = n.Next;
}
}
Running the code with the following linked list A->B->A->B
:
// Creating test Singly LinkedList
Node n = new Node("A");
n.Next = new Node("B");
n.Next.Next = new Node("A");
n.Next.Next.Next = new Node("B");
RemoveDuplicates_HashSet(n);
Works perfectly fine: the value of n
after the method is A->B
.
By following the code with a debugger, I can see that what happens in the method loop is the following:
| Pass | HashSet | n | previous | Comment |
| ---- | ------- | ---------- | ---------- | ------------------------ |
| – | – | A->B->A->B | null | |
| 1 | A | B->A->B | A->B->A->B | Condition 2 is triggered |
| 2 | A,B | A->B | B->A->B | Condition 2 is triggered |
| 3 | A,B | B | B->B | Condition 1 is triggered |
| 4 | A,B | null | B | Condition 1 is triggered |
I fail to understand how this actually results in several ways:
Where/how exactly are duplicates dropped from n
? I understand that HashSet contains only unique elements, and it will therefore detect if an element was encountered already, however I still cannot see how the algorithm works in its entirety.
How is it that the values pointed to by n
are updated to be A->B
? Where is it that, given that essentially the loop is simply iterating over the Linked List doing n = n.Next
, n
is actually updated with the final value A->B
? I understand that the list is passed by reference, but I cannot see how it is actually modified.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…