Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
250 views
in Technique[技术] by (71.8m points)

Fast ways to avoid duplicates in a List<> in C#

My C# program generates random strings from a given pattern. These strings are stored in a list. As no duplicates are allowed I'm doing it like this:

List<string> myList = new List<string>();
for (int i = 0; i < total; i++) {
  string random_string = GetRandomString(pattern);
  if (!myList.Contains(random_string)) myList.Add(random_string);
}

As you can imagine this works fine for several hundreds of entries. But I'm facing the situation to generate several million strings. And with each added string checking for duplicates gets slower and slower.

Are there any faster ways to avoid duplicates?

question from:https://stackoverflow.com/questions/17278593/fast-ways-to-avoid-duplicates-in-a-list-in-c-sharp

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use a data structure that can much more efficiently determine if an item exists, namely a HashSet. It can determine if an item is in the set in constant time, regardless of the number of items in the set.

If you really need the items in a List instead, or you need the items in the resulting list to be in the order they were generated, then you can store the data in both a list and a hashset; adding the item to both collections if it doesn't currently exist in the HashSet.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...