If I have a python list that is has many duplicates, and I want to iterate through each item, but not through the duplicates, is it best to use a set (as in set(mylist)
, or find another way to create a list without duplicates? I was thinking of just looping through the list and checking for duplicates but I figured that's what set()
does when it's initialized.
So if mylist = [3,1,5,2,4,4,1,4,2,5,1,3]
and I really just want to loop through [1,2,3,4,5]
(order doesn't matter), should I use set(mylist)
or something else?
An alternative is possible in the last example, since the list contains every integer between its min and max value, I could loop through range(min(mylist),max(mylist))
or through set(mylist)
. Should I generally try to avoid using set in this case? Also, would finding the min
and max
be slower than just creating the set
?
In the case in the last example, the set
is faster:
from numpy.random import random_integers
ids = random_integers(1e3,size=1e6)
def set_loop(mylist):
idlist = []
for id in set(mylist):
idlist.append(id)
return idlist
def list_loop(mylist):
idlist = []
for id in range(min(mylist),max(mylist)):
idlist.append(id)
return idlist
%timeit set_loop(ids)
#1 loops, best of 3: 232 ms per loop
%timeit list_loop(ids)
#1 loops, best of 3: 408 ms per loop
question from:
https://stackoverflow.com/questions/15102052/better-faster-to-loop-through-set-or-list 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…