Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
238 views
in Technique[技术] by (71.8m points)

Creating a Python list from a map using dictionary get() as a function vs creating a dictionary get() list with a for loop

I was very surprised that a seemingly easier and less taxing function was less efficient.

I ran a test on two methods, with a 10000 a-j (randomized) characters long string as word. The old function is a, the new one is b:

def a(word):
dictionary = {"a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6, "g": 7, "h": 8, "i": 9, "j": 10,}
word = list(map(dictionary.get, list(word)))
return word

def b(word):
dictionary = {"a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6, "g": 7, "h": 8, "i": 9, "j": 10,}
word = [dictionary.get(l, l) for l in word]
return word

map is supposed to have a tiny speed advantage vs a for loop: https://stackoverflow.com/a/1247490/12494235. However, in my case, the advantage is significant. Many more operations are being executed in function b, and it consistently takes 0.001-0.003 seconds longer. Also, converting the map iterable into a list should slow down a.

A - creating a list from a map using dictionary get() as a function

         160 function calls (159 primitive calls) in 0.002 seconds

   Ordered by: call count
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       12    0.000    0.000    0.000    0.000 {method 'rstrip' of 'str' objects}
        7    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
        6    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:222(_verbose_message)
        6    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        ...

B - creating a dictionary get() list with a for loop

         10161 function calls (10160 primitive calls) in 0.004 seconds

   Ordered by: call count
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10002    0.001    0.000    0.001    0.000 {method 'get' of 'dict' objects}
       12    0.000    0.000    0.000    0.000 {method 'rstrip' of 'str' objects}
        7    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
        6    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:222(_verbose_message)
        6    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        ...

How is that even possible? Did I do something wrong when profiling the two? Why is a for loop so inefficient?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

for example, if you have, arr1 and arr2, both length n = 10^9.

def A:
   for i in range(n):
      arr1[i] = 1
      arr2[i] = 2

def B:
   for i in range(n):
      arr1[i] = 1

   for i in range(n):
      arr2[i] = 2

you should think that A should be faster than B, but it isn't... because in the case of A, the compiler needs to jump from arr1 ptr to arr2 ptr in every iter. and even B has 2 for, but the operation is called only for arr1 at first, and then only arr2 - it's much efficient. so everything depends on how the method is implemented and how the compiler works, or what object is being processed, and how it is saved in memory. hope you can understand why for loop should be so inefficient or not...


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...