Hi, Habr. Often when working with sequences, the question arises of their creation. It seems to be used to using List Comprehensions
, and in books they shout about the mandatory use of the built-in map
In this article, we will consider these approaches to working with sequences, compare performance, and also determine in which situations which approach is better.
List inclusion is a list-generating mechanism built into Python. He has only one task - to build a list. List inclusion builds a list from any iterable type, transforming (filtering) the incoming values.
An example of a simple list inclusion for generating a list of squares of numbers from 0 to 9:
squares = [x*x for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
map is a function built into the language. It accepts a function as the first parameter, and an iterable object as the second. Returns a generator (Python 3.x) or a list (Python 2.x). I will choose Python 3.
An example of calling the map function to generate a list of squares of numbers from 0 to 9:
squares = list(map(lambda x: x*x, range(10)))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Build without functions
As an experiment, we will consider the squares of numbers from the interval from 0 to 9,999,999:
python -m timeit -r 10 "[x*x for x in range(10000000)]" python -m timeit -r 10 "list(map(lambda x: x*x, range(10000000)))"
1 loop, best of 10: 833 msec per loop 1 loop, best of 10: 1.22 sec per loop
As you can see, the List Comprehension
method works about 32% faster. Having disassembled it is not possible to get complete answers, because the map function “seems to hide the details of its work”. But most likely this is due to the constant call of the lambda function, inside which square calculations are already being done. In the case of List Comprehension, we only need to calculate the square.
Build with features
For comparison, we will also consider the squares of numbers, but the calculations will now be inside the function:
python -m timeit -r 10 -s "def pow2(x): return x*x" "[pow2(x) for x in range(10000000)]" python -m timeit -r 10 -s "def pow2(x): return x*x" "list(map(pow2, range(10000000)))"
1 loop, best of 10: 1.41 sec per loop 1 loop, best of 10: 1.21 sec per loop
This time the situation is reversed. The map
method was 14% faster. In this example, both methods are in the same situation. Both must call a function to calculate the square. However, the internal optimizations of the map function allow it to show better results.
What to choose?
Below is the rule for choosing the right method:
There may be exceptions to this rule, but in most cases it will help you make the right choice!
is map "safer"?
Why do many urge the use of map
. The fact is that in some cases map is actually safer than List Comprehension.
symbols = ['a', 'b', 'c'] values = [1, 2, 3] for x in symbols: print(x) squared = [x*x for x in values]
The output of the program will be as follows:
a 3 b 3 c 3
Now rewrite the same code using map
symbols = ['a', 'b', 'c'] values = [1, 2, 3] for x in symbols: print(x) squared = map(lambda x: x*x, values)
a a b b c c
The most observant ones could already notice from the syntax of using map
that this is Python 2. Indeed, in the second python there was a similar kind of problem with overwriting variables. However, in Python 3, this problem has been fixed and is no longer relevant.
The examples described above will show the same results. It may also seem that this is a stupid mistake and you will never make such a mistake, however, this can happen when you simply transferred a block of code with an inner loop from another block. Such a mistake can spend you a lot of time and nerves to fix it.
Comparison showed that each of the methods is good in its situation.
- If you do not need all the calculated values at once (or maybe they are not needed at all), then you should opt for map . So, as needed, you will request a portion of data from the generator, while saving a large amount of memory (Python 3. In Python 2, this does not make sense, since map returns a list).
- If you need to calculate all the values at once and the calculations can be done without using functions, then you should make a choice in the direction of List Comprehension . As shown by the results of experiments - it has a significant advantage in performance.
PS: If I missed something, I’m happy to discuss it with you in the comments.