Most array operations execute significantly faster than corresponding list operations. To demonstrate, we’ll use the IPython %timeit magic command, which times the average duration of operations. Note that the times displayed on your system may vary from what we show here.
Timing the Creation of a List Containing Results of 6,000,000 Die Rolls
We’ve demonstrated rolling a six-sided die 6,000,000 times. Here, let’s use the random module’s randrange function with a list comprehension to create a list of six million die rolls and time the operation using %timeit. Note that we used the line-continuation character (\) to split the statement
in snippet [2] over two lines:
In [1]: import random
In [2]: %timeit rolls_list = \
...: [random.randrange(1, 7) for i in
range(0, 6_000_000)]
6.29 s ± 119 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
By default, %timeit executes a statement in a loop, and it runs the loop seven times. If you do not indicate the number of loops, %timeit chooses an appropriate value. In our testing, operations that on average took more than 500 milliseconds iterated only once, and operations that took fewer than 500
milliseconds iterated 10 times or more.
After executing the statement, %timeit displays the statement’s average execution time, as well as the standard deviation of all the executions. On average, %timeit indicates that it took 6.29 seconds (s) to create the list with a standard deviation of 119 milliseconds (ms). In total, the preceding snippet took about 44 seconds to run the snippet seven times.
Timing the Creation of an array Containing Results of 6,000,000 Die Rolls
Now, let’s use the randint function from the numpy.random module to create an array of 6,000,000
die rolls
In [3]: import numpy as np
In [4]: %timeit rolls_array =
np.random.randint(1, 7, 6_000_000)
72.4 ms ± 635 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
On average, %timeit indicates that it took only 72.4 milliseconds with a standard deviation of 635 microseconds (μs) to create the array. In total, the preceding snippet took just under half a second to execute on our computer about 1/100th of the time snippet [2] took to execute. The operation is two orders of magnitude faster with array!
60,000,000 and 600,000,000 Die Rolls
Now, let’s create an array of 60,000,000 die rolls:
In [5]: %timeit rolls_array =
np.random.randint(1, 7, 60_000_000)
873 ms ± 29.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
On average, it took only 873 milliseconds to create the array. Finally, let’s do 600,000,000 million die rolls:
In [6]: %timeit rolls_array =
np.random.randint(1, 7, 600_000_000)
10.1 s ± 232 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
It took about 10 seconds to create 600,000,000 elements with NumPy vs. about 6 seconds to create only 6,000,000 elements with a list comprehension. Based on these timing studies, you can see clearly why
arrays are preferred over lists for compute-intensive operations. In the data science case studies, we’ll enter the performance-intensive worlds of big data and AI.
Customizing the %timeit Iterations
The number of iterations within each %timeit loop and the number of loops are customizable with the -n and -r options. The following executes snippet [4]’s statement three times per loop and runs the loop twice:
In [7]: %timeit -n3 -r2 rolls_array =
np.random.randint(1, 7, 6_000_000)
85.5 ms ± 5.32 ms per loop (mean ± std. dev. of 2 runs, 3 loops each)
Other IPython Magics
IPython provides dozens of magics for a variety of tasks for a complete list, see the IPython magics documentation. Here are a few helpful ones:
%load to read code into IPython from a local file or URL.
%save to save snippets to a file.
%run to execute a .py file from IPython.
%precision to change the default floating-point precision for IPython outputs.
%cd to change directories without having to exit IPython first.
%edit to launch an external editor—handy if you need to modify more complex snippets.
%history to view a list of all snippets and commands you’ve executed in the current IPython session.
0 comments:
Post a Comment