Sunday, December 9, 2018

Profiling in Python

Sometimes a program runs very slowly or consumes far more memory than we expect, the problem is mostly due to our choice of algorithms or data structures, or due to our doing an inefficient implementation. Irrespective of reason for the problem, it is best to find out precisely where the problem lies rather than just inspecting our code and trying to optimize it as this can cause us to introduce bugs or to speed up parts of our program that actually have no effect on the program’s overall performance because the improvements are not in places where the interpreter spends most of its time.

For e.g. which of the Python functions range or xrange is going to be faster. It turns out that knowing which one is faster may never be an issue in your program, and that the time gained by one of the functions above might not be worth the time you spend researching that, or arguing about it with your colleague.

Profiling a Python program is doing a dynamic analysis that measures the execution time of the program and everything that compose it. That means measuring the time spent in each of its functions. This will give you data about where your program is spending time, and what area might be worth optimizing.

Profiling is a form of program analysis. It can help us answer questions such as:

    how much memory is the program using?
    how much time is spent on functions?
    how much time is spent on statements?


Python’s standard library provides two modules that are particularly useful when we want to investigate the performance of our code. One of these is the timeit module—this is useful for timing small pieces of Python code,and can be used, for example, to compare the performance of two or more implementations of a particular function or method. The other is the cProfile module which can be used to profile a program’s performance—it provides a detailed breakdown of call counts and times and so can be used to find performance bottlenecks.


The timeit module


This is suitable for testing the time of short code (a few short lines).

The syntax:

import timeit

timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000)

timeit.repeat(stmt='pass', setup='pass', timer=<default timer>, repeat=3, number=1000000)


stmt: this is the statement that is being timed. It is a string; it is enclosed by quotes and can include newlines. This statement can be run or repeated a number of times.


setup: this is the setup statement - to setup variables for example. The setup is run only once - at setup.


timer: usually the default timer for the platform you are using (Windows or Unix)


repeat: how many times should the statement be repeated in this test. All of the responses are returned in a list.


number: controls how many times to run the statement; the default is 1,000,000.


Note: It is not required to include the keywords of the positional arguments like stmt and setup. You do need repeat or number.

To understand the timeit module, we will look at a small example. Suppose we have three functions, function_a(), function_b(), and function_c(), all of which perform the same computation, but each using a different algorithm.

If we put all these functions into a module (or import them), we can run them using the timeit module to see how they compare. Here is the code that we would use at the end of the module:


if __name__ == "__main__":

      repeats = 1000

     for function in ("function_a", "function_b", "function_c"):

         t = timeit.Timer("{0}(X, Y)".format(function),

                "from __main__ import {0}, X, Y".format(function))

         sec = t.timeit(repeats) / repeats

         print("{function}() {sec:.6f} sec".format(**locals()))


The first argument given to the timeit.Timer() constructor is the code we want to execute and time, in the form of a string. Here, the first time around the loop, the string is "function_a(X, Y)". The second argument is optional; again it is a string to be executed, this time before the code to be timed so as to provide some setup. Here we have imported from the __main__ (i.e., this) module the function we want to test, plus two variables that are passed as input data (X and Y), and that are available as global variables in the module. We could just as easily have imported the function and data from a different module.

When the timeit.Timer object’s timeit() method is called, it will first execute the constructor’s second argument—if there was one—to set things up, and then it will execute the constructor’s first argument—and time how long the execution takes. The timeit.Timer.timeit() method’s return value is the time taken in seconds, as a float. By default, the timeit() method repeats 1 million times and returns the total seconds for all these executions, but in this particular case we needed only 1000 repeats to give us useful results, so we specified the repeat count explicitly. After timing each function we divide the total by the number of repeats to get its mean (average) execution time and print the function’s name and execution time on the console.

function_a() 0.001618 sec

function_b() 0.012786 sec

function_c() 0.003248 sec

In this example, function_a() is clearly the fastest—at least with the input data that was used. In some situations—for example, where performance can vary considerably depending on the input data—we might have to test each function with multiple sets of input data to cover a representative set of cases
and then compare the total or average execution times.

It isn’t always convenient to instrument our code to get timings, and so the timeit module provides a way of timing code from the command line. For example, to time function_a() from the MyModule.py module, we would enter the following in the console: python3 -m timeit -n 1000 -s "from MyModule import function_a, X, Y" "function_a(X, Y)". (As usual, for Windows, we must replace python3 with something like C:\Python31\python.exe.) The -m option is for the Python interpreter and tells it to load the specified module (in this case timeit) and the other options are handled by the timeit module. The -n option specifies the repetition count, the -s option specifies the setup, and the last argument is the code to execute and time. After the command has finished it prints its results on the console, for example:


1000 loops, best of 3: 1.41 msec per loop


We can easily then repeat the timing for the other two functions so that we can compare them all.

Let's look at a few more examples:

import timeit

setup = "c= {'Quiz1': 90, 'Quiz2':91, 'Test1':89, 'Quiz3':87, 'Test2':90, 'Quiz4':89, 'Final':88}"

stmt = """c['Final']=c['Final'] + 3 """

timeit.repeat(stmt=stmt, setup=setup, repeat=10, number=1)


This is testing how long it will take to add 3 points to the Final mark.

First the setup, written as a string: the initial marks are set up in a dictionary; each quiz or test is a keyword.

Then the statement, written as a string. This adds 3 to the Final mark.

Lastly, the timeit. We are repeating this 10 times - so we can see the various times it takes to run this code; it is returned as a list.


[0.002605915069580078, 0.0019431114196777344, 0.0020720958709716797, 0.0030710697174072266, 0.0030739307403564453, 0.00308990478515625, 0.0024268627166748047 , 0.002237081527709961, 0.0023369789123535156, 0.0021729469299316406]


We can use timeit with a function too:


def add(x, y):  

    return x + y


import timeit

timeit.repeat("add(4,5)", "from __main__ import add", repeat=3, number=1000)


First define your function, as normally.

Then in the timeit code,
  • include the function in quotes.
  • add setup = "from __main__ import funcname" . Notice the double underscores around the  word main __main__ .
  

The cProfile module

Python provides a C module called cProfile which has a reasonable overhead and offers a good enough feature set. The profiler gives the total running time, tells the function call frequency and much more data.The cProfile module (or the profile module—we will refer to them both as the cProfile module) can also be used to compare the performance of functions and methods. And unlike the timeit module that just provides raw timings, the cProfile module shows precisely what is being called and how long each call takes.

Let's see a simple example:

import math


print( math.factorial(1024) )

print( math.sin(1024) )


Instead of executing it the traditional way, run python like this:

python -m cProfile domath.py

You will see something like this:

3 function calls in 0.003 seconds

Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.001    0.001    0.003    0.003 factorial.py:1(<module>)

        1    0.001    0.001    0.001    0.001 {math.factorial}

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


To visualize the profiling data we can use these steps:

python -m cProfile -o out.profile domath.py

sudo easy_install SquareMap RunSnakeRun

runsnake out.profile


A window will show up:





A lot of time is spent in the factorial function, the sinus function hardly takes any time (the sinus function running time is the small block on the right). We can iterate over each function and show the running time as percentage.

Lets us see one more example:

>>> import cProfile

>>> cProfile.run('2 + 2')

         2 function calls in 0.000 seconds


The output will be:

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.000    0.000    0.000    0.000 <string>:1(<module>)

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


Though you can also run a script with it, which turns out to be handy:


$ python -m cProfile -s cumtime lwn2pocket.py

   72270 function calls (70640 primitive calls) in 4.481 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.004    0.004    4.481    4.481 lwn2pocket.py:2(<module>)

        1    0.001    0.001    4.296    4.296 lwn2pocket.py:51(main)

        3    0.000    0.000    4.286    1.429 api.py:17(request)

        3    0.000    0.000    4.268    1.423 sessions.py:386(request)

      4/3    0.000    0.000    3.816    1.272 sessions.py:539(send)

        4    0.000    0.000    2.965    0.741 adapters.py:323(send)

        4    0.000    0.000    2.962    0.740 connectionpool.py:421(urlopen)

        4    0.000    0.000    2.961    0.740 connectionpool.py:317(_make_request)

        2    0.000    0.000    2.675    1.338 api.py:98(post)

       30    0.000    0.000    1.621    0.054 ssl.py:727(recv)

       30    0.000    0.000    1.621    0.054 ssl.py:610(read)

       30    1.621    0.054    1.621    0.054 {method 'read' of '_ssl._SSLSocket' objects}

        1    0.000    0.000    1.611    1.611 api.py:58(get)

        4    0.000    0.000    1.572    0.393 httplib.py:1095(getresponse)

        4    0.000    0.000    1.572    0.393 httplib.py:446(begin)

       60    0.000    0.000    1.571    0.026 socket.py:410(readline)

        4    0.000    0.000    1.571    0.393 httplib.py:407(_read_status)

        1    0.000    0.000    1.462    1.462 pocket.py:44(wrapped)

        1    0.000    0.000    1.462    1.462 pocket.py:152(make_request)

        1    0.000    0.000    1.462    1.462 pocket.py:139(_make_request)

        1    0.000    0.000    1.459    1.459 pocket.py:134(_post_request)

[…]


This prints out all the function called, with the time spend in each and the number of times they have been called.

Using the timeit and cProfile modules we can identify areas of our code that might be taking more time than expected, and using the cProfile module, we can find out exactly where the time is being taken.

Here our discussion about Profiling comes to an end. So till we meet next keep practicing and learning Python as Python is easy to learn!













Share:

0 comments:

Post a Comment