Monday, December 10, 2018

Logging

Logging is a great way to understand what’s happening in our program and in what order its happening. Python’s logging module makes it easy to create a record of custom messages that we write. These log messages will describe when the program execution has reached the logging function call and list any variables we have specified at that point in time. On the other hand, a missing log message indicates a part of the code was skipped and never executed.

Whenever we put a print() statement in our code to output some variable’s value while our program is running, we’ve used a form of logging to debug our code.

The logging module 

To enable the logging module to display log messages on our screen as our program runs, copy the following to the top of your program:

import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s
- %(message)s')


When Python logs an event, it creates a LogRecord object that holds information about that event. The logging module’s basicConfig() function lets us specify what details about the LogRecord object we want to see and how we want those details displayed.

Let's understand logging with the help of a simple example of factorial. The code shown below has a bug and we will also enter several log messages to  figure out what is going wrong. See the code below:

import logging
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s- %(message)s')
logging.debug('Start of program')

def factorial(n):
   
    logging.debug('Start of factorial(%s%%)' % (n))
    value = 1
   
    for num in range(n+1):
       
        value*=num
        logging.debug('num is ' + str(num) + ', total is ' + str(value))
               
     logging.debug('End of factorial(%s%%)' % (n))
     return value
   
print(factorial(5))
logging.debug('End of the program')  


The logging.debug() function is used when we want to print log information. This debug() function will call basicConfig(), and a line of information will be printed. This information will be in the format we specified in basicConfig() and will include the messages we passed to debug(). The
print(factorial(5)) call is part of the original program, so the result is displayed even if logging messages are disabled.

The output of this program looks like this: 





The factorial() function is returning 0 as the factorial of 5, which isn’t right. The for loop should be multiplying the value in total by the numbers from 1 to 5. But the log messages displayed by logging.debug() show that the num variable is starting at 0 instead of 1. Since zero times anything is zero, the rest of the iterations also have the wrong value.  Logging messages can help us figure out when things started to go wrong.

Thus we no that we have to do some changes in the range() so that it starts from 1 rather than zero. Let's change for num in range(n+1) to for num in range(1,n+1) and run the program again. The output should be similar to the this:



Now he factorial(5) call correctly returns 120. The log messages showed what was going on inside the loop, which led straight to the bug. We can see that the logging.debug() calls printed out not just the strings passed to them but also a timestamp and the word DEBUG.

One might wonder why not use print() calls rather than typing import logging and logging.basicConfig(level=logging.DEBUG, format= '%(asctime)s - %(levelname)s - %(message)s').

Once we’re done debugging, we’ll end up spending a lot of time removing print() calls from your code for each log message and might even accidentally remove some print() calls that were being used for nonlog messages.  We can use log messages in our program as per need and can always disable them later by adding a single logging.disable(logging.CRITICAL) call. Unlike print(), the logging module makes it easy to switch between showing and hiding log messages.

Log messages are intended for the programmer, not the user. The user won’t care about the contents of some dictionary value you need to see to help with debugging; use a log message for something like that. For messages that the user will want to see, like File not found or Invalid input, please
enter a number, we should use a print() call. We don’t want to deprive the user of useful information after we’ve disabled log messages. Thus, the logging module offers multiple benefits, including:
  • Multi-threading support
  • Categorization via different levels of logging
  • Flexibility and configurability
  • Separation of the how from the what
This last point, the actual separation of the 'what' we log from the 'how' we log enables the collaboration between different parts of the software. As an example, it allows the developer of a framework or library to add logs and let the sysadmin or person in charge of the runtime configuration decide what should be logged at a later point.

Logging levels

A log can contain different kind of messages and each message has its own level of importance. These levels, in Python, are defined in order of verbosity (from the most verbose to the less verbose) as follows:
  • DEBUG: Detailed information, typically of interest only when diagnosing problems.
  • INFO: Confirmation that things are working as expected.
  • WARNING: An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
  • ERROR: Due to a more serious problem, the software has not been able to perform some function.
  • CRITICAL: A serious error, indicating that the program itself may be unable to continue running.
So, each time we want to write something in our log we have to decide what kind of message we are writing to the log, and hence what level it should have.

The corresponding methods for each level can be called as shown in the following example:

import logging

logging.debug('This is a debug message')
logging.info('This is an info message')
logging.warning('This is a warning message')
logging.error('This is an error message')
logging.critical('This is a critical message')

The output of the above program would look like this:

WARNING:root:This is a warning message
ERROR:root:This is an error message
CRITICAL:root:This is a critical message

The output shows the severity level before each message along with root, which is the name the logging module gives to its default logger. (Loggers are discussed in detail in later sections.) This format, which shows the level, name, and message separated by a colon (:), is the default output format that can be configured to include things like timestamp, line number, and other details.

Notice that the debug() and info() messages didn’t get logged. This is because, by default, the logging module logs the messages with a severity level of WARNING or above. You can change that by configuring the logging module to log events of all levels if you want. You can also define your own severity levels by changing configurations, but it is generally not recommended as it can cause confusion with logs of some third-party libraries that you might be using.

The benefit of logging levels is that you can change what priority of logging message you want to see. Passing logging.DEBUG to the basicConfig() function’s level keyword argument will show messages from all the logging levels (DEBUG being the lowest level). But after developing your program some more, you may be interested only in errors. In that case, you can set basicConfig()’s level argument to logging.ERROR. This will show only ERROR and CRITICAL messages and skip the DEBUG, INFO, and WARNING messages.

Logging module is packed with several features. It has several constants, classes, and methods. The items with all caps are constant, the capitalize items are classes and the items which start with lowercase letters are methods.

There are several logger objects offered by the module itself.

Logger.info(msg) : This will log a message with level INFO on this logger.
Logger.warning(msg) : This will log a message with level WARNING on this logger.
Logger.error(msg) : This will log a message with level ERROR on this logger.
Logger.critical(msg) : This will log a message with level CRITICAL on this logger.
Logger.log(lvl,msg) : This will Logs a message with integer level lvl on this logger.
Logger.exception(msg) : This will log a message with level ERROR on this logger.
Logger.setLevel(lvl) : This function sets the threshold of this logger to lvl. This means that all the messages below this level will be ignored.
Logger.addFilter(filt) : This adds a specific filter filt to the to this logger.
Logger.removeFilter(filt) : This removes a specific filter filt to the to this logger.
Logger.filter(record) : This method applies the logger’s filter to the record provided and returns True if record is to be processed. Else, it will return False.
Logger.addHandler(hdlr) : This adds a specific handler hdlr to the to this logger.
Logger.removeHandler(hdlr) : This removes a specific handler hdlr to the to this logger.
Logger.hasHandlers() : This checks if the logger has any handler configured or not.
The second thing that we have to know is how to configure a logger in Python. It’s really easy actually, we just need three components:

a logger: the object that our code will use to log
at least one handler: the object that will handle the writing of our log to the target device
a formatter: an object that define the format of our log for the handler

Configuring our loggers

The logging module beautifully separates the responsibility of each of its parts. It provides you with a default logger that allows you to get started without needing to do much configuration. Loggers are the objects a developer usually interacts with. They are the main API that we use to indicate 'what' we want to log.

Given an instance of a Logger, we can categorize and ask for messages to be emitted without worrying about how or where it will be emitted. There are multiple ways to configure the logging stack:

1. basicConfig()

We can use the basicConfig(**kwargs) method to configure the logging. Some of the commonly used parameters for basicConfig() are the following:

  • level: The root logger will be set to the specified severity level.
  • filename: This specifies the file.
  • filemode: If filename is given, the file is opened in this mode. The default is a, which means append.
  • format: This is the format of the log message.

By using the level parameter, you can set what level of log messages you want to record. This can be done by passing one of the constants available in the class, and this would enable all logging calls at or above that level to be logged. Here’s an example:

import logging

logging.basicConfig(level=logging.DEBUG)
logging.debug('This will get logged')

The output will be:


DEBUG:root:This will get logged

All events at or above DEBUG level will now get logged.

Similarly, for logging to a file rather than the console, filename and filemode can be used, and you can decide the format of the message using format. The following example shows the usage of all three:

import logging

logging.basicConfig(filename='app.log', filemode='w', format='%(name)s - %(levelname)s - %(message)s')
logging.warning('This will get logged to a file')

The output will be:

root - ERROR - This will get logged to a file

The message will look like this but will be written to a file named app.log instead of the console. The filemode is set to w, which means the log file is opened in “write mode” each time basicConfig() is called, and each run of the program will rewrite the file. The default configuration for filemode is a, which is append.

We can customize the root logger even further by using more parameters for basicConfig(). It should be noted that calling basicConfig() to configure the root logger works only if the root logger has not been configured before. Basically, this function can only be called once.

debug(), info(), warning(), error(), and critical() also call basicConfig() without arguments automatically if it has not been called before. This means that after the first time one of the above functions is called, you can no longer configure the root logger because they would have called the basicConfig() function internally.

The default setting in basicConfig() is to set the logger to write to the console in the following format:

ERROR:root:This is an error message

2. dictConfig()

The configuration for all elements and how to connect them can be specified as a dictionary. The dictionary should have different sections for loggers, handlers and formatters, as well as with some basic global parameters.

As an example see:

config = {
    'disable_existing_loggers': False,
    'version': 1,
    'formatters': {
        'short': {
            'format': '%(asctime)s %(levelname)s %(name)s: %(message)s'
        },
    },
    'handlers': {
        'console': {
            'level': 'INFO',
            'formatter': 'short',
            'class': 'logging.StreamHandler',
        },
    },
    'loggers': {
        '': {
            'handlers': ['console'],
            'level': 'ERROR',
        },
        'plugins': {
            'handlers': ['console'],
            'level': 'INFO',
            'propagate': False
        }
    },
}
import logging.config
logging.config.dictConfig(config)

When invoked, dictConfig will disable all existing loggers, unless 'disable_existing_loggers' is set to False. This is usually the desired behavior, as many modules declare a global logger that will be instantiated at import time, before dictConfig is called.

Formatting the Output

While you can pass any variable that can be represented as a string from your program as a message to your logs, there are some basic elements that are already a part of the LogRecord and can be easily added to the output format. If you want to log the process ID along with the level and message, you can do something like this:

import logging

logging.basicConfig(format='%(process)d-%(levelname)s-%(message)s')
logging.warning('This is a Warning')

The output is as shown:

18472-WARNING-This is a Warning

format can take a string with LogRecord attributes in any arrangement you like. The entire list of available attributes can be found here.

Here’s another example where you can add the date and time info:

import logging

logging.basicConfig(format='%(asctime)s - %(message)s', level=logging.INFO)
logging.info('Admin logged in')

The output is as shown:

2018-07-11 20:12:06,288 - Admin logged in

%(asctime)s adds the time of creation of the LogRecord. The format can be changed using the datefmt attribute, which uses the same formatting language as the formatting functions in the datetime module, such as time.strftime():

import logging

logging.basicConfig(format='%(asctime)s - %(message)s', datefmt='%d-%b-%y %H:%M:%S')
logging.warning('Admin logged out')

The output is as shown:

12-Jul-18 20:53:19 - Admin logged out

Logging Variable Data

In most cases, you would want to include dynamic information from your application in the logs. You have seen that the logging methods take a string as an argument, and it might seem natural to format a string with variable data in a separate line and pass it to the log method. But this can actually be done directly by using a format string for the message and appending the variable data as arguments. 

Here’s an example:

import logging

name = 'John'

logging.error('%s raised an error', name)

The output is as shown:

ERROR:root:John raised an error

The arguments passed to the method would be included as variable data in the message.

While you can use any formatting style, the f-strings introduced in Python 3.6 are an awesome way to format strings as they can help keep the formatting short and easy to read:

import logging

name = 'John'

logging.error(f'{name} raised an error')

The output is as shown:

ERROR:root:John raised an error

Capturing Stack Traces

The logging module also allows you to capture the full stack traces in an application. Exception information can be captured if the exc_info parameter is passed as True, and the logging functions are called like this:

import logging

a = 5
b = 0

try:
  c = a / b
except Exception as e:
  logging.error("Exception occurred", exc_info=True)

The output is as shown:

ERROR:root:Exception occurred
Traceback (most recent call last):
  File "exceptions.py", line 6, in <module>
    c = a / b
ZeroDivisionError: division by zero
[Finished in 0.2s]

If exc_info is not set to True, the output of the above program would not tell us anything about the exception, which, in a real-world scenario, might not be as simple as a ZeroDivisionError. Imagine trying to debug an error in a complicated codebase with a log that shows only this:

ERROR:root:Exception occurred

So if you’re logging from an exception handler, use the logging.exception() method, which logs a message with level ERROR and adds exception information to the message. To put it more simply, calling logging.exception() is like calling logging.error(exc_info=True). But since this method always dumps exception information, it should only be called from an exception handler. Take a look at this example:

import logging

a = 5
b = 0
try:
  c = a / b
except Exception as e:
  logging.exception("Exception occurred")

The output is as shown:

ERROR:root:Exception occurred
Traceback (most recent call last):
  File "exceptions.py", line 6, in <module>
    c = a / b
ZeroDivisionError: division by zero
[Finished in 0.2s]

Using logging.exception() would show a log at the level of ERROR. If you don’t want that, you can call any of the other logging methods from debug() to critical() and pass the exc_info parameter as True.

So far, we have seen the default logger named root, which is used by the logging module whenever its functions are called directly like this: logging.debug(). You can (and should) define your own logger by creating an object of the Logger class, especially if your application has multiple modules. Let’s have a look at some of the classes and functions in the module.

The most commonly used classes defined in the logging module are the following:

Logger: This is the class whose objects will be used in the application code directly to call the functions.

LogRecord: Loggers automatically create LogRecord objects that have all the information related to the event being logged, like the name of the logger, the function, the line number, the message, and more.

Handler: Handlers send the LogRecord to the required output destination, like the console or a file. Handler is a base for subclasses like StreamHandler, FileHandler, SMTPHandler, HTTPHandler, and more. These subclasses send the logging outputs to corresponding destinations, like sys.stdout or a disk file.

Formatter: This is where you specify the format of the output by specifying a string format that lists out the attributes that the output should contain.

Out of these, we mostly deal with the objects of the Logger class, which are instantiated using the module-level function logging.getLogger(name). Multiple calls to getLogger() with the same name will return a reference to the same Logger object, which saves us from passing the logger objects to every part where it’s needed. Here’s an example:

import logging

logger = logging.getLogger('example_logger')
logger.warning('This is a warning')

The output is as shown:

This is a warning

This creates a custom logger named example_logger, but unlike the root logger, the name of a custom logger is not part of the default output format and has to be added to the configuration. Configuring it to a format to show the name of the logger would give an output like this:

WARNING:example_logger:This is a warning

Again, unlike the root logger, a custom logger can’t be configured using basicConfig(). You have to configure it using Handlers and Formatters.

Using Handlers

Handlers come into the picture when you want to configure your own loggers and send the logs to multiple places when they are generated. Handlers send the log messages to configured destinations like the standard output stream or a file or over HTTP or to your email via SMTP.
The standard logging module already comes with multiple built-in handlers like:

  • Multiple file handlers (TimeRotated, SizeRotated, Watched) that can write to files
  • StreamHandler can target a stream like stdout or stderr
  • SMTPHandler allows us to send log records via email
  • With SocketHandler, we can send LogRecords to a streaming socket
  • Others like SyslogHandler, NTEventHandler, HTTPHandler or MemoryHandler


A logger that you create can have more than one handler, which means you can set it up to be saved to a log file and also send it over email.

Like loggers, you can also set the severity level in handlers. This is useful if you want to set multiple handlers for the same logger but want different severity levels for each of them. For example, you may want logs with level WARNING and above to be logged to the console, but everything with level ERROR and above should also be saved to a file. Here’s a program logging_example.py that does that:

import logging

# Create a custom logger
logger = logging.getLogger(__name__)

# Create handlers
c_handler = logging.StreamHandler()
f_handler = logging.FileHandler('file.log')
c_handler.setLevel(logging.WARNING)
f_handler.setLevel(logging.ERROR)

# Create formatters and add it to handlers
c_format = logging.Formatter('%(name)s - %(levelname)s - %(message)s')
f_format = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
c_handler.setFormatter(c_format)
f_handler.setFormatter(f_format)

# Add handlers to the logger
logger.addHandler(c_handler)
logger.addHandler(f_handler)

logger.warning('This is a warning')
logger.error('This is an error')

The output is as shown:

__main__ - WARNING - This is a warning
__main__ - ERROR - This is an error

Here, logger.warning() is creating a LogRecord that holds all the information of the event and passing it to all the Handlers that it has: c_handler and f_handler.

c_handler is a StreamHandler with level WARNING and takes the info from the LogRecord to generate an output in the format specified and prints it to the console. f_handler is a FileHandler with level ERROR, and it ignores this LogRecord as its level is WARNING.

When logger.error() is called, c_handler behaves exactly as before, and f_handler gets a LogRecord at the level of ERROR, so it proceeds to generate an output just like c_handler, but instead of printing it to console, it writes it to the specified file in this format:

2018-08-03 16:12:21,723 - __main__ - ERROR - This is an error

The name of the logger corresponding to the __name__ variable is logged as __main__, which is the name Python assigns to the module where execution starts. If this file is imported by some other module, then the __name__ variable would correspond to its name logging_example. Here is the program explaining how it looks:

import logging_example

The output is as shown:

logging_example - WARNING - This is a warning
logging_example - ERROR - This is an error


Buffering logs

Sometimes, we wish we could have had access to debug logs when an error happened. This is feasible by creating a buffered handler that will log the last debug messages once an error occurred. See the following code as a non-curated example:

import logging
import logging.handlers

class SmartBufferHandler(logging.handlers.MemoryHandler):
    def __init__(self, num_buffered, *args, **kwargs):
        kwargs["capacity"] = num_buffered + 2  # +2 one for current, one for prepop
        super().__init__(*args, **kwargs)

    def emit(self, record):
        if len(self.buffer) == self.capacity - 1:
            self.buffer.pop(0)
        super().emit(record)

handler = SmartBufferHandler(num_buffered=2, target=logging.StreamHandler(), flushLevel=logging.ERROR)
logger = logging.getLogger(__name__)
logger.setLevel("DEBUG")
logger.addHandler(handler)

logger.error("Hello1")
logger.debug("Hello2")  # This line won't be logged
logger.debug("Hello3")
logger.debug("Hello4")
logger.error("Hello5")  # As error will flush the buffered logs, the two last debugs will be logged


Send log to a file

We can also configure the logging module to save the log messages in a file, instead of the STDERR by adding the filename parameter to the basicConfig method call:

logging.basicConfig(level = logging.INFO, filename = "my.log")

This will create the log file if it does not exists and append the content of the file, if it already existed.

This means the file can grow in size quite fast so a better idea might be to use some kind of a log-file rotation tool or to create log files for limited time periods. E.g. One file per day. For this we can use the time.strftime method:

import logging
import time

logging.basicConfig(level = logging.INFO, filename = time.strftime("my-%Y-%m-%d.log"))

logging.debug("debug") 
logging.info("info") 
logging.warning("warning") 
logging.error("error")
logging.critical("critical")

This will use files in the following format: my-2018-06-12.log. Please don't use British or American date formats here. Those are really confusing and break the sorting order.

Alternatively you could change it by adding filemode = 'w' that will change the mode of operation from "append" to "write" and will overwrite the file every time we run our application. This usually is not very useful as this mean we lose all the old logs.

Formatting the log message, including date

We can replace the default log message format using the format parameter and some keywords representing the various fields supplied by the logging module:

import logging

logging.basicConfig( format = '%(asctime)s  %(levelname)-10s %(processName)s  %(name)s %(message)s')

logging.debug("debug") 
logging.info("info") 
logging.warning("warning") 
logging.error("error")
logging.critical("critical")

The output will look like this:

2018-06-12 08:46:22,942  WARNING    MainProcess  root warning
2018-06-12 08:46:22,942  ERROR      MainProcess  root error
2018-06-12 08:46:22,942  CRITICAL   MainProcess  root critical

Change the date format in the log

Though I don't really recommend to change the date format, you can do that by passing the datefmt parameter with a string using the strftime tags.

import logging

logging.basicConfig( format = '%(asctime)s  %(levelname)-10s %(processName)s  %(name)s %(message)s', datefmt =  "%Y-%m-%d-%H-%M-%S")

logging.debug("debug") 
logging.info("info") 
logging.warning("warning") 
logging.error("error")
logging.critical("critical")

The output will then change to:

2018-06-12-08-51-30  WARNING    MainProcess  root warning
2018-06-12-08-51-30  ERROR      MainProcess  root error
2018-06-12-08-51-30  CRITICAL   MainProcess  root critical

Disabling Logging

After you’ve debugged your program, you probably don’t want all these log messages cluttering the screen. The logging.disable() function disables these so that you don’t have to go into your program and remove all the logging calls by hand. You simply pass logging.disable() a logging level, and it
will suppress all log messages at that level or lower. So if you want to disable logging entirely, just add logging.disable(logging.CRITICAL) to your program. For example:

>>> import logging
>>> logging.basicConfig(level=logging.INFO, format=' %(asctime)s -
%(levelname)s - %(message)s')
>>> logging.critical('Critical error! Critical error!')
2018-12-10 11:10:48,054 - CRITICAL - Critical error! Critical error!
>>> logging.disable(logging.CRITICAL)
>>> logging.critical('Critical error! Critical error!')
>>> logging.error('Error! Error!')

Since logging.disable() will disable all messages after it, you will probably want to add it near the import logging line of code in your program. This way, you can easily find it to comment out or uncomment that call to enable or disable logging messages as needed.

Here we end today's discussion, so till we meet next keep practicing and learning Python as Python is easy to learn!

























Share:

0 comments:

Post a Comment