Sunday, March 27, 2022

Python data containers

Python supports several data types, both numeric as well as collections. Defining numeric data types such as integers and floating-point numbers is based on assigning a value to a variable. The value we assign to a variable determines the type of the numeric data type. Note that a specific constructor (for example, int() and float()) can also be used to create a variable of a specific data type. Container data types can also be defined either by assigning values in an appropriate format or by using a specific constructor for each collection data type. We will study five different container data types: strings, lists, tuples, dictionaries, and sets.

Strings

Strings are not directly a container data type. But it is important to discuss the string data type because of its wide use in Python programming and also the fact that the string data type is implemented using an immutable sequence of Unicode code points. The fact that it uses a sequence (a collection type) makes it a candidate to be discussed in this post.

String objects are immutable objects in Python. With immutability, string objects provide a safe solution for concurrent programs where multiple functions may access the same string object and will get the same result back. This safety is not possible with mutable objects. Being immutable objects, string objects are popular to use as keys for the dictionary data type or as data elements for the set data type. The drawback of immutability is that a new instance needs to be created even if a small change is to be made to an existing string instance. 

String literals can be enclosed by using matching single quotes (for example, 'blah'), double quotes (for example, "blah blah"), or triple single or double quotes (for example, """none""" or '''none'''). It is also worth mentioning that string objects are handled differently in Python 3 versus Python 2. In Python 3, string objects can hold only text sequences in the form of Unicode data points, but in Python 2 they can hold text as well as byte data. In Python 3, byte data is handled by the bytes data type.

Separating text from bytes in Python 3 makes it clean and efficient but at the cost of data portability. The Unicode text in strings cannot be saved to disk or sent to a remote location on the network without converting it into a binary format. This conversion requires encoding the string data into a byte sequence, which can be achieved in one of the following ways:

• Using the str.encode (encoding, errors) method: This method is available on the string object and it can take two arguments. A user can provide the type of codec to be used (UTF-8 being the default) and how to handle the errors.

• Converting to the bytes datatype: A string object can be converted to the Bytes data type by passing the string instance to the bytes constructor along with the encoding scheme and the error handling scheme.

The details of methods and the attributes available with any string object can be found in the official Python documentation as per the Python release.

In the next post we will discuss about Lists

Share:

0 comments:

Post a Comment