Sunday, August 16, 2020

The Series

Pandas Series: A Lightweight Intro | by Deepak K [Daksh] Gupta ...

First, let’s re-create our example dataframe scientists. If we use the loc attribute to subset the first row of our scientists dataframe, we will get a Series object back.

# create our example dataframe
# with a row index label
scientists = pd.DataFrame(
data={'Occupation': ['Chemist', 'Statistician'],
'Born': ['1920-07-25', '1876-06-13'],
'Died': ['1958-04-16', '1937-10-16'],
'Age': [37, 61]},
index=['Rosaline Franklin', 'William Gosset'],
columns=['Occupation', 'Born', 'Died', 'Age'])
print(scientists)

 

Now we select a scientist by the row index label.

# select by row index label
first_row = scientists.loc['William Gosset']
print(type(first_row))

Output <class 'pandas.core.series.Series'>

print(first_row)

 

When a series is printed (i.e., the string representation), the index is printed as the first “column,” and the values are printed as the second “column.” There are many attributes and methods associated with a Series object. Two examples of attributes are index and values.

print(first_row.index)
Output - Index(['Occupation', 'Born', 'Died', 'Age'], dtype='object')

print(first_row.values)
Output - ['Statistician' '1876-06-13' '1937-10-16' 61]
 

An example of a Series method is keys, which is an alias for the index attribute.

print(first_row.keys())
Output - Index(['Occupation', 'Born', 'Died', 'Age'], dtype='object') 

By now, you might have questions about the syntax for index, values, and keys. Attributes can be thought of as properties of an object (in this example, our object is a Series). Methods can be thought of as some calculation or operation that is performed. The subsetting syntax for loc, iloc, and ix consists of all attributes. This is why the syntax does not rely on a set of round parentheses, (), but rather a set of square brackets, [ ], for subsetting. Since keys is a method, if we wanted to get the first key (which is also the first index), we would use the square brackets after the method call.

# get the first index using an attribute
print(first_row.index[0])
Output - Occupation

# get the first index using a method
print(first_row.keys()[0])
Output - Occupation

The Pandas data structure known as Series is very similar to the numpy.ndarray. In turn, many methods and functions that operate on a ndarray will also operate on a Series. A Series may sometimes be referred to as a “vector.”

We will continue with our discussion over series in the next post.

 

Share:

0 comments:

Post a Comment