Tuesday, December 17, 2019

Data Manipulation

By this point, you are aware of how to draw summaries from the data in your possession. Beyond this, you should learn how to slice, select, and extract data from your DataFrame. I mentioned earlier that DataFrames and Series share many similarities, especially in the methods used on them. However, their attributes are not similar. Therefore you must be keen to make sure you are using the right attributes, or you will end up with attribute errors.


To extract a column, you use square brackets as shown below:

position_col = squad_df['position']
type(position_col)


You will get the output below:

pandas.core.series.Series

The result is a Series. However, if you need to return the column as a Dataframe, you must use column names as shown below:

position_col = squad_df[['position']]
type(position_col)


You will get the output below:

pandas.core.frame.DataFrame

What you have now is a simple list. Onto this list, you can add a new column as follows:

subset = squad_df[['position', 'earnings']]
subset.head()


You should get the output below:




Next, we will look at how to call data from your DataFrame using rows. You can do this using any of the following means:

● Locating the name (.loc)
● Locating the numerical index (.iloc)

Since we will still be indexed using the Teams, we must use .loc and assign it the name of the team as shown below:

eve = squad_df.loc["Everton"]
eve


Another option is to use .iloc for the numerical index of Everton as shown below:

eve = squad_df.iloc[1]

The .iloc slice works in the same way that you slice lists in Python. Therefore, the item found in the index section at the end is omitted.
Share:

0 comments:

Post a Comment