Monday, August 24, 2020

Excel for data storage and Other Data Output Types

Why Pandas is a Better Data Analysis Tool Than Excel

Excel has its own scripting language if you absolutely have to work in it. This will allow you to work with data in a more predictable and reproducible manner.

a.Series

The Series data structure does not have an explicit to_excel method. If you have a Series that needs to be exported to an Excel file, one option is to convert the Series into a one-column DataFrame. 

# convert the Series into a DataFrame
# before saving it to an Excel file
names_df = names.to_frame()
import xlwt # this needs to be installed
# xls file
names_df.to_excel('../output/scientists_names_series_df.xls')
import openpyxl # this needs to be installed
# newer xlsx file
names_df.to_excel('../output/scientists_names_series_df.xlsx')

b.DataFrame

From the preceding example, you can see how to export a DataFrame to an Excel file. The documentation  http://pandas.pydata.org/pandasdocs/stable/generated/ pandas.DataFrame.to_excel.html shows several ways to further fine-tune the output. For example, you can output data to a specific “sheet” using the sheet_name parameter.

# saving a DataFrame into Excel format
scientists.to_excel('../output/scientists_df.xlsx',
sheet_name='scientists',
index=False)

Feather Format to Interface With R

The format called “feather” is used to save a binary object that can also be loaded into the R language. The main benefit of this approach is that it is faster than writing and reading a CSV file between the languages. The general rule of thumb for using this data format is to use it only as an intermediate data format, and to not use the feather format for long-term storage. That is, use it in your code only to pass in data into R; do not use it to save a final version of your data.

The feather formatter is installed via conda install -c conda-forge feather-format or pip install feather-format. You can use the to_feather method on a dataframe to save the feather object. Not every dataframe can be converted into a feather object. For example, our current data set contains a column of date values, which at the time of this writing is not supported by feather.

Other Data Output Types

There are many ways Pandas can export and import data. Indeed, to_pickle, to_csv, and to_excel, and
to_feather are only some of the data formats that can make their way into Pandas DataFrames. Table below, lists some of these other output formats.

                                            DataFrame Export Methods

 

 


Share:

0 comments:

Post a Comment