Pandas - 16 (Reading and Writing Data on Microsoft Excel Files) ~ Python is easy to learn

pandas provides specific functions for Excel spreadsheet format. We have seen that the I/O API provides two functions to this purpose:

• to_excel()
• read_excel()

The read_excel() function can read Excel 2003 (.xls) files and Excel 2007 (.xlsx) files. This is possible due to the integration of the internal module xlrd. You may have to install xlrd module before using these functions (pip install xlrd)

Let's create an excel document in our working directory and the content in sheet 1 and sheet 2 should be:

sheet 1

white	red	green	black
a	12	23	17	18
b	22	16	19	18
c	14	23	22	21

sheet 2

yellow	purple	blue	orange
A	11	16	44	22
B	20	22	23	44
C	30	31	37	32

To read the data contained in the XLS file and convert it into a dataframe, we only have to use the read_excel() function. See the following program :

import pandas as pd
import numpy as np

print('default\n')
print(pd.read_excel('mydata10.xlsx'))
print('\nUsing index for Sheet1\n')
print(pd.read_excel('mydata10.xlsx',0))
print('\nUsing name for Sheet1\n')
print(pd.read_excel('mydata10.xlsx','Sheet1'))
print('\nUsing name for Sheet2\n')
print(pd.read_excel('mydata10.xlsx','Sheet2'))
print('\nUsing index for Sheet2\n')
print(pd.read_excel('mydata10.xlsx',1))

By default, the returned dataframe is composed of the data tabulated in the first spreadsheets. To load the data in the second spreadsheet, we must specify the name of the sheet Sheet2 or the number of the sheet (index) [1] just as the second argument. The output of the program is shown below:

default

Unnamed: 0 white red green black
0                  a     12    23     17     18
1            b     22   16     19     18
2                  c     14   23     22     21

Using index for Sheet1

Unnamed: 0 white red green black
0         a     12   23     17     18
1         b     22   16     19     18
2                       c     14   23     22     21

Using name for Sheet1

Unnamed: 0 white red green black
0                  a     12   23     17     18
1                  b     22   16     19     18
2            c     14   23     22     21

Using name for Sheet2

Unnamed: 0 yellow purple blue orange
0           A      11      16       44      22
1                 B      20      22       23      44
2                 C      30      31       37      32

Using index for Sheet2

Unnamed: 0 yellow purple blue orange
0                 A      11      16        44      22
1                 B      20      22        23      44
2                 C      30      31        37      32
------------------
(program exited with code: 0)

Press any key to continue . . .

Now let's convert a dataframe into a spreadsheet on Excel. See the following program :

import pandas as pd
import numpy as np

frame = pd.DataFrame(np.random.random((4,4)),
                     index = ['exp1','exp2','exp3','exp4'],
                     columns = ['Jan2015','Fab2015','Mar2015','Apr2005'])

frame.to_excel('mydata11.xlsx')

print(pd.read_excel('mydata11.xlsx'))

In our program we first create a dataframe. Then using the to_excel() we convert our created dataframe into a spreadsheet whose name is passed as an argument to the to_excel('mydata11.xlsx') . Finally we read the newly created excel file using read_excel().

The output of the program is shown below:

Unnamed: 0       Jan2015    Fab2015   Mar2015   Apr2005
0                 exp1 0.802199 0.669568 0.667242 0.938849
1         exp2 0.326955 0.125371 0.909584 0.350017
2                 exp3 0.667992 0.707551 0.187137 0.244859
3                 exp4 0.139591 0.473944 0.945452 0.814224
------------------
(program exited with code: 0)

Press any key to continue . . .

Here I am ending today’s post. Until we meet again keep practicing and learning Python, as Python is easy to learn!

Python is easy to learn

Tuesday, April 16, 2019

Pandas - 16 (Reading and Writing Data on Microsoft Excel Files)

0 comments:

Post a Comment