CSV,
“comma-separated values”, files are simplified spreadsheets stored as plaintext
files. Python’s csv module contains parameters and method to parse CSV files.
Each line in a CSV file represents a row in the spreadsheet, and commas separate
the cells in the row. CSV files are widely supported by many types of programs,
can be viewed in text editors (including IDLE’s file editor), and are a
straightforward way to represent spreadsheet data.
CSV
files are just text files of comma-separated values. Not every comma in a CSV
file represents the boundary between two cells. CSV files also have their own
set of escape characters to llow commas and other characters to be included as part of the values.
Let’s start using the csv module for reading and
writing
CSV files.
Reading data from a CSV file
To
read data from a CSV file with the csv
module we first create a Reader object which allows us to iterate over lines in the CSV
file. Let’s create a CSV file fruits.csv
as shown below:
20/12/2018
13:34,Apples,73
21/12/2018
3:41,Guava,85
22/12/2018
12:46,Pears,14
23/12/2018
8:59,Oranges,52
24/12/2018
2:07,Grapes,152
25/12/2018
18:10,Bananas,23
26/12/2018
2:40,Pineapple,98
Now
we’ll read data from this file with the help of the following program:
import
csv
myFile
= open('fruits.csv')
myReader
= csv.reader(myFile)
myData
= list(myReader)
print(myData)
The
output of the above program is shown below:
[['20/12/2018 13:34', 'Apples', '73'],
['21/12/2018 3:41', 'Guava', '85'], ['22/12/2018 12:46', 'Pears', '14'],
['23/12/2018 8:59', 'Oranges', '52'], ['24/12/2018 2:07', 'Grapes', '152'],
['25/12/2018 18:10', 'Bananas', '23'], ['26/12/2018 2:40', 'Pineapple', '98']]
------------------
(program exited with code: 0)
Press any key to continue . . .
The
CSV module comes with Python, so no installation is required and we can import
it in to our programs. To read a CSV file with the csv module, first open it using the open() function and store it in a File
object (myFile). Next we pass myFile
file object to the csv.reader() function which returns a Reader
object. In order to access the
values in the Reader object we convert it to a plain Python list by passing it
to list() and store the list in a
variable myData. Finally we print the CSV
file as a list of lists as shown in the output file.
We
can access the value at a particular row and column with the expression myData[row][col], here row
is the index of one of the lists in myData, and col
is the index of the item you want
from that list. Let’s modify our program to print the value at a particular row
and column. See the code below:
import
csv
myFile
= open('fruits.csv')
myReader
= csv.reader(myFile)
myData
= list(myReader)
print(myData[0][0])
print(myData[0][1])
print(myData[0][2])
print(myData[1][1])
print(myData[1][2])
print(myData[2][1])
The
output of the above program is shown below:
20/12/2018 13:34
Apples
73
Guava
85
Pears
------------------
(program exited with code: 0)
Press any key to continue . .
.
myData [0][0] goes into the first list and gives us the first string, myData [0][2] goes into the first list and gives us the third string,
and so on. This approach is suitable for CSV files with small amount of data,
so for large CSV files, we can use the Reader
object in a for loop and print the data in CSV file. See the code below:
import
csv
myFile
= open('fruits.csv')
myReader
= csv.reader(myFile)
for
row in myReader:
print('Row #' +
str(myReader.line_num) + ' ' + str(row))
The
output of the above program is shown below:
Row
#1 ['20/12/2018 13:34', 'Apples', '73']
Row
#2 ['21/12/2018 3:41', 'Guava', '85']
Row
#3 ['22/12/2018 12:46', 'Pears', '14']
Row
#4 ['23/12/2018 8:59', 'Oranges', '52']
Row
#5 ['24/12/2018 2:07', 'Grapes', '152']
Row
#6 ['25/12/2018 18:10', 'Bananas', '23']
Row
#7 ['26/12/2018 2:40', 'Pineapple', '98']
------------------
(program
exited with code: 0)
Press
any key to continue . . .
Each
row in a Reader object is a list of values, with each value representing a
cell. Thus we loop through the rows in the Reader
object and using the print() function we print the number of the current row and the
contents
of the row. To get the row number, use the Reader
object’s line_num variable, which contains the number of the current line.
Write data to a CSV file
To
write data to a CSV file we require a Writer
object which can be created using the
csv.writer() function. See the following program to create a CSV file
and write data to this file:
import
csv
newFile
= open('breakfast.csv', 'w', newline='')
myWriter
= csv.writer(newFile)
myWriter.writerow(['bread',
'eggs', 'bacon', 'ham'])
myWriter.writerow(['scambled
eggs', 'toast', 'juice', 'sandwich'])
newFile.close()
First
we create an empty file and open it in write mode using the open() function. This
will
create
the object we can then pass to csv.writer()
to create a Writer object. On Windows, we also need to pass a blank string
for the open() function’s newline
keyword argument. The writerow() method for Writer
objects takes a list argument. Each
value in the list is placed in its own cell in the output CSV file. The
return
value of writerow() is the number of characters written to the file for that
row (including newline characters).
When
we run this program a new CSV file, 'breakfast.csv', is created in your program directory which
contains the data we wrote through the program.
The
next program can be used to read the content of 'breakfast.csv' file:
import
csv
myFile
= open('breakfast.csv')
myReader
= csv.reader(myFile)
for
row in myReader:
print('Row #' +
str(myReader.line_num) + ' ' + str(row))
The
output of the above program is shown below:
Row #1 ['bread', 'eggs', 'bacon',
'ham']
Row #2 ['scambled eggs', 'toast',
'juice', 'sandwich']
------------------
(program exited with code: 0)
Press any key to continue . . .
In
case you want to separate cells with a tab character instead of a comma and you
want the rows to be double-spaced , the following program can be used:
import
csv
newFile
= open('breakfast1.tsv', 'w', newline='')
myWriter
= csv.writer(newFile,delimiter='\t', lineterminator='\n\n')
myWriter.writerow(['bread',
'eggs', 'bacon', 'ham'])
myWriter.writerow(['scambled
eggs', 'toast', 'juice', 'sandwich'])
myWriter.writerow(['fried
eggs', 'bread', 'juice', 'salad'])
newFile.close()
This
program changes the delimiter and line terminator characters in our file. The delimiter is
the character that appears between cells on a row. By default, the delimiter
for a CSV file is a comma. The line
terminator is the character that comes at the
end of a row. By default, the line terminator is a newline. You can change
characters to different values by using the delimiter
and lineterminator
keyword arguments with csv.writer(). We’re using the file extension .tsv, for
tab-separated values.
Passing
delimeter='\t' and lineterminator='\n\n' changes
the character between cells to a tab and the character between rows to two
newlines. We then call writerow() three times to give us three rows. Open the 'breakfast1.tsv'
file and notice that our cells are separated by tabs.
0 comments:
Post a Comment