Open and Save CSV quickly with Pandas – Best Practice

How to easily open and save a CSV with the Pandas library ? Here you will find the most used line of code in Data field.

For this tutorial, we will use the happiness.csv file which is located on this GitHub link.

But the essential prerequisite is to import Pandas :

import pandas as pd

And we can start !

Open a csv

Classic method

The classical method is simply to use the read_csv() function by indicating the path of the csv file:

df = pd.read_csv('path/happiness.csv')

Column method

If we want to extract only a part of the csv, we can indicate it to pandas directly in the read_csv() function, with the usecols attribute as below:

df = pd.read_csv('path/happiness.csv', usecols=['Gender','Mean','N='])

Separator method

And finally a vital method when you have csv saved with different separators like: ‘.’ or ‘;’ and many others.

In this example our csv uses the comma ‘,’ as separator:

df = pd.read_csv('path/happiness.csv', sep = ',')

Now we know how to open a csv, let’s move on to how to create one!

Photo by Zoe Nicolaou on Unsplash

Save a csv

Classic method

To save a csv from a dataframe you simply have to use the to_csv() function by indicating the path and the name of the desired file:

df.to_csv('path/new_happiness.csv')

Recommended method

The method we recommend at Inside Machine Learning is to use the index attribute and to give it the value False:

df.to_csv('path/new_happiness.csv', index=False)

In fact, if you don’t do this, the default value is True. This implies that the csv created will have the columns and values of each row, but also an index column with the values in addition to having the base index in each csv.

In short index=False avoids having two columns indicating the index of each row in our final csv !

Compressed method

To finish, another method exists for large DataFrame: the compressing method.

We just have to add the compression attribute in the function and write our file in .zip and not in .csv :

df.to_csv('path/new_happiness.zip', index=False, compression='zip')

Note that several formats are available. Here they are detailed : ‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’.

That’s all for this tutorial. We hope you will find it useful 😉

Are you interested in happiness.csv ? We use with it a technique to boost its Machine Learning in this article… a must-read read !

sources :

Tom Keldenich
Tom Keldenich

Data Engineer & passionate about Artificial Intelligence !

Founder of the website Inside Machine Learning

Leave a Reply

Your email address will not be published.

Beginner, expert or just curious?Discover our latest news and articles on Machine Learning

Explore Machine Learning, browse our most recent notebooks and stay up to date with the latest practices and technologies!