How to easily open and save a CSV with the Pandas library ? Here you will find the most used line of code in Data field.
For this tutorial, we will use the happiness.csv file which is located on this GitHub link.
But the essential prerequisite is to import Pandas :
import pandas as pd
And we can start !
Open a csv
Classic method
The classical method is simply to use the read_csv() function by indicating the path of the csv file:
df = pd.read_csv('path/happiness.csv')
Column method
If we want to extract only a part of the csv, we can indicate it to pandas directly in the read_csv() function, with the usecols attribute as below:
df = pd.read_csv('path/happiness.csv', usecols=['Gender','Mean','N='])
Separator method
And finally a vital method when you have csv saved with different separators like: ‘.’ or ‘;’ and many others.
In this example our csv uses the comma ‘,’ as separator:
df = pd.read_csv('path/happiness.csv', sep = ',')
Now we know how to open a csv, let’s move on to how to create one!
THE PANE METHOD FOR DEEP LEARNING!
Get your 7 DAYS FREE TRAINING to learn how to create your first ARTIFICIAL INTELLIGENCE!
For the next 7 days I will show you how to use Neural Networks.
You will learn what Deep Learning is with concrete examples that will stick in your head.
BEWARE, this email series is not for everyone. If you are the kind of person who likes theoretical and academic courses, you can skip it.
But if you want to learn the PANE method to do Deep Learning, click here :

Save a csv
Classic method
To save a csv from a dataframe you simply have to use the to_csv() function by indicating the path and the name of the desired file:
df.to_csv('path/new_happiness.csv')
Recommended method
The method we recommend at Inside Machine Learning is to use the index attribute and to give it the value False:
df.to_csv('path/new_happiness.csv', index=False)
In fact, if you don’t do this, the default value is True. This implies that the csv created will have the columns and values of each row, but also an index column with the values in addition to having the base index in each csv.
In short index=False avoids having two columns indicating the index of each row in our final csv !
Compressed method
To finish, another method exists for large DataFrame: the compressing method.
We just have to add the compression attribute in the function and write our file in .zip and not in .csv :
df.to_csv('path/new_happiness.zip', index=False, compression='zip')
Note that several formats are available. Here they are detailed : ‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’.
That’s all for this tutorial. We hope you will find it useful 😉
Are you interested in happiness.csv ? We use with it a technique to boost its Machine Learning in this article… a must-read read !
sources :
- Pandas documentation
- Photo by Chris Curry on Unsplash
THE PANE METHOD FOR DEEP LEARNING!
Get your 7 DAYS FREE TRAINING to learn how to create your first ARTIFICIAL INTELLIGENCE!
For the next 7 days I will show you how to use Neural Networks.
You will learn what Deep Learning is with concrete examples that will stick in your head.
BEWARE, this email series is not for everyone. If you are the kind of person who likes theoretical and academic courses, you can skip it.
But if you want to learn the PANE method to do Deep Learning, click here :