Udit Vashisht
Author: Udit Vashisht


Python Pandas Tutorial - How to set index of a Python Pandas Dataframe?

  • 3 minutes read
  • 77 Views
Python Pandas Tutorial - How to set index of a Python Pandas Dataframe?

    Table of Contents

Python Pandas is the most popular and downloaded module of Python. In our previous post, we have given a detailed introduction about Python Pandas and how to install python pandas on MacOS, Windows, Linux, etc. In this post, we will learn how to set index of a Python Pandas’ Dataframe.

Python Pandas Tutorial - Setting index of a Python Pandas’ dataframe

For the purpose of this tutorial, we will be using Stack Overflow’s developer survey data for 2019. You can download the data from here. Let’s start coding. We will create a Pandas Dataframe from a CSV file using pandas.read_csv().

#python-pandas-tutorial.py

import pandas as pd
df = pd.read_csv('data/survey_results_public.csv')
print (df)

Output

   Respondent             ...                              SurveyEase
0           1             ...              Neither easy nor difficult
1           2             ...              Neither easy nor difficult
2           3             ...              Neither easy nor difficult
3           4             ...                                    Easy
4           5             ...                                    Easy

[5 rows x 85 columns]

If you have a close look at the data, the first column without any column name starting with 0 is the index of the pandas dataframe. Since, we have not explicitly set the index of the pandas dataframe, the python pandas has automatically set the default index ranging from 0 to (n-1) for a n-rowed python dataframe. We can also check out the index as under:-

#python-pandas-tutorial.py

df.index

Output

RangeIndex(start=0, stop=88883, step=1)

Since, this pandas dataframe already has column ‘Respondent’ with unique values, we can set the same as index of the pandas dataframe using the following code.

#python-pandas-tutorial.py

df.set_index('Respondent')

But interestingly, if you are not using Jupyter notebook, this won’t make any difference, because this only changes the index temporarily and printing out the dataframe again will show the old dataframe. So, to make it a permanent change, you will have to use ‘inplace = True’ as an argument to the above method.

#python-pandas-tutorial.py

df.set_index('Respondent', inplace = True)
Respondent             ...                              SurveyEase
         1             ...              Neither easy nor difficult
         2             ...              Neither easy nor difficult
         3             ...              Neither easy nor difficult
         4             ...                                    Easy
         5             ...                                    Easy

[5 rows x 84 columns]

Resetting the index of a Pandas dataframe

If you think, that you have accidentaly, set the index then you can use reset_index() to reset it to the original state, but in this case also, you will have to use ‘inplace = True’ argument.

#python-pandas-tutorial.py

df.reset_index(inplace = True)

Setting the index of a Pandas dataframe while reading in the CSV file

Alternatively, if you have an idea about the CSV file from which you are creating the pandas dataframe, you can set the index of the pandas dataframe while reading in the source CSV file as under:-

#python-pandas-tutorial.py

df = pd.read_csv('data/survey_results_public.csv', index_col='Respondent')

This command will set the ‘Respondent’ as the index of the pandas dataframe. In case of any doubts, feel free to leave the comment.

You can also read more about the data-structures of Python Pandas i.e. Pandas Series and Pandas Dataframes from here.

We also have a video series on Python Pandas Tips and Tricks on our youtube channel

Python Video Tutorials - How to set index of a pandas dataframe



Related Posts

How to insert a new row in a Pandas Dataframe?
By Udit Vashisht

How to insert a new row to a Pandas Dataframe?

In this post, we will learn to insert/add a new row to an existing Pandas Dataframe using pandas.DataFrame.loc, pandas.concat() and numpy.insert(). Using these methods you can add multiple rows/lists to an existing or an empty Pandas ...

Read More
Python Pandas Tutorial - Introduction and Installation
By Udit Vashisht

Python Pandas Tutorial - Introduction

Python Pandas or Python Data Analysis Library is an open-source library which provides high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Python Pandas is also used for timeseries data analysis. Python Pandas is derived from the term ...

Read More
Python Pandas Objects - Pandas Series and Pandas Dataframe
By Udit Vashisht

Python Pandas Objects - Pandas Series and Pandas Dataframe

In the last post, we discussed introduction and installation of Python Pandas. In this post, we will learn about pandas’ data structures/objects. Pandas provide two type of data structures:-

  • Pandas Series
  • Pandas Dataframe

Pandas Series

Pandas ...

Read More
Search
Tags
tech tutorials automate python beautifulsoup web scrapping webscrapping bs4 Strip Python3 programming Pythonanywhere free Online Hosting hindi til github today i learned Windows Installations Installation Learn Python in Hindi Python Tutorials Beginners macos installation guide linux SaralGyaan Saral Gyaan json in python JSON to CSV Convert json to csv python in hindi convert json csv in python remove background python mini projects background removal remove.bg tweepy Django Django tutorials Django for beginners Django Free tutorials Proxy Models User Models AbstractUser UserModel convert json to csv python json to csv python Variables Python cheats Quick tips == and is f string in python f-strings pep-498 formatting in python python f string smtplib python send email with attachment python send email automated emails python python send email gmail automated email sending passwords secrets environment variables if name == main Matplotlib tutorial Matplotlib lists pandas Scatter Plot Time Series Data Live plots Matplotlib Subplots Matplotlib Candlesticks plots Tutorial Logging unittest testing python test Object Oriented Programming Python OOP Database Database Migration Python 3.8 Walrus Operator Data Analysis Pandas Dataframe Pandas Series Dataframe index pandas index python pandas tutorial python pandas python pandas dataframe python f-strings padding how to flatten a nested json nested json to csv json to csv python pandas Pandas Tutorial insert rows pandas pandas append list line charts line plots in python Django proxy user model django custom user model django user model matplotlib marker size pytplot legends scatter plot python pandas python virtual environment virtualenv venv python python venv virtual environment in python python decorators