Introduction to Pandas
Pandas or Python Data Analysis Library is an open source library which provides high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is also used for timeseries data analysis. Pandas is derived from the term “panel data”, an econometrics term for data sets that include observations over multiple time periods for the same individuals (wikipedia).
Data scientists love to use pandas for data analysis because:-
- It handles the missing data very efficiently and easily.
- It is faster and provides a highly optimized performance as it is built on the top of numpy.
- It can be used to easily manipulate the data using functions like merge, concatenate or reshape.
- It works smoothly and efficiently with time series data.
- It provides series and dataframes for handling one-dimensional and multi-dimensional data.
- It can easily extract the data from various data forms like txt, csv, excel and present it in a tabular (dataframe) form.
How to install pandas ?
Installing with anaconda
Pandas is buit on the top of Numpy and Scipy and hence for installing pandas you need to install numpy and scipy also.Which makes it little difficult for novice users to install it. Hence, the simplest way to install pandas (as recommended by the official website also) is to install it using anaconda. You can check installation guide for anaconda from here.
Installing with miniconda
The disadvantage of using the above method is that, it will result in installing hundreds of packages included with Anaconda. To overcome this and have more control on the number of packages you want to install, you can use miniconda.
Use this installer to run miniconda, which will install conda for you. Once you are done doing that, you can simply run the following command to install pandas.
conda install pandas
Installing from PyPi
Pandas can also be installed from Pypi using the following command:-
pip install pandas
Check out our one minute video for installing pandas using pypi
Installing using Linux distribution’s package
You can also install pandas for python 3 on linux using the commands in the following table:-
|Distribution||Status||Repository Link||Install method|
|Debian||stable||official Debian repository||sudo apt-get install python3-pandas|
|Debian & Ubuntu||unstable (latest packages)||NeuroDebian||sudo apt-get install python3-pandas|
|Ubuntu||stable||official Ubuntu repository||sudo apt-get install python3-pandas|
|OpenSuse||stable||OpenSuse Repository||zypper in python3-pandas|
|Fedora||stable||official Fedora repository||dnf install python3-pandas|
|Centos/RHEL||stable||EPEL repository||yum install python3-pandas|