DigitalOcean Referral Badge
Udit Vashisht
Author: Udit Vashisht


Matplotlib Tutorial in Python | Chapter 2 | Extracting Data from CSVs and plotting Bar Charts.

  • 9 minutes read
  • 2032 Views
Matplotlib Tutorial in Python | Chapter 2 | Extracting Data from CSVs and plotting Bar Charts.

    Table of Contents

Matplotlib Tutorial in Python

Chapter 2 | Extracting Data from CSVs and plotting Bar Charts

In the last chapter, we learned to draw simple plots in Matplotlib and further customizing it.In this chapter we will be learning to extract data from external sources like CSV and creating Matplotlib bar charts.

Matplotlib Tutorials in Python - Creating Simple Bar Charts

Just like plt.plot() , you can simply use plt.bar() to create Matplotlib Bar Chart.

# matplotlib_barchart_tutorial.py

import matplotlib.pyplot as plt
plt.style.use('ggplot')

ages = [12, 13, 14, 15, 16, 17, 18, 19, 20, 21]

total_population = [27877307, 24280683, 25258169, 25899454, 24592293, 21217467, 27958147, 20859088, 28882735, 19978972]

plt.bar(ages, total_population, color='b', label="Total Population")
plt.legend()
plt.xlabel("Age")
plt.ylabel("Total Population")
plt.title("Age-wise population of India")
plt.show()

Just like plt.plot(), we can pass the color argument to the plt.bar() also.

matplotlib_tutorial_barchart_simple.png

Matplotlib Tutorials in Python - Adding Line Plots to Bar Charts

We can add line plots to overlap Matplotlib Bar Charts.

# matplotlib_barchart_tutorial.py

male_population = [14637892, 12563775, 13165128, 13739746, 13027935, 11349449, 15020851, 10844415, 14892165, 10532278]

female_population = [13239415, 11716908, 12093041, 12159708, 11564358, 9868018, 12937296, 10014673, 13990570, 9446694]

plt.plot(ages, male_population, color='g', linestyle='--', marker='o', label="Male Population")
plt.plot(ages, female_population, color='r', linestyle='-', marker='^', label="Female Population")

matplotlib_tutorial_barchart_simple_.png

Matplotlib Tutorials in Python - Adding more Bar Charts to the Matplotlib Plot

We can simply add more than one Bar Chart in a Matplollib Plot by running plt.bar().

matplotlib_tutorial_barchart_mutliple_.png

In the above plot, for each age range, female population < male population < total population, hence we can distinguish between the data but if the data is not linear like this, it won’t be possible to read the data from the plot that effectively.

Matplotlib Tutorials in Python - Stacking bars side-by-side in Matplotlib Bar Chart

To overcome the problem mentioned above, we will have to perform a little hack to place our Bars in the Matplotlib Charts side-by-side for better readability. For this we will be using numpy.Numpy will be installed automatically while installing matplotlib, if not, you can alternatively install it by:-

pip install numpy

Then, we will be adding following code to display the Bars side-by-side in the Matplotlib Bar Chart Plot:-

# matplotlib_barchart_tutorial.py

import numpy as np
age_indexes = np.arange(len(ages))
width = .30

plt.bar(age_indexes + width, total_population, width=width, color='b', label="Total Population")
plt.bar(age_indexes, male_population, width=width, color='g', label="Male Population")
plt.bar(age_indexes - width, female_population, width=width, color='r', label="Female Population")

plt.xticks(ticks=age_indexes, labels=ages)

Let me walk you through the code one by one:-

  1. We have imported the numpy as np.
  2. Then, we have created an array which has as many items as ages by using numpy.arange() and len() method to get the length of ages.
  3. Then we have defined the default width of each bar, this will be used to offset the bars to the left and right of the main bar chart.
  4. In plt.bar(), we have used +/- width to offset a particular bar to left and right of the male_population bar. Also, we have set the width = 0.30.
  5. Finally, to show x_indexes (i.e. from 0 to len of the list), we have set the labels of x-axis = ages.

matplotlib_tutorial_barchart_mutliple_side.png

Matplolib Tutorials in Python- Extracting data from CSV and plotting it as Matplotlib Bar Chart

In this section, we will parse data from a csv file using csv module and some other modules of python and then plot it as a Matplotlib Bar Chart using plt.bar() in matplotlib.

For this we will be using the data of services provided by various RPOs in India. I have downloaded the csv file from here and renamed it to ‘data.csv’.

We will be using the in-built csv module module to parse the data from the file. First of all let us have a look at what the data is:-

# matplotlib_barchart_tutorial.py

import csv
with open('../data/data.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    row = next(csv_reader)
    print(row)

# Output

OrderedDict([('ServiceName', 'Applications Received - Scheme wise'), ('RpoName', 'RPO Ahmedabad'), ('SchemeType', 'Normal'), ('LastWeekCount', '14196'), ('LastMonthCount', '68775'), ('YearTillDate', '447831'), ('Date', '2019-08-25 04:22:41.094411')])

We will be extracting the data by SchemeType and getting the numbers processed for that particular scheme from YearTillDate. Then we will be adding the said data for each RPO (RpoName). We will be creating an empty python dictionary called schemes_dict and add the keys and values using code.

# matplotlib_barchart_tutorial.py

schemes_dict = {}   #empty dictionary
with open('../data/data.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        if row['SchemeType'] not in schemes_dict.keys():
            schemes_dict[row['SchemeType']] = int(row['YearTillDate'])
        else:
            schemes_dict[row['SchemeType']] += int(row['YearTillDate'])

This will give us the following dictionary.

# matplotlib_barchart_tutorial.py

print (schemes_dict)

# Output
{'Normal': 6839295, 'Tatkaal': 258603, 'FEMALE': 2586870, 'MALE': 4510888, 'TRANSGENDER': 249, 'Count of Applications': 7098007, '10TH PASS AND ABOVE': 2603471, '5TH PASS OR LESS': 33944, 'BETWEEN 6TH AND 9TH STANDARD': 38233, 'GRADUATE AND ABOVE': 2073126, 'Between_18_to_35': 3644579, 'Between_36_to_60': 1989386, 'GreaterThan60': 500636, 'LessThan18': 963406, 'Challan': 209902400, 'Credit/ Debit Card': 5000286450, 'Online': 3408535500, 'FRESH': 4519001, 'PCC': 333621, 'REISSUE': 2239541, 'No Verification': 931889, 'Post Verification': 1005761, 'Pre Verification': 4333366, 'More than 21 Days': 837586, 'Within 21 Days': 4210361}

Now, the values here has a very vast range, so we will sort the keys and values in descending order.

# matplotlib_barchart_tutorial.py

sorted_schemes = sorted(schemes_dict.items(), key=lambda kv: -kv[1])
print(sorted_schemes)

# Output

[('Credit/ Debit Card', 5000286450), ('Online', 3408535500), ('Challan', 209902400), ('Count of Applications', 7098007), ('Normal', 6839295), ('FRESH', 4519001), ('MALE', 4510888), ('Pre Verification', 4333366), ('Within 21 Days', 4210361), ('Between_18_to_35', 3644579), ('10TH PASS AND ABOVE', 2603471), ('FEMALE', 2586870), ('REISSUE', 2239541), ('GRADUATE AND ABOVE', 2073126), ('Between_36_to_60', 1989386), ('Post Verification', 1005761), ('LessThan18', 963406), ('No Verification', 931889), ('More than 21 Days', 837586), ('GreaterThan60', 500636), ('PCC', 333621), ('Tatkaal', 258603), ('BETWEEN 6TH AND 9TH STANDARD', 38233), ('5TH PASS OR LESS', 33944), ('TRANSGENDER', 249)]

Here we have used the sorted() function to sort the dict and have defined the key as the second item of dict (i.e. value) using a lambda function. The output here is a list of tuples with each tuple having scheme and total number. The first three/four values are way higher than the rest of them, so we will be taking 10 items starting from fifth item to get a good bar chart.

# matplotlib_barchart_tutorial.py

scheme = []
total_number = []
for item in sorted_schemes[4:14]:
    scheme.append(item[0])
    total_number.append(item[1])

So, here we have created two list of ten items each, the first one contains the scheme type and the other one contains the total number of applications processsed in the said scheme. Now we will be plotting this data.

# matplotlib_barchart_tutorial.py

import matplotlib.pyplot as plt
import csv

schemes_dict = {}
with open('../data/data.csv', 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        if row['SchemeType'] not in schemes_dict.keys():
            schemes_dict[row['SchemeType']] = int(row['YearTillDate'])
        else:
            schemes_dict[row['SchemeType']] += int(row['YearTillDate'])
sorted_schemes = sorted(schemes_dict.items(), key=lambda kv: -kv[1])

scheme = []
total_number = []
for item in sorted_schemes[3:13]:
    scheme.append(item[0])
    total_number.append(item[1])

plt.bar(scheme, total_number)
plt.xlabel("Scheme")
plt.ylabel("Total Applications")
plt.title("Scheme-wise Passport Applications processed in India")
plt.show()

We will get the following Matplotlib plot:-

matplotlib_tutorial_barchart_csv.png

Here, the x-labels are not legible, Let us change it. We will be using rotation method in plt.xticks() to rotate it vertically i.e. 90 degrees (You can rotate it to any angle, play around). Then we will be using plt.subplots_adjust(), to adjust the plot by setting left and bottom margin.

# matplotlib_barchart_tutorial.py
plt.xticks(rotation=90)
plt.subplots_adjust(left=.15, bottom=.4)

matplotlib_tutorial_barchart_csv_.png

Let us use plt.tight_layout() to make it better:-

# matplotlib_barchart_tutorial.py

plt.tight_layout()
plt.show()

matplotlib_tutorial_barchart_csv__.png

Creating horizontal barchart in Matplotlib

Creating a horizontal bar chart is as easy as using plt.barh() instead of plt.bar() and tweaking a bit with plt.xlabel() and plt.ylabel().

# matplotlib_barchart_tutorial.py

plt.barh(scheme, total_number)
plt.xlabel("Total Applications")
plt.ylabel("Scheme")
plt.title("Scheme-wise Passport Applications processed in India")
plt.xticks(rotation=90)
plt.subplots_adjust(left=.15, bottom=.4)
plt.tight_layout()
plt.show()

matplotlib_tutorial_barchart_csv___.png

If you have liked our tutorial, there are various ways to support us, the easiest is to share this post. You can also follow us on facebook, twitter and youtube.

In case of any query, you can leave the comment below.

In the next chapter we will learn about drawing Pi-charts in Matplotlib in Python.

Video Tutorial

Table of Contents of Matplotlib Tutorials for Python

Matplotlib Tutorial in Python | Chapter 1 | Introduction

Matplotlib Tutorial in Python | Chapter 2 | Extracting Data from CSVs and plotting Bar Charts

Pie Charts in Python | Matplotlib Tutorial in Python | Chapter 3

Matplotlib Stack Plots/Bars | Matplotlib Tutorial in Python | Chapter 4

Filling Area on Line Plots | Matplotlib Tutorial in Python | Chapter 5

Python Histograms | Matplotlib Tutorial in Python | Chapter 6

Scatter Plotting in Python | Matplotlib Tutorial | Chapter 7

Plot Time Series in Python | Matplotlib Tutorial | Chapter 8

Python Realtime Plotting | Matplotlib Tutorial | Chapter 9

Matplotlib Subplot in Python | Matplotlib Tutorial | Chapter 10

Python Candlestick Chart | Matplotlib Tutorial | Chapter 11

If you have liked our tutorial, there are various ways to support us, the easiest is to share this post. You can also follow us on facebook, twitter and youtube.

In case of any query, you can leave the comment below.

If you want to support our work. You can do it using Patreon.


Related Posts

Python Decorators - What are Python Decorators - A Complete Tutorial
By Udit Vashisht

What are Python Decorators ?

Python Decorators function is just a function that takes another function as an argument and add extra features or behaviour to another function without altering the source code of original function.

The python decorators take advantage of the fact that the functions in python are ...

Read More
Chapter 7- Numbers, Conversions and Operators
By Udit Vashisht

In the previous chapter, we learned about various datatypes. From this chapter onward, we will learn in detail about each data type.

Numbers in Python

This datatype stores the numeric values and is immutable. For the beginners, it is sufficient to understand that immutable objects are those which can ...

Read More
What is a Python List?
By Udit Vashisht

What is a Python List?

A Python List is the most common and versatile data structure in Python. It is one of the six built-in types of sequences in Python language. Each item in a Python list is separated by comma (,) and are enclosed in square brackets ([]). It ...

Read More
Search
Tags
tech tutorials automate python beautifulsoup web scrapping webscrapping bs4 Strip Python3 programming Pythonanywhere free Online Hosting hindi til github today i learned Windows Installations Installation Learn Python in Hindi Python Tutorials Beginners macos installation guide linux SaralGyaan Saral Gyaan json in python JSON to CSV Convert json to csv python in hindi convert json csv in python remove background python mini projects background removal remove.bg tweepy Django Django tutorials Django for beginners Django Free tutorials Proxy Models User Models AbstractUser UserModel convert json to csv python json to csv python Variables Python cheats Quick tips == and is f string in python f-strings pep-498 formatting in python python f string smtplib python send email with attachment python send email automated emails python python send email gmail automated email sending passwords secrets environment variables if name == main Matplotlib tutorial Matplotlib lists pandas Scatter Plot Time Series Data Live plots Matplotlib Subplots Matplotlib Candlesticks plots Tutorial Logging unittest testing python test Object Oriented Programming Python OOP Database Database Migration Python 3.8 Walrus Operator Data Analysis Pandas Dataframe Pandas Series Dataframe index pandas index python pandas tutorial python pandas python pandas dataframe python f-strings padding how to flatten a nested json nested json to csv json to csv python pandas Pandas Tutorial insert rows pandas pandas append list line charts line plots in python Django proxy user model django custom user model django user model matplotlib marker size pytplot legends scatter plot python pandas python virtual environment virtualenv venv python python venv virtual environment in python python decorators bioinformatics fastafiles Fasta python list append append raspberry pi editor cron crontab Cowin Cowin api python dictionary Python basics dictionary python list list ios development listview navigationview swiftui ios mvvm swift environmentobject property wrapper @State @Environm popup @State ios15 alert automation instagram instaloader texteditor youtubeshorts textfield multi-line star rating reusable swift selenium selenium driver requests-html youtube youtube shorts python automation python tutorial algo trading nifty 50 nifty50 stock list nifty50 telegram telegram bot dictionary in Python how to learn python learn python