DigitalOcean Referral Badge
Udit Vashisht
Author: Udit Vashisht

Time Series Analysis with Pandas

  • 11 minutes read
Time Series Analysis with Pandas

    Table of Contents


Python’s pandas library is frequently used to import, manage, and analyze datasets in a variety of formats. In this article, we’ll use it to analyze Amazon’s stock prices and perform some basic time series operations.
Stock markets play an important role in the economy of a country. Governments, private sector companies, and central banks keep a close eye on fluctuations in the market as they have much to gain or lose from it. Due to the volatile nature of the stock market, analyzing stock prices is tricky– this is where Python comes in. With built-in tools and external libraries, Python makes the process of analyzing complex stock market data seamless and easy.


We’ll be analyzing stock data with Python 3, pandas and Matplotlib. To fully benefit from this article, you should be familiar with the basics of pandas as well as the plotting library called Matplotlib.

Time series data

Time series data is a sequence of data points in chronological order that is used by businesses to analyze past data and make future predictions. These data points are a set of observations at specified times and equal intervals, typically with a datetime index and corresponding value. Common examples of time series data in our day-to-day lives include:

  • Measuring weather temperatures
  • Measuring the number of taxi rides per month
  • Predicting a company’s stock prices for the next day

Variations of time series data

  • Trend Variation: moves up or down in a reasonably predictable pattern over a long period of time.
  • Seasonality Variation: regular and periodic; repeats itself over a specific period, such as a day, week, month, season, etc.
  • Cyclical Variation: corresponds with business or economic ‘boom-bust’ cycles, or is cyclical in some other form.
  • Random Variation: erratic or residual; doesn’t fall under any of the above three classifications.

Here are the four variations of time series data visualized:

Importing stock data and necessary Python libraries

To demonstrate the use of pandas for stock analysis, we will be using Amazon stock prices from 2013 to 2018. We’re pulling the data from Quandl, a company offering a Python API for sourcing a la carte market data. A CSV file of the data in this article can be downloaded from the article’s repository.

Fire up the editor of your choice and type in the following code to import the libraries and data that correspond to this article.

Example code for this article may be found at the Kite Blog repository on Github.

# Importing required modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Settings for pretty nice plots'fivethirtyeight')

# Reading in the data
data = pd.read_csv('amazon_stock.csv')

A first look at Amazon’s stock Prices

Let’s look at the first few columns of the dataset:

# Inspecting the data


Let’s get rid of the first two columns as they don’t add any value to the dataset.

data.drop(columns=['None', 'ticker'], inplace=True)


Let us now look at the datatypes of the various components.


It appears that the Date column is being treated as a string rather than as dates. To fix this, we’ll use the pandas to_datetime() feature which converts the arguments to dates.

# Convert string to datetime64
data['Date'] = data['Date'].apply(pd.to_datetime)

Lastly, we want to make sure that the Date column is the index column.

data.set_index('Date', inplace=True)


Now that our data has been converted into the desired format, let’s take a look at its columns for further analysis.

  • The Open and Close columns indicate the opening and closing price of the stocks on a particular day.
  • The High and Low columns provide the highest and the lowest price for the stock on a particular day, respectively.
  • The Volume column tells us the total volume of stocks traded on a particular day.

The Adj_Close column represents the adjusted closing price, or the stock’s closing price on any given day of trading, amended to include any distributions and/or corporate actions occurring any time before the next day’s open. The adjusted closing price is often used when examining or performing a detailed analysis of historical returns.

data['Adj_Close'].plot(figsize=(16,8),title='Adjusted Closing Price')


Interestingly, it appears that Amazon had a more or less steady increase in its stock price over the 2013-2018 window. We’ll now use pandas to analyze and manipulate this data to gain insights.

Pandas for time series analysis

As pandas was developed in the context of financial modeling, it contains a comprehensive set of tools for working with dates, times, and time-indexed data. Let’s look at the main pandas data structures for working with time series data.

Manipulating datetime

Python’s basic tools for working with dates and times reside in the built-in datetime module. In pandas, a single point in time is represented as a pandas.Timestamp and we can use the datetime() function to create datetime objects from strings in a wide variety of date/time formats. datetimes are interchangeable with pandas.Timestamp.

from datetime import datetime
my_year = 2019
my_month = 4
my_day = 21
my_hour = 10
my_minute = 5
my_second = 30

We can now create a datetime object, and use it freely with pandas given the above attributes.

test_date = datetime(my_year, my_month, my_day)

# datetime.datetime(2019, 4, 21, 0, 0)

For the purposes of analyzing our particular data, we have selected only the day, month and year, but we could also include more details like hour, minute and second if necessary.

test_date = datetime(my_year, my_month, my_day, my_hour, my_minute, my_second)
print('The day is : ',
print('The hour is : ', test_date.hour)
print('The month is : ', test_date.month)

# Output

The day is :  21
The hour is :  10
The month is :  4

For our stock price dataset, the type of the index column is DatetimeIndex. We can use pandas to obtain the minimum and maximum dates in the data.


# Output

2018-03-27 00:00:00
2013-01-02 00:00:00

We can also calculate the latest date location and the earliest date index location as follows:

# Earliest date index location
# Latest date location

Time resampling

Examining stock price data for every single day isn’t of much use to financial institutions, who are more interested in spotting market trends. To make it easier, we use a process called time resampling to aggregate data into a defined time period, such as by month or by quarter. Institutions can then see an overview of stock prices and make decisions according to these trends.

The pandas library has a resample() function which resamples such time series data. The resample method in pandas is similar to its groupby method as it is essentially grouping according to a certain time span. The resample() function looks like this:

data.resample(rule = 'A').mean()

To summarize:

  • data.resample() is used to resample the stock data.
  • The ‘A’ stands for year-end frequency, and denotes the offset values by which we want to resample the data.
  • mean() indicates that we want the average stock price during this period.

The output looks like this, with average stock data displayed for December 31st of each year


Below is a complete list of the offset values. The list can also be found in the pandas documentation.


We can also use time sampling to plot charts for specific columns.

data['Adj_Close'].resample('A').mean().plot(kind='bar',figsize = (10,4))
plt.title('Yearly Mean Adj Close Price for Amazon')


The above bar plot corresponds to Amazon’s average adjusted closing price at year-end for each year in our data set.

Similarly, monthly maximum opening price for each year can be found below.


Time shifting

Sometimes, we may need to shift or move the data forward or backwards in time. This shifting is done along a time index by the desired number of time-frequency increments.

Here is the original dataset before any time shifts.


Forward Shifting

To shift our data forward, we will pass the desired number of periods (or increments) through the shift() function, which needs to be positive value in this case.


Here we will move our data forward by one period or index, which means that all values which earlier corresponded to row N will now belong to row N+1. Here is the output:


Backwards shifting

To shift our data backwards, the number of periods (or increments) must be negative.



The opening amount corresponding to 2018–03–27 is now 1530, whereas originally it was 1572.40.

Shifting based off time string code

We can also use the offset from the offset table for time shifting. For that, we will use the pandas shift() function. We only need to pass in the periods and freq parameters. The period attribute defines the number of steps to be shifted, while the freq parameters denote the size of those steps.

Let’s say we want to shift the data three months forward:

data.tshift(periods=3, freq = 'M').head()

We would get the following as an output:


## Rolling windows

Time series data can be noisy due to high fluctuations in the market. As a result, it becomes difficult to gauge a trend or pattern in the data. Here is a visualization of the Amazon’s adjusted close price over the years where we can see such noise:

data['Adj_Close'].plot(figsize = (16,8))


As we’re looking at daily data, there’s quite a bit of noise present. It would be nice if we could average this out by a week, which is where a rolling mean comes in. A rolling mean, or moving average, is a transformation method which helps average out noise from data. It works by simply splitting and aggregating the data into windows according to function, such as mean(), median(), count(), etc. For this example, we’ll use a rolling mean for 7 days.

Here’s is the output:


The first six values have all become blank as there wasn’t enough data to actually fill them when using a window of seven days.

So, what are the key benefits of calculating a moving average or using this rolling mean method? Our data becomes a lot less noisy and more reflective of the trend than the data itself. Let’s actually plot this out. First, we’ll plot the original data followed by the rolling data for 30 days.

data.rolling(window=30).mean()['Open'].plot(figsize=(16, 6))


The orange line is the original open price data. The blue line represents the 30-day rolling window, and has less noise than the orange line. Something to keep in mind is that once we run this code, the first 29 days aren’t going to have the blue line because there wasn’t enough data to actually calculate that rolling mean.


Python’s pandas library is a powerful, comprehensive library with a wide variety of inbuilt functions for analyzing time series data. In this article, we saw how pandas can be used for wrangling and visualizing time series data.

We also performed tasks like time sampling, time shifting and rolling with stock data. These are usually the first steps in analyzing any time series data. Going forward, we could use this data to perform a basic financial analysis by calculating the daily percentage change in stocks to get an idea about the volatility of stock prices. Another way we could use this data would be to predict Amazon’s stock prices for the next few days by employing machine learning techniques. This would be especially helpful from the shareholder’s point of view.

Example code for this article may be found at the Kite Blog repository on Github.

Disclaimer :- This article is reposted on this website as a content collaboration agreement with Kite’s editorial team and it was originally written by Parul Pandey on kite.

Related Posts

Chapter 7- Numbers, Conversions and Operators
By Udit Vashisht

In the previous chapter, we learned about various datatypes. From this chapter onward, we will learn in detail about each data type.

Numbers in Python

This datatype stores the numeric values and is immutable. For the beginners, it is sufficient to understand that immutable objects are those which can ...

Read More
How I used Python and Web Scrapping to find cheap diaper deals?
By Udit Vashisht

There were two things which pushed me to write down this code:-
1. Diapers are expensive and saving a dollar or two on it every month is cool.
2. If you are not using python to automate certain stuff, you are not doing it right.

So, here is ...

Read More
Chapter 6 - Data Types & Variables
By Udit Vashisht

Variables and Identifiers in Python

If we go by the dictionary meaning ‘Variable’ is something which is ‘able to be changed or adapted’. Which is true to much extent in terms of Python programming language also. Variable is basically a reference to the memory location where an object is ...

Read More
tech tutorials automate python beautifulsoup web scrapping webscrapping bs4 Strip Python3 programming Pythonanywhere free Online Hosting hindi til github today i learned Windows Installations Installation Learn Python in Hindi Python Tutorials Beginners macos installation guide linux SaralGyaan Saral Gyaan json in python JSON to CSV Convert json to csv python in hindi convert json csv in python remove background python mini projects background removal tweepy Django Django tutorials Django for beginners Django Free tutorials Proxy Models User Models AbstractUser UserModel convert json to csv python json to csv python Variables Python cheats Quick tips == and is f string in python f-strings pep-498 formatting in python python f string smtplib python send email with attachment python send email automated emails python python send email gmail automated email sending passwords secrets environment variables if name == main Matplotlib tutorial Matplotlib lists pandas Scatter Plot Time Series Data Live plots Matplotlib Subplots Matplotlib Candlesticks plots Tutorial Logging unittest testing python test Object Oriented Programming Python OOP Database Database Migration Python 3.8 Walrus Operator Data Analysis Pandas Dataframe Pandas Series Dataframe index pandas index python pandas tutorial python pandas python pandas dataframe python f-strings padding how to flatten a nested json nested json to csv json to csv python pandas Pandas Tutorial insert rows pandas pandas append list line charts line plots in python Django proxy user model django custom user model django user model matplotlib marker size pytplot legends scatter plot python pandas python virtual environment virtualenv venv python python venv virtual environment in python python decorators bioinformatics fastafiles Fasta python list append append raspberry pi editor cron crontab Cowin Cowin api python dictionary Python basics dictionary python list list ios development listview navigationview swiftui ios mvvm swift environmentobject property wrapper @State @Environm popup @State ios15 alert automation instagram instaloader texteditor youtubeshorts textfield multi-line star rating reusable swift selenium selenium driver requests-html youtube youtube shorts python automation python tutorial algo trading nifty 50 nifty50 stock list nifty50 telegram telegram bot dictionary in Python how to learn python learn python