With global stock markets reeling from the uncertainty around the current COVID-19 pandemic, I thought it might be interesting to see how we can pull some stock market data into Python for further analysis.
In this first post, we look at how to import daily trading data into Python and display the data for a chosen stock.
Before getting started, it is assumed that you are already familiar with Python and have both Python and it’s package manager, pip installed.
To follow this project, some additional Python packages will also need to be installed.
Using pandas allows us to easily interact with the data using two dimensional structures called dataframes, whilst pandas-datareader is used for obtaining Yahoo stock market data.
The above packages (including dependencies) can be installed using the following two commands :
pip install pandas pip install pandas-datareader
With the packages now installed, we are ready to get started.
First we import the modules from the installed packages, as well as the built-in datetime
module:
import datetime as dt import pandas as pd import pandas_datareader as pdr
As we will soon see, the datetime
module is needed to create date objects, which are used to extract stock prices in a specific date range.
Next, we define some variables needed to define the data that we want to import, based on the ticker symbol and the date range in which we are interested. The stock symbols used are as per Yahoo Finance, so depending on which global market data you want to access, the ticker symbols may differ slightly from your local market. (For example in Australia, the ticker symbol for Coles supermarket group is COL, but on Yahoo it is COL.AX).
Let’s use the Australian All Ordinaries index as our target, starting from 1 January 2019 up to the current date:
# target stock details stock_pick = '^AORD' start_date = dt.datetime(2019,1,1) end_date = dt.date.today()
To grab the data and place it in a dataframe, we pass the above variables to pandas_datareader
and specify the target dataset as ‘yahoo’:
# get stock data df = pdr.DataReader(stock_pick, 'yahoo', start_date, end_date)
To check if our data imported correctly, we can view the last 5 lines of data, using the tail
method:
# print stock data print(df.tail())
Viewing the output from the above command you should see that the stock data has columns for date, open, high, low, close, volume and adjusted close. (The data may be truncated as below, depending on your terminal width):
High Low ... Volume Adj Close Date ... 2020-04-02 5282.600098 5063.500000 ... 1.548106e+09 5106.8999 2020-04-06 5338.000000 5106.899902 ... 1.273476e+09 5323.6000 2020-04-07 5464.200195 5237.000000 ... 1.523458e+09 5301.2998 2020-04-08 5368.000000 5176.000000 ... 1.507547e+09 5258.7998 2020-04-09 5439.399902 5258.799805 ... 1.363194e+09 5439.3999
It looks like everything has imported successfully and we are now ready to start working with the data.
The code for the above can be downloaded from my GitHub page.
In the next post, we will look at creating a standard chart using the data.