Working with stock market data using Python: Part 3

In this series, we have been covering how to interact with market trading data from Yahoo Finance using Python. In the two previous posts, we covered how to connect to daily market trading data using pandas-datareader and pandas, as well as how to graph historic market data using matplotlib. Other than the daily trading data, we can also extract financial statistics for the ticker, such as fifty two week high and low, market capitalisation, as well as per share statistics, such as price earnings ratio, earnings per share, dividend yield and the price to book ratio, to name but a few.

No further Python packages need to be installed, as we have already installed pandas-datareader and pandas in the first post in this series.

To extract the per share statistics, we will be using YahooQuotesReader, which is part of pandas-datareader.

As in previous posts, we start with importing the modules that we need:

import pandas as pd 
import pandas_datareader.data as data

Next, we choose a target ticker, in this case Coles Group Limited, and pass the ticker symbol to YahooQuotesReader, storing the result in a dataframe, called ticker_stats.

# get the per share stats for the ticker
symbol = 'COL.AX'
ticker_stats = data.YahooQuotesReader(symbol).read()

The per share statistics that we will focus on are price earnings ratio, price book ratio and dividend yield, which we extract using the relevant column names in the pandas dataframe, and then print out the result.

# list some example per share stats for the ticker
pe = ticker_stats['trailingPE'].values[0]
pb = ticker_stats['priceToBook'].values[0]
dy = ticker_stats['trailingAnnualDividendYield'].values[0]
print('TickerttP/EttP/BttDivY')
print('{}tt{:.2f}t{:.4f}t{:.4f}'.format(symbol, pe, pb, dy))

The output of the above code should be similar to the following (values will be based on data at time of executing the script):

Ticker    P/E     P/B      DivY
COL.AX    18.16   8.5487   0.0336

In order to get a complete list of all available statistics for the ticker, all the column names of the dataframe can be displayed using the command:

print ticker_stats.columns

So that’s all very interesting and all, but what if we want to look at the per share statistics for a range of stocks? Or maybe we want to filter the list of tickers for those that match a certain target criteria?

In the next example, building on what we have learnt above, we sift through the list of Australia’s top 200 stocks (ASX 200) and look for those stocks that meet out target criteria.

A file containing a list of the stocks in the ASX200 index can be accessed at https://www.asx200list.com. You may need to delete the first row of the downloaded CSV file, so that the first row of data contains the column name headers, in order to play nicely with pandas.

Using the ASX200 data from the CSV file, we loop through the ticker symbols and extract the data for each ticker. We use the same three target statistics as in the previous example, but are only interested in tickers with a price earnings ratio below 10, price book ratio below 1 and a dividend yield above 5%. Where the criteria are met, the result is printed to standard out.

Since we will be making multiple calls to YahooFinance when we are testing our code and as we loop through the list of tickers, we need to bear in mind that when using the public Yahoo API without authentication, we are limited to 2 000 requests per hour, or up to a total of 48 000 requests per day, based on our IP address.

First up, let’s import the necessary Python modules and read in the CSV file to a pandas dataframe. The CSV file is assumed to be in the same directory as the Python script.

import pandas as pd 
import pandas_datareader.data as data
# asx200 data from https://www.asx200list.com
# The fieldname we are interested in is "Code"
asx200 = pd.read_csv('asx200.csv')

Next, we create a simple header for the data that will be printed to standard out.

# print header
print('TickertP/EttP/BttDivY')

We then loop through the Code column in the asx200 dataframe, get the data for each ticker symbol, compare the result to our target criteria, and print the successful candidates.

# retrieve the data for each ticker symbol and analyse the result
for code in asx200['Code'].values:
	# add '.AX' as suffix to ASX200 symbol codes for YahooFinance compatibility
	ticker_stats = data.YahooQuotesReader(code + '.AX').read()
	
	try:
		pe = ticker_stats['trailingPE'].values[0]
		pb = ticker_stats['priceToBook'].values[0]
		dy = ticker_stats['trailingAnnualDividendYield'].values[0]

		# scan for P/E ratio < 10, P/B ratio < than 1, and DY > 5% 
		if pe < 10 and pb < 1 and dy > 0.05:
			print('{}\t\t{:.2f}\t{:.4f}\t{:.4f}'.format(code, pe, pb, dy))

	except KeyError:
		# ignore tickers that don't have all of the targeted ticker stats
		pass

There is quite a lot going on in the above code, so let’s dig a little deeper into the details.

When we pass the ticker symbol to YahooQuotesReader, we add a suffix '.AX', in order to be compatible with the ticker symbols used on Yahoo Finance. As before, we store the result in a dataframe, called ticker_stats.

We create variables for price earnings ratio, price book ratio and dividend yield, and assign the applicable values from ticker_stats.

Not all stocks in the ASX200 have a complete set of per share statistics data on Yahoo Finance, which will cause a KeyError when attempting to access the applicable column in ticket_stats. We therefore use try and except blocks to handle this situation, effectively ignoring these stocks.

The if statement has multiple conditions to test for the target criteria (price earnings ratio below 10, price book ratio below 1 and a dividend yield above 5%).

The for loop iterates through each stock symbol and repeats the above process on each iteration.

Successful results are printed to standard out. The output at the time of testing this script was as follows (values will be based on data at time of executing the script):

Ticker P/E  P/B  DivY
ABP  9.42 0.7029 0.0776
ANZ  7.93 0.7471 0.0999
BOQ  8.67 0.5063 0.0661
BEN  9.80 0.5122 0.1142
CGF  5.96 0.6622 0.0829
DXS  6.18 0.7711 0.0571
EHE  9.89 0.4520 0.1023
FLT  4.61 0.6552 0.1120
GEM  5.73 0.4301 0.1346
GPT  7.91 0.6363 0.0720
GOZ  5.53 0.7831 0.0815
LLC  7.59 0.8881 0.0583
MGR  7.98 0.7887 0.0596
NAB  9.59 0.8151 0.1056
NHC  7.67 0.6303 0.1053
ORG  8.42 0.6216 0.0636
SCG  9.05 0.4505 0.1124
SFR  7.93 0.9509 0.0507
SXL  2.06 0.2359 0.3478
URW  2.49 0.1430 2.8466
WBC  8.09 0.8159 0.1139
WEB  7.69 0.5450 0.0987
WHC  7.25 0.5379 0.0829

The above examples are only a small illustration of what can be achieved by using Python to analyse the financial data available from Yahoo Finance. There certainly is a lot of scope to build more complicated models for analysis.

The full code for both the first and second examples above can be found on my Github page.

Sheldon Barry

A place to organise my thoughts

Working with stock market data using Python: Part 3

Leave a Reply Cancel reply