Get in touch!

pyAFL - a Python library for collecting AFL statistics 📊

By Ram Parameswaran web data scraping sports analytics data analytics Hacktoberfest Australian rules football


Last year I participated in an inter-office AFL (Australian football) tipping competition. I have always been interested in sports analytics, and it turns out that AFL is one of the most data-rich sports in the world. So this was a perfect opportunity to practice some Python machine learning, and to hopefully win the office sweepstakes!

My machine learning algorithm used historical data on team performance (point-difference) to estimate the relative "strength" of any two teams. For example, let's say I want to compare Team A and Team B this week. My algorithm analysed all recent match-ups between Team A vs Team B, and recorded which team won and by how many points. It then analysed all match-ups between Team A or Team B against a common opponent (Team X). It continued in this manner to produce a number of metrics which arranged teams in some numerical order. The machine learning algorithm then found an optimal weighting for these ordinal metrics which most accurately predicted the "stronger" team.

The algorithm looked great. It was simple in concept and in execution. The only problem ... it wasn't very good at predicting the winning team... And so I finished in last place in the inter-office tipping competition that year 😅.

The problem was that my approach was too simple. The basic (only) unit of comparison was a Team. But teams are a sum of their players. And players' performances vary over time and are dependent on a number of factors: their opponents, their confidence, their performance in previous matches, their level of exhaustion. But data on individual players is big and messy.

What I needed was a convenient way to collect data in a format which Python can easily manipulate. Enter pyAFL.


pyAFL is my attempt to create a Python wrapper around the dataset so painstakingly curated by My aim is to make it easier for Pythonistas out there to get their hands on AFL data, and start concentrating on the analytics - and hopefully do better than me! 😃

The pyAFL API allows you to query historical data on Players and Teams. The data is scraped from and converted into structured Python objects and Pandas dataframes. These can easily be used in scikit-learn or any other Python analytics tool.


Request caching - All data is scraped from We don't want to make 1000's of requests to every time we test and run our analytics script. So, instead I've implemented request-caching into pyAFL. Whenever a web-request is made to, the result is cached locally and retrieved on each successive request 👍.

Contributions welcome - this project is open source and a great opportunity for developers participating in Hacktoberfest this year. There are a tonne of improvements to be made, so if you're interested check out the repo at If you're new to open-source software, read the for a crash course of how to make your first pull-request!

Ram Parameswaran

about Ram Parameswaran

Django & React Developer, Data Geek and Maker-Enthusiast.
Ram builds web applications that help businesses operate more efficiently.