I thought I’d take a look at team level passing completion statistics in this season’s Premier league. Here are the numbers just considering open play passes made with player’s feet. The x-axis is that team’s pass completion percentage, the y-axis shows the opposition’s. Continue reading
Statistical analysis of the defensive side of football is notoriously difficult. Tackle and interception counts are famously deceptive. One of the best ways of measuring a defence is from the affect it has on the opposing attack.
One fairly simple way of doing this is to look at opponents pass completion percentage. The more pressure the defence exerts on the ball, the harder it is for the opposition to pass the ball successfully. Continue reading
Previously I created an Expected Goals model based on logistic regression.
I wanted to improve this model. Rather than add new features and work out how to include them in the regression equations, I decided a simpler way would be to use a Machine Learning algorithm to do it for me. So I decided to convert my model to use a Neural Network. Continue reading
Previously I have built a very simple expected goals model based on four buckets for shots – six yard box, penalty area, outside the area, penalties. This is an improvement on pure shot numbers but still fairly crude. Here I describe my attempts to refine the model. Continue reading
Aim: To produce time series from a simple Expected Goals model to help analyse the progress of football matches.
Expected Goals is a derived statistic that estimates the number of goals a team would score on average from its opportunities. It has become so widespread it now features on Match of the Day.
Unlike observed statistics like goals or shots, it depends on a model. These models can be incredibly sophisticated – see Michael Caley’s excellent work.
A couple of years ago I tried creating my own much simpler version. You can read about it here. I was mainly investigating using the data for prediction. As on Match of the Day, expected goals are most commonly used for analysis, i.e. describing what happened after the event.
I wanted to build a slightly modified version for analysing games myself. As far as categories go, out go headers (these are now lumped together with other shots), in come penalties and shots in the 6 yard box. So now the four categories of shot used in the model are:
- shots in the 6 yard box
- shots in the rest of the penalty box
- shots outside the penalty box
Still pretty simple – unfortunately time and a lack of finer grained data prevent me from going much further.
Expected Goals over time
However, what I have added is the ability to record how cumulative expected goals build up over the course of a match.
Here is one example from last weeks Premier League:
The graph captures the ups and downs of a roller-coaster match quite well. Liverpool were slow to start, then dominated after half-time. They probably should have sealed the win but a late surge from Watford was enough to share the points.
I will try to publish more examples here and on my Spurs blog and use them as a tool in my own analysis of games. I hope to refine the model over time. Maybe put headers back, or split the locations into finer buckets. Unfortunately getting the data is the main barrier.
Note: the model was calibrated from Premier League data from the last 6 seasons from WhoScored.com.
Follow me on Twitter @ABPSpurs
Last post I looked at adjusting shot ratios using some simple categories and created an improved metric that I called PGR.
But what are the current numbers for Premier League teams? Continue reading