Scorlibo
25-05-2013, 11:13 AM
Stats divide the football public. When first introduced to discussions of players and teams a number of years ago, they were nothing more than a count of the number of times a player touched the ball and the number of goals they kicked - caveman stats. It became clear that some quite ordinary players could find a lot of the ball. Subsequently stats came to be viewed with scant respect.
These days there's far more to it, but most still treat stats as though they're 'double plus ungood'. I have been a supporter of the stats innovations brought to the public by Champion Data, and I wish to demystify a couple of the rankings systems used by them - because Champion Data, although talented mathematicians abound, apparently spend no money on PR, leaving their supporters out in the cold to fight off the wolves.
Supercoach Scores
I'm sure many here play supercoach, and probably know that the scoring system used is superior to dream team (but maybe only because that's the line that the herald sun have been spouting for years now).
Before supercoach scores were supercoach scores, they were 'Official AFL Rankings'. They weren't intended to be fantasy scores, but I suppose when news limited bought the rights to them, that's what they became. But enough backstory, here's how they work:
Around 70 stats are recorded. These are inclusive of both quantitative and qualitative measures. Examples of quantitative stats are possessions, disposals, tackles, frees for, frees against, goals and marks. Qualitative stats involve a description of the quality of the quantitative stats listed above, eg. contested and uncontested possessions, effective and ineffective disposals, contested and uncontested marks. Of course the description will often involve even more classification, for example a contested possession can be any of:
- loose ball get (ball in dispute, no immediate physical pressure)
- hard ball get (ball in dispute, immediate physical pressure)
- contested mark (mark whilst in contact with the opposition)
Uncontested possessions:
- gather (no immediate physical pressure, team has control of the ball)
- handball receive
- uncontested mark
Disposal (either effective or ineffective, additive descriptions):
- rebound 50
- inside 50
- clearance
- score assist
- goal
- and more..
Ineffective disposal:
- ineffective (disposed to a contest, no uncontested possession)
- clanger (disposed to give the opposition uncontested possession)
Also recorded are many stats pertaining to defenders, such as:
- spoils
- knock ons
- tackles
- pressure acts
- marks from opposition kicks
- rushed behinds
- smothers
Points are allocated to each statistic in accordance with how correlative that statistic is with winning games. For example, goals become the most significant statistic in this analysis because teams invariably win when they kick more goals than the opposition. Each stat is tested against other stats to eliminate biases. For example, the raw correlation between behinds and winning is very high, but once the positive relationship between goals and behinds is recognised, behinds lose their value. I'm not completely sure how this is done mathematically, but anyone can understand the need to do so.
Each stat is given a value relative to the other stats, positive or negative, and inflated/deflated in accordance with how many points need to be given out each game. In every game of AFL, 3300 supercoach points are handed out. Points are scaled to meet this goal. Points are also scaled to be in line with the change in the probability of winning at any stage of the game. For example, if it's the last quarter and one team is 50 points up, then it's unlikely players from either team will get many points from the last quarter because the probabilities aren't fluctuating very much. On the other hand, when it is a very close game in the last quarter, those who play supercoach will know, player scores can be scaled up and down by as much as 50 points because the probabilities of each team winning and losing are changing so rapidly. I personally only see the merit in this when the margin is very large and the game time is reduced to 'junk time'.
So Supercoach scores essentially act as the ideal summary of a comprehensive stats sheet and until very recently were the best statistical rankings out there.
Official AFL Ratings
http://www.afl.com.au/stats/player-ratings/ratings-hub
These newly implemented ratings measure players in terms of scoreboard impact. This is done by assessing the change in the probability of the next score, for either team, from one situation to another.
Example 1:
Situation - centre bounce.
Team 1 goal - 30% (rough estimations of probabilities)
Team 2 goal - 30%
Team 1 behind - 20%
Team 2 behind - 20%
Average outcome for Team 1 = (0.3*6+0.2*1)-(0.3*6+0.2*1) = 0
Changed situation - Player A from Team 1 has won contested possession at the centre bounce.
Team 1 goal - 35%
Team 2 goal - 25%
Team 1 behind - 23%
Team 2 behind - 17%
Average outcome for Team 1 = (0.35*6+0.23*1)-(0.25*6+0.17*1) = 0.66
Resulting scoreboard impact for Player A = +0.66
Example 2:
Situation - Player A from Team 1 with uncontested possession on the wing, no pressure.
Team 1 goal - 50%
Team 2 goal - 10%
Team 1 behind - 35%
Team 2 behind - 5%
Average outcome for Team 1 = (0.5*6+0.35*1)-(0.1*6+0.05*1) = 2.7
Changed situation - Player B from Team 2 takes an uncontested mark from Player A's kick.
Team 1 goal - 25%
Team 2 goal - 35%
Team 1 behind - 17%
Team 2 behind - 23%
Average outcome for Team 1 = (0.35*6+0.23*1)-(0.25*6+0.17*1) = -0.66
Resulting scoreboard impact for Player A = -3.33
Example 3:
Situation - Marking contest in Team A's forward 50, 10 metres out.
Team 1 goal - 65%
Team 2 goal - 10%
Team 1 behind - 20%
Team 2 behind - 5%
Average outcome for Team 1 = (0.65*6+0.2*1)-(0.1*6+0.05*1) = 3.45
Changed situation - Player B from Team 2 spoils the ball and rushes a behind.
Team 1 goal - 0%
Team 2 goal - 0%
Team 1 behind - 100%
Team 2 behind - 0%
Average outcome for Team 1 = 1
Resulting scoreboard impact for Player B = +2.45
You can watch a video on all the possible game situation changes here: http://www.afl.com.au/stats/player-ratings/ratings-explained
Also on this page is an in depth pdf account of the rating system.
These probabilities are found after years of recording game situations and the result of the next score.
The advantage of this system is obvious: you can see each player's real scoreboard impact on the game. The classification system in place is already quite sophisticated, giving the state of play and the degree of pressure which the player is under when disposing of the ball. Players who interrupt scoring plays are rewarded probably more than those who continue them, meaning defenders get due reward for defensive acts.
The main drawback of this new system is that Champion Data doesn't have access to the GPS data of players at this stage, so short of employing another 40 odd watchers at the game to record where every player is on the ground, there's no way of knowing the distribution of players. This becomes important when giving value to gut running, midfield accountability, defenders keeping their opponent close and thus nullifying them as an option. Also the percentages change, for example, between an open forward line and a crowded one, but the current system can only describe the situation around the ball.
Even so, this is a great innovation in player ratings. It's in effect the same as personally going through a game piece by piece, assigning each play a certain value and adding these values to a total, but possibly the system is more exact in the assignment of these values. I have no trouble in placing a lot of faith in this system, and would hold it in higher esteem than my own view of the game.
When talking of stats more generally, their role in football discussions seems misunderstood. As gogriff posted in another thread, talking about Crossy's omission:
Stats are the evidence supporting a claim. I could say to you "Cross wins more of the ball", "Cross wins more clearances" or "Cross gets more contested possessions", but unless I can support that with stats then it has no factual basis.
Stats are facts, you've just got to understand exactly what the facts mean.
These days there's far more to it, but most still treat stats as though they're 'double plus ungood'. I have been a supporter of the stats innovations brought to the public by Champion Data, and I wish to demystify a couple of the rankings systems used by them - because Champion Data, although talented mathematicians abound, apparently spend no money on PR, leaving their supporters out in the cold to fight off the wolves.
Supercoach Scores
I'm sure many here play supercoach, and probably know that the scoring system used is superior to dream team (but maybe only because that's the line that the herald sun have been spouting for years now).
Before supercoach scores were supercoach scores, they were 'Official AFL Rankings'. They weren't intended to be fantasy scores, but I suppose when news limited bought the rights to them, that's what they became. But enough backstory, here's how they work:
Around 70 stats are recorded. These are inclusive of both quantitative and qualitative measures. Examples of quantitative stats are possessions, disposals, tackles, frees for, frees against, goals and marks. Qualitative stats involve a description of the quality of the quantitative stats listed above, eg. contested and uncontested possessions, effective and ineffective disposals, contested and uncontested marks. Of course the description will often involve even more classification, for example a contested possession can be any of:
- loose ball get (ball in dispute, no immediate physical pressure)
- hard ball get (ball in dispute, immediate physical pressure)
- contested mark (mark whilst in contact with the opposition)
Uncontested possessions:
- gather (no immediate physical pressure, team has control of the ball)
- handball receive
- uncontested mark
Disposal (either effective or ineffective, additive descriptions):
- rebound 50
- inside 50
- clearance
- score assist
- goal
- and more..
Ineffective disposal:
- ineffective (disposed to a contest, no uncontested possession)
- clanger (disposed to give the opposition uncontested possession)
Also recorded are many stats pertaining to defenders, such as:
- spoils
- knock ons
- tackles
- pressure acts
- marks from opposition kicks
- rushed behinds
- smothers
Points are allocated to each statistic in accordance with how correlative that statistic is with winning games. For example, goals become the most significant statistic in this analysis because teams invariably win when they kick more goals than the opposition. Each stat is tested against other stats to eliminate biases. For example, the raw correlation between behinds and winning is very high, but once the positive relationship between goals and behinds is recognised, behinds lose their value. I'm not completely sure how this is done mathematically, but anyone can understand the need to do so.
Each stat is given a value relative to the other stats, positive or negative, and inflated/deflated in accordance with how many points need to be given out each game. In every game of AFL, 3300 supercoach points are handed out. Points are scaled to meet this goal. Points are also scaled to be in line with the change in the probability of winning at any stage of the game. For example, if it's the last quarter and one team is 50 points up, then it's unlikely players from either team will get many points from the last quarter because the probabilities aren't fluctuating very much. On the other hand, when it is a very close game in the last quarter, those who play supercoach will know, player scores can be scaled up and down by as much as 50 points because the probabilities of each team winning and losing are changing so rapidly. I personally only see the merit in this when the margin is very large and the game time is reduced to 'junk time'.
So Supercoach scores essentially act as the ideal summary of a comprehensive stats sheet and until very recently were the best statistical rankings out there.
Official AFL Ratings
http://www.afl.com.au/stats/player-ratings/ratings-hub
These newly implemented ratings measure players in terms of scoreboard impact. This is done by assessing the change in the probability of the next score, for either team, from one situation to another.
Example 1:
Situation - centre bounce.
Team 1 goal - 30% (rough estimations of probabilities)
Team 2 goal - 30%
Team 1 behind - 20%
Team 2 behind - 20%
Average outcome for Team 1 = (0.3*6+0.2*1)-(0.3*6+0.2*1) = 0
Changed situation - Player A from Team 1 has won contested possession at the centre bounce.
Team 1 goal - 35%
Team 2 goal - 25%
Team 1 behind - 23%
Team 2 behind - 17%
Average outcome for Team 1 = (0.35*6+0.23*1)-(0.25*6+0.17*1) = 0.66
Resulting scoreboard impact for Player A = +0.66
Example 2:
Situation - Player A from Team 1 with uncontested possession on the wing, no pressure.
Team 1 goal - 50%
Team 2 goal - 10%
Team 1 behind - 35%
Team 2 behind - 5%
Average outcome for Team 1 = (0.5*6+0.35*1)-(0.1*6+0.05*1) = 2.7
Changed situation - Player B from Team 2 takes an uncontested mark from Player A's kick.
Team 1 goal - 25%
Team 2 goal - 35%
Team 1 behind - 17%
Team 2 behind - 23%
Average outcome for Team 1 = (0.35*6+0.23*1)-(0.25*6+0.17*1) = -0.66
Resulting scoreboard impact for Player A = -3.33
Example 3:
Situation - Marking contest in Team A's forward 50, 10 metres out.
Team 1 goal - 65%
Team 2 goal - 10%
Team 1 behind - 20%
Team 2 behind - 5%
Average outcome for Team 1 = (0.65*6+0.2*1)-(0.1*6+0.05*1) = 3.45
Changed situation - Player B from Team 2 spoils the ball and rushes a behind.
Team 1 goal - 0%
Team 2 goal - 0%
Team 1 behind - 100%
Team 2 behind - 0%
Average outcome for Team 1 = 1
Resulting scoreboard impact for Player B = +2.45
You can watch a video on all the possible game situation changes here: http://www.afl.com.au/stats/player-ratings/ratings-explained
Also on this page is an in depth pdf account of the rating system.
These probabilities are found after years of recording game situations and the result of the next score.
The advantage of this system is obvious: you can see each player's real scoreboard impact on the game. The classification system in place is already quite sophisticated, giving the state of play and the degree of pressure which the player is under when disposing of the ball. Players who interrupt scoring plays are rewarded probably more than those who continue them, meaning defenders get due reward for defensive acts.
The main drawback of this new system is that Champion Data doesn't have access to the GPS data of players at this stage, so short of employing another 40 odd watchers at the game to record where every player is on the ground, there's no way of knowing the distribution of players. This becomes important when giving value to gut running, midfield accountability, defenders keeping their opponent close and thus nullifying them as an option. Also the percentages change, for example, between an open forward line and a crowded one, but the current system can only describe the situation around the ball.
Even so, this is a great innovation in player ratings. It's in effect the same as personally going through a game piece by piece, assigning each play a certain value and adding these values to a total, but possibly the system is more exact in the assignment of these values. I have no trouble in placing a lot of faith in this system, and would hold it in higher esteem than my own view of the game.
When talking of stats more generally, their role in football discussions seems misunderstood. As gogriff posted in another thread, talking about Crossy's omission:
Stats are the evidence supporting a claim. I could say to you "Cross wins more of the ball", "Cross wins more clearances" or "Cross gets more contested possessions", but unless I can support that with stats then it has no factual basis.
Stats are facts, you've just got to understand exactly what the facts mean.