Football Reference 101 — Finding your way through a gold mine

Open your laptop, open your browser, open fbref.com

Ninad Barbadikar
7 min readJun 16, 2021

Hey everyone, how’s it going?

Hope all of you have been safe and well and if you’re going through something, I sincerely hope that things get better for you soon.

Today’s blog post is all about football reference, aka FBref.com

For years now, football reference has become the no.1 playground for people of all ages to play around with numbers in football. And rightly so, the makers of fbref.com deserve huge credit for creating such a useful resource for so many of us who are either just starting out or are slightly more experienced in using football data to gain insights.

For anyone just starting out in the world of football data, all the numbers can seem very exciting and the world can seem full of endless possibilities. But I think it’s important to channel all of that energy in the right direction and try to understand what exactly it is, that you’re working with.

So without any further ado, let’s get straight into it. Let’s try to look at player shooting data which can be found under the squad and player stats tab on any of the competitions pages.

Note: I have only used Premier League data for this article.

Scrolling down to the players data table. This is what you generally see.

Data for Premier League players

You’ll notice that the entire table is split into different sections, so let me try to elaborate on some of the metrics which are interesting.

Minutes played (Min)

Minutes played overall in the season is actually a very useful indicator of a particular player’s availability throughout the campaign and is often a good indicator of the sample size. For example, if you’re looking to compare two strikers, it’s always better to compare them if they have more or less similar amounts of minutes played. This means opposition quality has evened out during that time period and in a more general sense, you’re also able to eliminate the effect of outlier performances.

So it’s not a good idea to compare Dele Alli’s numbers with anyone else in the league, because for whatever reason(we know the reasons), he’s only played 620 minutes.

This is why it’s a good idea to use per90 numbers when comparing players with different amounts of matches played. Alternatively, you could also set a threshold of matches played. Say, for example, 10 matches played or 15 matches played over the course of a season.

Moving on.

Under the performance column, you have some of the more familiar figures broadly used by everyone. Goals, everybody loves them. Strikers live for them. Goalkeepers hate to concede them. And analysts love to break them down.

Goals overall or Goals per90 is usually a decent indicator for output depending on the position you’re looking at. For strikers, goals are important, and being able to deliver goals on a regular basis is even more so.

But even if you’re not scoring the goals, it’s important to try. And for that, we have :

Shots overall or shots per90

Taking shots during a game is important. As a striker, you want to consistently test the opposition keeper, and taking shots is the best way to do that. Unsettling opposition defenses, winning set-pieces, testing goalkeepers, shots do the lot.

Shots per90 is usually a good indicator for individual player tendencies. Whether a player is a high-volume shooter who places importance on getting several shots away or is a low-volume shooter who always tries to take lesser shots but try to be more clinical with them, shots allow you to make that difference.

Of course, the no. of shots a player is able to take also hugely depends on his/her team’s ability to generate those chances for him/her.

As you can see from the graphic, Harry Kane is a good example of a high-volume striker as opposed to his partner in crime Son Heung-min, who is a low-volume shooter with greater accuracy.

Shots on Target% is a subset of shots per90. It’s self-explanatory in some ways, but again, is another useful indicator of understanding different strikers. From the above graph, you can see how some strikers take high/low volume of shots and how many of them are on/off-target.

Goals per shot is a very interesting metric to look at. Usually, the players you find ranking high on this metric are very efficient with their shots and are likely to be low-volume shooters. Some of the world’s best and most efficient strikers would rank very highly in this regard. You can see again, the differentiation in the nature of strikers, if you compare it to shots per90.

Two good examples of players doing this well on this metric are Son from Spurs again and Alexandre Lacazette from Arsenal, both are low-volume shooters as compared to the rest of the league and rank fairly highly on goals per shot overall.

Average distance (Dist) of all shots taken

The average distance of all shots taken is another key metric that is very interesting to put into the context of strikers. Ideally, you want your striker to be taking shots closer to goal, and whilst there are other metrics you can use to judge this further, looking at the average distance of shots is also a very good way of looking at the tendencies of players. Looking at this metric allows understanding if a player enjoys shooting more often from range or prefers taking shots from within the penalty area (18-yard box). It also adds to the discussion about their efficiency.

As you can see, strikers like Edinson Cavani and Dominic Calvert-Lewin who enjoy taking shots from good positions closer to goal, rank nicely on this metric.

This here is interesting because once again, this is a metric that allows us to differentiate between players and their shooting tendencies. The Y-axis on this one is flipped because I want the players doing well on this metric to be placed higher up on the scatter.

Expected Goals aka xG

Expected Goals is the apple of so many eyes and the scourge of many many more. With the passage of time, xG has been welcomed more and more in the football industry, and people at the highest level in this field place a great deal of importance on this metric.

Even more important I think is the non-penalty Expected Goals metric (npxG). It’s a great way of understanding the quality of chances a player is able to find himself/herself in during the course of a match.

Expected goals is a performance and quality-related metric and tells us about the same.

Looking at the above graph, you can clearly differentiate between players who are scoring high on their npxG tally due to a high/low volume of shots.

The best strikers in the world consistently rank high on this metric, because — a.) They are taking their shots from very good positions, or b.) They are placed in a system where they are given the freedom of taking several low-quality shots that boost their xG tally for the game and thus, the season overall.

Non-penalty Expected Goals per Shot (npxG/Sh)-

This is a sort of a subset of the Expected goals metric. npxG per shot is an interesting metric because it tells you about the average quality of shots taken and that’s important to note as well.

This metric is also heavily influenced by the volume of shots taken, so for example a striker who is a high-volume shooter may not necessarily rank very high on this metric because their xG tally has come from a number of low-quality shots.

With npxG/Shot of 0.21, Edinson Cavani’s average shot quality is the highest among Premier League strikers with at least 10 matches played this season.

Lastly,

Non-Penalty Goals minus Non-Penalty Expected Goals(np: G-xG) —

As the name of the metric suggests, this is essentially the difference in the expected goals tally of a player and their actual goals tally. It’s commonly useful in measuring whether a player is over-performing or under-performing in terms of their xG or npxG.

This is a brief glance at the over/under performance of some Premier League forwards for this season

Under/overperformance on xG has been a topic discussed for a while in analytics circles. And generally, if you’re looking to explore this particular aspect of players, it’s better to do it over larger sample sizes spanning multiple seasons. Why? Because it allows us to eliminate outliers in terms of performances, smoothen out the numbers, and analyzing long-term trends.

That’s about it, I think.

Understanding the difference between all of these metrics and how to use them is important because you’re able to understand why it is that a particular player is doing well on a certain metric if that’s an individual tendency or a system-related thing. I hope I was able to help you achieve that understanding!

This brings us to the end of this piece.

Thanks for taking the time to read this piece, I have tried to translate my understanding and experience using the aforementioned metrics and I hope that I made some sort of sense through it all.

If at all you enjoyed this one, I might explore making this a series, so I’d definitely appreciate any and all feedback I can get.

Once again, a big thanks to Fbref and Statsbomb for providing the data used in this piece.

Take care and stay safe.

--

--