Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Maths IA First Draft

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

.

Football Analysis: using (xG) to


measure team performances across
the top three European leagues from
last season.
IB Mathematics IA AI SL

Candidate Name:

Candidate Number

Table of Contents
Table of Contents ......................................................................................................................... 1

1
Introduction................................................................................................................................. 2
Brief Introduction of xG ................................................................................................................ 3
Method of Data Collection ............................................................................................................ 4
Research Question........................................................................................................................ 5
Results......................................................................................................................................... 6
The Relationship Between xG and GF ......................................................................................... 6
Scatter plot of xG and GF ....................................................................................................... 6
Pearson Test on xG and GF:.................................................................................................... 6
Spearmans Ranking on xG and GF:.......................................................................................... 6
Summary .............................................................................................................................. 7
The Relationship Between xG and GA......................................................................................... 8
Scatter plot of xG and GA....................................................................................................... 8
Pearson Test on xG and GA: ................................................................................................... 9
Spearman's Ranking on xG and GA ......................................................................................... 9
Summary .............................................................................................................................. 9
The Relationship Between xG and W........................................................................................ 11
Scatter Plot of xG and W ...................................................................................................... 11
Pearson Test on xG and W: .................................................................................................. 11
Spearmans Ranking on xG and W ......................................................................................... 12
Summary: ........................................................................................................................... 12
The Relationship Between xG and L ......................................................................................... 14
Scatter Plot of xG and L........................................................................................................ 14
Pearson Test on XG and L:.................................................................................................... 14
Spearmans Ranking on xG and L ........................................................................................... 14
Summary ............................................................................................................................ 15
The Relationship Between xG and Pts ...................................................................................... 17
Scatter Plot of xG and Pts..................................................................................................... 17
Pearson Test on xG and Pts:................................................................................................. 17
Spearmans Ranking on xG and Pts ........................................................................................ 17
Summary ............................................................................................................................ 18
Conclusion.............................................................................................................................. 19
Reflection............................................................................................................................... 20
Bibliography ........................................................................................................................... 20

2
Brief Introduction of xG
xG was a feature created in 2012 by Opta’s Sam Green used to measure how good

a scoring chance. This was done by calculating the likelihood of whether the player can

score based on similar information in past seasons that can determine the player’s ability to

score. During the 22/23 season, Darwin Nuñez was initially predicted to have a xG of 0.72

(Sky Sports Premier League 2023) . This is because of his good performance in the 21/22

where he had a 0.87 xG (FootyStats 2024) . Whereas Erling Haland initially he was

predicted to have a xG 0.70 for the 22/23, where he was predicted to score less than Darwin

Nunez (Sky Sports Premier League 2023). However, by the end of the 22/23 season

Haaland has a 0.82 xG whereas Darwin had 0.44 xG (FootyStats 2024) . This implied that

Haaland had exceeded the average amount of goals he was expected to score. Whereas,

Darwin had failed to meet the predictions for the average amount goals he was expected to

score in the 22/23 season (Sky Sports Premier League 2023). This shows that Expected

goals as an indicator may not always be reliable due to external factors that affecting team

performance in a match.

Expected goals (xG) as a team performance indicator has been questioned for its

accuracy for providing an overview of how teams would perform within their respective

leagues. On one hand, xG can give slightly accurate predictions on match day performance.

For instance, it can be 66% accurate with home games results and 58% accurate for away

games results, showing its reliance in predicting team performance (Football XG 2024).

However, xG has issues in maintaining accuracy on a consistent basis. For example, during

the 19/20 season; based on the results from xG. Manchester City was expected to win the

premier league by 13 points. Despite this, Liverpool won the premier league title in spite of

being predicted from the last two seasons to score fewer than 39 points (Macinnes 2020).

Hence, showing the limitations of xG as a reliable measure of team performance when being

unable to consider of other factors of team performance that include the quality of players

and tactical aspect of the team in accurately assessing team performance.

Aim: To investigate correlations between the 5 statistics of team performance and xG .

3
Introduction
Team performance is currently being measured with using the 5 statistics of GF

(Goals For), GA (Goals Against), W (wins), L (Losses) and Pts (Points). Goals For is the

number of goals scored by a team in a season against their opponents. Goals Against is the

number of goals conceded by a team in a season by their opponents. Wins are the number

of matches won in a season by a team against their opponents. Losses are the number of

matches lost in a season by a team against their opponents. Lastly, points are values where

a team accumulates that determines their league placement in a season where wins = 3

points, draws = 1 point and losses = 0. xG only became mainstream in 2017 (Willams 2020).

Furthermore, this would be done through using Pearson Coefficient and Spearman

Ranking to measure their relationship with xG. Subsequently, the results of the test for each

factor of team performance will be compared to xG . The data was presented as scatter

plots.

I choose this topic because I am passionate about football because I am interested in

whether xG can objectively give be reliable accurate predictions that can be used by football

fans to accurately assess how well their teams would do in their respective football leagues

. Furthermore, I am interested in unpacking the xG debate because of its controversies of xG

not being a reliable indicator for team performance. It is used as a marker despite its

inaccuracies. Hence, my math IA would be looking to settle the debate in seeing the validity

of xG in football analysis.

Method of Data Collection


I collected data from FBrief.com from where I selected. This source is reliable because the

data is from Opta which is company that collects accurate and reliable data that is football

that is trusted worldwide (source) :

• 2022-2023 La Liga (FBref.com, 2023a)

• 2022-2023 Premier League (FBref.com, 2023b)

• 2022-2023 Serie A (FBref.com, 2023c)

4
I chose these xG, GF, GA, W, L and pts because they are obvious determiners for

team performance. Data was collated into an excel spreadsheet. I grouped the data where I

compared xG with GF, GA, W, L and Pts as factors relating to team performance can also

influence a team’s league standing. Thus, when comparing with xG, this would help with

seeing whether these factors are good for determining team performance.

Scatter plot graph creation

Step 1 Insert a scatter plot on excel


Step 2 Select the data needed for the scatter plot
Step 3 Generate data in scatter plot
Pearson Correlation Coefficient Calculation

Step 1 Input data sets for xG and the indicators of team performance on to the
GDC
Step 2 Label x axis and y axis to do a linear regression stat calculation on the
scatter graph
Step 3 Go to menu--> stat calculation --> linear regression (mx+b)
Spearman’s Rank Correlation Calculation

Step 1 Sort data of xG and team performance


indicators from smallest to biggest,
Step 2 Input data once the sorting of xG and team
performance indicators are done
Step 3 Go to menu--> stat calculation --> linear
regression (mx+b)

Research Question
Does xG determine team performance within the five factors that influences it?

Hypothesis: xG is key an indicator for all five statistics in determining team performance

5
Results
The Relationship Between xG and GF

Scatter plot of xG and GF

The graph looks like it shows a strong positive correlation.

Pearson Test on xG and GF:


Pearson Test Results
RegEqn: m*x+b M:0.720945
B:14.144 R2 = 0.8272
R= 0.9095 Linear Equation: 𝑦 = 0.7209𝑥 + 14.144

The results suggest that xG and GF have a strong positive linear relationship of 0.9095.

Therefore, suggesting an increase xG when there is an increase in GF. Furthermore, the r2

value shows that there is 90% fit in the data.

Spearmans Ranking on xG and GF:


Spearman Rank Results
RegEqn:m*x+b M:0.9965

B:0.1722 r=0.9991

R2= 0.9983 Linear Equation: 𝑦 = 0.9965𝑥 + 0.1722


This indicates that xG and GF have a strong positive linear relationship as the r value

(0.988) is close to 1. Hence, imply that when there is an increase in xG, there would also be

an increase in GF. Further, the r2 value shows that there is a 99% fit in the data hence

strongly supporting the answer.

6
Summary
The relationship between xG goals and GF suggest a strong positive correlation

between xG goals and GF. This shows that xG is an accurate determiner for better team

performance. As a higher the xG, would mean the more goals scored which shows better

team performance.

When looking at all the top three leagues within Europe, there seems to be

correlation between xG and goals scored as well as team performance but there are some

exceptions.

Ranking XG GF League Standing


1 Inter (68.0) Napoli (77) Napoli
2 Napoli (64.7) Inter (71) Lazio
3 Milan (58.8) Atalanta (66) Inter
Table 1: Comparison of the Top Three rankings of XG, GF and League Standing in Serie A

Out of the three teams that had highest xG, Inter and Napoli scored the most goals

which ranked them second and first in the league. Therefore, suggesting some correlation

that xG is a good determiner for GF and therefore team performance. However, Milan

despite being within the top three teams for xG, they are not in the top three for league

standing nor GF which could suggest the possibility of other factors affecting team

performance.

Ranking xG GF League Standing


1 Barcelona (75.5) Real Madrid (75) Barcelona
2 Real Madrid (75.5) Barcelona (70) Real Madrid
3 Atlético Madrid Atlético Madrid (70) Atlético Madrid
(61.9)
Table 2: Comparison of the Top Three rankings of xG, GF and League standing in La Liga

The results in La Liga indicate a correlation that xG is a good determiner of GF and

therefore team performance. The teams with the three highest xG were placed in the top

three in league standing and GF. Thus, suggesting that results in La Liga strongly supports

the hypothesis of XG determining team performance within La Liga based on GF.

Ranking xG GF League Standing


1 Manchester City Manchester City Manchester City
(78.6) (94)

7
2 Brighton (73.3) Arsenal (88) Arsenal
3 Newcastle Utd Liverpool (75) Manchester Utd
(71.9)
Table 3 Comparison of the Top Three rankings of XG, GF and League Standing in The Premier

League

The results in the EPL indicate a low correlation. This can be shown that out of the

three teams with the highest xG, only Manchester City was placed within the top three for

GF and within the league standing. It can that xG can predict team performance that the

higher the xG, the higher amount goals scored thus a higher league standing. However,

Brighton and Newcastle Utd despite being predicted to be within the top three for xG isn’t

with the top three in league standing nor GF, indicating that xG is not an accurate measure

of team performance.

Overall, xG and GF seem to have played a bigger role in determining team

performance within La Liga. As the more goals a team scores, the more likely they would

perform well; seen by Barcelona and Atlético Madrid being joint second for the most goals

scored, which allowed for them to be within the top three within La Liga. This could imply that

La Liga teams are much more expansive in their playstyle, suggesting emphasis on

attacking tactics in improving their team performance within the league. However, for the

Premier League and Serie A, it appears that the amount goals score by a team doesn’t

seem to have strong correlation to improving their team performance within their league

standing implying that their other factors influencing team performance than GF.

The Relationship Between xG and GA

Scatter plot of xG and GA

8
The graph looks like it shows a moderate negative correlation

Pearson Test on xG and GA:


Pearson Test Results
RegEqn: m*x+b M: -0.5675
B: 78.969 R2 = 0.3528
R = -0.5940 Linear Equation: 𝑦 = 0.567𝑥 +
78.96

The r value (-0.5940) shows a moderate negative linear relationship of between xG

and GA suggesting that if there is an increase of xG, there would be a decrease in GA.

Furthermore, the r2 value shows a 35% good fit suggesting that the data moderately

supports the answer in xG affecting GA.

Spearman's Ranking on xG and GA

Spearman Rank Results


RegEqn:m*x+b M:0.999

B:0.00212 r=0.9991

R2: 0.997 Linear Equation: 𝑦 = 0.999𝑥 + 0.00212

The r value (0.997) between xG and GA implies a strong positive linear relationship

between xG and GA. This implies that when there is a high xG, the GA rank will increase

which means that goals against will decrease. Furthermore, the r2 value suggest that the

data strongly support this answer of 99% for good fit.

Summary
The relationship between xG and GA shows a moderate negative correlation. This

implies that xG and GA have some relationship with each other, implying xG is not a good

determiner for team performance regarding about GA.

There seems to be negative correlation that can suggest that a higher xG, would

mean lower the GA. However, they may be exceptions to it when looking at the top three

leagues within Europe.

Ranking XG GA League Standing

9
1 Sampdoria (34.1) Sampdoria (71) Sampdoria
2 Hellas Verona (35.8) Cremonese (69) Cremonese
3 Lecce (36.1) Salernitana (62) Hellas Verona
Table 4 : Comparison of the Top Three Lowest Ranking of XG, GA and League Standing in Serie A

There seems to be a negative correlation between xG and GA in accurately

representing team performance. This can be shown by Lecce where they have the highest

xG out of the top three lowest ranked teams in terms xG; leading them to not be within the

top three in GA. This suggest that a higher the xG, would mean a lower GA; translating to

better team performance. Thus, showing that xG is a determiner for team performance for

GA thus supporting hypothesis 4.

Ranking xG GA League Standing


1 Mallorca (35.2) Espanyol (69) Elche
2 Getaf e (36.7) Elche (67) Espanyol
3 Elche ( 37.5) Almería (65) Valladolid
Table 5 : Comparison of the Top Three Lowest Ranking Teams in terms of xG, GA and League

Standing in La Liga

The findings suggest little correlation between xG and GA in determining team

performance in La Liga. This can be shown by Elche which was a team that has the highest

xG out of the top lowest xG; Elche ended up ranked with the second lowest GA and placed

within the top three lowest ranked teams. This could imply about xG being a flawed indicator

to determining team performance between xG and GA in Serie A. Consequently, suggest the

inapplicability of xG in real life situations for accurately measuring team achievement.

Ranking xG GA League Standing


1 Wolves (36.8) Leeds United (78) Leicester City
2 Southampton (37.8) Southampton (73) Leeds United
3 Bournemouth (38.5) Bournemouth (71) Southampton

Table 6: Comparison of the Top Three Lowest Ranking Teams in terms of xG, GA and League

Standing in The Premier League

The results show some negative correlation between xG and GA in showing that they

are somewhat dependent on influencing team performance. This is evident by Southampton

having the second lowest xG; leading it to be within the top three lowest ranked team and

highest GA. However, there are outliners: Wolves were ranked to have the lowest xG.

10
However, despite it doesn’t have the highest GA nor is within the top three lowest ranking

teams. This suggest that xG unable to accurately assess team performance in terms of GA.

Thus, rejecting the idea that having the highest xG will give a team the lowest GA that would

allow for better team performance and league standings.

Overall, this implies that GA and xG play somewhat of a role in measuring team

performance. As in the Serie A, Lecce had the highest xG out of the top three lowest ranking

teams which led them to not be within the top three in GA. This could suggest the

importance of defensive tactics within the league that infer its importance to maintaining

team performance. However, GA and xG when it comes to determining team performance

within La Liga and The Premier League, doesn’t seem to have played much of a role in

judging team performance. The data shows that La Liga has a null hypothesis whereas the

Premier League has a negative correlation, thus suggesting the importance of other factors

outside of xG and GA in determining team performance.

The Relationship Between xG and W

Scatter Plot of xG and W

The relationship between xG and W looks like it shows a moderate positive

relationship between each other. This indicates that xG somewhat determines the number of

wins that a team and therefore team performance.

Pearson Test on xG and W:


Pearson Test Results
RegEqn: m*x+b M: 1.630
B: 26.993 R2 = 0.7063
R = 0.8404 Linear Equation:𝑦 = 1.63𝑥 + 26.99

11
The r value suggests (0.84) a strong positive linear relationship between xG and

wins. This means the higher the xG; the higher the amount of wins a team would get, thus

affecting team performance. The r2 value (0.84) suggest that the data gathered strongly

supports this.

Spearmans Ranking on xG and W


Spearman Rank Results
RegEqn:m*x+b M:0.999

B:0.00212 r=0.9991

R2: 0.997 Linear Equation: 𝑦 = 1.00623𝑥 ± 0.0557

The r value (0.996) implies a strong positive linear relationship between xG and wins.

This could suggest that the higher the xG, the better the chances of a team winning their

matches. Furthermore, r2 values implies that the data strongly supports the answer for good

fit 99%.

Summary:
The relationship between xG and W shows a moderate positive correlation. This

indicates that xG somewhat determines the number of wins that a team would get implying

that expected goals is able to determine team performances based on wins.

Ranking xG W League Standing


1 Inter (68.0) Napoli (28) Napoli
2 Napoli (64.7) Inter (23) Lazio
3 Milan (58.8) Lazio (22) Inter
Table 7: Comparison of the Top Three rankings of xG, W and League Standing in Serie A

This shows a positive correlation that suggest that xG and the amount of wins a team

would get can determine team performance. This can show by Inter and Napoli having the

highest amount of xG leading to be top three with the highest number of wins. However,

Milan is an outliner, as is not within the top three in league standing nor for the most amount

of wins. Thus, suggesting the presence of other factors such as draws could have influenced

12
the league standing of a team by a one-point difference that could have impacted team

performance based on their league standing.

Ranking xG W League Standing


1 Barcelona (75.5) Barcelona (28) Barcelona
2 Real Madrid (75.5) Real Madrid (24) Real Madrid
3 Atlético Madrid (61.9) Atlético Madrid (23) Atlético Madrid
Table 8: Comparison of the Top Three rankings of xG, W and League Standing in La Liga

This shows a positive correlation that xG can predict how well a term would do based

on the number of wins. This is exemplified by Atlético Madrid, Real Madrid and Barcelona

being placed within top three in xG and league standing. This implies that xG is an accurate

measure for team performance, as the higher the xG, the more wins a team would get.

Consequently, supporting the hypothesis that suggest xG can predict team performance

based on the number of wins in La Liga that would allow them to be in the top three in the

league standing.

Ranking xG W League Standing


1 Manchester City (78.6) Manchester City (28) Manchester City
2 Brighton (73.3) Arsenal (26) Arsenal
3 Newcastle Utd (71.9) Manchester Utd (23) Manchester Utd
Table 9: Comparison of the Top Three rankings of xG, W and League Standing in The Premier

League

This shows some positive correlation between xG and the number of wins to

determining league. For example, Brighton had the second highest xG; despite they weren’t

second in the league and with the most wins. However, Manchester City had the highest xG

leading to wins; causing them to be 1st in the league. This could suggest that xG as a

measure for team performance can have inconsistencies for determining team performance

that can make it unreliable. As a result, this makes team performance unapplicable to real

life as the unpredictability of it can lead to xG creating overestimations or underestimations

of a team performance in a footballing season.

Overall, the Serie A and Premier league suggest winning seem to have more an

important factor than xG. As factors such as draws can influence team performance by one

point in league standings. However, La Liga seems to be the outlier where xG, seems to be
13
more important than wins; to be able to place higher within league standing. This suggest

that wins within the top three European leagues is mostly a key indicator for team

performance for teams to be able to do well within the league standings.

The Relationship Between xG and L

Scatter Plot of xG and L

The graph below shows a negative correlation implying that the higher the xG the lower

amount of losses a team would get.

Pearson Test on XG and L:


Pearson Test Results
RegEqn: m*x+b M: -1.853
B: 77.165 R2 = 0.6139
R = -0.7835 Linear Equation: 𝑦 = −1.853𝑥 + 77.165

The r value (-0.7835) shows a negatively strong linear relationship between xG and L

(s). This could imply that xG is a good determiner for the amount of losses a team would

make 1. However, the r2 value suggest that the data strongly support the answer that as a -

78% of good fit that suggest a higher xG would a lower amount of loses.

Spearmans Ranking on xG and L


Spearman Rank Results
RegEqn:m*x+b M: 0.993

B:0.06809 R= 0.997

R2: 0.994 Linear Equation: 𝑦 = 0.993𝑥 + 0.6809

14
The r value (0.997) of the data set suggests a strong negative linear relationship as

with an increase of xG, there would be a decrease number of losses a team would face.

Further, the r2 value suggest that the data set strongly support this answer as it has a 99%

good fit.

Summary
The relationship between xG and L (losses made by a football team during a football

season) shows a strong negative correlation. This suggest that xG is a reliable indicator for

determining the amount of losses a team would make during a football season.

Ranking xG L League Standing


1 Sampdoria (34.1) Sampdoria (25) Sampdoria
2 Hellas Verona Hellas Verona (21) Cremonese
(35.8)
3 Leece (36.1) Cremonese (21) Hellas Verona
Table 10: Comparison of the Top Three Bottom Teams in terms of rankings based on XG, L and

League Standing in Serie A

There seems a be a negative correlation that suggest that xG does determining team

performance in the amount of loses that a team would get. As exemplified, Sampdoria being

ranked the lowest in xG has the highest number of losses and is one of the bottom

performing teams. This alludes that a lower xG the more loses a team would get as this

mean that they are likely to be perform badly in Serie A.

Ranking xG L League Standing


1 Mallorca (35.2) Elche (23) Elche
2 Getafe (36.7) Valladolid (20) Espanyol
3 Elche ( 37.5) Almería (19) Valladolid
Table 11: Comparison of the Top Three Bottom Teams in terms of rankings based on xG, L and

League Standing in La Liga

The results show some correlation between xG and L that doesn’t support team

performance. This be shown by Elche where it had the highest xG amongst the bottom three

teams; despite this it was placed last in La Liga and ranked high for the most losses. This

implies that having a higher xG would not guarantee the team to do better within their league

standing by having a lower lost rate; leading to hypothesis 2 being rejected.

15
Ranking xG L League Standing
1 Wolves (36.8) Southampton (25) Leicester City
2 Southampton (37.8) Leicester City (22) Leeds United
3 Bournemouth (38.5) Bournemouth (21) Southampton
Table 12: Comparison of the Top Three Bottom Teams in terms of rankings based on xG, L and

League Standing in The Premier League

The findings suggest little correlation between xG and the amount of L. This is

exemplified by Bournemouth having the highest xG amongst the top three bottom ranking

teams; where it was top three for most losses but was not bottom three for league standing.

However, Leeds and Leicester City seem to be the teams that do will not perform well,

despite not being ranked for having the top three lowest xG nor top three highest number of

losses. This shows a lack of transferability of xG in different situations where xG is unable to

determine team performance within the premier league, thus rejecting hypothesis 3.

Overall, xG and L does seem to play a role in determining team performance

depending on the contexts of the top three European leagues. However, Serie A indicate

that losses play a role for team performance where the higher the xG, the more likely the

team would perform better that would translate to lower losses within Serie A. Despite this,

the Premier League and La Liga, suggest that xG and L do not play a role of deciding team

performance, which suggests the limitations of these measures for determining team

performance. As for instance, La Liga views losses as insignificant because of other factors

such as wins and draws that could have played a role in influencing team performance and

league standing.

16
The Relationship Between xG and Pts

Scatter Plot of xG and Pts

The graph shows a moderate positive correlation. This would imply that the higher

the xG, the higher amount of points a team would get. The relationship between xG and pts

is shown to be moderate positive correlation. This implies that xG can influence how many

points a football team would get that would determine their success in their national leagues.

Pearson Test on xG and Pts:


Pearson Test Results
RegEqn: m*x+b M: 0.5961
B: 19.329 R2 = 0.7106
R = 0.8430 Linear Equation:𝑦 = 0.5961𝑥 +
19.329

The r value (0.84) shows a strong positive linear relationship between xG and Pts .

This could suggest that xG can influence the number of points teams would get in a season.

Further, the r2 value of 0.71 suggests that data strongly supports this answer.

Spearmans Ranking on xG and Pts


Spearman Rank Results
RegEqn:m*x+b M: 0.999194

B:0.0024579 R= 0.999555

R2:0.999111 Linear Equation: 𝑦 = 0.99194𝑥 +

0.0024579

The r value (0.99) mplies a strong positive linear relationship where an increase of

xG, would mean an increase of points. Further, the r2 value of 0.999 suggest that the data

set strongly support this answer.

17
Summary
The relationship between expected goals and Pts shows a moderate positive

correlation. This implies that xG would influence how many points a football team would get

that would determine team performance based on whether they get relegated, have

champions leagues placement and with winning the league title.

Ranking xG Pts League Standing


1 Inter (68.0) Napoli (90) Napoli
2 Napoli (64.7) Lazio (74) Lazio
3 Milan (58.8) Inter (72) Inter
Table 13: Comparison of the Top Three rankings based on xG, Pts and League Standing in Serie A

This indicates a moderate positive correlation that xG is a determiner for team

performance based on pts. This is referenced by Inter and Napoli having one of the highest

xGs that placed to be top three with the highest number of points and in their league

standings. This is with the exception for Milan where despite being within the top three for

the highest xG, it doesn’t have highest amount points nor top three in league standing. This

suggest leads to hypothesis 3 being accepted, that means that having more points would

lead to a better league standing.

Ranking xG Pts League Standing


1 Barcelona (75.5) Barcelona (88) Barcelona
2 Real Madrid (75.5) Real Madrid (78) Real Madrid
3 Atlético Madrid (61.9) Atlético Madrid (77) Atlético Madrid
Table 14: Comparison of the Top Three rankings based on xG, Pts and League Standing in La Liga

The results indicate a strong positive correlation between xG and Pts. This is by

Barcelona, Real Madrid and Atlético Madrid being placed within the top three in xG making

them more likely to have the highest amount points earned and be top three in league

standing. Thus, supporting hypothesis 3 of xG being a key indicator for the most amounts

points for team performance.

Ranking xG Pts League Standing


1 Manchester City (78.6) Manchester City (89) Manchester City
2 Brighton (73.3) Arsenal (84) Arsenal
3 Newcastle Utd (71.9) Manchester Utd (75) Manchester Utd

18
Table 15: Comparison of the Top Three rankings based on xG, Pts and League Standing in The

Premier League

This shows a weak positive hypothesis as xG is shown to be able to be an indicator

for the number of points that a team would to determine their team performance. This is

evident by Newcastle and Brighton where despite having highest xG; they do not have

highest number of points nor is top three in the league standing. However, there is an

exception where xG can determine team performance. This can be shown by Manchester

City being placed first for the highest xG leading to them being 1st of the most points and 1st

in league standing. This signfies that xG can be an indicator for team performance based on

pts earned; but with abnormalies that could undermine it as a measure.

Overall, xG and Pts can play a role in judging team performance. As La Liga

especially shows a strong positive correlation that suggest a strong relationship between xG

and Pts as being accurate measures for team performance. However, for the Premier

League and Serie A; they don’t have strong positive correlations that suggest that xG can

determine team performance. As Pts seems to be sole determiner for team performance as

exemplified by Napoli and Manchester City having the highest amount of Pts, that lead them

to be placed top three in league standing. Thus, implying about the lack of generalisability

and validity of xG and Pts to determining team performance for the top three European

leagues.

Conclusion
In conclusion, the results suggest that xG can determine team performance but with

varying results. For instance, xG can work within La Liga where they can influence team

performance based on GF, W and pts. This has implications of La Liga being dependent on

more expansive playstyles for a better league standing and team performance. However, xG

cannot be generalisable for team performance for the top three European Leagues. This is

because of Serie A xG predicting only 2/3 of the top teams for Ws, Pts and GF that breaks

the stereotype that alludes to Serie A’s dependence on defensive and goalkeeping.

Consequently, indicating an attacking component to assessing team performance. This is

similarly shown in the Premier League where only 1/3 of the top teams in these three factors;
19
implying a more well-rounded approach to determining team performance that emphasise

the need for a better attack and defence.

Reflection
Spearman ranking was used to investigate whether xG would show a positive or

negative corelation for assessing team performance. Whereas Pearson's test is used to

identify the relationship between xG and team performance indicators to recognise whether

xG can determine team performance.

However, Spearman's ranking can be unreliable as it is not suitable for graphs that

have non-linear relationships. Henceforth, the results gathered from the Spearman's ranking

would not be reliable. Furthermore, the limitations of using a Pearson coefficient would be

the inclusion of called a spurious correlation that can make two factors like related when they

are not; that can make the findings unreliable. (Ghouse et al. 2024).

Moreover, to improve the accuracy of my investigation more seasons and leagues

are needed to accurately assess team performance based on xG. This would be done by

including two more seasons and leagues such as the Bundesliga and Ligue 1. Another way

to improve the study would have been to avoid data that have external issues. For example,

COVID-19 lead to football games being suspended on 13th March 2020. This would result in

data on team performance being unreliable for our IA (Premier League 2020). Lastly, an

alternative statistical test like Anova would allow for hypothesis testing between different

group means; to determine whether there is a significant difference between the use of xG

in determining a specific team performance indicator for a specific league (Bevans 2024).

Bibliography
Bevans. R. (2024). One-way Anova test | when and how to use it (with examples). [online].
Available From: ttps://www.scribbr.com/statistics/one-way-
anova/#:~:text=The%20null%20hypothesis%20(H0,use%20a%20t%20test%20instead
[accessed from 26th July 2024].

FBref.com. (2023a). 2022-2023 La Liga stats [online] . Available from:


https://fbref.com/en/comps/12/2022-2023/2022-2023-La-Liga-Stats [accessed April 16
2024].

FBref.com. (2023b). 2022-2023 Premier League stats [online]. Available from:


https://fbref.com/en/comps/9/2022-2023/2022-2023-Premier-League-
Stats#all_league_structure [accessed: April 16 2024].

20
FBref.com. (2023c). 2022-2023 Serie A stats [online]. Available from:
https://fbref.com/en/comps/11/2022-2023/2022-2023-Serie-A-Stats [accessed April 16
2024].

Footy Stats. (2024). Darwin Nunez stats – Goals, xG, assists & career Stats | FootyStats
[online]. Available From: https://footystats.org/players/uruguay/darwin-nunez [accessed from
21 July 2024].

Footy Stats. (2024). Erling Haaland stats – Goals, xG, assists & career stats | FootyStats
[online]. Available From: https://footystats.org/players/norway/erling-haaland [accessed from
21 July 2024].

Football XG. (2024). What are expected Goals (xG)? [online]. Available From:
https://footballxg.com/what_are_expected_goals/#:~:text=23%2B00%3A00-
,So%20how%20much%20better%20is%20expected%20goals%3F,worse%20on%20the%20
home%20results [accessed from 21 July 2024].

Ghouse, G., Rehman, A.U. & Bhatti, M.I. (2024). Understanding of causes of spurious
associations: Problems and prospects. J Stat Theory Appl 23, 44–66.
https://doi.org/10.1007/s44199-024-00072-0

Macinnes. P. (2020). ‘It is beyond the model’: Have Liverpool exposed the limits of
xG?[online]. Available From: https://www.theguardian.com/football/2020/aug/09/liverpool-xg-
jurgen-klopp [accessed from 21 July 2024].

Premier League. (2020). How has the COVID-19 Pandemic affected premier league
matches?. [online]. Available From: https://www.premierleague.com/news/1682374
[accessed from 26th July 2024].

Sky Sports Premier League. (2023). Darwin Nunez tops the Premier League’s average XG
per game this season! [online]. Available From:
https://x.com/SkySportsPL/status/1633482957385023491?lang=en [accessed from 21 July
2024].

Whitmore. J. (2023). What is expected goals (xG)? [online]. Available from:


https://theanalyst.com/eu/2023/08/what-is-expected-goals-xg/ [accessed from 21 July 2024].

https://thesefootballtimes.co/2020/04/08/the-roots-of-expected-goals-xg-and-its-journey-from-
nerd-nonsense-to-the-mainstream/

21

You might also like