Can AI Predict Player Performance in New Team Environments?

Using Fine-Tuned LEMs to Assess Soccer Players' Impact Across Various Contexts.

Feb 06, 2025

1. Introduction

Effectively navigating decision-making in the sports domain, particularly in contexts where substantial financial investments in players or managers are at stake, presents a significant challenge. While the use of data to enhance decision-making processes has grown considerably in recent decades, its application often lacks a comprehensive and nuanced approach. An illustrative example of this shortcoming is the evaluation of compatibility between a specific player and a team. Despite the proliferation of "AI-based solutions" that claim to assess this compatibility, there is a notable absence of published validation studies for such tools. Many existing solutions appear to rely on linear paradigms, yet the complexity of the problem suggests the necessity for non-linear methodologies.

This gap in approach becomes particularly pressing given that approximately 50% of soccer transfers fail, as attributed to various factors (see Table 1). While some aspects, such as adaptability issues, may lie beyond the control of decision-makers, other elements—like alignment with playing style or positional compatibility—can potentially be addressed through improved frameworks.

**“Table 1.** Why transfers fail according to Ian Graham, compiled by Tom Worville. *Reasons that can be addressed in our framework.”

The overarching aim of this study is to evaluate a player’s potential impact on a soccer team, aligning with various existing research efforts. Numerous methods have emerged to assess player and event contributions based on game state value, often referred to as possession value. Examples include the Valuing Actions by Estimating Probabilities (VAEP) model, which evaluates the likelihood of a goal resulting from subsequent actions, and the expected threat (xT) metric, which assigns values to specific areas of the pitch based on their scoring potential. These frameworks provide substantial insights but often fail to account for contextual influences—a critical limitation when addressing the question: How might these values change if a player were to join a different team?

Contrasting with conventional approaches, this study introduces the application of Large Events Models (LEMs). LEMs operate on principles akin to Large Language Models (LLMs), predicting the next element in a sequence based on the existing context. While LLMs focus on predicting subsequent words, LEMs predict the next event in a sequence, with the current game state shaping this prediction. Importantly, both LEMs and LLMs allow the next predicted element to alter the context, enabling dynamic simulations and coherent scenario generation.

This approach uniquely addresses a key challenge: LLMs often excel in creative tasks but falter in reasoning tasks, such as generating concise player recommendations for a team. By simulating diverse contexts and predicting player behavior within those contexts, LEMs facilitate informed decision-making based on multiple Key Performance Indicators (KPIs). These KPIs include anticipated contributions to team metrics such as points, shots, key passes, crosses, and set-piece goals.

To validate this approach, LEMs were fine-tuned to learn and simulate team and player behaviors across various scenarios. Teams from the 2017/18 English Premier League (EPL) season were analyzed, showcasing LEMs' capability to capture distinct team behaviors. A comparative analysis was conducted to evaluate the hypothetical impact of acquiring Lionel Messi or Cristiano Ronaldo for each Premier League team, highlighting the methodology’s strengths and areas for improvement.

While the availability of public data imposes constraints, the methodology proposed herein offers enhanced predictive depth compared to existing approaches. By leveraging LEMs, this framework provides a comprehensive analysis of a player’s potential impact upon joining a new team. The method moves beyond simplistic numerical summations of player attributes, enabling a multidimensional exploration of recruitment outcomes. This enriches the knowledge base and bolsters the effectiveness of decision-making processes in player acquisitions.

2 Background

2.1 Large Events Models

Large Event Models (LEMs) extend the framework of Large Language Models (LLMs) to soccer by adopting their sequential prediction mechanism. Where LLMs predict the next word in a sequence by analyzing the prior context, LEMs predict the next soccer event by examining the current game state, which encompasses the match score, prior events, and other contextual information. These models iteratively update the game state with each predicted event, enabling comprehensive simulations of soccer matches that are grounded in real data.

“**Figure 1.** LEMs and LLMs work on the same principle: given the initial context that contains all current information, it can forecast the next token. This token then updates the context iteratively until an exit criterion is met. For LLMs, the tokens are words. For LEMs, the tokens are events.”

The sequential nature of LEMs provides advantages over traditional methods, enabling them to model complex interactions within soccer events. For example, they can predict basic metrics like expected goals or simulate entire matches. Furthermore, smaller-scale LEMs can focus on specific scenarios, such as key events within games, allowing for targeted analysis.

The phased approach employed in LEMs ensures accurate forecasting of interrelated variables. First, the model predicts the type of the next event. Subsequently, it forecasts the event's accuracy and whether it results in a goal. Lastly, the model predicts additional attributes, including spatial location, time elapsed, and team involvement.

The model inputs consist of detailed event features:

Event Type: Encodes event types like passes or shots with a categorical variable.
Period and Minute: Binary and normalized continuous variables, respectively, representing the match segment and time.
Spatial Coordinates (X, Y): Normalize a player’s position on the field to a standard range of 0–1, maintaining consistency across matches.
IsHomeTeam: Indicates whether the home team executed the action.
IsAccurate and IsGoal: Binary variables denoting action success and goal outcomes.
Team Scores: Normalized scores of the home and away teams at the event time.

LEMs produce probabilistic outputs for all variables, enabling diverse simulation scenarios. Once an event is predicted, its context updates the game state, serving as the input for subsequent predictions. This iterative process creates a robust simulation framework that supports detailed player and team analysis.

LEMs provide significant benefits, such as creating sophisticated player performance metrics and enabling game-level insights by identifying patterns across simulations. They are highly adaptable, applicable to various levels of soccer and capable of integrating with tracking data. However, a limitation lies in their generic nature, as they simulate average matches based on training data from all teams. This limitation is addressed in subsequent sections by fine-tuning LEMs for specific contexts.

3 Methodology

3.1 Data

Event data forms the backbone of soccer analysis, offering granular information on match events such as passes, shots, and goals, with attributes like event type, spatial coordinates, and outcomes. This study employs the publicly available Wyscout dataset, focusing on matches from top-tier leagues during the 2017/18 season. The dataset includes approximately 1,446 games across Ligue 1, Bundesliga, Serie A, Premier League, and La Liga. The primary training data was drawn from Ligue 1 and Bundesliga, while Serie A data was used for validation, and data from Premier League and La Liga contributed to fine-tuning.

3.2 Fine-Tuning LEMs to Learn Different Contexts

Fine-tuning involves adapting a general LEM to specific contexts by further training it on specialized datasets. For example, a LEM trained on league-wide data can be fine-tuned to reflect the unique behavior of a particular team or player. This process entails selecting relevant subsets of data, such as games played at home or specific players’ contributions, and using these subsets to refine the model.

“**Figure 2.** This figure depicts the two-stage process of developing a fine-tuned LEM model: first, we use a large dataset to build a LEM, then we fine-tune it using specific data about our target”

Four primary fine-tuning contexts were explored:

Team Behavior: Models a team’s performance compared to league averages.
Player Impact: Evaluates how individual players influence game outcomes.
Player Addition: Assesses the hypothetical addition of a new player to a team.
Player Replacement: Analyzes the effect of replacing an existing player with another.

“**Table 2.** The types of fine-tuning that we can perform on LEMs. The type "Team" focuses on replicating a single team’s behavior against the league average. The type "Player" focuses on estimating the impact of the player on the average team of the league. "Player Addition" includes data from a new player in the context of the new team. "Player Replacement" does the same but excludes data from the player being replaced. In the "Team Face-off" type, we have the data of two teams: the home team and the away team.”

These fine-tuned models enable nuanced analysis by isolating specific influences, such as a player’s impact or tactical variations, thereby providing actionable insights for recruitment or strategy.

3.3 Player Addition vs. Player Replacement

The distinction between Player Addition and Player Replacement lies in how the data is manipulated. Player Addition evaluates a new player’s hypothetical contribution to a team while retaining data from existing squad members. In contrast, Player Replacement actively removes data from a player being substituted, ensuring the model focuses exclusively on the incoming player’s impact. This approach ensures an unbiased evaluation of the replacement’s effect on the team.

“**Table 3.** The list of players that are replaced in the Player Replacement fine-tuning”

3.4 Parameter Tuning

The fine-tuning process maintained many of the parameters from the general LEM training but introduced several refinements:

Reduced learning rate to improve training stability.
Batch size optimized based on dataset size to balance computational efficiency with model accuracy.
Simulation counts set to 2,500 per model to minimize variability while maintaining computational feasibility.

“**Figure 3.** This figure illustrates that the outcome variability originates from the model’s training and fine-tuning rather than the simulation process. At the threshold of 3000 simulations, a sequence of 10 consecutive wins is required to alter the expected points by a mere +0.01. Additionally, the figure shows that the error distribution follows a normal curve, indicating that, over an extended period, the average error in simulations is expected to converge towards zero.”

These adjustments aimed to enhance model performance while managing computational constraints. Simulations demonstrated that error distribution follows a normal curve, ensuring reliable predictions over extended periods.

3.5 Limitations

This study focused exclusively on home game data due to the computational scope and the straightforward adaptability of the methodology to away games. Future work can replicate the analysis for away games by modifying specific data parameters. Additionally, computational constraints limited the optimization of hyperparameters, highlighting the need for more resources to refine the methodology further.

4 Experiments

4.1 Simulating the Premier League

The fine-tuned LEM models were evaluated using the 2017/18 Premier League data to simulate team performances and compare the predicted league standings against actual outcomes. Predictions included full-league performance as well as home-game-specific results, as the model was fine-tuned on home-game data. The accuracy of these predictions was assessed using the average displacement metric, which measures the deviation between predicted and actual standings.

“**Table 5.** Comparing the forecasts from the finetuned LEM models against the end-of-season (EoS) tables. The average displacement (Avg. Disp.) is the average number of team positions from the actual position. We compare against the full and home tables, with the home table providing less displacement since all models are finetuned using home game data.”

The model showed high accuracy for top-performing teams, with Manchester City and Liverpool meeting expectations by securing first and second positions, respectively. These results reflect Manchester City’s record-breaking season and Liverpool’s Champions League final appearance, even though the latter was not captured within the dataset. However, discrepancies became more pronounced for mid- and lower-tier teams. Notably, in the top six, the average displacement was limited to 1.3 positions, whereas for the entire league, it rose to 3.4. This variability underscores the competitive nature of the Premier League, where small changes in match outcomes can significantly shift team standings. For example, a single additional win for a team could result in a positional change of up to four places, emphasizing the sensitivity of league dynamics.

4.2 Cristiano Ronaldo vs. Lionel Messi

The hypothetical addition of Cristiano Ronaldo and Lionel Messi to various Premier League teams was explored using fine-tuned LEMs, focusing on their impact on home-game performance. Violin plots illustrated the distributions of expected points for teams under three scenarios: baseline (without the players), with Ronaldo, and with Messi. These distributions revealed the differences in mean performance, variability, and the overall shape of expected outcomes under the three conditions.

“**Figure 4.** The expected impact of adding Cristiano Ronaldo or Lionel Messi on the teams in the EPL. The figure presents the violin plots of the simulations using the fine-tuned models.”

The analysis highlighted significant variability in how teams would benefit from these star players. Teams like Tottenham showed a clear increase in average home points when either Ronaldo or Messi was hypothetically added. However, for Manchester City, no improvement was observed, which aligns with the team’s already optimized performance during the 2017/18 season. This finding suggests that integrating a star player into a well-oiled system might disrupt rather than enhance team dynamics.

“**Figure 5.** The expected impact of replacing a player for Cristiano Ronaldo or Lionel Messi on the teams in the EPL.”

Team-specific contexts played a crucial role in determining the players’ impacts. For instance, Messi had a greater positive effect on Watford than Ronaldo, primarily due to Messi’s superiority over Watford’s key players. Conversely, Ronaldo outperformed Messi in Leicester, a result tied to the comparative differences between the incoming stars and the players they would hypothetically replace. These findings underscore the importance of considering the existing team dynamics when evaluating player impact.

Interestingly, both players generally reduced the variance in teams’ performance distributions. Lower variance can be beneficial, especially for teams aiming to avoid relegation, as it minimizes the likelihood of poor outcomes even if it slightly reduces the average result. Moreover, the study hinted at the possibility of decomposing player impacts by combining individual player effect distributions from separate simulations, potentially streamlining the simulation process.

4.3 The Importance of Context

The role of context was further investigated by fine-tuning LEMs using data specific to individual players. When evaluating the hypothetical addition of top (Illarramendi) and bottom (Casemiro) performers, the context significantly influenced their impacts. While Illarramendi consistently outperformed Casemiro, the difference in their contributions was smaller than initially suggested when placed in specific Premier League teams. This discrepancy highlighted the importance of contextual alignment between players and teams, demonstrating that generalizing player performance across different contexts can lead to misleading conclusions.

“**Figure 6.** The impact on the baseline of the top 10 players of the 2017/2018 season, according to Sofascore.”

By contrasting the results of general context models with those fine-tuned to specific teams, the experiments revealed how context-specific analysis could refine evaluations of player impact. Metrics derived from generalized models often overestimated or underestimated player contributions when actual team contexts were considered.

“**Figure 7.** The expected impact of adding Casemiro or Illarramendi on the teams in the EPL.”

5 Discussion

The experiments confirmed the utility of LEMs as a tool for analyzing players within the context of specific team dynamics. By capturing team playstyles and simulating event sequences, LEMs provide a powerful framework for evaluating players across a variety of metrics. While the current analysis focused on Points Per Game, the methodology extends to any event-based metric, enabling a broader understanding of player performance.

However, challenges remain in isolating individual player impacts from team dynamics. Current LEMs rely on event data, which inherently reflects a player’s interaction within a team context. This introduces limitations, as some data points, such as passes received, are influenced by teammates’ actions, making it difficult to attribute outcomes exclusively to a single player. Additionally, LEMs cannot fully model how a player might alter a team’s overall playstyle, as they evaluate players within the constraints of existing team dynamics.

Future advancements could include larger datasets for training, incorporating multi-season data for validation, and exploring advanced deep learning architectures like Transformers. Extending the context window to include more events and leveraging more computational resources could also enhance the model’s predictive capabilities. These improvements would enable more robust simulations, providing deeper insights into the complex interactions between players and teams.

6 Conclusion

This study highlights the potential of Large Event Models (LEMs) in soccer analytics, particularly for analyzing team playstyles and player impacts. LEMs successfully predicted outcomes for optimized teams like Manchester City while capturing variability for less consistent teams. Star players like Ronaldo and Messi improved predictability but had limited impact on already high-performing teams. While LEMs offer valuable insights for recruitment and performance analysis, limitations remain in isolating individual influence and predicting playstyle changes. With further advancements, LEMs hold promise for broader applications in sports analytics.

Be a Team Player — Pass It On!

Mendes-Neves, T., Meireles, L., & Mendes-Moreira, J. (2024). Estimating Player Performance in Different Contexts Using Fine-tuned Large Events Models. arXiv preprint arXiv:2402.06815. https://arxiv.org/abs/2402.06815v2