T
traeai
登录
返回首页
Towards Data Science

谁将赢得 2026 年世界杯?

8.5Score

TL;DR · AI 摘要

本文通过 Elo 评级、泊松分布和蒙特卡罗模拟三步,构建了一个可解释、可辩驳的 2026 年世界杯预测模型,展示了数据科学中透明管道的价值。

核心要点

  • 使用 World Football Elo 评级作为球队实力的唯一特征,简化模型但保持可审计性。
  • 将 Elo 差值映射到泊松分布的 λ,生成每场比赛的完整比分概率矩阵。
  • 通过 10 万次模拟,得到每支球队夺冠概率、淘汰阶段概率等可量化结果。

结构提纲

按章节快速跳转。

  1. 作者阐述构建可辩驳的世界杯预测模型的动机与价值。

  2. §步骤一:用 Elo 评级评估球队

    介绍 Elo 评级的公式、参数与对球队实力的压缩假设。

  3. 说明如何用泊松过程为每支球队生成 λ 并构建比分概率矩阵。

  4. 描述模拟流程、结果统计与对关键假设的敏感性检验。

  5. 强调透明管道在其他领域预测中的通用性与优势。

思维导图

用一张图看清主题之间的关系。

查看大纲文本(无障碍 / 无 JS 友好)
  • 2026 年世界杯预测模型
    • Elo 评级
      • 公式与参数
      • 假设:实力短期稳定
    • 泊松分布
      • λ 计算
      • 比分概率矩阵
    • 蒙特卡罗模拟
      • 10 万次模拟
      • 结果统计
      • 敏感性检验

金句 / Highlights

值得收藏与分享的关键句。

  • Elo 评级将球队实力压缩为单一数值,假设短期内实力保持不变。

    步骤一

    ⬇︎ 下载 PNG𝕏 分享到 X
  • 泊松分布模型假设进球事件独立且平均速率恒定,适用于比赛中稀疏事件。

    步骤二

    ⬇︎ 下载 PNG𝕏 分享到 X
  • 通过 10 万次模拟,模型可输出每支球队夺冠概率、淘汰阶段概率等可量化指标。

    步骤三

    ⬇︎ 下载 PNG𝕏 分享到 X
#Elo#Poisson#Monte Carlo#Data Science#Football Analytics
打开原文

off on June 11 with 48 teams, 104 matches, and the usual avalanche of hot takes. I wanted a forecast I could actually defend. Not just a cool machine learning model with nice results, but a model where every number traces back to an explicit assumption I could argue about.

This article builds that forecast from scratch. It is deliberately simple: rate every team, convert each matchup into a goal distribution, and simulate the whole tournament tens of thousands of times.

This may sound very football-specific, but pretty much everything in this article, from the methodology to the way we interpret results, are universal to data science. Swap “teams” for sales reps, delivery dates, server loads, or churn cohorts and the same three steps give you a defensible forecast instead of a point estimate.

The real transferable skill here is building a pipeline where every number traces back to an assumption you can argue about, rather than one a black box machine learning model hides from you.

In our soccer case, this means: No tracking data, no deep learning, nothing you couldn’t rebuild in an afternoon. But don’t stop reading here! The point isn’t sophistication. It’s about having a transparent pipeline that forces you to confront the very modeling choices that black boxes hide. We’ll build our model in three steps and interrogate the assumptions at each one.

Step 1: Rate every team with Elo

You can’t forecast a match without a number for how good each side is. The cleanest off-the-shelf option for national teams is the _World Football Elo rating_, an adaptation of Arpad Elo’s chess system.

Elo is a single self-correcting equation. Each team carries a rating R. Before a match, the expected score of team A against team B (on a 0–1 scale, where 1 is a win) is a logistic function of the rating difference:

E_A = 1 / (1 + 10^(-(R_A - R_B) / 400)) After the match, you nudge the rating toward what actually happened:

R_A' = R_A + K * (S_A - E_A), where S_A is the realized result (1 win, 0.5 draw, 0 loss) and K controls how fast ratings move. The football variant adds two wrinkles that matter: K scales with the _margin of victory_ (a 4–0 moves ratings more than a 1–0), and it weights competitive matches above friendlies. The constant 400 is a scale choice — it’s what makes a 400-point gap correspond to roughly a 10:1 favorite (E ≈ 0.91).

For the model, we only need the current ratings, stored as a dictionary. I’m using the pre-tournament snapshot from early June 2026, taken from a freely reusable Kaggle dataset that compiles these ratings:

code
# World Football Elo Ratings, pre-tournament snapshot (early June 2026).
# Source: "2026 FIFA World Cup — Historical Elo Ratings" (Kaggle, CC BY-SA 4.0),
# compiling data from World Football Elo Ratings (eloratings.net).
ELO = {
    "Spain": 2155, "Argentina": 2113, "France": 2062,
    "England": 2020, "Brazil": 1988, "Portugal": 1984,
    "Colombia": 1977, "Netherlands": 1944, "Germany": 1925,
    # ... all 48 qualified teams
}

Assumption check: Elo compresses everything — form, squad quality, fatigue — into one number and assumes a team’s strength is roughly stationary in the short run. That’s a strong simplification, but it’s an honest, auditable one, and Elo is hard to beat as a single feature.

Step 2: Turn a rating gap into a goal distribution

A rating difference gives us a win _probability_, but to simulate a tournament we want _scorelines_ — they drive goal difference, group tiebreakers, and the texture of the thing. The standard move in soccer analytics is to model each team’s goals as a _Poisson process._

The Poisson distribution gives the probability of observing k events when events occur independently at a constant average rate λ:

P(k goals) = λ^k * e^(-λ) / k! Goals fit this well empirically: they’re discrete, relatively rare, and roughly memoryless within a match. If we treat the two teams’ goal counts as independent Poisson variables with means λ_home and λ_away, the full scoreline distribution is just the outer product of their two pmfs, and we can read off win/draw/loss probabilities by summing the appropriate cells:

code
from scipy.stats import poisson
import numpy as np

def match_probs(lam_home, lam_away, max_goals=10):
    h = poisson.pmf(np.arange(max_goals + 1), lam_home)
    a = poisson.pmf(np.arange(max_goals + 1), lam_away)
    grid = np.outer(h, a)             # grid[i, j] = P(home i, away j)
    p_home = np.tril(grid, -1).sum()  # home goals > away goals
    p_draw = np.trace(grid)
    p_away = np.triu(grid, 1).sum()
    return p_home, p_draw, p_away

Assumption check: the independence assumption is convenient but imperfect — real scorelines show correlation and an excess of low-scoring draws (0–0, 1–1). The standard fix is the _Dixon–Coles_ adjustment, which adds a low-score correction term and a time-decay weighting on historical matches. We’re skipping it here for clarity; it’s a natural upgrade and exactly the kind of refinement my upcoming book‘s Poisson chapter walks through.

Step 3: Connect ratings to goals

We need λ_home and λ_away as a function of the Elo gap. A robust piece of soccer-modeling folklore is that a ~400-point Elo edge is worth roughly one goal of supremacy. So we split a baseline of ~2.7 total goals (a typical international average) between the teams according to their rating difference:

code
GOALS_BASE = 2.7
GOALS_PER_400_ELO = 1.0

def lambdas(elo_a, elo_b):
    diff = (elo_a - elo_b) / 400.0 * GOALS_PER_400_ELO
    la = max(0.15, GOALS_BASE / 2 + diff / 2)
    lb = max(0.15, GOALS_BASE / 2 - diff / 2)
    return la, lb

The floor at 0.15 keeps even a massive underdog from being assigned a non-physical negative scoring rate. A more principled version fits log(λ) = β₀ + β₁·Δrating as a Poisson GLM on real match data; the linear-supremacy heuristic above is the back-of-envelope version and lands in the same place for the favorites.

Step 4: Simulate the tournament 10,000 times

A single simulation isn’t a forecast, it’s just one possible 2026. The forecast is the _distribution_ over thousands of them. So we run the entire bracket and tally how often each team wins.

The 2026 format is new and worth stating precisely: 48 teams in 12 groups of four, where the top two from each group _plus the eight best third-placed teams_ advance to a 32-team single-elimination knockout.

That third-place rule is quite a combinatorial wrinkle because you can’t decide who advances until every group is done. Thus, the simulation tracks points and goal difference for all four teams in each group, ranks the third-placed teams across groups, and takes the best eight. In the knockout rounds a draw goes to penalties, which we model as a near-coin-flip nudged slightly toward the stronger side.

code
N = 10_000
title = {t: 0 for t in ELO}

for _ in range(N):
    champion = simulate_one_tournament()  # groups -> R32 -> ... -> final
    title[champion] += 1

probs = {t: title[t] / N for t in ELO}

Why 10,000? Because a simulated probability is itself an estimate with sampling error. A title probability p estimated from N independent tournaments has a standard error of sqrt(p(1-p)/N). For a 15% favorite at N = 10,000, that’s about 0.36 percentage points — tight enough that the ranking is stable and the top numbers won’t wobble between runs. Drop to N = 500 and the standard error quadruples-and-then-some to ~1.6 points, enough to reshuffle the midfield. Vectorizing the simulation (drawing all N tournaments as array operations rather than a Python loop) makes 20,000+ runs essentially free.

What the model says

| Team | Win probability | | --- | --- | | Spain | 16.0% | | Argentina | 11.9% | | France | 7.9% | | England | 7.0% | | Brazil | 5.4% | | Netherlands | 4.7% | | Portugal | 4.3% | | Germany | 3.7% |

_Table 1: Possible World Cup Outcomes, according to model. Source: author._

Two things stand out. First, the favorite sits around _15%, not 50%._ Even the best team in the world is far more likely _not_ to win a 48-team knockout than to win it — a direct consequence of Poisson variance in a low-scoring sport compounded over seven win-or-go-home matches.

Second, these numbers land remarkably close to the forecasts published by far more elaborate statistical models, the kind built on years of match data and dozens of features. That’s reassuring: a transparent Elo-plus-Poisson pipeline recovers most of what a heavyweight forecasting system produces, because both are ultimately doing the same thing: mapping team strength onto outcome probabilities.

What it gets right, and what it leaves out

The model is honest about being simple, and each simplification is a labeled dial you can turn:

  • Neutral venue. Every match is treated as neutral; the hosts (USA, Mexico, Canada) get no boost. Adding a home-advantage term (~+50–100 Elo, historically worth a third of a goal) is a one-line change.
  • Static ratings. Elo is frozen at kickoff; the model doesn’t update as the tournament unfolds. Re-rating after each round would sharpen the later-round forecasts.
  • Independent Poisson goals. No Dixon–Coles low-score correction, no explicit draw inflation.
  • Seeded bracket. I use a seeded knockout rather than FIFA’s exact Round-of-32 map. For title odds of the top teams this barely moves the needle, but it matters for specific paths.

Each of those is the topic of a chapter in the book I coauthored, _Soccer Analytics with Machine Learning_ (O’Reilly, 2026): the Poisson goal model and its extensions in Chapter 6, team ratings in Chapter 8, and turning probabilities into betting decisions in Chapter 9. This article is the toy version of that pipeline — and a toy you can actually run in an afternoon.

Try it yourself

Many more examples can be found in the book’s GitHub repository — clone it, drop in today’s Elo ratings, and you have your own World Cup forecast faster than you can prompt Claude.

In another article, you’ll see how I rebuild this structure with _eleven_ different models, fit it on real match data, and watch FIFA crown four different champions.

For now, my model says Spain. The tournament starts June 11. We’ll find out together.

_Ari Joury is a co-author of_Soccer Analytics with Machine Learning_(O’Reilly, 2026)._

AI 可能会生成不准确的信息,请核实重要内容