Skip to main content

FIFA World Cup 2018: A not-so-artificially-intelligent predictor

JUN 20, 2018:
It's been close to a week since FIFA World Cup 2018 started. We have already seen almost as many upsets as there were fancy advanced models pre-tournament trying to predict the outcomes. 
There has been a flurry of articles and scholarly papers using Artificial Intelligence and Machine Learning to predict the tournament results this year. Included in that list are the usual suspects such as FiveThirtyEight and bookmakers, as well as unlikely participants such as Goldman Sachs and Cornell University (maybe not that unlikely).

Inspired by these articles, some heated arguments, a few cups of coffee, and with my trusted Microsoft Excel, I set about creating my own not-so-intelligent predictorIt's a fairly simple model that runs Monte Carlo simulations, and uses some hard coded inputs on pre-tournament form, such as:
  • FIFA Ranking Points
  • Football ELO Ratings
  • Goals Scored, Goal Conceded and Undefeated streaks since World Cup 2014
  • Number of Ballon d'Or winners in the team

I started on this passion project on Monday night and ran some vanilla simulation to see how different the results are from the other advanced models (spoiler alert: they are not really). And maybe not everything in the world needs a machine learning tadka.

If I had been living under a rock over the past week and didn't know the results of the matches already played, based on the model I would have said Brazil has the best chance to win the World Cup by defeating Germany in the Final. Overall probability for Brazil is 12% (which is significantly higher than 3% chance they would have had if all matches were decided by coin tosses).

The knockout stage would have probably looked like this:

The real interesting part comes in when you start coding in the results as the matches happen, and see how the draw changes. After Match Day 01, which ended with the Poland vs Senegal match on June 19th, the draw looks quite different for two reasons.

1. Germany's loss to Mexico means that they are more likely to finish second in their group, and that leads to a R16 showdown with Brazil (Final come early, yay). 
2. Argentina's draw with Iceland, coupled with Croatia's win over Nigeria means Argentina are also likely to finish second in their group and end up facing France in R16. 

Both these results combined mean the bottom half of the draw may completely open up. There is a not-so-unrealistic path to the Semis for England, Switzerland, Mexico and Senegal. Brazil is still most likely to win, but their probability has come down to 10% because they now have a more difficult match-up with Germany in R16.



As we head into the find round of group games, here is what the knock-out grid looks like. I have made an update to the model to take into account tournament from. This is done by maintaining a live track of ELO-like ratings based on match results.

Some of the groups are very close to call. 
  • In group D, Croatia have qualified, but are not guaranteed top spot. Argentina has a 50% chance of qualifying, followed by Nigeria at 35% and Iceland at 15%.
  • In group F, anyone can qualify. Mexico have the best odds at 80%, followed by Germany at 65%, Sweden at 45% and Korea at 10%. Mexico - Germany is the most likely 1-2.
  • Group G is practically a coin toss in who finishes first. The model gives a slight edge to England.
    As we had already seen, the bottom half of the draw is much more favorable. It will be interesting to see if the two teams actively try to finish second.
    Fun fact: If the Belgium-England game ends in a draw, the winner will be based on Fair Play points. England currently have one fewer yellow card than Belgium.
  • In Group H too, anyone other than Poland can qualify. Japan has the best odds of qualifying, following Senegal and Columbia.

Overall, Brazil is still the favorite to win. Depending on how the 1/2 rank plays out across the groups, any of Portugal, Spain, England or Belgium can make the finals.
Assuming we flip the order of group winners in Groups B and G, then the draw looks something like this:

I'll update the odds at the end of Match Day 03 at which time the R16 matches will be locked in.


A super late post again. A part of the tardiness was caused by the results that I was getting. Based on some of the results in Round 3, the model ends up picking Belgium as the eventual winner. So, I was making sure there were no errors. I did find one (I had coded the result of BRA-SUI match incorrectly), but it did not change model results.

Numerically, though, I can see why Belgium are favourites - they beat England in Groups, and gained a lot of ELO points. As a result their tournament form is assessed to be higher than that of Brazil - which isn't that much of a surprise. But even if I were a betting man (and I am not), I would not bet all my money on Belgium winning it.

This is what the model churned out in the end. Other than the Belgium predictions, I can fairly happy with the results so far. This in spite the fact that two out of four predictions have already been proven wrong. But Spain really shouldn't have lost to Russia. And I really believe Mexico have a better than 18% chance of beating Brazil that FiveThirtyEight has attributed.

UPDATE JULY 5th 2018: END OF R16

Finally a post on time, before the first QF kicks of tomorrow between Uruguay and France.

Spain's upset loss against Russia means a lot of churn in the bottom half of the table, leading to Croatia making the final. In the top half, a close game for Belgium and a comfortable win for Brazil means things are a little bit closer than the last run of the model. Belgium still picked to go through and win it all, if only by a whisker against Brazil.
Overall Belgium have the highest probability to win the cup, followed by Brazil, Croatia, England, and France - in that order.


Popular posts from this blog


महफ़िलें वही हैं, ये
जाम दूसरा है
बाज़ार है वही, पर
दाम दूसरा है

ख्वाहिशें वही हैं
खुमार दूसरा है
यार भी वही हैं, पर
प्यार दूसरा है

सब कुछ है हूबहू, बस
ख्याल दूसरा है
वो साल दूसरा था, ये
साल दूसरा है


mehfilein wahi hain, par
jaam doosra hai
bazaar bhi wahi hai, par
daam doosra hai

khwahishein wahi hain
khumaar doosra hai
yaar bhi wahi hain, par
pyaar doosra hai

sab kuchh hai hu-ba-hu, bas
khayal doosra hai
woh saal doosra tha
yeh saal doosra hai

For You (The Girl In Purple)

Dear Girl in Purple,Let me start at the bottom-line itself: I don’t like you. I don’t like you because you brought to the surface the very facet of my being that I dreaded the most – that being: acute paranoia, extreme wariness of public embarrassment, and of course my utter discomfort in the mere presence of a girl. Remember this – I have tried all my life to shield these aspects from public knowledge. For me these are more covert than perhaps the existence of the Holy Grail.Alas, though, as all this is now a thing of the past.You make me feel pathetic and miserable. I mean, how difficult is it to walk up to a cute girl in a coffee shop and say “hello” or whatever else might be fitting. What is the risk I am playing against? No probable solution of the Schrödinger’s equation will make my saying “hello” lead to a nuclear holocaust. Life is not like the Butterfly Effect. But my utterly female-terrorised brain makes me believe otherwise. Or maybe, it’s just that guys who talk of solutio…

Coming of Age

Sometimes, I just feel lucky to be born when I was, to be born in this time and age, to have seen so many contenders for the "Greatest of All Time" tag.

Lucky to be born in the age of Michael Schumacher and watch him beat record after record. To be born in the age of Lance Armstrong, for whom beating opponents and records was but a secondary feat. To be born in the age of Zinedine Zidane, most certainly the best of this generation, the marquee headbutt notwithstanding.

Lucky to have watched not one, but two legends, pass the baton in tennis.
When Pete Sampras lost in the Wimbledon of 01, it made fans of him hate that pony-tailed guy for ending his quest for a 5th consecutive crown. I know I did. But Pete shouldn’t mind that now, for over the course of next 8 years or so, Roger has shown that when it comes to beauty on the tennis court he has got the entire WTA beat, by a long shot (down the line).

And then there’s Sachin Tendulkar, whose one mistimed shot makes your heart skip a…