Skip to main content

FIFA World Cup 2018: A not-so-artificially-intelligent predictor

JUN 20, 2018:
It's been close to a week since FIFA World Cup 2018 started. We have already seen almost as many upsets as there were fancy advanced models pre-tournament trying to predict the outcomes. 
There has been a flurry of articles and scholarly papers using Artificial Intelligence and Machine Learning to predict the tournament results this year. Included in that list are the usual suspects such as FiveThirtyEight and bookmakers, as well as unlikely participants such as Goldman Sachs and Cornell University (maybe not that unlikely).

Inspired by these articles, some heated arguments, a few cups of coffee, and with my trusted Microsoft Excel, I set about creating my own not-so-intelligent predictorIt's a fairly simple model that runs Monte Carlo simulations, and uses some hard coded inputs on pre-tournament form, such as:
  • FIFA Ranking Points
  • Football ELO Ratings
  • Goals Scored, Goal Conceded and Undefeated streaks since World Cup 2014
  • Number of Ballon d'Or winners in the team

I started on this passion project on Monday night and ran some vanilla simulation to see how different the results are from the other advanced models (spoiler alert: they are not really). And maybe not everything in the world needs a machine learning tadka.

If I had been living under a rock over the past week and didn't know the results of the matches already played, based on the model I would have said Brazil has the best chance to win the World Cup by defeating Germany in the Final. Overall probability for Brazil is 12% (which is significantly higher than 3% chance they would have had if all matches were decided by coin tosses).

The knockout stage would have probably looked like this:

The real interesting part comes in when you start coding in the results as the matches happen, and see how the draw changes. After Match Day 01, which ended with the Poland vs Senegal match on June 19th, the draw looks quite different for two reasons.

1. Germany's loss to Mexico means that they are more likely to finish second in their group, and that leads to a R16 showdown with Brazil (Final come early, yay). 
2. Argentina's draw with Iceland, coupled with Croatia's win over Nigeria means Argentina are also likely to finish second in their group and end up facing France in R16. 

Both these results combined mean the bottom half of the draw may completely open up. There is a not-so-unrealistic path to the Semis for England, Switzerland, Mexico and Senegal. Brazil is still most likely to win, but their probability has come down to 10% because they now have a more difficult match-up with Germany in R16.



As we head into the find round of group games, here is what the knock-out grid looks like. I have made an update to the model to take into account tournament from. This is done by maintaining a live track of ELO-like ratings based on match results.

Some of the groups are very close to call. 
  • In group D, Croatia have qualified, but are not guaranteed top spot. Argentina has a 50% chance of qualifying, followed by Nigeria at 35% and Iceland at 15%.
  • In group F, anyone can qualify. Mexico have the best odds at 80%, followed by Germany at 65%, Sweden at 45% and Korea at 10%. Mexico - Germany is the most likely 1-2.
  • Group G is practically a coin toss in who finishes first. The model gives a slight edge to England.
    As we had already seen, the bottom half of the draw is much more favorable. It will be interesting to see if the two teams actively try to finish second.
    Fun fact: If the Belgium-England game ends in a draw, the winner will be based on Fair Play points. England currently have one fewer yellow card than Belgium.
  • In Group H too, anyone other than Poland can qualify. Japan has the best odds of qualifying, following Senegal and Columbia.

Overall, Brazil is still the favorite to win. Depending on how the 1/2 rank plays out across the groups, any of Portugal, Spain, England or Belgium can make the finals.
Assuming we flip the order of group winners in Groups B and G, then the draw looks something like this:

I'll update the odds at the end of Match Day 03 at which time the R16 matches will be locked in.


A super late post again. A part of the tardiness was caused by the results that I was getting. Based on some of the results in Round 3, the model ends up picking Belgium as the eventual winner. So, I was making sure there were no errors. I did find one (I had coded the result of BRA-SUI match incorrectly), but it did not change model results.

Numerically, though, I can see why Belgium are favourites - they beat England in Groups, and gained a lot of ELO points. As a result their tournament form is assessed to be higher than that of Brazil - which isn't that much of a surprise. But even if I were a betting man (and I am not), I would not bet all my money on Belgium winning it.

This is what the model churned out in the end. Other than the Belgium predictions, I can fairly happy with the results so far. This in spite the fact that two out of four predictions have already been proven wrong. But Spain really shouldn't have lost to Russia. And I really believe Mexico have a better than 18% chance of beating Brazil that FiveThirtyEight has attributed.

UPDATE JULY 5th 2018: END OF R16

Finally a post on time, before the first QF kicks of tomorrow between Uruguay and France.

Spain's upset loss against Russia means a lot of churn in the bottom half of the table, leading to Croatia making the final. In the top half, a close game for Belgium and a comfortable win for Brazil means things are a little bit closer than the last run of the model. Belgium still picked to go through and win it all, if only by a whisker against Brazil.
Overall Belgium have the highest probability to win the cup, followed by Brazil, Croatia, England, and France - in that order.


Popular posts from this blog

For You (The Girl In Purple)

Dear Girl in Purple, Let me start at the bottom-line itself: I don’t like you. I don’t like you because you brought to the surface the very facet of my being that I dreaded the most – that being: acute paranoia, extreme wariness of public embarrassment, and of course my utter discomfort in the mere presence of a girl. Remember this – I have tried all my life to shield these aspects from public knowledge. For me these are more covert than perhaps the existence of the Holy Grail. Alas, though, as all this is now a thing of the past. You make me feel pathetic and miserable. I mean, how difficult is it to walk up to a cute girl in a coffee shop and say “hello” or whatever else might be fitting. What is the risk I am playing against? No probable solution of the Schrödinger’s equation will make my saying “hello” lead to a nuclear holocaust. Life is not like the Butterfly Effect. But my utterly female-terrorised brain makes me believe otherwise. Or maybe, it’s just that guys who talk of

The Awesome Threesome

I expect the DC++ hoggers already know about "Three KGPians day out", well here is a new version of it. Four days before the end sem exams, and on the eve of the day which has three tests in store for them, three KGPians, decided to go out for a late night snack. Actually there wasnt much decision involved except for the place where they would be willing to hog down stuff. The local canteen won on the grounds that being the nearest, they would be WASTING much lesser time if they went there. The guftagu began, after the initial rite of ordering your stuff. Two Bread Butters, one beg sandwich, and a cup of tea. No maggie, no chowmein -- seriosly these people were low on budget. Before we get any further into their actual conversation, lets name the three dramatis personae. On account of confidentiality, they have requested that they be known by aliases. So lets call them MyTh, Quark and manGO. As the three waited for the food to arrive, manGO being in a counter reflective mo

Google Talk

Dedicated to: Gulzar Saab and to all guys wih a computer screen who are the inspiration behind this piece कल रात तुम्हारे online आने का इंतज़ार करता रहा बोझिल आँखें तुम्हारी बत्ती के green होने की आस देखती रहीं बात तो वैसे कुछ ख़ास नहीं थी पार बात तो फ़िर भी करनी थी दिन कैसे बीता, जब चोट लगी -- क्यों हँसा, यही सब कहनी थी दो बार "thr" भी मारा, सोचा कहीं "always-idle" तो नहीं कोई reply नहीं आया तो सोचा की अब यही सही शायद आज तुम्हारा mood नहीं था ४ बजे वाला कोहरा भी घिर आया था मैं आँखें मूँद नींद का इंतज़ार करता रहा कल रात तुम्हारे online आने का इंतज़ार करता रहा इन बोझिल आंखों को तुमसे chat करने की आदत सी हो गई है. Do follow up to the jugalbandi And for those whose Hindi font is not working: Kal raat tumhare online aane ka intezaar karta raha bojhil aankhen tumharee batti ke green hone ki aas dekhti rahi.n baat to waise kuchh khaas nahi thi par baat to fir bhi karni thi din kaise beeta, jab chot lagi -- kyun hansa yahi sab kahni thi do baar "thr"