Skip to main content

FIFA World Cup 2018: A not-so-artificially-intelligent predictor

JUN 20, 2018:
It's been close to a week since FIFA World Cup 2018 started. We have already seen almost as many upsets as there were fancy advanced models pre-tournament trying to predict the outcomes. 
There has been a flurry of articles and scholarly papers using Artificial Intelligence and Machine Learning to predict the tournament results this year. Included in that list are the usual suspects such as FiveThirtyEight and bookmakers, as well as unlikely participants such as Goldman Sachs and Cornell University (maybe not that unlikely).

Inspired by these articles, some heated arguments, a few cups of coffee, and with my trusted Microsoft Excel, I set about creating my own not-so-intelligent predictorIt's a fairly simple model that runs Monte Carlo simulations, and uses some hard coded inputs on pre-tournament form, such as:
  • FIFA Ranking Points
  • Football ELO Ratings
  • Goals Scored, Goal Conceded and Undefeated streaks since World Cup 2014
  • Number of Ballon d'Or winners in the team

I started on this passion project on Monday night and ran some vanilla simulation to see how different the results are from the other advanced models (spoiler alert: they are not really). And maybe not everything in the world needs a machine learning tadka.

RESULTS (PRE TOURNAMENT PREDICTIONS)
If I had been living under a rock over the past week and didn't know the results of the matches already played, based on the model I would have said Brazil has the best chance to win the World Cup by defeating Germany in the Final. Overall probability for Brazil is 12% (which is significantly higher than 3% chance they would have had if all matches were decided by coin tosses).

The knockout stage would have probably looked like this:


THE FUN STUFF: MATCH DAY 01
The real interesting part comes in when you start coding in the results as the matches happen, and see how the draw changes. After Match Day 01, which ended with the Poland vs Senegal match on June 19th, the draw looks quite different for two reasons.


1. Germany's loss to Mexico means that they are more likely to finish second in their group, and that leads to a R16 showdown with Brazil (Final come early, yay). 
2. Argentina's draw with Iceland, coupled with Croatia's win over Nigeria means Argentina are also likely to finish second in their group and end up facing France in R16. 

Both these results combined mean the bottom half of the draw may completely open up. There is a not-so-unrealistic path to the Semis for England, Switzerland, Mexico and Senegal. Brazil is still most likely to win, but their probability has come down to 10% because they now have a more difficult match-up with Germany in R16.


-----

UPDATE JUNE 25 2018: END OF MATCH DAY 02

As we head into the find round of group games, here is what the knock-out grid looks like. I have made an update to the model to take into account tournament from. This is done by maintaining a live track of ELO-like ratings based on match results.


Some of the groups are very close to call. 
  • In group D, Croatia have qualified, but are not guaranteed top spot. Argentina has a 50% chance of qualifying, followed by Nigeria at 35% and Iceland at 15%.
  • In group F, anyone can qualify. Mexico have the best odds at 80%, followed by Germany at 65%, Sweden at 45% and Korea at 10%. Mexico - Germany is the most likely 1-2.
  • Group G is practically a coin toss in who finishes first. The model gives a slight edge to England.
    As we had already seen, the bottom half of the draw is much more favorable. It will be interesting to see if the two teams actively try to finish second.
    Fun fact: If the Belgium-England game ends in a draw, the winner will be based on Fair Play points. England currently have one fewer yellow card than Belgium.
  • In Group H too, anyone other than Poland can qualify. Japan has the best odds of qualifying, following Senegal and Columbia.

Overall, Brazil is still the favorite to win. Depending on how the 1/2 rank plays out across the groups, any of Portugal, Spain, England or Belgium can make the finals.
Assuming we flip the order of group winners in Groups B and G, then the draw looks something like this:


I'll update the odds at the end of Match Day 03 at which time the R16 matches will be locked in.

_______
UPDATE JULY 2nd 2018: END OF MATCH DAY 03

A super late post again. A part of the tardiness was caused by the results that I was getting. Based on some of the results in Round 3, the model ends up picking Belgium as the eventual winner. So, I was making sure there were no errors. I did find one (I had coded the result of BRA-SUI match incorrectly), but it did not change model results.

Numerically, though, I can see why Belgium are favourites - they beat England in Groups, and gained a lot of ELO points. As a result their tournament form is assessed to be higher than that of Brazil - which isn't that much of a surprise. But even if I were a betting man (and I am not), I would not bet all my money on Belgium winning it.


This is what the model churned out in the end. Other than the Belgium predictions, I can fairly happy with the results so far. This in spite the fact that two out of four predictions have already been proven wrong. But Spain really shouldn't have lost to Russia. And I really believe Mexico have a better than 18% chance of beating Brazil that FiveThirtyEight has attributed.

___
UPDATE JULY 5th 2018: END OF R16

Finally a post on time, before the first QF kicks of tomorrow between Uruguay and France.

Spain's upset loss against Russia means a lot of churn in the bottom half of the table, leading to Croatia making the final. In the top half, a close game for Belgium and a comfortable win for Brazil means things are a little bit closer than the last run of the model. Belgium still picked to go through and win it all, if only by a whisker against Brazil.
Overall Belgium have the highest probability to win the cup, followed by Brazil, Croatia, England, and France - in that order.






Comments

Popular posts from this blog

The Awesome Threesome

I expect the DC++ hoggers already know about "Three KGPians day out", well here is a new version of it.

Four days before the end sem exams, and on the eve of the day which has three tests in store for them, three KGPians, decided to go out for a late night snack. Actually there wasnt much decision involved except for the place where they would be willing to hog down stuff. The local canteen won on the grounds that being the nearest, they would be WASTING much lesser time if they went there.

The guftagu began, after the initial rite of ordering your stuff. Two Bread Butters, one beg sandwich, and a cup of tea. No maggie, no chowmein -- seriosly these people were low on budget. Before we get any further into their actual conversation, lets name the three dramatis personae. On account of confidentiality, they have requested that they be known by aliases. So lets call them MyTh, Quark and manGO.

As the three waited for the food to arrive, manGO being in a counter reflective mood de…

The practical dSLR buying guide

If you are planning to buy a dSLR, here are the steps that you should follow. The first rule of buying a dSLR is do not buy a dSLR.
The second rule of buying a dSLR is do NOT buy a dSLR. Seriously! You may think that you are ready, that you know what you are getting into. But trust me, you are not. dSLRs are not just cameras, they are relationships. And like any relationship they have to be maintained – they require time, money and effort. And much like that rebound you regretted getting into, at some point you will regret this.

These things are big, bulky, heavy and conspicuous – you will probably require a separate bag when you are packing for a trip, you will have to sacrifice underwear or deodorant (or both) on that hike, the strangers would glare at you as they would at a monkey with a camera (which you most likely end up looking like anyway), and everyone will expect each frame you click to be poetry frozen in time. You can’t carry them to the local pub; you can’t dance with th…