Fifa World Cup Analysis From 1930 To 2018

INTRODUCTION

My team had been able to keep up with our weekly tasks. It has really being an amazing journey, making information easily understood just at a glance.

The task for the week was to work on football world cup analysis using data from 1930 till present. The World Cups dataset show all information about all the World Cups in history. The World Cup is a global football competition contested by the various football-playing nations of the world. It is contested every four years. We were to scrap, clean and visualize the data using Power bi. During this process we utilized both Microsoft excel and Power bi tools.

The task took us through the following processes :

  • Data scraping
  • Data cleaning
  • Data visualisation

Data Scrapping : We sourced our data from kaggle.com in CSV format. While sourcing the data, we focused on information that would meet the task requirements. We found three dataset, one had the world cup data from 1930 - 2014, another had the 2018 world cup data while the other was summary of the world cup data from 1930 - 2018.

20220626_141503.jpg

Data Cleaning : We went ahead to import the Datasets that met the task requirements into Microsoft excel for cleaning. For the world cup dataset from 1930 - 2014, we changed the data types for each column to fit it's content. In the stage column, we replaced stage 1-6 with Group A to F to tally with the 2018 dataset.

We corrected some names in the stadium column and city column. We also deleted win condition column as it had lots of empty cells. We also deleted the Halftime goals column and the Round ID column to match perfectly with the 2018 dataset in order to append both.

For the 2018 world cup dataset, we also changed the data types, separated the datetime column to have date column and time column as in the main dataset. In the world cup summary dataset, we also changed the data types and filled the missing cells with the correct content. Finally, we loaded the files to Power bi to join the dataset from 1930 - 2014 with the 2018 dataset since they have same columns. We joined both by appending queries as new. We also went further to delete referees and assisants columns.

IMG-20220626-WA0000.jpg

Data Visualization : After the transformation, we closed and applied. We went ahead to create our visuals on the report area.

The visuals showed the following :

  • Number of world cup winning title
  • Which countries had won the cup
  • Stadium with highest average attendance
  • Home team goals and away team goals per world cup
  • Attendance, matches played, qualified teams and goals scored per world cup
  • Matches played with the highest number of attendance by country
  • Number of Goals scored by country
  • Match outcome by home and away teams

FINAL RESULT

IMG_20220626_142310_962.png

Conclusion

  • The total number of world cup winning title from 1930 till present is 21.
  • Brazil has the highest world cup winning title with 5 trophies.
  • Brazil has also had the highest number of goals from 1930 till present with 259 goals.
  • Maracana estadio has the highest average attendance.
  • Germany has played the highest number of matches and also recorded the highest number of attendance from 1930 till present.

Members Contact

Sponsor