Published April 8, 2025 | Version v2
Dataset Open

Simulated Premier League player statistics dataset (2007/08 – 2023/24)

Description

This dataset was generated as part of Practical Exercise 1 of the Data Typology and Lifecycle course, within the UOC's Master's in Data Science.

The objective of the project is to demonstrate the operation of an automated scraper developed with Python and Selenium to extract historical statistics of Premier League players from the 2007/08 season to 2023/24.

This file contains simulated data.
To avoid potential conflicts with intellectual property or privacy rights, the original personal and sports data has been replaced with automatically generated fictitious values. Although masked, private use is preferred. The structure, format, and statistical consistency have been maintained for educational and demonstration purposes.

The original scraper dynamically accessed the official Premier League website (https://www.premierleague.com/stats) to extract information such as:

  • Player name
  • Position
  • Nationality
  • Date of birth
  • Height
  • Season
  • Club

Seasonal statistics:

  • Goals
  • Goal assist
  • Clean sheet
  • Appearances
  • Mins played
  • Yellow cards
  • Red cards
  • Total pass

This simulated dataset retains that structure but does not contain any real data.
It can be used as a basis for testing, data analysis training, or documentation of the scraping process.

Files

simulated_seasons_2007_2024_stats.csv

Files (839.4 kB)

Name Size Download all
md5:8eea844be279cf3a130f1c6da97c8c08
839.4 kB Preview Download

Additional details

Software

Programming language
Python console