Published September 17, 2023 | Version v.5
Dataset Open

Animal_Species_Synthetic_Dataset

Authors/Creators

Contributors

Data collector:

Description

Dataset Description

This dataset contains structured animal observation records designed for machine learning classification tasks. The data includes various animal characteristics along with geographic observation information.

The dataset has been cleaned and processed to improve quality and usability. Data preprocessing steps included handling missing values, removing duplicate entries, correcting inconsistent formats, and standardizing categorical values. No modifications were applied to longitude and latitude values as requested. Additionally, the target variable animal_class contains two unique categories, with a noticeable class imbalance where the majority of records belong to the mammal class.

This dataset can be used for:

  • Animal classification modeling

  • Studying class imbalance effects

  • Data preprocessing and cleaning practice

  • Feature encoding and transformation tasks

The dataset is suitable for supervised learning algorithms such as Logistic Regression, Decision Trees, Random Forest, SVM, and other classification models.

Files

enhanced_animal_dataset.csv

Files (554.3 kB)

Name Size Download all
md5:3964c74f5b78be0c3c8e0d922c11d90b
554.3 kB Preview Download