Published April 16, 2022 | Version v2
Dataset Open

Improving The Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data

  • 1. Loyola University Chicago
  • 2. The George Washington University
  • 3. Shengjing Hospital of China Medical University

Description

This repository contains the dataset used in the paper "Improving The Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data" published in Scientific Reports. Please check our formal publication for the full details. The dataset contains 1232 nodules from 724 patients. Each row represents one nodule and each column represents one variable that describes the characteristics of the patient or nodule. The meaning of each variable is summarized below.

  • id: the unique identity of the patient who carries the nodule
  • age: the age of the patient
  • FT3: triiodothyronine test result
  • FT4: thyroxine test result
  • TSH: thyroid-stimulating hormone test result
  • TPO: thyroid peroxidase antibody test result
  • TGAb: thyroglobulin antibodies test result
  • site: the nodule location, 0: right, 1: left, 2: isthmus
  • echo_pattern: thyroid echogenicity, 0: even, 1: uneven
  • multifocality: if multiple nodules exist in one location, 0: no, 1: yes
  • size: the nodule size in cm
  • shape: the nodule shape, 0: regular, 1: irregular
  • margin: the clarity of nodule margin, 0: clear; 1: unclear
  • calcification: the nodule calcification, 0: absent, 1: present
  • echo_strength: the nodule echogenicity, 0: none, 1: isoechoic, 2: medium-echogenic, 3: hyperechogenic, 4: hypoechogenic
  • blood_flow: the nodule blood flow, 0: normal, 1: enriched
  • composition: the nodule composition, 0: cystic, 1: mixed, 2: solid
  • multilateral: if nodules occur in more than one location, 0: no, 1: yes
  • mal: the nodule malignancy, 0: benign, 1: malignant

Files

thyroid_clean.csv

Files (75.9 kB)

Name Size Download all
md5:f6935d19ddb6eff0d9d9ee950dd09203
75.9 kB Preview Download

Additional details

References

  • Xi, M.N., Wang, L., and Yang, C. (2022). Improving the Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data. Scientific Report 12, 1143