There is a newer version of the record available.

Published March 27, 2022 | Version v1
Dataset Open

Improving The Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data

  • 1. Loyola University Chicago
  • 2. The George Washington University
  • 3. Shengjing Hospital of China Medical University

Description

This repository contains the dataset used in the paper "Improving The Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data". Please check our preprint for the full details. The dataset contains 1232 nodules from 724 patients. Each row represents one nodule and each column represents one variable that describes the characteristics of the patient or nodule. The meaning of each variable is summarized below.

id: the unique identity of the patient who carries the nodule

age: the age of the patient

FT3: triiodothyronine test result

FT4: thyroxine test result

TSH: thyroid-stimulating hormone test result

TPO: thyroid peroxidase antibody test result

TGAb: thyroglobulin antibodies test result

site: the nodule location, 0: right, 1: left, 2: isthmus

echo_pattern: thyroid echogenicity, 0: even, 1: uneven

multifocality: if multiple nodules exist in one location, 0: no, 1: yes

size: the nodule size in cm

shape: the nodule shape, 0: regular, 1: irregular

margin: the clarity of nodule margin, 0: clear; 1: unclear

calcification: the nodule calcification, 0: absent, 1: present

echo_strength: the nodule echogenicity, 0: none, 1: isoechoic, 2: medium-echogenic, 3: hyperechogenic, 4: hypoechogenic

blood_flow: the nodule blood flow, 0: normal, 1: enriched

composition: the nodule composition, 0: cystic, 1: mixed, 2: solid

multilateral: if nodules occur in more than one location, 0: no, 1: yes

mal: the nodule malignancy, 0: benign, 1: malignant

Files

Files (282.1 kB)

Name Size Download all
md5:ff4556d9947c1f2c4b9676aea6f499a8
282.1 kB Download