Published October 2, 2020 | Version v1
Dataset Open

Publicly available medical text data with authentic quality

  • 1. University of Tsukuba

Description

This dataset is the public medical text record (progress notes) written in Japanese.

Any researchers can use this dataset without privacy issues. 

CC BY-NC 4.0

crowd.zip: 9,756 pseudo progress notes written by crowd workers

crowd_evaluated.zip: 83 pseudo progress notes with authentic quality written by crowd workers

MD.zip: 19 pseudo progress notes written by medical doctors

 

Reference:

Kagawa, R., Baba, Y., & Tsurushima, H. (2021, December). A practical and universal framework for generating publicly available medical notes of authentic quality via the power of crowds. In 2021 IEEE International Conference on Big Data (Big Data) (pp. 3534-3543). IEEE.

http://hdl.handle.net/2241/0002002333

The supplemental files of the paper are here: https://github.com/rinabouk/HMData2021

Files

crowd.zip

Files (8.7 MB)

Name Size Download all
md5:10d6ea03dd033511a080775b7e4d0d33
8.6 MB Preview Download
md5:dab159079af645e0ed7a0073bdfaadd6
55.4 kB Preview Download
md5:d2c5d00db8a34be1810c63e6fd45831a
5.4 kB Preview Download

Additional details

References

  • Kagawa, R., Baba, Y., & Tsurushima, H. (2021, December). A practical and universal framework for generating publicly available medical notes of authentic quality via the power of crowds. In 2021 IEEE International Conference on Big Data (Big Data) (pp. 3534-3543). IEEE. http://hdl.handle.net/2241/0002002333