Published May 4, 2026
| Version v2.1.0
Software
Open
metasyn
Description
Synthetic data is a promising tool for improving the accessibility of datasets which are too sensitive to be shared publicly. To this end, we introduce metasyn, a Python package for generating synthetic data from tabular datasets. Unlike existing synthetic data generation software, metasyn is built on a simple generative model that omits multivariate information. This choice enables transparency and auditability, keeps information leakage to a minimum, and enables privacy guarantees through a plug-in system. While the analytical validity of the generated data is thus intentionally limited, its potential uses are broad, including exploratory analyses, code development and testing, and external communication and teaching.
Notes
Files
sodascience/metasyn-v2.1.0.zip
Files
(3.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:2f2824773b2cb958c647104da512bf9e
|
3.1 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/sodascience/metasyn/tree/v2.1.0 (URL)
Software
- Repository URL
- https://github.com/sodascience/metasyn