Zenodo

NCCR Catalysis Zenodo Curation Policy

Must

At least one author is affiliated to NCCR Catalysis research.
Contact information for at least one NCCR Catalysis author is provided through an ORCID identifier.
The content of the submitted record (generally a dataset) must be accessible for review. In line with SNSF regulations, the NCCR Catalysis Internal Rules and Regulations prohibit embargo and require open-access as far as the article is concerned. Nonetheless, the dataset can be submitted for review before the article publication, requesting the dataset not to be published until the related article is published.
The description of the submitted work is sufficiently detailed. Mere references to external articles or to other external resources are not sufficient descriptions.
The submitted work includes at least a clearly identifiable README file, typically in the root directory. This is not required for works consisting of one single document (e.g., publications, posters, or presentation slides). The README file must include the record DOI or URL so that the downloaded folder can be traced back to its source. The README file should be selected to be previewed in the record.
The main DOI has been assigned by Zenodo. Entering an existing DOI as the main identifier is allowed only if the submitted work is an exact copy of a digital object that has already received its DOI on another platform. For example, supplementary data to a journal article should NOT re-use the journal article DOI.
The main title is human-readable on the same level as conventional publications: filenames or coded expressions are deprecated. The main title must not be the same as that of the article it refers to. When applicable, use one of the following types of title, or an analogous one:
- Dataset for [related work title]
- Compound characterization for [related work title]
- Code for [related work title]

If existing, references to related publications (e.g., article, source code, other datasets, etc.) are specified in the "Related works" field. If available, references are designated by their respective DOIs.
Any personal and sensitive data has been anonymized.
The submitted work has been cleaned up (e.g., there are no temporary files, no unnecessary empty files or folders, no superfluous file versions, etc.).
The correct NCCR Catalysis Phase is acknowledged through a statement including the grant number in "Funding", if there is at least some cash financing, or as a “contributor” with the role of “sponsor” for sole in-kind financing. If the author(s) do not know if they are financed in cash or in-kind, they can ask their group's admin assistant. Searching the phase by name is not returning the correct phase in the top results: it is easier to search with the phase ID, which are reported below for everyone's convenience:
- Phase I: 180544
- Phase II: 225147
Where applicable, sources from which the work is derived are specified in the “Related works” field.
A license is chosen and indicated through Zenodo’s appropriate field.
At least a minimal description is provided in the "Description" field.
Molecule information is present, preferably as .mol files.
Files are available in open formats, or in any case formats that can be opened and processed with free software. If proprietary formats are present, the work also includes versions of the files converted to open formats, with the least possible loss of information.
If your laboratory is unable to perform this task, the data steward should ask for the help of the data officer.
The Data Officer, with the help of the Data Stewards, will update the list of accepted and recommended formats below:

Data type	recommended	accepted
Nuclear magnetic resonance (NMR)	JCAMP-DX (.jdx)	.fid, bruker
Infrared/UV-vis/Raman spectroscopy	JCAMP-DX (.jdx)	.csv
X-ray diffraction	.cif
Transmission electron microscopy (TEM)	.tiff
Gas chromatography flame ionization detector (GC-FID)	.gcd
Gas chromatography - mass spectroscopy (GC-MS)	.xms
Liquid chromatography-mass spectroscopy (LC-MS)	.raw
Electron paramagnetic resonance (EPR)	.dsc, .dta

Recommended

The record (dataset, code repository, etc...) is submitted directly within the community, rather than as a standalone record then added to the community.
This allows the Data Steward to edit fields if necessary, facilitating the review process.
Permissive licenses are preferred. CC0, CC-BY-4.0, CC-BY-SA-4.0 for data and MIT, BSD, Apache, and L-GPL for code are suggested.
All authors are identified by their ORCID.
If related grants require an acknowledgement, they are listed using “Funding/Grants” fields.
The README file is a plain text file, avoiding proprietary formats such as MS Word whenever possible. Format the readme document so it is easy to understand (e.g., separate important pieces of information with blank lines, rather than having all the information in one long paragraph).
The README file contains clear and detailed information about the work creation (authors, time, place, methodologies, …). Dates are in standardised formats. Suggested format: W3C/ISO 8601 date standard, which specifies the international standard notation of YYYY-MM-DD or YYYY-MM-DDThh:mm:ss.
The README file contains clear and detailed information about content (e.g., file organization and naming, formats, relevant standards, etc.), sharing and access, etc.
If the submission is related to a PhD thesis, the supervisor is specified/mentioned.
Keywords are entered as separated fields in the “Keywords and subjects” field.
The information in the dataset is as self-contained as possible, i.e., access to the publication is not necessary to understand the data. This is particularly true for characterization of compounds.
The data and metadata is as structured and standardized as possible. For instance: one folder per compound and one file per spectrum, or grouped in a consistent way into formats such as hdf5.

Nice to have

README files are available for logical “clusters” of data. In many cases, it will be appropriate to create one document for a dataset that has multiple, related, similarly formatted files, or files that are logically grouped together for use (e.g., a collection of Matlab scripts). Sometimes it may make sense to create a README file for a single data file.
README files are formatted identically and present the information in the same order, using the same terminology.