# ISA-XLSX format, version 3.0.0-draft.1

For detail on ISA framework terminology, please read the [ISA Abstract Model specification](https://isa-specs.readthedocs.io/en/latest/isamodel.html).

This document describes the ISA Abstract Model reference implementation specified in the ISA-XLSX format. The XLSX format uses the SpreadsheetML markup language and schema to represent a spreadsheet document. Conceptually, using the terminology of the Spreadsheet ML specification [ISO/IEC 29500-1](https://www.loc.gov/preservation/digital/formats/fdd/fdd000398.shtml#:~:text=The%20XLSX%20format%20uses%20the,a%20rectangular%20grid%20of%20cells.), the document comprises one or more worksheets in a workbook.

**Table of contents**

- [Investigation File](#investigation-file)
- [Study File](#study-file)
- [Assay File](#assay-file)
- [Run File](#run-file)
- [Workflow File](#workflow-file)
- [Datamap File](#datamap-file)
- [Top-level metadata sheets](#top-level-metadata-sheets)
  - [Ontology Source Reference section](#ontology-source-reference-section)
  - [INVESTIGATION section](#investigation-section)
  - [STUDY section](#study-section)
  - [ASSAY section](#assay-section)
  - [RUN section](#run-section)
  - [WORKFLOW section](#workflow-section)
  - [Multiple Values](#multiple-values)
- [Annotation Table sheets](#annotation-table-sheets)
  - [Inputs and Outputs](#inputs-and-outputs)
  - [Protocol Columns](#protocol-columns)
  - [Ontology Annotations](#ontology-annotations)
  - [Unit](#unit)
  - [Characteristics](#characteristics)
  - [Factors](#factors)
  - [Components](#components)
  - [Parameters](#parameters)
  - [Comments](#comments)
  - [Examples](#examples-1)
- [Datamap Table sheets](#datamap-table-sheets)
  - [Data](#data-column)
  - [Explication](#explication-column)
  - [Unit](#unit-column)
  - [Object Type](#object-type-column)
  - [Label](#label-column)
  - [Description](#description-column)
  - [Generated By](#generated-by-column)
  - [Comment](#comments-1)
  - [Examples](#examples-2)


Below we provide the schemas and the content rules for valid ISA-XLSX documents. 

ISA-XLSX uses three types of files to capture the experimental metadata:
  - Investigation file
  - Study file
  - Assay file

The Investigation file contains all the information needed to understand the overall goals and means used in an experiment; experimental steps (or sequences of events) are described in the Study and in the Assay file(s). For each Investigation file there may be one or more Studies defined with a corresponding Study file; for each Study there may be one or more Assays defined with corresponding Assay files; one assay file may be registered in different studies.

In order to facilitate identification of ISA-XLSX component files, specific naming patterns MUST be followed:

- `isa.investigation.xlsx` for identifying the [Investigation file](#investigation-file)
- `isa.study.xlsx` for identifying [Study file(s)](#study-file)
- `isa.assay.xlsx` for identifying [Assay file(s)](#assay-file)

Sheets described in this specification MUST follow one of the two given formats:

- [`Top-level metadata sheets`](#top-level-metadata-sheets) for listing top-level metadata
- [`Annotation Table sheets`](#annotation-table-sheets) for describing experimental workflows

Sheets which do not follow any of these two formats are considered additional payload and are ignored in this specification.

All labels are case-sensitive:

Dates SHOULD be supplied in the [ISO8601](http://www.iso.org/iso/home/standards/iso8601.htm) format.

For maximal portability file names SHOULD contain only ASCII characters not excluded
already (that is `A-Za-z0-9._!#$%&+,;=@^(){}'[]` - we exclude space as many utilities
do not accept spaces in file paths): non-English alphabetic characters cannot be guaranteed
to be supported in all locales. It is recommended to avoid the shell metacharacters
`(){}'[]$."`.

# Investigation File

The `Investigation file` fulfils four needs:

1. to declare key entities, such as factors, protocols, which may be referenced in the other files
2. to track provenance of the used terminologies (controlled vocabularies or ontologies), where applicable
3. to relate Assay files to Studies
4. to select those Studies, that are considered part of the investigation.

The `Investigation File` MUST contain one [`Top-Level Metadata sheet`](#top-level-metadata-sheets). This sheet MUST be named `isa_investigation` and MUST contain the following sections:
 
- [`ONTOLOGY SOURCE REFERENCE`](#ontology-source-reference)
- [`INVESTIGATION`](#investigation)
- [`INVESTIGATION PUBLICATIONS`](#investigation-publications)
- [`INVESTIGATION CONTACTS`](#investigation-contacts)

Additionally, it MAY contain the following sections:

- [`STUDY`](#study-section)
- [`STUDY DESIGN DESCRIPTORS`](#study-design-descriptors)
- [`STUDY PUBLICATIONS`](#study-publications)
- [`STUDY FACTORS`](#study-factors)
- [`STUDY ASSAYS`](#study-assays)
- [`STUDY PROTOCOLS`](#study-protocols)
- [`STUDY CONTACTS`](#study-contacts)
  
The `Investigation File` implements the [`Investigation`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#investigation) graph from the ISA Abstract Model.

# Study File

The `Study` represents a set of logically connected experiments. A `Study File` contains contextualising information for one or more `Assays`, metadata about the study design, study factors used, and study protocols, as well as information similarly to the Investigation including title and description of the study, and related people and scholarly publications, but also details the sample collection process needed to perform the connected `Assays`.

The `Study File` MUST contain one [`Top-Level Metadata sheet`](#top-level-metadata-sheets). This sheet MUST be named `isa_study` and MUST contain the following sections:

- [`STUDY`](#study-section)
- [`STUDY DESIGN DESCRIPTORS`](#study-design-descriptors)
- [`STUDY PUBLICATIONS`](#study-publications)
- [`STUDY CONTACTS`](#study-contacts)

Additionally, it MAY contain the following sections:

- [`STUDY FACTORS`](#study-factors)
- [`STUDY ASSAYS`](#study-assays)
- [`STUDY PROTOCOLS`](#study-protocols)

Additionally, the `Study File` SHOULD contain one or more [`Annotation Table sheet(s)`](#annotation-table-sheets), which MAY record provenance of biological samples, from source material through a collection process to sample material.

Therefore, the main entities of the `Study File` should be `Sources` and `Samples`.

The `Study File` implements the [`Study`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#study) graph from the ISA Abstract Model. graph from the ISA Abstract Model.

# Assay File

The `Assay` represents one experimental measurement. An `Assay File` contains metadata about the assay design, information about the people performing the experiment, and most importantly, details about the preparation and/or execution of the experimental measurement.

The `Assay File` MUST contain one [`Top-Level Metadata sheet`](#top-level-metadata-sheets). This sheet MUST be named `isa_assay` and MUST contain the following sections:

- [`ASSAY`](#assay-section)
- [`ASSAY PERFORMERS`](#assay-performers)

Additionally, the `Assay File` SHOULD contain one or more [`Annotation Table sheet(s)`](#annotation-table-sheets), which MAY record preparation of biological samples, measurement of these samples and basic computations performed on the resulting data.

Therefore, the main entities of the `Assay File` should be `Samples` and `Data`.

The `Assay File` implements the [`Assay`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#assay) graph from the ISA Abstract Model.

# Run File

The `Run` represents the application of a `workflow`, i.e. the execution of a computational tool that has orchestrated the execution of other tools. A `Run File` contains metadata about the executed workflows, information about the people creating, picking and parametrizing these workflow exections, and most importantly, details about the input data, parametrization and the result data.

The `Run File` MUST contain one [`Top-Level Metadata sheet`](#top-level-metadata-sheets). This sheet MUST be named `isa_run` and MUST contain the following sections:

- [`RUN`](#run-section)
- [`RUN PERFORMERS`](#run-performers)

Additionally, the `Run File` SHOULD contain one or more [`Annotation Table sheet(s)`](#annotation-table-sheets), which MAY record provenance of data through application of computational workflows and the parameters that describe these computations.

Therefore, the main entities of the `Run File` should be `Data`.

# Workflow File

The `Workflow` represents the prospective orchestration of other tools or workflows. A `Workflow File` contains contextualizing information about the workflow.

The `Workflow File` MUST contain one [`Top-Level Metadata sheet`](#top-level-metadata-sheets). This sheet MUST be named `isa_workflow` and MUST contain the following sections:

- [`WORKFLOW`](#workflow-section)
- [`WORKFLOW CONTACTS`](#workflow-contacts)

# Datamap File

The `Datamap` represents a set of explanations about the `data` entities defined in `assays` and `studies`.

The `Datamap File` MUST contain at least one [`Datamap table sheet`](#datamap-table-sheets). The names of these worksheets SHOULD reflect the data entities they describe.

Therefore, the main entities of the `Datamap File` should be `Data`.

The `Datamap File` acts as an extension of the `data` nodes defined in the [`Study and Assay graphs section`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#study-and-assay-graphs) from the ISA Abstract Model.

# Top-level metadata sheets

The purpose of top-level metadata sheets is aggregating and listing top-level metadata. Each sheet consists of sections consisting of a section header and key-value fields. Section headers MUST be completely written in upper case (e.g. STUDY), field headers MUST have the first letter of each word in upper case (e.g. Study Identifier); with the exception of the referencing label (REF).

In the following sections, examples of each section block are given beside the specification of each section.

> ### ATTENTION
> Rows in which the first character in the first column is Unicode
> [U+0023](http://www.fileformat.info/info/unicode/char/0023/index.htm)  (the `#` character) > MUST be interpreted as
> comments, where reference implementation parsers SHOULD ignore those lines entirely.

> Rows where the label `Comment[<comment name>]` appear can also appear within any of the > section blocks. Where these appear, the comment name must be unique within the context of a single block (e.g. you cannot have multiple occurrences of `Comment[external DB REF]` within `STUDY ASSAYS`. Also, the value cells MUST match the number of values indicated by the rest of the section in context.

## Ontology Source Reference section

The Ontology Source section of the Investigation file is used to declare Ontology Sources used elsewhere in the ISA-XLSX
files within the context of an Investigation.

Where a row labelled with `Term Source REF` suffixed in a [`Top-level metadata sheet`](#top-level-metadata-sheets), the value of the cell SHOULD match one of the `Term Source Name` value declared in this section.

Where a column labelled with `Term Source REF` in a [`Annotation table sheet`](#annotation-table-sheets), the value
of the cell SHOULD match one of the `Term Source Name` value declared in this section.

This section implements a list of `Ontology Source` from the ISA Abstract Model.

This section MUST contain zero or more values.

### ONTOLOGY SOURCE REFERENCE

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                   | Datatype                  | Description                                                                                                                                                                     |
|-------------------------|---------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Term Source Name        | String                    | The name of the source of a term; i.e. the source controlled vocabulary or ontology. These names will be used in all corresponding Term Source REF fields that occur elsewhere. |
| Term Source File        | String (file name or URI) | A file name or a URI of an official resource.                                                                                                                                   |
| Term Source Version     | String                    | The version number of the Term Source to support terms tracking.                                                                                                                |
| Term Source Description | String                    | Use for disambiguating resources when homologous prefixes have been used.                                                                                                       |

**Example**

For example, the `ONTOLOGY SOURCE REFERENCE` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                    |       |       |       |             |      |
|--------------------|-------|-------|-------|-------------|------|
| ONTOLOGY SOURCE REFERENCE | 
| Term Source Name  | CHEBI | EFO | OBI | NCBITAXON | PATO |
| Term Source File  | [http://data.bioontology.org/ontologies/CHEBI](http://data.bioontology.org/ontologies/CHEBI) | [http://data.bioontology.org/ontologies/EFO](http://data.bioontology.org/ontologies/EFO) | [http://data.bioontology.org/ontologies/OBI](http://data.bioontology.org/ontologies/OBI) | [http://data.bioontology.org/ontologies/NCBITAXON](http://data.bioontology.org/ontologies/NCBITAXON) | [http://data.bioontology.org/ontologies/PATO](http://data.bioontology.org/ontologies/PATO) |
| Term Source Version | 78  | 111 | 21  | 2         | 160 |
| Term Source Description | Chemical Entities of Biological Interest Ontology | Experimental Factor Ontology | Ontology for Biomedical Investigations | National Center for Biotechnology Information (NCBI) Organismal Classification | Phenotypic Quality Ontology |


## INVESTIGATION section

This section is organized in several subsections, described in detail below.

This section implements an `Investigation` from the ISA Abstract Model.

### INVESTIGATION

This section MUST contain zero or one values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                             | Datatype                                    | Description                                                                                  |
|-----------------------------------|---------------------------------------------|----------------------------------------------------------------------------------------------|
| Investigation Identifier          | String                                      | A mandatory identifier or an accession number provided by a repository. This SHOULD be locally unique. A value MUST be given for this label. |
| Investigation Title               | String                                      | A mandatory concise name given to the investigation. A value MUST be given for this label.   |
| Investigation Description         | String                                      | A mandatory textual description of the investigation. A value MUST be given for this label.  |
| Investigation Submission Date     | String formatted as ISO8601 date YYYY-MM-DD | The date on which the investigation was reported to the repository.                          |
| Investigation Public Release Date | String formatted as ISO8601 date YYYY-MM-DD | The date on which the investigation was released publicly.                                   |

**Example**

For example, the `INVESTIGATION` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:



|                              |                         |
|------------------------------|-------------------------|
| INVESTIGATION |
| Investigation Identifier     | ChlamyHeatstress                 |
| Investigation Title         | Systems-wide investigation of responses to moderate and acute high temperatures in the green alga Chlamydomonas reinhardtii. |
| Investigation Description   | Algae cultures were grown mixotrophically (TAP). After 24h of 35°C/40°C the cells were shifted back to room temperature for 48h. 'omics samples were taken. |
| Investigation Submission Date | 2022-05-13              |
| Investigation Public Release Date |            |


### INVESTIGATION PUBLICATIONS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                                  | Datatype                                                                                           | Description                                                                                                                                                                                |
|--------------------------------------------------------|----------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Investigation Publication PubMed ID                                | String formatted as valid PubMed ID                                                                | The PubMed IDs of the described publication(s) associated with this investigation.                                                                              |
| Investigation Publication DOI                          | String formatted as valid DOI                                                                      | A Digital Object Identifier (DOI) for that publication (where available).                                                                                 |
| Investigation Publication Author List                  | String                                                                                      | The list of authors associated with that publication.                                                                                |
| Investigation Publication Title                        | String                                                                                             | The title of publication associated with the investigation.                                                                              |
| Investigation Publication Status                       | String, or Ontology Annotation by providing accompanying Term Accession Number and Term Source REF | A term describing the status of that publication (i.e. submitted, in preparation, published). |
| Investigation Publication Status Term Accession Number | String or URI                                                                                      | The accession number from the Term Source associated with the selected term.                                                                                       |
| Investigation Publication Status Term Source REF       | String                                                                                             | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one the Term Source Name declared in the in the Ontology Source Reference section. |

**Example**

For example, the `INVESTIGATION PUBLICATIONS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:


|                                        |                  |
|----------------------------------------|------------------|
| INVESTIGATION PUBLICATIONS |
| Investigation Publication PubMed ID    | PMC9106746         |
| Investigation Publication DOI          | 10.1038/s42003-022-03359-z |
| Investigation Publication Author List  | Ningning Zhang, Erin M. Mattoon, Will McHargue, Benedikt Venn, David Zimmer, Kresti Pecani, Jooyeon Jeong, Cheyenne M. Anderson, Chen Chen, Jeffrey C. Berry, Ming Xia, Shin-Cheng Tzeng, Eric Becker, Leila Pazouki, Bradley Evans, Fred Cross, Jianlin Cheng, Kirk J. Czymmek, Michael Schroda, Timo Mühlhaus & Ru Zhang |
| Investigation Publication Title        | Systems-wide analysis revealed shared and unique responses to moderate and acute high temperatures in the green alga Chlamydomonas reinhardtii |
| Investigation Publication Status       | published |
| Investigation Publication Status Term Accession Number | http://purl.org/spar/pso/published |
| Investigation Publication Status Term Source REF | PSO |

### INVESTIGATION CONTACTS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                            | Datatype                                                                                    | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|--------------------------------------------------|---------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Investigation Person Last Name                   | String                                                                                      | The last name of a person associated with the investigation.                                                                              |
| Investigation Person First Name                  | String                                                                                      | Investigation Person Name                                                                                        |
| Investigation Person Mid Initials                | String                                                                                      | The middle initials of a person associated with the investigation.                                                                              |
| Investigation Person Email                       | String formatted as email                                                                   | The email address of a person associated with the investigation.                                                                              |
| Investigation Person Phone                       | String                                                                                      | The telephone number of a person associated with the investigation.                                                                              |
| Investigation Person Fax                         | String                                                                                      | The fax number of a person associated with the investigation.                                                                              |
| Investigation Person Address                     | String                                                                                      | The address of a person associated with the investigation.                                                                              |
| Investigation Person Affiliation                 | String                                                                                      | The organization affiliation for a person associated with the investigation.                                                                              |
| Investigation Person Roles                       | String or Ontology Annotation if accompanied by Term Accession Numbers and Term Source REFs | Term to classify the role(s) performed by this person in the context of the investigation, which means that the roles reported here need not correspond to roles held withing their affiliated organization. Multiple annotations or values attached to one person can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: submitter;funder;sponsor) .The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources. |
| Investigation Person Roles Term Accession Number | String                                                                                      | The accession number from the Term Source associated with the selected term.                                                                                       |
| Investigation Person Roles Term Source REF       | String                                                                                      | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section.                                                                                    |

**Example**

For example, the `INVESTIGATION CONTACTS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                                |          |          |       |
|--------------------------------|----------|----------|-------|
| INVESTIGATION CONTACTS |
| Investigation Person Last Name | Venn  | Zimmer | Mühlhaus  |
| Investigation Person First Name | Benedikt   | David     | Timo   |
| Investigation Person Mid Initials |        |         |      |
| Investigation Person Email     | venn@rptu.de         | d_zimmer@rptu.de         | timo.muehlhaus@rptu.de      |
| Investigation Person Phone     |          |          |       |
| Investigation Person Fax       |          |          |       |
| Investigation Person Address   | TU Kaiserslautern, Kaiserslautern, 67663, Germany | TU Kaiserslautern, Kaiserslautern, 67663, Germany | TU Kaiserslautern, Kaiserslautern, 67663, Germany |
| Investigation Person Affiliation | Computational Systems Biology | Computational Systems Biology | Computational Systems Biology |
| Investigation Person Roles     | author | author | corresponding author |
| Investigation Person Roles Term Accession Number |          |          |       |
| Investigation Person Roles Term Source REF |          |          |       |

## STUDY section

This section is organized in several subsections, described in detail below. This section also represents a
**repeatable block**, which is replicated according to the number of Studies to report (i.e. two Studies, two Study
blocks are represented in the Investigation file). The subsections in the block are arranged vertically; the intent
being to enhance readability and presentation, and possibly to help with parsing. These subsections MUST remain within
this repeatable block, although their order MAY vary; the fields MUST remain within their subsection.

These sections implement the metadata for a `Study` from the ISA Abstract Model and a list of `Assay` (i.e. `Study` and
`Assay` **without** graphs; graphs are implemented in ISA-XLSX as `Annotation Table sheets`).

### STUDY

This section MUST contain zero or one values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                     | Datatype                             | Description                                                                                                                                                                                            |
|---------------------------|--------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Study Identifier          | String                               | A mandatory unique identifier, either a temporary identifier supplied by users or one generated by a repository or other database. For example, it could be an identifier complying with the LSID specification. A value MUST be given for this label. |
| Study Title               | String                               | A mandatory concise phrase used to encapsulate the purpose and goal of the study. A value MUST be given for this label.                                                                                                                                |
| Study Description         | String                               | A textual description of the study, with components such as objective or goals.                                                                                                                        |
| Study Submission Date     | String formatted as ISO8601 date     | The date on which the study is submitted to an archive.                                                                                                                                                |
| Study Public Release Date | String formatted as ISO8601 date     | The date on which the study SHOULD be released publicly.                                                                                                                                               |
| Study File Name           | String formatted as file name or URI | A field to specify the name of the Study Table file corresponding the definition of that Study. There can be only one file per cell.                                                                   |

**Example**

For example, the `STUDY` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                        |          |
|------------------------|----------|
| STUDY |
| Study Identifier       | HeatstressExperiment  |
| Study Title            | Systems-wide investigation of responses to moderate and acute high temperatures in the green alga Chlamydomonas reinhardtii. |
| Study Description      | Algae cultures were grown mixotrophically (TAP). After 24h of 35°C/40°C the cells were shifted back to room temperature for 48h. 'omics samples were taken. |
| Study Submission Date  | 2022-05-13 |
| Study Public Release Date |  |
| Study File Name        | studies/HeatstressExperiment/isa.study.xlsx |


### STUDY DESIGN DESCRIPTORS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                   | Datatype   | Description                                                                                                                                                                                                                                                                                                                             |
|-----------------------------------------|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Study Design Type                       | String     | A term allowing the classification of the study based on the overall experimental design, e.g cross-over design or parallel group design. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Study Design Type Term Accession Number | String     | The accession number from the Term Source associated with the selected term.                                                                                                                                                                                                                                                            |
| Study Design Type Term Source REF       | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Study Design Term Source REF has to match one the Term Source Name declared in the Ontology Source Reference section.                                                                                                                                   |

**Example**

For example, the `STUDY DESIGN DESCRIPTORS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                                |                   | |
|--------------------------------|-------------------|-|
| STUDY DESIGN DESCRIPTORS |
| Study Design Type              | time series design | heat exposure |
| Study Design Type Term Accession Number | http://purl.obolibrary.org/obo/OBI_0500020 | http://purl.obolibrary.org/obo/XCO_0000308 |
| Study Design Type Term Source REF | OBI               | |


### STUDY PUBLICATIONS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                          | Datatype                                                                                           | Description                                                                                                                                                                                |
|------------------------------------------------|----------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Study PubMed ID                                | String formatted as valid PubMed ID                                                                | The PubMed IDs of the described publication(s) associated with this study.                                                                                                                 |
| Study Publication DOI                          | String formatted as valid DOI                                                                      | A Digital Object Identifier (DOI) for that publication (where available).                                                                                                                  |
| Study Publication Author List                  | String                                                                                             | The list of authors associated with that publication.                                                                                                                                      |
| Study Publication Title                        | String                                                                                             | The title of publication associated with the investigation.                                                                                                                                |
| Study Publication Status                       | String, or Ontology Annotation by providing accompanying Term Accession Number and Term Source REF | A term describing the status of that publication (i.e. submitted, in preparation, published).                                                                                              |
| Study Publication Status Term Accession Number | String or URI                                                                                      | The accession number from the Term Source associated with the selected term.                                                                                                               |
| Study Publication Status Term Source REF       | String                                                                                             | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one the Term Source Name declared in the in the Ontology Source Reference section. |

**Example**

For example, the `STUDY PUBLICATIONS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                                        |                  |
|----------------------------------------|------------------|
| STUDY PUBLICATIONS |
| Study Publication PubMed ID    | PMC9106746         |
| Study Publication DOI          | 10.1038/s42003-022-03359-z |
| Study Publication Author List  | Ningning Zhang, Erin M. Mattoon, Will McHargue, Benedikt Venn, David Zimmer, Kresti Pecani, Jooyeon Jeong, Cheyenne M. Anderson, Chen Chen, Jeffrey C. Berry, Ming Xia, Shin-Cheng Tzeng, Eric Becker, Leila Pazouki, Bradley Evans, Fred Cross, Jianlin Cheng, Kirk J. Czymmek, Michael Schroda, Timo Mühlhaus & Ru Zhang |
| Study Publication Title        | Systems-wide analysis revealed shared and unique responses to moderate and acute high temperatures in the green alga Chlamydomonas reinhardtii |
| Study Publication Status       | published |
| Study Publication Status Term Accession Number | http://purl.org/spar/pso/published |
| Study Publication Status Term Source REF | PSO |


### STUDY FACTORS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                   | Datatype   | Description                                                                                                                                                                                                                                                                                                                                                                              |
|-----------------------------------------|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Study Factor Name                       | String     | The name of one factor used in the Study and/or Assay files. A factor corresponds to an independent variable manipulated by the experimentalist with the intention to affect biological systems in a way that can be measured by an assay. The value of a factor is given in the Study or Assay file, accordingly. If both Study and Assay have a Factor Value, these must be different. |
| Study Factor Type                       | String     | A term allowing the classification of this factor into categories. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required.                                                                                                                         |
| Study Factor Type Term Accession Number | String     | The accession number from the Term Source associated with the selected term.                                                                                                                                                                                                                                                                                                             |
| Study Factor Type Term Source REF       | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section.                                                                                                                                                                                                   |

**Example**

For example, the `STUDY FACTORS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                             |                    |                   | 
|-----------------------------|--------------------|-------------------|
| STUDY FACTORS |
| Study Factor Name | temperature | collection time  |
| Study Factor Type | temperature | time             |
| Study Factor Type Term Accession Number | http://purl.obolibrary.org/obo/PATO_0000146 | http://purl.obolibrary.org/obo/PATO_0000165 |
| Study Factor Type Term Source REF | PATO  | PATO |


### STUDY ASSAYS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                              | Datatype   | Description                                                                                                                                                                                                                                                                                                         |
|----------------------------------------------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Study Assay Identifier          | String                               | A mandatory unique identifier, either a temporary identifier supplied by users or one generated by a repository or other database. For example, it could be an identifier complying with the LSID specification. A value MUST be given for this label. |
| Study Assay Title               | String                               | A concise phrase used to encapsulate the purpose and goal of the assay.                                                                                                                                |
| Study Assay Description         | String                               | A textual description of the assay, with components such as objective or goals.                                                                                                                        |
| Study Assay Measurement Type                       | String     | A term to qualify the endpoint, or what is being measured (e.g. gene expression profiling or protein identification). The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Study Assay Measurement Type Term Accession Number | String     | The accession number from the Term Source associated with the selected term.                                                                                                                                                                                                                                        |
| Study Assay Measurement Type Term Source REF       | String     | The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section.                                                                                                                                                                                                          |
| Study Assay Technology Type                        | String     | Term to identify the technology used to perform the measurement, e.g. DNA microarray, mass spectrometry. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required.              |
| Study Assay Technology Type Term Accession Number  | String     | The accession number from the Term Source associated with the selected term.                                                                                                                                                                                                                                        |
| Study Assay Technology Type Term Source REF        | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section.                                                                                                                             |
| Study Assay Technology Platform                    | String     | Manufacturer and platform name, e.g. Bruker AVANCE                                                                                                                                                                                                                                                                  |
| Study Assay File Name                              | String     | A field to specify the name of the Assay Table file corresponding the definition of that assay. There can be only one file per cell.                                                                                                                                                                                |

**Example**

For example, the `STUDY ASSAYS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                     |                                      |                              |
|---------------------|--------------------------------------|------------------------------|
| STUDY ASSAYS |
| Study Assay Identifier     | Proteomics     | Transcriptomics |
| Study Assay Title     | Mass spectrometrical analysis of harvested, heat-shocked Chlamydomonas cells.  |  RNA Sequencing analysis of harvested, heat-shocked Chlamydomonas cells. |
| Study Assay Description     | Proteins from harvested cells were extracted, digested and the proteome content was measured using Orbitrap Fusion Lumos. | RNA from harvested cells were extracted and quantified using an Illumina HiSeq 2000 Rapid Run.|
| Study Assay Measurement Type | Proteomics | transcription profiling      |
| Study Assay Measurement Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C20085 | http://purl.obolibrary.org/obo/OBI_0000424 |
| Study Assay Measurement Type Term Source REF     | NCIT | OBI                          |
| Study Assay Technology Type | Mass Spectrometry                | nucleotide sequencing         |
| Study Assay Technology Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C17156 | http://purl.obolibrary.org/obo/OBI_0000626 |
| Study Assay Technology Type Term Source REF  | NCIT | OBI                          |
| Study Assay Technology Platform | Orbitrap Fusion Lumos  | Illumina HiSeq 2000 Rapid Run |
| Study Assay File Name     | assays/Proteomics/isa.assay.xlsx     | assays/Transcriptomics/isa.assay.xlsx |


### STUDY PROTOCOLS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label | Datatype | Description |
|-------|----------|-------------|
| Study Protocol Name                                  | String     | The name of the protocols used within the ISA-XLSX document. The names are used as identifiers within the ISA-XLSX document and will be referenced in the Study and Assay files in the Protocol REF columns. Names can be either local identifiers, unique within the ISA Archive which contains them, or fully qualified external accession numbers.                                                |
| Study Protocol Type                                  | String     | Term to classify the protocol. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Study Protocol Type Term Accession Number            | String     | The accession number from the Term Source associated with the selected term. |
| Study Protocol Type Term Source REF                  | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section. |
| Study Protocol Description                           | String     | A free-text description of the protocol. |
| Study Protocol URI                                   | String     | Pointer to protocol resources external to the ISA-Tab that can be accessed by their Uniform Resource Identifier (URI). |
| Study Protocol Version                               | String     | An identifier for the version to ensure protocol tracking. |
| Study Protocol Parameters Name                       | String     | A semicolon-delimited (“;”) list of parameter names, used as an identifier within the ISA-XLSX document. These names are used in the Study and Assay files (in the “Parameter Value []” column heading) to list the values used for each protocol parameter. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources. |
| Study Protocol Parameters Term Accession Number      | String     | The accession number from the Term Source associated with the selected term. |
| Study Protocol Parameters Term Source REF            | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section. |
| Study Protocol Components Name                       | String     | A semicolon-delimited (“;”) list of a protocol’s components; e.g. instrument names, software names, and reagents names. |
| Study Protocol Components Type                       | String     | Term to classify the protocol components listed for example, instrument, software, detector or reagent. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources |
| Study Protocol Components Type Term Accession Number | String     | The accession number from the Source associated to the selected terms. |
| Study Protocol Components Type Term Source REF       | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match a Term Source Name previously declared in the ontology section |

**Example**

For example, the `STUDY PROTOCOLS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

| | | | |
|--|-|--|---|
| STUDY PROTOCOLS |
| Study Protocol Name           | Harvesting | Protein extraction | Measurement |
| Study Protocol Type           | Biospecimen Collection | nucleic acid extraction         | nucleic acid extraction  | 
| Study Protocol Type Term Accession Measurement Number | http://purl.obolibrary.org/obo/NCIT_C70945 |  |  | 
| Study Protocol Type Term Source REF | NCIT |  |   | | 
| Study Protocol Description    | Extraction and storage of algae cells from photo-bio reactor. Extracted and centrifuged cell pellets were frozen in liquid nitrogen. | Proteins were extracted from cells using a combination of chemical (lysis buffer) and physical (sonicator) methods. Digested peptides were purified and resuspended in LC loading buffer. | Peptides were separated by a nanoHPLC (C18 column) and detected using an Orbitrap mass spectrometry device. |
| Study Protocol URI            | | | |
| Study Protocol Version        | |
| Study Protocol Parameters Name | Centrifugation Time;sample volume setting  | frequency; duration  | duration;flow rate
| Study Protocol Parameters Name Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C178881;http://purl.allotrope.org/ontologies/result#AFR_0002492 | http://purl.obolibrary.org/obo/PATO_0000044;http://purl.obolibrary.org/obo/PATO_0001309 | http://purl.obolibrary.org/obo/PATO_0001309;http://purl.obolibrary.org/obo/PATO_0001574 |
| Study Protocol Parameters Name Term Source REF | NCIT;AFO | PATO;PATO | PATO;PATO |
| Study Protocol Components Name | liquid nitrogen |  Sonicator; Extraction Kit | HPLC; Column; MS
| Study Protocol Components Type | Liquid Nitrogen | VWR Aquasonic 250D; IST sample preparation kit (PreOmics GmbH, Germany) |  U3000 RSLCnano HPLC; C18 column (Fritted Glass Column, 25 cm × 75 μm); Orbitrap Fusion Lumos
| Study Protocol Components Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C68796 | | ;;http://purl.obolibrary.org/obo/MS_1002732
| Study Protocol Components Type Term Source REF | NCIT | | ;;MS

### STUDY CONTACTS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                    | Datatype                                                                                    | Description                                                                                 |
|------------------------------------------|---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Study Person Last Name                   | String                                                                                      | The last name of a person associated with the study.                                                                                      |
| Study Person First Name                  | String                                                                                      | Study Person Name                                                                                        |
| Study Person Mid Initials                | String                                                                                      | The middle initials of a person associated with the study.                                                                            |
| Study Person Email                       | String formatted as email                                                                   | The email address of a person associated with the study.                                                                                      |
| Study Person Phone                       | String                                                                                      | The telephone number of a person associated with the study.                                                                                      |
| Study Person Fax                         | String                                                                                      | The fax number of a person associated with the study.                                                                                      |
| Study Person Address                     | String                                                                                      | The address of a person associated with the study.                                                                                      |
| Study Person Affiliation                 | String                                                                                      | The organization affiliation for a person associated with the study.                                                                                      |
| Study Person Roles                       | String or Ontology Annotation if accompanied by Term Accession Numbers and Term Source REFs | Term to classify the role(s) performed by this person in the context of the study, which means that the roles reported here need not correspond to roles held withing their affiliated organization. Multiple annotations or values attached to one person can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: submitter;funder;sponsor) .The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources. |
| Study Person Roles Term Accession Number | String                                                                                      | The accession number from the Term Source associated with the selected term.                                                                                       |
| Study Person Roles Term Source REF       | String                                                                                      | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section.                                                                                    |

**Example**

For example, the `STUDY CONTACTS` section of an ISA-XLSX `isa.investigation.xlsx` file may look as follows:

|                                |          |          |       |
|--------------------------------|----------|----------|-------|
| STUDY CONTACTS |
| Study Person Last Name | Venn  | Zimmer | Mühlhaus  |
| Study Person First Name | Benedikt   | David     | Timo   |
| Study Person Mid Initials |        |         |      |
| Study Person Email     | venn@bio.rptu.de         | d_zimmer@rptu.de         | timo.muehlhaus@rptu.de      |
| Study Person Phone     |          |          |       |
| Study Person Fax       |          |          |       |
| Study Person Address   | TU Kaiserslautern, Kaiserslautern, 67663, Germany | TU Kaiserslautern, Kaiserslautern, 67663, Germany | TU Kaiserslautern, Kaiserslautern, 67663, Germany |
| Study Person Affiliation | Computational Systems Biology | Computational Systems Biology | Computational Systems Biology |
| Study Person Roles     | author | author | corresponding author |
| Study Person Roles Term Accession Number |          |          |       |
| Study Person Roles Term Source REF |          |          |       |


## ASSAY section

This section is organized in several subsections, described in detail below. The subsections in the block are arranged vertically; the intent being to enhance readability and presentation, and possibly to help with parsing. These subsections MUST remain within
this block; the fields MUST remain within their subsection.

These sections implement the metadata for an `Assay` from the ISA Abstract Model.

### ASSAY

This section MUST contain zero or one values.

This section MUST contain the following labels, with the specified datatypes for values supported:


| Label                                              | Datatype   | Description                                                                                                                                                                                                                                                                                                         |
|----------------------------------------------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Assay Identifier          | String                               | A mandatory unique identifier, either a temporary identifier supplied by users or one generated by a repository or other database. For example, it could be an identifier complying with the LSID specification. A value MUST be given for this label. |
| Assay Title               | String                               | A concise phrase used to encapsulate the purpose and goal of the assay.                                                                                                                                |
| Assay Description         | String                               | A textual description of the assay, with components such as objective or goals.                                                                                                                        |
| Assay Measurement Type                       | String     | A term to qualify the endpoint, or what is being measured (e.g. gene expression profiling or protein identification). The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Assay Measurement Type Term Accession Number | String     | The accession number from the Term Source associated with the selected term.                                                                                                                                                                                                                                        |
| Assay Measurement Type Term Source REF       | String     | The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section.                                                                                                                                                                                                          |
| Assay Technology Type                        | String     | Term to identify the technology used to perform the measurement, e.g. DNA microarray, mass spectrometry. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required.              |
| Assay Technology Type Term Accession Number  | String     | The accession number from the Term Source associated with the selected term.                                                                                                                                                                                                                                        |
| Assay Technology Type Term Source REF        | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section.                                                                                                                             |
| Assay Technology Platform                    | String     | Manufacturer and platform name, e.g. Bruker AVANCE                                                                                                                                                                                                                                                                  |
| Assay File Name                              | String     | A field to specify the name of the Assay Table file corresponding the definition of that assay. There can be only one file per cell.                                                                                                                                                                                |

**Example**

For example, the `ASSAY` section of an ISA-XLSX `isa.assay.xlsx` file may look as follows:

|                     |                                      |
|---------------------|--------------------------------------|
| ASSAY |
| Assay Identifier     | Proteomics     | 
| Assay Title     | Mass spectrometrical analysis of harvested, heat-shocked Chlamydomonas cells  | 
| Assay Description     | Proteins from harvested cells were extracted, digested and the proteome content was measured using Orbitrap Fusion Lumos. | 
| Assay Measurement Type | Proteomics | 
| Assay Measurement Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C20085 |
| Assay Measurement Type Term Source REF | NCIT | 
| Assay Technology Type | Mass Spectrometry | 
| Assay Technology Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C17156 | 
| Assay Technology Type Term Source REF  | NCIT |
| Assay Technology Platform | Orbitrap Fusion Lumos  |
| Assay File Name     | assays/Proteomics/isa.assay.xlsx     | 


### ASSAY PERFORMERS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label                                     | Datatype                                                                                    | Description                                                                                 |
|------------------------------------------|---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Assay Person Last Name | String  | The last name of a person associated with the Assay.  |
| Assay Person First Name  | String | Assay Person Name |
| Assay Person Mid Initials  | String  | The middle initials of a person associated with the Assay.|
| Assay Person Email | String formatted as email | The email address of a person associated with the Assay. |
| Assay Person Phone  | String | The telephone number of a person associated with the Assay. |
| Assay Person Fax  | String | The fax number of a person associated with the assay.  |
| Assay Person Address  | String | The address of a person associated with the assay. |
| Assay Person Affiliation | String | The organization affiliation for a person associated with the assay. |
| Assay Person Roles | String or Ontology Annotation if accompanied by Term Accession Numbers and Term Source REFs | Term to classify the role(s) performed by this person in the context of the assay, which means that the roles reported here need not correspond to roles held withing their affiliated organization. Multiple annotations or values attached to one person can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: submitter;funder;sponsor) .The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources.|
| Assay Person Roles Term Accession Number | String | The accession number from the Term Source associated with the selected term. |
| Assay Person Roles Term Source REF | String | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section. |

**Example**

For example, the `ASSAY PERFORMERS` section of an ISA-XLSX `isa.assay.xlsx` file may look as follows:

|                            |         |       | |
|----------------------------|---------|-------|-|
| ASSAY PERFORMERS |
| Assay Person Last Name     | Zhang | Tzeng | Evans |
| Assay Person First Name    | Ningning    | Shin-Cheng  | Bradley |
| Assay Person Mid Initials  |       |       |
| Assay Person Email         |  |       |
| Assay Person Phone         |         |       |
| Assay Person Fax           |         |       |
| Assay Person Address       | St. Louis, Missouri 63132, USA | St. Louis, Missouri 63132, USA | St. Louis, Missouri 63132, USA |
| Assay Person Affiliation   | Donald Danforth Plant Science Center | Donald Danforth Plant Science Center | Donald Danforth Plant Science Center |
| Assay Person Roles         | Investigator | Laboratory Technologist | Laboratory Technologist |
| Assay Person Roles Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C25936 | http://purl.obolibrary.org/obo/NCIT_C51830 | http://purl.obolibrary.org/obo/NCIT_C51830 |
| Assay Person Roles Term Source REF       | NCIT | NCIT | NCIT |

## RUN section

This section is organized in several subsections, described in detail below. The subsections in the block are arranged vertically; the intent being to enhance readability and presentation, and possibly to help with parsing. These subsections MUST remain within
this block; the fields MUST remain within their subsection.

These sections implement the metadata for a `Run`.

### RUN

This section MUST contain zero or one values.

This section MUST contain the following labels, with the specified datatypes for values supported:


| Label | Datatype | Description |
|-------|----------|-------------|
| Run Identifier          | String | A mandatory unique identifier, either a temporary identifier supplied by users or one generated by a repository or other database. For example, it could be an identifier complying with the LSID specification. A value MUST be given for this label. |
| Run Title               | String                               | A concise phrase used to encapsulate the purpose and goal of the run.  |
| Run Description         | String | A textual description of the run, with components such as objective or goals. |
| Run Workflow Identifiers | String | Identifiers of the Workflows whose execution was orchestrated in the context of this run. Multiple values can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: Imputation;Testing;Plotting). |
| Run Measurement Type    | String     | A term to qualify the endpoint, or what is being computed (e.g. gene expression profiling or protein identification). The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Run Measurement Type Term Accession Number | String     | The accession number from the Term Source associated with the selected term.  |
| Run Measurement Type Term Source REF       | String     | The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section. |
| Run Technology Type     | String | Term to identify the technology used to perform the computation, e.g. statistical testing. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Run Technology Type Term Accession Number  | String     | The accession number from the Term Source associated with the selected term.  |
| Run Technology Type Term Source REF        | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference. |
| Run Technology Platform | String     | Software Publisher/Creator and/or Software name used for the computation. |
| Run File Name           | String     | A field to specify the name of the Run Table file corresponding the definition of that run. There can be only one file per cell. |

**Example**

For example, the `RUN` section of an ISA-XLSX `isa.run.xlsx` file may look as follows:

|      |      |
|------|------|
| RUN |
| Run Identifier     | ProteomicsAnalysis     | 
| Run Title     | Preparatory Proteome Quantification | 
| Run Description     | Raw Data Files measured from Orbitrap device were processed using ProteomIQon to calculate protein quantities. | 
| Run Workflow Identifiers | ProteomIQonPipeline | 
| Run Measurement Type | Proteomics | 
| Run Measurement Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C20085 |
| Run Measurement Type Term Source REF | NCIT | 
| Run Technology Type | Protein quantification | 
| Run Technology Type Term Accession Number | http://edamontology.org/operation_3630 | 
| Run Technology Type Term Source REF  | EDAM |
| Run Technology Platform | ProteomIQon  |
| Run File Name     | runs/ProteomicsAnalysis/isa.run.xlsx     | 


### RUN PERFORMERS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label | Datatype | Description |
|-------|----------|-------------|
| Run Person Last Name | String  | The last name of a person associated with the run.  |
| Run Person First Name  | String | Run Person Name |
| Run Person Mid Initials  | String  | The middle initials of a person associated with the run.|
| Run Person Email | String formatted as email | The email address of a person associated with the run. |
| Run Person Phone  | String | The telephone number of a person associated with the run. |
| Run Person Fax  | String | The fax number of a person associated with the run.  |
| Run Person Address  | String | The address of a person associated with the run. |
| Run Person Affiliation | String | The organization affiliation for a person associated with the run. |
| Run Person Roles | String or Ontology Annotation if accompanied by Term Accession Numbers and Term Source REFs | Term to classify the role(s) performed by this person in the context of the run, which means that the roles reported here need not correspond to roles held withing their affiliated organization. Multiple annotations or values attached to one person can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: submitter;funder;sponsor) .The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources.|
| Run Person Roles Term Accession Number | String | The accession number from the Term Source associated with the selected term. |
| Run Person Roles Term Source REF | String | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section. |

**Example**

For example, the `RUN PERFORMERS` section of an ISA-XLSX `isa.run.xlsx` file may look as follows:

|                |         |       |
|----------------|---------|-------|
| RUN PERFORMERS |         |       |
| Run Person Last Name     | Ott | Zimmer  |
| Run Person First Name    | Caroline    | David   |
| Run Person Mid Initials  |       |       |
| Run Person Email         |  |       |
| Run Person Phone         |         |       |
| Run Person Fax           |         |       |
| Run Person Address       | Erwin-Schrödinger-Straße 56, 67663 Kaiserslautern, Germany | Erwin-Schrödinger-Straße 56, 67663 Kaiserslautern, Germany | 
| Run Person Affiliation   | RPTU Kaiserslautern-Landau | RPTU Kaiserslautern-Landau | 
| Run Person Roles         | Data Manager | Statistician | 
| Run Person Roles Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C51820 | http://purl.obolibrary.org/obo/NCIT_C51877 | 
| Run Person Roles Term Source REF       | NCIT | NCIT | 

## WORKFLOW section

This section is organized in several subsections, described in detail below. The subsections in the block are arranged vertically; the intent being to enhance readability and presentation, and possibly to help with parsing. These subsections MUST remain within
this block; the fields MUST remain within their subsection.

These sections implement the metadata for a `Workflow`.

### WORKFLOW

This section MUST contain zero or one values.

This section MUST contain the following labels, with the specified datatypes for values supported:


| Label | Datatype | Description |
|-------|----------|-------------|
| Workflow Identifier          | String | A mandatory unique identifier, either a temporary identifier supplied by users or one generated by a repository or other database. For example, it could be an identifier complying with the LSID specification. A value MUST be given for this label. |
| Workflow Title               | String                               | A concise phrase used to encapsulate the purpose and goal of the workflow.  |
| Workflow Description         | String | A textual description of the workflow, with components such as objective or goals. |
| Workflow Subworkflow Identifiers | String | Identifiers of the Workflows who're orchestrated in the context of this workflow. Multiple values can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: Imputation;Testing;Plotting). |
| Workflow Type                                  | String     | Term to classify the workflow. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Workflow Type Term Accession Number            | String     | The accession number from the Term Source associated with the selected term. |
| Workflow Type Term Source REF                  | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section. |
| Workflow URI                                   | String     | Pointer to workflow resources external to the ISA-Tab that can be accessed by their Uniform Resource Identifier (URI). |
| Workflow Version                               | String     | An identifier for the version to ensure workflow tracking. |
| Workflow Parameters Name                       | String     | A semicolon-delimited (“;”) list of parameter names, used as an identifier within the ISA-XLSX document. These names are used in [Run Annotation Tables](#annotation-table-sheets) to list the values used for each Workflow parameter. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources. |
| Workflow Parameters Term Accession Number      | String     | The accession number from the Term Source associated with the selected term. |
| Workflow Parameters Term Source REF            | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Name declared in the Ontology Source Reference section. |
| Workflow Components Name                       | String     | A semicolon-delimited (“;”) list of a workflow’s components; e.g. software names. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources. |
| Workflow Components Type                       | String     | Term to classify the workflow components listed, e.g command line tool or software. The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. |
| Workflow Components Type Term Accession Number | String     | The accession number from the Source associated to the selected terms. |
| Workflow Components Type Term Source REF       | String     | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match a Term Source Name previously declared in the ontology section |
| Workflow File Name           | String     | A field to specify the name of the Workflow Table file corresponding the definition of that workflow. There can be only one file per cell. |



**Example**

For example, the `WORKFLOW` section of an ISA-XLSX `isa.workflow.xlsx` file may look as follows:

|      |      |
|------|------|
| WORKFLOW |
| Workflow Identifier     | ProteomIQonPipeline | 
| Workflow Title     | Proteome Quantification | 
| Workflow Description     | Calculate protein abundancies using ProteomIQon. | 
| Workflow Subworkflow Identifiers | PeptideSpectrumMatching;ProteinInference;ProteinQuantification | 
| Workflow Type           | Proteomic Quantification Method | 
| Workflow Type Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C161810 |  
| Workflow Type Term Source REF | NCIT |  
| Workflow URI            | | 
| Workflow Version        | |
| Workflow Parameters Name | Protein quantification  | 
| Workflow Parameters Name Term Accession Number | http://edamontology.org/operation_3630 | 
| Workflow Parameters Name Term Source REF | EDAM | 
| Workflow Components Name | Proteomiqon | 
| Workflow Components Type | software | 
| Workflow Components Type Term Accession Number | http://purl.obolibrary.org/obo/IAO_0000010 | 
| Workflow Components Type Term Source REF | IAO | 
| Workflow File Name     | workflows/ProteomIQon/isa.workflow.xlsx | 

### WORKFLOW CONTACTS

This section MUST contain zero or more values.

This section MUST contain the following labels, with the specified datatypes for values supported:

| Label | Datatype | Description |
|-------|----------|-------------|
| Workflow Person Last Name | String  | The last name of a person associated with the workflow.  |
| Workflow Person First Name  | String | Workflow Person Name |
| Workflow Person Mid Initials  | String  | The middle initials of a person associated with the workflow.|
| Workflow Person Email | String formatted as email | The email address of a person associated with the workflow. |
| Workflow Person Phone  | String | The telephone number of a person associated with the workflow. |
| Workflow Person Fax  | String | The fax number of a person associated with the workflow.  |
| Workflow Person Address  | String | The address of a person associated with the workflow. |
| Workflow Person Affiliation | String | The organization affiliation for a person associated with the workflow. |
| Workflow Person Roles | String or Ontology Annotation if accompanied by Term Accession Numbers and Term Source REFs | Term to classify the role(s) performed by this person in the context of the workflow, which means that the roles reported here need not correspond to roles held withing their affiliated organization. Multiple annotations or values attached to one person can be provided by using a semicolon (“;”) Unicode (U0003+B) as a separator (e.g.: submitter;funder;sponsor) .The term can be free text or from, for example, a controlled vocabulary or an ontology. If the latter source is used the Term Accession Number and Term Source REF fields below are required. Refer to section [Multiple values](#multiple-values) on how to encode multiple values in one field and match term sources. |
| Workflow Person Roles Term Accession Number | String | The accession number from the Term Source associated with the selected term. |
| Workflow Person Roles Term Source REF | String | Identifies the controlled vocabulary or ontology that this term comes from. The Source REF has to match one of the Term Source Names declared in the Ontology Source Reference section. |

**Example**

For example, the `WORKFLOW CONTACTS` section of an ISA-XLSX `isa.workflow.xlsx` file may look as follows:

|                |         |       |
|----------------|---------|-------|
| WORKFLOW CONTACTS |         |       |
| Workflow Person Last Name     | Ott | Zimmer  |
| Workflow Person First Name    | Caroline    | David   |
| Workflow Person Mid Initials  |       |       |
| Workflow Person Email         |  |       |
| Workflow Person Phone         |         |       |
| Workflow Person Fax           |         |       |
| Workflow Person Address       | Erwin-Schrödinger-Straße 56, 67663 Kaiserslautern, Germany | Erwin-Schrödinger-Straße 56, 67663 Kaiserslautern, Germany | 
| Workflow Person Affiliation   | RPTU Kaiserslautern-Landau | RPTU Kaiserslautern-Landau | 
| Workflow Person Roles         | Data Manager | Statistician | 
| Workflow Person Roles Term Accession Number | http://purl.obolibrary.org/obo/NCIT_C51820 | http://purl.obolibrary.org/obo/NCIT_C51877 | 
| Workflow Person Roles Term Source REF       | NCIT | NCIT | 

## Multiple values

This section gives additional guidance for cases, where cells in Top-level metadata sheets may contain multiple annotation values. In this case, if there is more than 1 ontology term, the names of the ontology terms MUST be concattenated into a single, semicolon-delimited (“;”) string.

In the two additonal qualifying row, the `Term Accession Numbers` and the 
`Term Source REFs` MUST be added following the same rules, i.e. all three cells MUST contain the same number of semicolon-delimited (“;”) values which can can be connected by their position in the string.


E.g. for three ontology terms
   - Name1 (AO:0000001)
   - Name2 (BO:0000002)
   - Name3 (CO:0000003)

their rows in a metadata sheet would look like this:

|                |         |
|----------------|---------|
| Properties | Name1;Name2;Name3 |
| Properties Term Accession Number | AO:0000001;BO:00000002;CO:000000003 |
| Properties Term Source REF | AO;BO;CO |



# Annotation Table sheets

`Annotation Table sheets` are used to describe the experimental flow in detailed, machine readable way. In each sheet, there is a mapping from input entities to output entities, placed in the `Input` and `Output` columns, accordingly. The other columns then are used to either describe those entities or the processes that led to this mapping.

In the `Annotation Table sheets`, column headers MUST have the first letter of each word in upper case, with the exception of the referencing label (REF).

The content of the annotation table MUST be placed in an `xlsx table` whose name starts with `annotationTable`. Each sheet MUST contain at most one such annotation table. Only cells inside this table are considered as part of the formatted metadata.

`Annotation Table sheets` are structured with fields organized on a per-row basis. The first row MUST be used for column headers. Each body row is an implementation of a `Process`.

## Inputs and Outputs

Each annotation table sheet MAY contain at most one `Input` column.
Their header MUST follow the pattern `Input [<NodeType>]` (See below for possible values of `<NodeType>`).
If the `Input` column is present, it MUST NOT contain empty cell values.

Each annotation table sheet MAY contain at most one `Output` column.
Their header MUST follow the pattern `Output [<NodeType>]` (See below for possible values of `<NodeType>`).
If the `Output` column is present, it MUST NOT contain empty cell values.

Cell values of the `Input` and `Output` column represent nodes of the `Process` respectively.

`NodeTypes` MUST be one of the following:

- A `Source` MUST be indicated with the node type `Source Name`. `Sources` MUST not be used as `Output` nodes.

- A `Sample` MUST be indicated with the node type `Sample Name`.

- An `Extract Material` MUST be indicated with the node type `Material Name`.

- A `Data` object MUST be indicated with the node type `Data`.

`Source Names`, `Sample Names`, `Material Names` MUST be unique across an ARC. If two of these entities with the same name exist in the same ARC, they are considered the same entity.

The `Data` node type MUST correspond to a relevant data resource location, following the [Data Path Annotation](/ARC%20specification.md#data-path-annotation) patterns. If the annotation of the `Data` node refers not to the complete resource, but a part of it, a `Selector` MAY be added. This Selector MUST be separated from the resource location using a `#`— with no whitespace between: `location#selector`. If appropriate, the Selector SHOULD be formatted according to IRI fragment selectors specified by [W3](https://www.w3.org/TR/annotation-model/#fragment-selector).

The format of the data resource MAY be further qualified using a `Data Format` column. The `Data Format` SHOULD be expressed using a [MIME format](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types), most commonly consisting of two parts: a type and a subtype, separated by a slash (/) — with no whitespace between: `type/subtype`. If appropriate, a format from the list composed by [IANA](https://www.iana.org/assignments/media-types/media-types.xhtml)
SHOULD be picked. Unregistered or niche encoding and file formats MAY be indicated instead via the most appropriate URL.

The format and usage info about the Selector MAY be further qualified using a `Data Selector Format` column. The `Data Selector Format` SHOULD point to a web resource containing instructions about how the Selector is formatted and how it should be interpreted.


## Examples

### Data Location and Selector

In this example, there is a measurement of two `Samples`, namely `input1` and `input2`. The values measured are both written into the same data resource in the location `result.csv`, whichs formatting is tabular, according to the `Data Format` being `text/csv`. To distinguish between the measurement values stemming from the different inputs, selectors were added to the resource location (seperated by a `#`), namely `col=1` and `col=2`. The specification about the formatting of these selectors can be found in the provided link, namely `https://datatracker.ietf.org/`.


| Input [Sample Name] | Output [Data]          | Data Format | Data Selector Format | 
|-------------|---------------------------------|----------------------------------|--|
| input1       | result.csv#col=1 | text/csv | https://datatracker.ietf.org/doc/html/rfc7111 |
| input2       | result.csv#col=2 | text/csv | https://datatracker.ietf.org/doc/html/rfc7111 |

## Protocol Columns

`Protocol REF` columns MAY be used to specify the name of the `Protocol` node implemented by the `Process` node. Per Annotation Table sheet there MUST be at most one `Protocol REF` column. The value MUST be free text.

`Protocol Version` columns MAY be used to specify the version of the `Protocol` node implemented by the `Process` node. Per Annotation Table sheet there MUST be at most one `Protocol Version` column. The value MUST be free text.

`Protocol Description` columns MAY be used to specify the description of the `Protocol` node implemented by the `Process` node. Per Annotation Table sheet there MUST be at most one `Protocol Description` column. The value MUST be free text.

`Protocol Uri` columns MAY be used to specify the uri of the `Protocol` node implemented by the `Process` node. Per Annotation Table sheet there MUST be at most one `Protocol Uri` column. The value MUST be either a URI or a file path corresponding to a relevant protocol file location.

`Protocol Type` columns MAY be used to specify the type of the `Protocol` node implemented by the `Process` node. Per Annotation Table sheet there MUST be at most one `Protocol Type` column. The value MUST be free text, or an [`Ontology Annotation`](#ontology-annotations).


## Ontology Annotations

Where a value is an `Ontology Annotation` in an annotation table, `Term Accession Number` and `Term Source REF` columns MUST follow the main column. 

An `Ontology Annotation` MAY be applied to any appropriate `Characteristic`, `Parameter`, `Factor`, `Component` or `Protocol Type`.

This implements `Ontology Annotation` from the ISA Abstract Model.

#### Ontology Annotation Headers

The header of the main column MUST contain the structural column type followed by the `name` of the ontology term in `[]` brackets.
There SHOULD be a `space` between the column type and the `[` bracket.

The headers of the two annotation columns SHOULD contain further ontological information about the ontology term of the main header. 
In this case, following the static header string, separated by a single space, there MUST be a short ontology term identifier formatted as CURIEs (prefixed identifiers) of the form `<IDSPACE>:<LOCALID>` (specified [here](http://obofoundry.org/id-policy)) inside `()` brackets.

In the other case, i.e. when the annotation columns do not contain further ontological information, the static header strings MUST be either followed by a single space and empty `()` brackets or nothing.

#### Ontology Annotation Values

The value in the main column MUST contain the name of the ontology term.

The value in the `Term Source REF` column MUST either contain a short identifier for the `IDSPACE`, which identifies the ontology containing the term, or be left empty. 

The value in the `Term Accession Number` column MUST either contain a value formatted in one of the following formats, or be left empty:
  - `LOCALID` of the ontology, which is only applicable if the matching `IDSPACE` is given in the `Term Source REF` column
  - short ontology term identifier formatted as CURIEs (prefixed identifiers) of the form `<IDSPACE>:<LOCALID>` (specified [here](http://obofoundry.org/id-policy))
  - `URL` pointing to the ontology term

#### Ontology Annotation Example

For example, a characteristic type `organism` with a value of `Homo sapiens` can be qualified with an `Ontology Annotation` of a term from NCBI Taxonomy as follows:

| Characteristic [organism]   | Term Source REF (OBI:0100026)  | Term Accession Number (OBI:0100026) |
|-----------------------------|-------------------|------------------------------------------------------|
| Homo sapiens                | NCBITaxon         | [http://…/NCBITAXON_9606](http://purl.obolibrary.org/obo/NCBITAXON_9606) |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting.

## Unit

Where a value is numeric, a `Unit` MAY be used to qualify the quantity. 
In this case, the main column must be followed by a `Unit` column, which in turn SHOULD be further annotated as an [`Ontology Annotation`](#ontology-annotations), being followed by `Term Accession Number` and `Term Source REF` columns.

- The headers of the annotation columns then refer to the header of the main column.
- The values of the annotation columns then refer to the unit, and not to the numeric value of the main column.

For example, in the following, the header ontology `temperature` is further qualified with the CURIE `PATO:0000146`. 
The value `300` is qualified with a `Unit` `Kelvin`, which is further qualified as an [`Ontology Annotation`](#ontology-annotations) from the Units Ontology declared in the Ontology Sources with `UO`:

|   Parameter [temperature] | Unit   | Term Source REF (PATO:0000146)  | Term Accession Number (PATO:0000146)  |
|--------------------------------|--------|-------------------|------------------------------------------------------|
|                            300 | Kelvin | UO                | [http://…/obo/UO_0000012](http://purl.obolibrary.org/obo/UO_0000012) |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting.


## Characteristics

A `Characteristic` is used as an attribute column following [`Sources`](#inputs-and-outputs) and [`Samples`](#inputs-and-outputs). This column contains terms describing each material according to the characteristics category indicated in the column header in the pattern `Characteristic [<category term>]`.
For example, a column header `Characteristic [organ part]` would contain terms describing an organ part. `Characteristic` SHOULD be used as an attribute column following `Input [Source Name]`, or `Input [Sample Name]`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations).

For example, a characteristic type Organism with a value of Homo sapiens can be qualified with an [`Ontology Annotation`](#ontology-annotations) of a term from NCBI Taxonomy as follows:

| Characteristic [organ part]   | Term Source REF (UBERON:0000064)  | Term Accession Number (UBERON:0000064)  |
|-------------------------------|-------------------|-------------------------|
| Liver                         | MeSH              | D008099                 |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `LOCALID`. The associated `IDSPACE` to identify the ontology term is given in the `Term Source REF` column.

## Factors

A `Factor` is an independent variable manipulated by an experimentalist with the intention to affect biological systems in a way that can be measured by an assay. This field holds the actual data for the `Factor` named between the square brackets (as declared in the `Study Factors` section of a top-level metadata sheet) so MUST match, for example, `Factor [compound]`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations).

| Factor [Gender]   | Term Source REF (NCIT:C17357)  | Term Accession Number (NCIT:C17357)  |
|------------------------|-------------------|-------------------------|
| Male                   | MeSH              | D008297                 |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `LOCALID`. The associated `IDSPACE` to identify the ontology term is given in the `Term Source REF` column.

## Components

A `Component` is a consumable or reusable physical entity used in the experimental workflow. It is formatted in the pattern `Component [<category term>]`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations).

| Component [Measurement Device]   | Term Source REF (NCIT:C81182)  | Term Accession Number (NCIT:C81182)  |
|------------------------|-------------------|-------------------------|
| Illumina MiniSeq                   | OBI              | [http://…/obo/OBI_0003114](http://purl.obolibrary.org/obo/OBI_0003114)                 |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting.


## Parameters

A `Parameter` can be used to specify any additional information about the experimental setup, that does not fall under the aforementioned 3 categories. It is formatted in the pattern `Parameter [<category term>]`. The value MUST be free text, numeric, or an [`Ontology Annotation`](#ontology-annotations).

| Parameter [temperature] | Unit   | Term Source REF (NCRO:0000029)  | Term Accession Number (NCRO:0000029)  |
|--------------------------------|--------|-------------------|------------------------------------------------------|
|                            300 | Kelvin | UO                | [http://…/obo/UO_0000032](http://purl.obolibrary.org/obo/UO_0000032) |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting.


## Comments

A `Comment` can be used to provide some additional information. Columns headed with `Comment[<comment name>]` MAY appear anywhere in the Annotation Table. The comment always refers to the Annotation Table. The value MUST be free text.

| Comment [Answer to everything] |
|--------------------------------|
|                      forty-two |

## Others

Columns whose headers do not follow any of the formats described above are considered additional payload and are out of the scope of this specification.

## Examples

For example, a simple [source](#inputs-and-outputs) to [sample](#inputs-and-outputs) may be represented as:

| Input [Source Name]   | Protocol REF      | Output [Sample Name]   |
|---------------|-------------------|---------------|
| source1       | sample collection | sample1       |

Where a graph splits or pools, we use the [Input](#inputs-and-outputs) or [Output](#inputs-and-outputs) column to represent the same nodes.

For example, if we split a source into two samples, we  might represent this as:

| Input [Source Name]   | Protocol REF      | Output [Sample Name]   |
|---------------|-------------------|---------------|
| source1       | sample collection | sample1       |
| source1       | sample collection | sample2       |

If we pool two sources into a single sample, we might represent this as:

| Input [Source Name]   | Protocol REF      | Output [Sample Name]   |
|---------------|-------------------|---------------|
| source1       | sample collection | sample1       |
| source2       | sample collection | sample1       |

# Datamap table sheets

`Datamap Table sheets` are used to describe the contents of data files.

In the `Datamap Table sheets`, column headers MUST have the first letter of each word in upper case, with the exception of the referencing label (REF).

The content of the datamap table MUST be placed in an `xlsx table` whose name starts with `datamapTable`. Each sheet MUST contain at most one such datamap table. Only cells inside this table are considered as part of the formatted metadata.

`Datamap Table sheets` are structured with fields organized on a per-row basis. The first row MUST be used for column headers. Each body row is an implementation of a `data` node.

## Data column

Every `Datamap Table sheet` MUST contain a `Data` column. Every  object in this column MUST correspond to a relevant data resource location, following the [Data Path Annotation](/ARC%20specification.md#data-path-annotation) patterns. If the annotation of the `Data` node refers not to the complete resource, but a part of it, a `Selector` MAY be added. This Selector MUST be separated from the resource location using a `#`— with no whitespace between: `location#selector`. If appropriate, the Selector SHOULD be formatted according to IRI fragment selectors specified by [W3](https://www.w3.org/TR/annotation-model/#fragment-selector).

The format of the data resource MAY be further qualified using a `Data Format` column. The `Data Format` SHOULD be expressed using a [MIME format](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types), most commonly consisting of two parts: a type and a subtype, separated by a slash (/) — with no whitespace between: `type/subtype`. If appropriate, a format from the list composed by [IANA](https://www.iana.org/assignments/media-types/media-types.xhtml)
SHOULD be picked. Unregistered or niche encoding and file formats MAY be indicated instead via the most appropriate URL.

The format and usage info about the Selector MAY be further qualified using a `Data Selector Format` column. The `Data Selector Format` SHOULD point to a web resource containing instructions about how the Selector is formatted and how it should be interpreted.

## Explication column

Every `Datamap Table sheet` SHOULD contain an `Explication` column. The `Explication` adds explicit meaning to the data node. The value MUST be free text, or an [`Ontology Annotation`](#ontology-annotations).

| Explication | Term Source REF | Term Accession Number |
|------------------------|-------------------|-------------------------|
| average value | OBI | [http://…/obo/OBI_0000679](http://purl.obolibrary.org/obo/OBI_0000679) |

## Unit column

Every `Datamap Table sheet` SHOULD contain an `Unit` column. The `Unit` adds a unit of measurement to the data node. The value MUST be free text, or an [`Ontology Annotation`](#ontology-annotations).

| Unit | Term Source REF | Term Accession Number |
|------------------------|-------------------|-------------------------|
| milligram per milliliter | UO | [http://…/obo/UO_0000176](http://purl.obolibrary.org/obo/UO_0000176) |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting.

## Object Type column

Every `Datamap Table sheet` SHOULD contain an `Object Type` column. The `Object Type` defines the shape or format in which the data node is represented. The value MUST be free text, or an [`Ontology Annotation`](#ontology-annotations).

| Object Type | Term Source REF | Term Accession Number |
|------------------------|-------------------|-------------------------|
| Float | NCIT | [http://…/obo/NCIT_C48150](http://purl.obolibrary.org/obo/NCIT_C48150) |

> [!NOTE]
> In this example, the value in the `Term Accession Number` column is formatted as a `URL`, but shortened for the purpose of markdown-formatting.

## Label column

Every `Datamap Table sheet` SHOULD contain a `Label` column. The `Label` is the free-text identifier used to name the data entity in the given location (e.g. the column header in tabular data). The value MUST be free text and SHOULD match the actual text it refers to in the data file.

> [!NOTE]
> In tabular data, values in the label column can be an indicator for the usage of column-headers in the annotated data file.

| Label | 
|------------------------|
| ProtConc |

## Description column

Every `Datamap Table sheet` SHOULD contain a `Description` column. The `Description` gives additional, human readable context about the data node. The value MUST be free text.

| Description | 
|------------------------|
| The average protein concentration for the given gene |

## Generated By column

Every `Datamap Table sheet` SHOULD contain a `Generated By` column. The `Generated By` names the tool which led to the creation of the data node. The value MUST be free text.

If possible, the value in this column MUST correspond to a relevant data resource location, following the [Data Path Annotation](/ARC%20specification.md#data-path-annotation) patterns.

| Generated By | 
|------------------------|
| GeneStatisticsTool.exe |

## Comments

A `Comment` can be used to provide some additional information. Columns headed with `Comment[<comment name>]` MAY appear anywhere in the Annotation Table. The comment always refers to the Annotation Table. The value MUST be free text.

| Comment [Answer to everything] |
|--------------------------------|
|                      forty-two |

## Examples

For example, a simple `datamap` table representing a tabular datafile might look as follows:

| Data | Explication | Term Source REF | Term Accession Number | Unit | Term Source REF | Term Accession Number | Object Type | Term Source REF | Term Accession Number | Label | Description | GeneratedBy | 
|---------------|---------------|-------------------|---------------|---------------|-------------------|---------------|---------------|-------------------|---------------|---------------|---------------|---------------|
| MyData.csv#col=1 | Gene Identifier | NCIT | [http://…/obo/NCIT_C48664](http://purl.obolibrary.org/obo/NCIT_C48664) | | | | String | NCIT | [http://…/obo/NCIT_C45253](http://purl.obolibrary.org/obo/NCIT_C45253) | GeneID | Short hand identifier of the gene coding for the protein. | GeneStatisticsTool.exe |
| MyData.csv#col=2 | average value | OBI | [http://…/obo/OBI_0000679](http://purl.obolibrary.org/obo/OBI_0000679) | milligram per milliliter | UO | [http://…/obo/UO_0000176](http://purl.obolibrary.org/obo/UO_0000176) | Float | NCIT | [http://…/obo/NCIT_C48150](http://purl.obolibrary.org/obo/NCIT_C48150) | ProtConc | The average protein concentration for the given gene |GeneStatisticsTool.exe |
| MyData.csv#col=3 |  p-value | OBI | [http://…/obo/OBI_0000175](http://purl.obolibrary.org/obo/OBI_0000175) | | | | Float | NCIT | [http://…/obo/NCIT_C48150](http://purl.obolibrary.org/obo/NCIT_C48150) | p-val | p-value of t-test against control. | GeneStatisticsTool.exe |

In this example, the `datamap` table describes a single data file named `MyData.csv`. This file contains three columns. The first column contains gene identifiers, the other two results of a statistical analysis performed by the tool GeneStatisticsTool.exe.