This scheme will provide a framework for describing aquatic microcosm experiments in the sense as used in the papers in the [References] section.
It aims at providing a structured way to describe the data and to make finding the data using the provided metadata possible.
This scheme does neither aim at covering aspects of the actual analysis of the extracted data nor does it aim at giving all information to re-run the experiment.
The dmdScheme is a metadata scheme for Ecological Microcosm Experiments. As these are essentially ecological data, the use of other schemes geareed towards ecological data also comes to mind. One widely used scheme is the eml scheme.
The main difference between the dmdScheme and other ecological meta data schemes is that in the development of the dmdScheme the aim was to develop a scheme specific for a certain type of experiment. This specificity went together with the objective of keeping the scheme simple to fill in and to understand. This resulted in a meta data scheme which contains all information necessary to describe the data generated in these aquatic ecological microcosm experiments, while at the same time being simple enough to be filled in by the researcher without to much time required. It should be possible to use e.g. the afore mentioned eml scheme to store the meta data contained in the dmdScheme, but it would require a much larger investment in time to fill in the eml scheme as it is much more general per definition.
The dmdScheme is neither intended nor suited for experiments outside of ecological and aquatic microcosm experiments.
The dmdScheme is providing meta data for a data file bundle. A data bundle is an archive (e.g. tar.gz or zip) cosisting of multiple data files and one file with the metadata TODO we have to decide on the format of this file - should be a text file?. If the data files represent tabular data, they should be in csv format, otherwise any open format.
It is a property set (dmdSchemeSet) which contains of five different sets of data properties (dmdSchemeData) which are tables of metadata.
These five data properties are:
It is important to note that
suggesteddValues. Any value can be entered, but, if possible, one should choose a value from the list.label
This document was than re-ordered and thinned which resulted in the initial verions of the dmdScheme.
Here we describe the structure of the dmdScheme
print(dmdScheme, printData = F, printAttr = FALSE, printExtAttr = FALSE)
#> dmdScheme - dmdSchemeSet
#> Experiment - dmdSchemeData
#> Genus - dmdSchemeData
#> Treatments - dmdSchemeData
#> Measurement - dmdSchemeData
#> DataExtraction - dmdSchemeData
#> DataFileMetaData - dmdSchemeData| propertySet | valueProperty | unit | type | suggestedValues | Description | DATA_v0.9.5 |
|---|---|---|---|---|---|---|
| Experiment | name | NA | character | NA | The name of the experiment. | ASR-expt1 |
| NA | temperature | NA | character | treatment, in degrees celsius, measurement | Temperature used for all treatments. If different between treatments, use “treatment” and specify in the Treatment sheet. | 20 |
| NA | light | NA | character | treatment,light, dark, cycle , e.g. 16:8 LD | Light used for all treatments. If different between treatments, use “treatment” and specify in the Treatment sheet. | semi-ambient |
| NA | humidity | NA | character | treatment, relative humidity in % | Humidity used for all treatments. If different between treatments, use “treatment” and specify in the Treatment sheet. | ambient |
| NA | incubator | NA | character | none, bench | What type of incubator is used. | not given here |
| NA | container | NA | character | NA | What type of container is used. | Duran type bottle, red lids, 250ml |
| NA | microcosmVolume | ml | numeric | NA | Volume of the microcosm container. Not the volume of the culture medium! | 100 |
| NA | mediaType | NA | character | NA | NA | PPM |
| NA | mediaConcentration | g/l | numeric | NA | NA | 0.55 |
| NA | cultureConditions | NA | character | axenic, dirty, clean | Conditions of the cultures for all treatments. | dirty |
| NA | comunityType | NA | character | treatment, single trophic level, multiple trophic level | Characterisation of the microbe community. | initially unknown |
| NA | mediaAdditions | NA | character | NA | NA | Wheat seeds added on specific dates, see file wheat_seed_additions.csv |
| NA | duration | days | integer | NA | Length of the experiment in days. This should only include the time in which the measurements were taken! | 100 |
| NA | comment | NA | character | NA | Additional features of the Experiment you want to provide | NA |
| propertySet | Treatments | …3 | …4 |
|---|---|---|---|
| valueProperty | treatmentID | treatmentLevelHeight | comment |
| unit | NA | NA | NA |
| type | character | character | character |
| suggestedValues | species, temperatur, light, initial density, comunity composition, densities, dispersal, viscosity, disturbance, communityType | value, variable: freetext | NA |
| Description | ID of the the treatment decribed in this a row. Each treatmentId can occur multiple times as it can contain multiple treatment levels. | The value of the parameter if the parameter is constant over time, or a description of the variability. If unit is speciesId, comma separated list of all species in the treatment. | NA |
| DATA | Lid_treatment | Loose | NA |
| MULTIPLE ROWS | Lid_treatment | Tight | NA |
| NA | species_1 | tt_1, unknown | NA |
| NA | species_2 | unknown | NA |
| NA | species_3 | tt_1 | NA |
| propertySet | Measurement | …3 | …4 | …5 | …6 | …7 | …8 | …9 | …10 | …11 |
|---|---|---|---|---|---|---|---|---|---|---|
| valueProperty | measurementID | variable | method | unit | object | noOfSamplesInTimeSeries | samplingVolume | dataExtractionID | measuredFrom | comment |
| unit | NA | NA | NA | NA | NA | NA | ml | NA | NA | NA |
| type | character | character | character | character | character | integer | numeric | character | character | character |
| suggestedValues | NA | O2 concentration, video, manual count, abundance, DNA | presens Optode, microscopy | %, mmol, count | species, OUT, gene, community, particles | NA | NA | NA | NA | NA |
| Description | Id of the Measurement process. This includes methodology, variables . Each measurementId specifies one Measurement process and must be unique in this column. Should be in the mapping column in the DataFileMetaData tab. | The variable measured. | Name of the method used. | Unit of the measured variable | The object measured. E.g. species in the case of manual count, gene for genetic analysis, particle for particle counters. | Total number of all samples in the time series. | The sampling volumne. If e.g. atmosphere in container is sampled (oxygen measurements), than enter 0. Please use NA if sampling volumne is variable. | as used in the sheet DataExtraction, column dataExtractionID | if measured from the experiment, raw, else the measurementId (first column) of the Measurement it is based on. | NA |
| DATA | oxygen concentration | DO | presens Optode | % | community | 50 | 0 | none | raw | NA |
| MULTIPLE ROWS | abundance | abundance | molecular | count | species | 6 | 0.5 | Mol_Analy_pipeline1 | sequenceData | NA |
| NA | smell | smell | nose | rotten eggs or not | community | 6 | 0 | none | raw | NA |
| NA | sequenceData | DNA | NGS | Nucleotide | DNA fragment | 6 | 0 | none | raw | NA |
| propertySet | DataExtraction | …3 | …4 | …5 | …6 |
|---|---|---|---|---|---|
| valueProperty | dataExtractionID | method | parameter | value | comment |
| unit | NA | NA | NA | NA | NA |
| type | character | character | character | character | character |
| suggestedValues | NA | bemovi x.y.z | NA | NA | NA |
| Description | Name of the DataExtraction process. This includes methodology, variables . Each name specifies one extraction process and can occur multiple times in the case of multiple parameters in the analysis. | Method used for the DataExtraction process. If possible including version (in the case of R packages). | parameter in the analysis. Only needs to be specified if it varies from the default. | value of the parameter (you can enter a number or a word) | NA |
| DATA | Mol_Analy_pipeline1 | NA | NA | NA | See description in file xxx.yyy |
| MULTIPLE ROWS | NA | NA | NA | NA | NA |
| propertySet | DataFileMetaData | …3 | …4 | …5 | …6 | …7 | …8 |
|---|---|---|---|---|---|---|---|
| valueProperty | dataFileName | columnName | columnData | mappingColumn | type | description | comment |
| unit | NA | NA | NA | NA | NA | NA | NA |
| type | character | character | character | character | character | character | character |
| allowedValues | NA | NA | ID, Treatment, Measurement, Species, other | NA | integer, numeric, character, logical, datetime, date, time | NA | NA |
| Description | the name of the data set. | Name of column in the data file. Each column in the data file needs to be documented! or NA if it is for the whole data file and not specified in the dataFileName | The type of the data in the column. ID: ID field (unique ID of unit of replication); Treatment: specifies treatment; Measurement: contains measurements; Species: contains species; other: other type of data | columnData = Treatment: treatmentID as in the Treatment tab; columnData = Species: treatmentID refering to species composition as in the Treatment tab columnData = Measurement: measurementID as in the Measurement tab; otherwise: NA | Type of the column. | if column contains measurement: General description. If type is datatime, date, or time, give the order of year month day hour minute second as e.g. ymdhms, ymd, or hms. (Do not give any other information, e.g. give nothing about how months are entered (e.g. number or name), or how years, months, day, etc are separated. | NA |
| DATA | dissolved_oxygen_measures.csv | Jar_ID | ID | NA | character | NA | NA |
| MULTIPLE ROWS | dissolved_oxygen_measures.csv | DO | Measurement | oxygen concentration | numeric | NA | NA |
| NA | dissolved_oxygen_measures.csv | Unit_1 | other | NA | character | NA | NA |
| NA | dissolved_oxygen_measures.csv | Mode | other | NA | character | NA | NA |
| NA | dissolved_oxygen_measures.csv | Location | other | NA | character | NA | NA |
| NA | dissolved_oxygen_measures.csv | Date_time | other | NA | datetime | ymdhms | NA |
| NA | dissolved_oxygen_measures.csv | Lid_treatment | Treatment | Lid_treatment | character | NA | NA |
| NA | dissolved_oxygen_measures.csv | Jar_type | other | NA | character | NA | NA |
| NA | dissolved_oxygen_measures.csv | Jar_ID | ID | NA | character | NA | NA |
| NA | smell.csv | NA | Species | species_1 | character | NA | NA |
| NA | smell.csv | smell | Measurement | smell | character | NA | NA |
| NA | smell.csv | Date | other | NA | datetime | ymdhms | NA |
| NA | smell.csv | Lid_treatment | Treatment | Lid_treatment | character | NA | NA |
| NA | smell.csv | Jar_type | other | NA | character | NA | NA |
| NA | abundances.csv | NA | Species | species_3 | character | NA | NA |
| NA | abundances.csv | Jar_ID | ID | NA | character | NA | NA |
| NA | abundances.csv | Date_time | other | NA | datetime | ymdhms | NA |
| NA | abundances.csv | Lid_treatment | Treatment | Lid_treatment | character | NA | NA |
| NA | abundances.csv | Jar_type | other | NA | character | NA | NA |
| NA | abundances.csv | count_number | Measurement | abundance | numeric | NA | NA |
A property set which contains data properties must be seen as tables and must therefore have the same number of entries for each data property.
The structure as at 2019-05-24 16:34:28 GMT is as followed:
An xml file with the example data can be downloaded from here
The xsd grammar has been generated using xmlgrid.net. You can download it from here - right mouse click - Save Linked Content