# Photocatalysis subproject example (synthesis_bots)

## Workflow preparation and file system setup

The files in `DATA/INPUT_MINIMUM` are required to execute the first part of the workflow. All subsequent input files (in `DATA/INPUT`) are the expected output have been generated by the decision maker based on the experimental results.

### Potential MS hits

Monoisotopic masses for expected product (and a number of common adducts) were calculated using the [pyISOPACh package](https://github.com/AberystwythSystemsBiology/pyISOPACh). The list of possible expected m/z values in each sample is kept in a nested dictionary and follows the structure:

```python
{
    "SAMPLE_ID": {
        "[M+H]+": {
            "1": MZ_VALUE
        },
        "[M+Na]+": {
            "1": MZ_VALUE,
        },
        ...
    },
    ...
}
```
The resulting dictionary can be found in the input data folder as `PHOTOCATALYSIS-SCREENING-EXPECTED-MS.json`.

### Settings file

The `settings.toml` file contains the settings used throughout the workflow, including the paths for archiving data, names for the workflows, paths for the corresponding Python files and default settings for NMR and LCMS acquisition. In particular, when extending the workflow it is important to take note of those lines:

```toml
[workflows.LCMS]
Photochem    = "synthesis_bots.workflows.ms.photocat.photocat"
```

as they indicate which file will be run when a given command is received. In the example above, upon receiving the string `Medchem_Screening`, the NMR will execute `main()` from `synthesis_bots/workflows/nmr/medchem/screening.py`.

In [None]:
settings_toml = '''[dry]

dry = true

[tcp]

HOST      = "tcp://172.31.1.17:5558"
NMR       = "tcp://172.31.1.15:5552"
CHEMSPEED = "tcp://172.31.1.16:5553"
LCMS      = "tcp://172.31.1.18:5554"

[paths]

LCMS_archive   = "."                                 # Path to archive from LCMS PC
LCMS_queue     = "."                                 # Path to LCMS queue for MassLynx
LCMS_data      = "DATA/PHOTOCATALYSIS/DATA/LCMS"     # Raw LCMS data on LCMS control PC
LCMS_to_NMR    = "."                                 # Path to NMR data from LCMS PC
NMR_data       = "."                                 # Raw NMR data on NMR control PC
NMR_archive    = "."                                 # Path to archive from NMR PC
CS_csv_supra   = "."                                 # CSV on ChemSpeed Computer
CS_csv_medchem = "."                                 # CSV on ChemSpeed Computer

[defaults.MS]

injection_volume      = 0.5
peak_match_tolerance  = 0.4
analog_peak_threshold = 0.1
tic_peak_params       = { "height" = 0.2, "distance" = 50 }
analog_peaks_params   = { "height" = 0.1, "distance" = 50 }
ms_peak_params        = { "height" = 0.1, "distance" = 10 }
solvent_front         = 0.55
lc_run_end            = 3.0
integral_rel_height   = 0.95
lc_ms_flowpath        = 2.84
# Time (s) for sample to reach MS after LC detector

[workflows.PREFIX]

Photocat             = "PHOTOCATALYSIS"

[workflows.LCMS]

InsertRack1  = "synthesis_bots.workflows.ms.insert_rack_one"
InsertRack2  = "synthesis_bots.workflows.ms.insert_rack_two"
ExtractRack1 = "synthesis_bots.workflows.ms.eject_rack_one"
ExtractRack2 = "synthesis_bots.workflows.ms.eject_rack_two"
Photochem    = "synthesis_bots.workflows.ms.photocat.photocat"

'''

with open("settings.toml", "w") as f:
    f.write(settings_toml)

## Photocatalyst screening with UPLC-MS

The first and only analysis step of this workflow is to perform UPLC-MS analysis of the post-irradiation reaction mixtures when different catalysts were used. The goal of this example is to demonstrate the ease of expansion of the setup with a simple off-line reaction station rather to perform any novel reaction analysis (see Examples 1 and 2). As previously, Note all success criteria can be modified by domain experts in the settings file if they so wish.

It can be easily see from the resulting chromatograms that conditions 1, 2, 6, 7 did not progress; conditions 3 did not reach completion; conditions 4 and 5 performed best. The peaks of interest overlap as they are very similar isomers but we are only interested in their sum (i.e., reaction progress).

In [None]:
from pathlib import Path
from synthesis_bots.utils.constants import PATHS

DATA = Path.cwd() / "DATA"
INPUT = DATA / "INPUT"
RAW_NMR = Path.cwd() / "NMR"
PLOTS = Path.cwd() / "PLOTS"
SUMMARY = Path.cwd() / "SUMMARY"
PATHS["LCMS_data"] = DATA / "PHOTOCATALYSIS" / "DATA" / "LCMS"

from synthesis_bots.workflows.ms.photocat.photocat import results_analysis

results_analysis(
    expected_json=INPUT / "PHOTOCATALYSIS-EXPECTED-MS.json",
    archive_path=DATA / "PHOTOCATALYSIS" / "DATA" / "LCMS",
    summary_path=DATA / "PHOTOCATALYSIS" / "DATA" / "SUMMARY_MS.json"
)