# Methods and Reproducibility

## Run metadata
- Run timestamp (UTC): `2026-02-11T13:22:16Z`
- Working directory: repository root (relative paths only)
- Environment:
- `python 3.13.2`
- `pandas 2.2.3`
- `numpy 2.3.3`
- `matplotlib 3.10.6`
- `pdftotext 24.02.0`
- Platform: `Linux-6.17.0-14-generic-x86_64-with-glibc2.39`

## Preflight (Phase 0)
### A) Draft PDF detection
- Directory listing found a single PDF in root:
- input draft PDF (selected by filename/size rule)
- Selection rule outcome:
- Filename matched `Crohn` / `Causal Model`, so that root-level draft PDF was selected as `DRAFT_PDF`.

### B) Full-text extraction
- Extraction command:
```bash
mkdir -p output/tables output/figures/code
pdftotext -layout "$DRAFT_PDF" output/extracted_draft.txt
```
- Output: `output/extracted_draft.txt`

### C) Content map creation
- Generated:
- `output/draft_outline.json`
- `output/draft_sections.json`

## Real-data quantitative execution
### Public dataset downloads (executed)
```bash
wget -O output/data/ti_immune_crohns_normal.h5ad \
  https://datasets.cellxgene.cziscience.com/6cfb8c33-9cfb-4e16-b868-18aff944e55a.h5ad

wget -O output/data/ti_epithelial_crohns_normal.h5ad \
  https://datasets.cellxgene.cziscience.com/b7c3c27c-97c2-4983-97df-1d537d138a43.h5ad

wget -O output/data/ti_stromal_crohns_normal.h5ad \
  https://datasets.cellxgene.cziscience.com/1a640ddc-ea3c-4711-ba8e-07084cc40a88.h5ad
```

### Table generation from real sources (executed)
```bash
python output/figures/code/build_real_data_tables.py
```
- Sources used in script:
- Open Targets Platform GraphQL (`Crohn disease`, `EFO_0000384`)
- STRING API (human, species `9606`)
- CELLxGENE TI Crohn/normal scRNA datasets above

### Figure regeneration (executed)
```bash
python output/figures/code/generate_figures.py
```

### Manuscript format generation (executed)
```bash
python -m pip install pypandoc-binary
python - <<'PY'
from pathlib import Path
import pypandoc
root=Path('output')
src=root/'manuscript.md'
tmp=root/'.manuscript_body.md'
text=src.read_text(encoding='utf-8').splitlines()
if text and text[0].startswith('# '):
    text=text[1:]
    while text and text[0].strip()=='':
        text=text[1:]
tmp.write_text('\\n'.join(text)+'\\n',encoding='utf-8')
common=['--standalone','--from=gfm+raw_html',
        '-M','title=A five-node causal circuit for ileal Crohn\\'s disease',
        '-M','author=Alexander Humphries','-M','date=']
pypandoc.convert_file(str(tmp),'html5',outputfile=str(root/'manuscript.html'),
                      extra_args=common+['--css','manuscript.css'])
pypandoc.convert_file(str(tmp),'latex',outputfile=str(root/'manuscript.tex'),
                      extra_args=common+['-V','geometry:margin=1in','-V','fontsize=11pt','-V','linestretch=1.1'])
PY
google-chrome --headless --disable-gpu --no-pdf-header-footer \
  --print-to-pdf='output/manuscript.pdf' \
  'output/manuscript.html'
```

## Key outputs
- Manuscript: `output/manuscript.md`
- LaTeX: `output/manuscript.tex`
- PDF: `output/manuscript.pdf`
- Tables:
- `output/tables/node_rank_table.csv`
- `output/tables/edge_evidence_scores.csv`
- `output/tables/phenotype_mapping_scores.csv`
- `output/tables/claims_evidence_table.csv`
- Figures:
- `output/figures/fig1_minimal_circuit.png`
- `output/figures/fig2_node_leverage.png`
- `output/figures/fig3_edge_evidence_heatmap.png`
- `output/figures/fig4_phenotype_mapping.png`

## Intermediate reproducibility artifacts
- Open Targets snapshot: `output/data/opentargets_crohns_associated_targets.csv`
- STRING pair scores: `output/data/string_pair_scores.csv`
- Cell-type means: `output/data/crohn_normal_celltype_means.csv`
- Crohn-normal deltas: `output/data/crohn_minus_normal_celltype_deltas.csv`
- Dataset manifest: `output/data/scRNA_dataset_manifest.csv`
- Missing genes list: `output/data/missing_genes_in_scrna.json`

## End-to-end rerun checklist
1. Ensure the three CELLxGENE H5AD files exist in `output/data/`.
2. Run `python output/figures/code/build_real_data_tables.py`.
3. Run `python output/figures/code/generate_figures.py`.
4. Rebuild manuscript assets (`.html`, `.tex`, `.pdf`) after text edits.
5. Sync `zenodo preprint/` and `nature/` packaging folders from updated `output/`.
