Published January 17, 2026 | Version V1.0
Preprint Open

Structure Before Function: Comparative Genome Analysis Reveals Stable Species-specific Organization

Authors/Creators

Description

This repository contains the data and code supporting the study
“Structure Before Function: Comparative Genome Analysis Reveals Stable Species-specific Organization.”

The dataset presents genome-wide structural summaries derived from publicly available reference genomes of nine species:
human, chimpanzee, bonobo, gorilla, orangutan, dog, wolf, horse, and donkey.

Rather than focusing on gene annotation or functional interpretation, this work analyzes genomic sequences using an annotation-free, windowed statistical representation.
The aim is to characterize large-scale organizational patterns across entire genomes, including regions commonly classified as non-coding, repetitive, or low-complexity.

The results demonstrate that such regions exhibit stable, species-specific structural signatures when analyzed at a genome-wide scale.
These findings suggest that genome organization itself constitutes a meaningful descriptive layer, complementary to but distinct from local gene-centric functional analysis.

This repository includes:

  • A Python implementation used to generate windowed genome representations and summary statistics.

  • Per-species output files containing window-level measurements, detected outliers, and aggregated structural summaries.

  • Documentation describing data provenance, analysis scope, and reproducibility considerations.

No claims are made regarding biological function, evolutionary mechanism, or phenotypic causality.
The materials are intended to support structural analysis and comparative interpretation at the level of genome-wide organization.

The dataset is provided to ensure transparency, reproducibility, and reuse in related studies exploring statistical structure, complexity, and organization in biological sequences.

Files

01_Structure_Before_Function.pdf

Files (734.7 MB)

Name Size Download all
md5:b68147f5769a8be137f43827dae4dc30
308.8 kB Preview Download
md5:405c913a1719ffab2807d95e0d61aa3a
734.4 MB Preview Download

Additional details

Dates

Issued
2026-01-17