seismicrna.table package

Submodules

class seismicrna.table.base.AvgTable

Bases: RelTypeTable, ABC

Average over an ensemble of RNA structures.

classmethod header_type()

Type of the header for the table.

class seismicrna.table.base.ClustFreqTable

Bases: FreqTable, ABC

classmethod header_type()

Type of the header for the table.

classmethod index_depth()

Number of columns in the index.

classmethod kind()

Kind of table.

class seismicrna.table.base.ClustPosTable

Bases: ClustTable, ProfilePosTable, ABC

class seismicrna.table.base.ClustReadTable

Bases: ClustTable, ReadTable, ABC

class seismicrna.table.base.ClustTable

Bases: RelTypeTable, ABC

Cluster for each RNA structure in an ensemble.

classmethod header_type()

Type of the header for the table.

classmethod kind()

Kind of table.

class seismicrna.table.base.FreqTable

Bases: Table, ABC

Table of frequencies.

classmethod by_read()

Whether the table contains data for each read.

property data: Series

Table’s data.

classmethod path_segs()

Table’s path segments.

class seismicrna.table.base.MaskPosTable

Bases: MaskTable, ProfilePosTable, ABC

class seismicrna.table.base.MaskReadTable

Bases: MaskTable, ReadTable, ABC

class seismicrna.table.base.MaskTable

Bases: AvgTable, ABC

classmethod kind()

Kind of table.

class seismicrna.table.base.PosTable

Bases: RelTypeTable, ABC

Table indexed by position.

MASK = 'pos-mask'
classmethod by_read()

Whether the table contains data for each read.

ci_count(confidence: float, **kwargs)

Confidence intervals of counts, under these simplifications:

  • Counts are independent of each other.

  • Counts follow binomial distributions.

  • Coverage counts are constant.

Parameters:
  • confidence (float) – Confidence level; must be in [0, 1).

  • **kwargs – Keyword arguments for fetch methods.

Returns:

Lower and upper bounds of the confidence interval.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

ci_ratio(confidence: float, **kwargs)

Confidence intervals of ratios, under these simplifications:

  • Ratios are independent of each other.

  • Ratios follow beta distributions.

  • Coverage counts are constant.

Parameters:
  • confidence (float) – Confidence level; must be in [0, 1).

  • **kwargs – Keyword arguments for fetch methods.

Returns:

Lower and upper bounds of the confidence interval.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

property end3
property end5
classmethod index_depth()

Number of columns in the index.

iter_profiles(*, sections: Iterable[Section] | None = None, quantile: float = 0.0, rel: str = 'Mutated', order: int | None = None, clust: int | None = None)

Yield RNA mutational profiles from the table.

classmethod path_segs()

Table’s path segments.

property range
property range_int
resample(fraction: float = 1.0, *, exclude_masked: bool = False, seed: int | None = None, max_seed: int = 4294967296)

Resample the reads and return a new DataFrame.

Parameters:
  • fraction (float = 1.) – Number of reads to resample, expressed as a fraction of the original number of reads. Must be ≥ 0; may be > 1.

  • exclude_masked (bool = False) – Exclude positions that have been masked.

  • seed (int | None = None) – Seed for the random number generator.

  • max_seed (int = 2 ** 32) – Maximum seed to pass to the next random number generator.

property section

Section covered by the table.

property seq
class seismicrna.table.base.ProfilePosTable

Bases: PosTable, ABC

class seismicrna.table.base.ReadTable

Bases: RelTypeTable, ABC

Table indexed by read.

classmethod by_read()

Whether the table contains data for each read.

classmethod index_depth()

Number of columns in the index.

classmethod path_segs()

Table’s path segments.

property reads
class seismicrna.table.base.RelPosTable

Bases: RelTable, PosTable, ABC

class seismicrna.table.base.RelReadTable

Bases: RelTable, ReadTable, ABC

class seismicrna.table.base.RelTable

Bases: AvgTable, ABC

classmethod kind()

Kind of table.

class seismicrna.table.base.RelTypeTable

Bases: Table, ABC

Table with multiple types of relationships.

property data: DataFrame

Table’s data.

fetch_count(*, exclude_masked: bool = False, squeeze: bool = False, **kwargs) Series | DataFrame

Fetch counts of one or more columns.

fetch_ratio(*, exclude_masked: bool = False, squeeze: bool = False, precision: int | None = None, quantile: float = 0.0, **kwargs) Series | DataFrame

Fetch ratios of one or more columns.

classmethod header_rows() list[int]

Row(s) of the file to use as the columns.

class seismicrna.table.base.Table

Bases: ABC

Table base class.

abstract classmethod by_read() bool

Whether the table contains data for each read.

property data: DataFrame | Series

Table’s data.

classmethod ext()

Table’s file extension: either ‘.csv’ or ‘.csv.gz’.

classmethod gzipped()

Whether the table’s file is compressed with gzip.

property header

Header for the table’s data.

classmethod header_depth()
abstract classmethod header_type() type[Header]

Type of the header for the table.

classmethod index_cols() list[int]

Column(s) of the file to use as the index.

abstract classmethod index_depth() int

Number of columns in the index.

abstract classmethod kind() str

Kind of table.

property path

Path of the table’s CSV file (possibly gzipped).

property path_fields: dict[str, Any]

Table’s path fields.

abstract classmethod path_segs() tuple[Segment, ...]

Table’s path segments.

abstract property ref: str

Name of the table’s reference.

property refseq: DNA

Reference sequence.

abstract property sample: str

Name of the table’s sample.

abstract property sect: str

Name of the table’s section.

abstract property top: Path

Path of the table’s output directory.

seismicrna.table.base.get_rel_name(rel_code: str)

Get the name of a relationship from its code.

class seismicrna.table.calc.AvgTabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: Tabulator, ABC

property max_order: int

Number of clusters, or 0 if not clustered.

class seismicrna.table.calc.ClustTabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: PartialTabulator, ABC

property clust_header

Header of the per-cluster data.

property max_order

Number of clusters, or 0 if not clustered.

property table_per_clust

Number of reads in each cluster.

class seismicrna.table.calc.FullTabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: Tabulator, ABC

classmethod get_null_value()

The null value for a count: either 0 or NaN.

class seismicrna.table.calc.MaskTabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: PartialTabulator, AvgTabulator

class seismicrna.table.calc.PartialTabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: Tabulator, ABC

classmethod get_null_value()

The null value for a count: either 0 or NaN.

property table_per_pos
class seismicrna.table.calc.RelateTabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: FullTabulator, AvgTabulator

class seismicrna.table.calc.Tabulator(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Bases: ABC

Base class for tabulating data for multiple tables from a report loader.

abstract classmethod get_null_value() int | float

The null value for a count: either 0 or NaN.

abstract property max_order: int

Number of clusters, or 0 if not clustered.

property p_ends_given_noclose

Probability of each end coordinate.

property pos_header

Header of the per-position data.

property read_header

Header of the per-read data.

property ref
property refseq
property sample
property section
property table_per_pos
property table_per_read
property top
seismicrna.table.calc.adjust_counts(table_per_pos: DataFrame, p_ends_given_noclose: ndarray, n_reads_clust: Series | int, section: Section, min_mut_gap: int, quick_unbias: bool, quick_unbias_thresh: float)

Adjust the given table of masked/clustered counts per position to correct for observer bias.

seismicrna.table.calc.all_patterns(mask: RelPattern | None = None)

Every RelPattern, keyed by its name.

seismicrna.table.calc.tabulate_loader(dataset: MutsDataset | ClusterMutsDataset | UnbiasDataset)

Return a new Dataset, choosing the subclass based on the type of the argument dataset.

class seismicrna.table.load.ClustFreqTableLoader(table_file: Path)

Bases: TableLoader, ClustFreqTable

Load cluster data indexed by cluster.

property data: Series

Table’s data.

class seismicrna.table.load.ClustPosTableLoader(table_file: Path)

Bases: PosTableLoader, ClustPosTable

Load cluster data indexed by position.

class seismicrna.table.load.ClustReadTableLoader(table_file: Path)

Bases: ReadTableLoader, ClustReadTable

Load cluster data indexed by read.

class seismicrna.table.load.MaskPosTableLoader(table_file: Path)

Bases: PosTableLoader, MaskPosTable

Load masked bit vector data indexed by position.

class seismicrna.table.load.MaskReadTableLoader(table_file: Path)

Bases: ReadTableLoader, MaskReadTable

Load masked bit vector data indexed by read.

class seismicrna.table.load.PosTableLoader(table_file: Path)

Bases: RelTypeTableLoader, PosTable, ABC

Load data indexed by position.

class seismicrna.table.load.ReadTableLoader(table_file: Path)

Bases: RelTypeTableLoader, ReadTable, ABC

Load data indexed by read.

class seismicrna.table.load.RelPosTableLoader(table_file: Path)

Bases: PosTableLoader, RelPosTable

Load relation data indexed by position.

class seismicrna.table.load.RelReadTableLoader(table_file: Path)

Bases: ReadTableLoader, RelReadTable

Load relation data indexed by read.

class seismicrna.table.load.RelTypeTableLoader(table_file: Path)

Bases: TableLoader, RelTypeTable, ABC

Load a table of relationship types.

property data: DataFrame

Table’s data.

class seismicrna.table.load.TableLoader(table_file: Path)

Bases: Table, ABC

Load a table from a file.

property ref: str

Name of the table’s reference.

property sample: str

Name of the table’s sample.

property sect: str

Name of the table’s section.

property top: Path

Path of the table’s output directory.

seismicrna.table.load.find_all_tables(files: Iterable[str | Path])
seismicrna.table.load.find_pos_tables(files: Iterable[str | Path])
seismicrna.table.load.find_read_tables(files: Iterable[str | Path])
seismicrna.table.load.find_tables(segments: Iterable[Segment], files: Iterable[str | Path])

Yield every table file with the given type of segment from among the given paths.

seismicrna.table.load.load_all_tables(files: Iterable[str | Path])

Yield every table among the given paths.

seismicrna.table.load.load_any_table(table_file: Path)
seismicrna.table.load.load_pos_table(table_file: Path) PosTableLoader
seismicrna.table.load.load_pos_tables(files: Iterable[str | Path])

Yield every positional table among the given paths.

seismicrna.table.load.load_read_table(table_file: Path) ReadTableLoader
seismicrna.table.load.load_read_tables(files: Iterable[str | Path])

Yield every per-read table among the given paths.

seismicrna.table.load.load_table(types: Iterable[type[PosTableLoader | ReadTableLoader | ClustFreqTableLoader]], table_file: Path)

Load a Table of one of several types from a file.

seismicrna.table.load.load_tables(finder: Callable[[Iterable[str | Path]], Iterable[Path]], loader: Callable[[Path], TableLoader], files: Iterable[str | Path])

Yield every table with the given type of segment from among the given paths.

seismicrna.table.main.run(input_path: tuple[str, ...], *, table_pos: bool = True, table_read: bool = True, table_clust: bool = True, force: bool = False, max_procs: int = 16, parallel: bool = True)

Count mutations for each read and position; output tables.

Parameters:
  • table_pos (bool) – Make a table counting relationships per position [keyword-only, default: True]

  • table_read (bool) – Make a table counting relationships per read [keyword-only, default: True]

  • table_clust (bool) – Make a table counting reads per cluster (only for clustered data) [keyword-only, default: True]

  • force (bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]

  • max_procs (int) – Run up to this many processes simultaneously [keyword-only, default: 16]

  • parallel (bool) – Run tasks in parallel or in series [keyword-only, default: True]

class seismicrna.table.write.ClustFreqTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: TableWriter, ClustFreqTable

property data

Table’s data.

class seismicrna.table.write.ClustPosTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: PosTableWriter, ClustPosTable

class seismicrna.table.write.ClustReadTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: ReadTableWriter, ClustReadTable

class seismicrna.table.write.MaskPosTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: PosTableWriter, MaskPosTable

class seismicrna.table.write.MaskReadTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: ReadTableWriter, MaskReadTable

class seismicrna.table.write.PosTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: TableWriter, PosTable, ABC

property data

Table’s data.

class seismicrna.table.write.ReadTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: TableWriter, ReadTable, ABC

property data

Table’s data.

class seismicrna.table.write.RelPosTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: PosTableWriter, RelPosTable

class seismicrna.table.write.RelReadTableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: ReadTableWriter, RelReadTable

class seismicrna.table.write.TableWriter(tabulator: AvgTabulator | ClustTabulator)

Bases: Table, ABC

Write a table to a file.

property columns
property ref

Name of the table’s reference.

property sample

Name of the table’s sample.

property sect

Name of the table’s section.

property top

Path of the table’s output directory.

write(force: bool)

Write the table’s rounded data to the table’s CSV file.

seismicrna.table.write.get_tabulator_writer_types(tabulator: Tabulator)
seismicrna.table.write.get_tabulator_writers(tabulator: AvgTabulator | ClustTabulator, *, table_pos: bool = True, table_read: bool = True, table_clust: bool = True)
seismicrna.table.write.write(report_file: Path, *, force: bool, **kwargs)

Helper function to write a table from a report file.