There is a newer version of the record available.

Published June 3, 2026 | Version v0.4.0
Software Open

harens/AnomaLog: v0.4.0

Authors/Creators

  • 1. @ImperialCollegeLondon

Description

0.4.0 (2026-06-03)

Features

  • add next-event prediction diagnostics to DeepCASE and DeepLog models (1b48275)
  • Add OpenStackDeepLogParser and SpellTemplateParser for OpenStack log processing (2892c72)
  • add optional per-event anomaly labels to TemplateSequence and validate alignment (3820888)
  • add Slurm job wrappers and configuration management (5ca223b)
  • ait-ads: suppress sequence-level metrics in reports and add related tests (a0f6808)
  • anomalog: add preprocessed DeepLog session source (a2aeffc)
  • baselines: run with all ML models (8638cf2)
  • bgl: split CCS 2017 and How Far Are We protocols (51a2e4f)
  • build_templated_dataset: coarse lock over dataset name + cache path (5ab01b5)
  • datasets: add AIT-ADS scenario support (1a7ec8c)
  • deepcase: add cluster-labelling ablations to registry (d451235)
  • deepcase: add event-level prediction metrics, not just sequence (436e154)
  • deepcase: enhance training feedback by reporting progress per epoch (7693743)
  • deepcase: update documentation and tests for zero-query iterations in scoring (c2d0629)
  • deeplog: add HDFS short-session padding variant (e63086e)
  • deeplog: add OpenStack parameter CI approximation (ce38437)
  • deeplog: add OpenStack regression coverage (b2d8470)
  • deeplog: add progress reporting for key model and parameter schema preparation (a631783)
  • deeplog: add short-session padding fidelity mode (e5ada94)
  • deeplog: align BGL 2022 path with Drain3 and CI highlights (834e308)
  • deeplog: carry parameter history across openstack entity windows (d7d4a13)
  • deeplog: disable parameter model by default, correct g default value (fe77e1c)
  • deeplog: emit parameter ci report artifacts (c243b1f)
  • deeplog: left pad by default (c41195d)
  • deeplog: make top-g replay configurable (2b55195)
  • deeplog: merge continuous context series in parameter dataset construction (5967617)
  • deeplog: separate parameter summary from debug trace (3556dca)
  • deeplog: support parameter-only scoring and registry updates (ea44148)
  • deeplog: surface OpenStack Figure 9 metadata in parameter reports (945ae45)
  • dependencies: add spellpy as a git dependency and update related markers (b0d079c)
  • detectors: implement BatchExperimentDetector for bulk scoring and enhance evaluation logic (b1f0ff8)
  • docs: add reference documentation for experiments package (80e20ae)
  • Enhance continuous context handling in DeepLog (e464816)
  • enhance materialization to handle stale Prefect cache paths and update related tests (9feaa97)
  • entity_chronological: add AIT-ADS entity-chronological dataset and update related configurations (dcd771a)
  • experiment_logger: enhance logging with concrete run names (413f52a)
  • experiment_runner: add --write-predictions option to persist predictions.jsonl (ef06f46)
  • experiment_runner: enhance failure handling to log bundle failures without stopping execution (a9a97e4)
  • experiments: add DeepLog and DeepCASE detectors (7b3745e)
  • experiments: add event-level baseline reporting (8cf2c13)
  • experiments: add HDFS Table-IV compatibility suite (488bf5e)
  • experiments: add Prefect telemetry configuration to Slurm wrapper (cd70adb)
  • experiments: allow for multi-process model sweeps (46e0a05)
  • experiments: commit registry-backed experiment overhaul (af97052)
  • experiments: enhance error handling in submit_experiments for sbatch failures (c0d6e9b)
  • experiments: lazy load models for optional extra dependencies (a58e50c)
  • experiments: make data/cache root configurable for slurm jobs (caf620e)
  • experiments: publish scoped metric blocks in run outputs (1fcae3d)
  • experiments: separate run_groups to execute (5759e81)
  • experiments: share progress totals and logging (f0019a6)
  • experiments: streamline run metric metadata (c8ce126)
  • experiments: submit Slurm arrays in one job (18bd438)
  • group missing suite reruns by experiment (0133d2e)
  • markov: introduce markov baseline (d53a315)
  • metadata: record DeepCASE version in environment metadata (0250d97)
  • models: add SingleFitMixin for single fit state management in detectors (9f4247b)
  • models: support bounded train progress hints (2b24a2a)
  • modify train/test fractions in tandem (28a21f0)
  • naive_bayes: add type annotations and improve docstrings (b611c63)
  • openstack: make the Figure 9 anomaly slice explicit (2ca83e0)
  • parquet: implement chronological entity grouping and persist entity chronology index (28d7f48)
  • parsers: add context manager for spellpy logging and update tests (16a01f5)
  • parsers: implement lightweight mode for SpellTemplateParser and add corresponding test (47869ca)
  • parsing: add thunderbird parser and timing logs (f750391)
  • registry: add HDFS DeepLog Drain3 ablations (0be1600)
  • run_suite: add check for missing concrete runs and update tests (6b5e28e)
  • run_suite: enhance output handling for missing runs and update rerun command format (d824176)
  • sequences: infer split labels from preprocessed entity prefixes (c276b7c)
  • set paper-faithful deepcase iterations, expose next event prediction metrics (0131ccf)
  • slurm: add initial sbatch experiment scripts (a2aa3ba)
  • slurm: enhance job scripts to dynamically set repository root and cache directory (674f476)
  • slurm: simplify REPO_ROOT assignment in Slurm scripts and tests (26b0f0d)
  • slurm: update wrap script to export EXPERIMENT_NAME for nested commands (d82fa79)
  • slurm: update wrap script to use loop for experiment indexing (f038698)
  • slurm: update wrap script to use set -- for experiment indexing (9b84a8d)
  • sources: add preprocessed and thunderbird tests (c80f5be)
  • sources: add tar.gz support (4964c07)
  • template_frequency: clarify documentation on threshold calibration and update label checks (ff609ea)
  • tests: add fixtures to mirror Prefect API URL in subprocess environments (71c1c23)
  • tests: add test for entity counting using grouped rows in ParquetStructuredSink (236ea9e)
  • tests: add test for parquet dataset schema exposing timestamp and partition fields (bbbc128)
  • tests: add unit tests for DeepLog preprocessed dataset source helpers (60f9be6)
  • tests: update wrap script assertions to reflect changes in error handling (1f55820)
  • thunderbird: add entity-grouped DeepCASE extension and update registry (d3c2561)
  • thunderbird: consolidate registry entries (3336713)
  • thunderbird: enhance parser to handle message tails and update entity ID extraction (a684bdf)
  • thunderbird: normalise benchmark slice input (0d96003)
  • torch_runtime: add shared helpers for managing torch device and seed (b7c0485)
  • update experiment configurations to use chronological datasets and remove obsolete files (0c34300)
  • update metrics structure to use canonical metric blocks and remove legacy fields (7256c05)

Bug Fixes

  • accept all-normal inline-labelled structured datasets (dde55eb)
  • ait-ads: split by chronology before entity grouping (7268f8a)
  • align BGL 10% DeepLog with normal-entry split (bc349b3)
  • align Thunderbird fixed-window contract (75bffbe)
  • bump deepcase commit version for batching fix (1043917)
  • cache: enhance result storage handling and cache policy configuration (5025ef5)
  • cache: stop cached materialisation results pointing to stale per-run paths (fbac953)
  • deepcase: align HDFS compat split with entity grouping (9030c01)
  • deepcase: align query defaults with paper (3f5a000)
  • deepcase: align workload reduction with paper semantics (9b6b520)
  • deepcase: cache training chunks and separate query budget (913bd90)
  • deepcase: chunk adapter batches to bound memory (1a07946)
  • deepcase: don't treat abstained scores as anomalous (0b99151)
  • deepcase: exclude evaluation_event_mask events from score (c736cd8)
  • deepcase: optimise template access and label resolution in training batch (9155033)
  • deepcase: preserve optimiser state in chunked training (0e27c5f)
  • deepcase: reduce fit-time memory pressure (08352da)
  • deepcase: use finer grained anomaly labels where available (03741f0)
  • deeplog: evaluator no longer repeatedly charging warm-up penalty (8d9b32f)
  • deeplog: harden OpenStack Figure 9 parameter CI (7aa7bae)
  • deeplog: keep compatibility variant on deeplog_default (95df577)
  • deeplog: microbatch key-model training (6a391c2)
  • deeplog: next-event predictions over all logs, not just latest one (c20858e)
  • deeplog: remove g=11 from default replay cutoffs (742bb69)
  • deeplog: remove short-session padding fallback (bf94416)
  • deeplog: stream Spell input during training (7f740a0)
  • deeplog: use increased epochs/batch size for hdfs pre-processed (d137477)
  • dependencies: add diff-cover to dev dependency group (66f41b4)
  • don't consume log sequences between different models (7a53d32)
  • drop incompatible baselines from paper configs (f020967)
  • drop obsolete spellpy persist_state argument (e536a43)
  • evaluator no longer treats every DeepCASE outcome as abstained (9e8a1ef)
  • experiments: apply deepcase model-set overrides (64f0f12)
  • experiments: correct supervised entity split fractions (0e27773)
  • experiments: invalidate dataset cache on force reruns (826abfb)
  • experiments: raise Prefect startup timeout for dataset builds (e38d2e5)
  • hdfs_v1: add support for non-integer csv anomaly labels (7a2f4df)
  • keep test set the same across different train splits (d27fc42)
  • make OpenStack DeepLog preprocessing paper-faithful (6c37d8d)
  • markov: accept mixed BGL training chunks (823c94f)
  • markov: remove quadratic calibration scan (7638d2b)
  • openstack-deeplog: normalise volatile session tokens (958e522)
  • openstack: group by instance id (c9b75c7)
  • paper-faithful split contract changes (da2d6c5)
  • parquet-sink: rebuild missing structured cache directories (19557ca)
  • parquet: sort rows within each entity by line order for accurate grouping (24e4253)
  • parquet: stale parquet cache making entity scan look empty (4c85634)
  • parquet: validate cached parquet fragments before reuse (409d915)
  • parsers: fix openstack dropping traceback/error continuation rows (6407caf)
  • parsers: simplify SpellTemplateParser after spellpy fix (f383178)
  • record file-boundary split provenance in manifests (56a2c47)
  • recover stale Prefect cache paths (b01eae2)
  • repair chronological stream and audit typing (4d7f94b)
  • resolve paths in ExperimentBundle for consistent path handling (029fe01)
  • ruff: update external linting rules to include only "DOC" (f68660f)
  • run_bundle: improve error handling for existing result paths (0a283de)
  • spell: collapse back to direct spellpy parsing (3b55bbf)
  • template: clear stale spell cache before retraining (f142c57)
  • tests: keep deeplog/case runners independent of dataset.toml (f003c94)
  • thunderbird: invalidate template cache by raw slice (33afd82)
  • thunderbird: skip empty Thunderbird message rows (a225a79)

Performance Improvements

  • cache registry bundle decoding (fce28e2)
  • deeplog: speed up Thunderbird key training (3272761)

Documentation

  • anomalog: apply strict pydoclint across modules (8429e76)
  • deeplog: audit templates (82eede4)
  • deeplog: document duplicate-session findings in HDFS preprocessed test files (74b389a)
  • experiments: document caching and reruns (5240b0d)
  • experiments: document DeepLog and DeepCASE support (8b5bda5)
  • experiments: pydoclint docstrings (8c082df)
  • experiments: refresh Slurm registry examples (3b73ca0)
  • experiments: satisfy pydoclint for Slurm backend (d594306)
  • tests: apply strict pydoclint (0978c04)
  • thunderbird: note cache-key fix and template contract (541ecc7)
  • thunderbird: record parser skips and DeepLog comparison (c75c635)
  • thunderbird: refresh reproduction notes and config (a3f3172)
  • tighten docstrings for pydoclint (85db1be)

Files

harens/AnomaLog-v0.4.0.zip

Files (862.3 kB)

Name Size Download all
md5:f95a6cb74f854d5eefc04d33ae467b8e
862.3 kB Preview Download

Additional details

Related works

Is supplement to
Software: https://github.com/harens/AnomaLog/tree/v0.4.0 (URL)

Software

Repository URL
https://github.com/harens/AnomaLog
Programming language
Python