Published May 18, 2026
| Version v1.3.0
Software
Open
dsgoficial/pytorch_segmentation_models_trainer: v1.3.0
Authors/Creators
- 1. UFMG
- 2. @anthropics
- 3. @ultralytics @viddexa
Description
[1.3.0] - 2026-05-18
Dataset Distillation
- Added VAE-backed DDOQ image distillation (
tools/dataset_distillation/vae_ddoq_distillation.py) that loads a trained VAE checkpoint, extracts embeddings for all configured input images, reusesKMeansClusteringToolfor Mini-Batch K-Means, decodes one cluster center per distilled image, and writes bothembeddings.parquet(source image path, embedding, cluster id) anddistilled_images.parquet(distilled image path, cluster id, center embedding, DDOQ weight). - Added
pytorch-smt-tools ddoq-vaeand Hydra modeddoq-vae-distillso the VAE DDOQ pipeline can run either as a tool command or from YAML configuration. Addedconf/examples/ddoq_vae_distillation.yaml. - Extended
KMeansClusteringToolwithpredict()and exact label-based Voronoi weight calculation, so DDOQ weights can be tied to the full input-image cluster assignments instead of only Mini-Batch K-Means internal update counts.
CLI Tools
- Novo subcomando
pytorch-smt-tools export-tb-images: exporta imagens de event files do TensorBoard para PNG sem necessidade do pacotetensorboard. Usa os protos dotensorboardX(já dependência do projeto) para parsear TFRecords. Suporta filtro por tag (--tags) e por step/epoch (--steps, aceita inteiros, ranges e combinações:"0-10,20").--list-tagslista todas as tags disponíveis no diretório de logs. Destinado ao workflow de treinar comdelete_after_log=True(sem acúmulo de PNGs) e exportar seletivamente apenas as imagens de interesse após análise no TensorBoard.
Training
FinalMetricsCallbackagora é injetado automaticamente em todo treinamento viatrain.py, sem necessidade de declaração manual no YAML. Para suprimir, adicionaradd_final_metrics_callback: falsena config. Se o usuário já declarouFinalMetricsCallbackno campocallbacks, nenhuma duplicata é criada. Campoadd_final_metrics_callback: bool = Trueadicionado ao dataclassTrainConfig.
Image Callbacks
- Added
delete_after_log: bool = Falseparameter toImageSegmentationResultCallback(e todas as subclasses via herança) eEnhancedImageSegmentationResultCallback. QuandoTrue, o arquivo PNG salvo emimage_logs/é deletado imediatamente após o envio ao TensorBoard, reduzindo uso de disco em experimentos longos. PadrãoFalsemantém comportamento anterior. - Added
use_basename_as_title: bool = Falseoption to all image callbacks (ImageSegmentationResultCallback,EnhancedImageSegmentationResultCallback,FrameFieldResultCallback,FrameFieldOverlayedResultCallback,ObjectDetectionResultCallback,PolygonRNNResultCallback,ModPolyMapperResultCallback,AutoencoderResultCallback). WhenTrue, plot titles and TensorBoard tags use only the file stem (e.g.tile_001instead of/data/images/tile_001.tif), making TensorBoard runs easier to read and compare when images come from deeply nested directories. Access via_get_title(path)helper on each class.
Autoencoder Clustering Losses
- Added
DECSoftAssignmentLoss(custom_losses/autoencoder_clustering_losses.py): soft-assignment KL loss (DEC, Xie et al. ICML 2016) that pushes the encoder toward confident, well-separated cluster assignments via a Student-t kernel and a sharpened target distribution P. Includesinitialize_centersfor K-Means warm-start. - Added
CenterLoss: intra-cluster compactness loss (Wen et al. ECCV 2016) that minimises the mean squared distance from each embedding to its nearest cluster center, with a configurablelambda_centerweight. - Added
ClusteringAwareVAELoss: composite VAE loss combining reconstruction, KL, DEC, and center losses in a single module (L = L_recon + β·L_KL + γ·L_DEC + δ·L_center). Owns a single sharedcluster_centersparameter to avoid duplicate optimizer updates. Providesinitialize_centers_from_embeddingsfor Phase-2 DCEC-style fine-tuning: pre-train with MSE-only (Phase 1) then fine-tune with this loss to maintain PSNR while improving latent cluster structure. Supports both flat(B, D)and spatial(B, C, H, W)latents vialatent_reduction. - Added
conf/examples/autoencoder_clustering_phase2.yamlwith recommended hyperparameters for Phase-2 training. - Added user documentation at
website/docs/user-guide/autoencoder_clustering_losses.mdwith training protocol and metric monitoring guide. - Added
ClusterCentersWarmStartCallback(custom_callbacks/cluster_centers_warm_start_callback.py): runs once aton_train_start, iterates the training dataloader, collects latent embeddings (muorz), fits K-Means, and initialisesClusteringAwareVAELoss.cluster_centersbefore the first training epoch. No-op ifpl_module.loss_functionis not aClusteringAwareVAELoss. Supports flat and spatial latents. Added correspondingClusterCentersWarmStartCallbackConfigdataclass registered in Hydra ConfigStore undercallbacks/cluster_centers_warm_start.
Training Callbacks
- Added
PatienceWarmupCallback(custom_callbacks/training_callbacks.py): freezes the encoder aton_fit_startand unfreezes it when a monitored metric stops improving forpatienceconsecutive validation epochs. Works likeEarlyStoppingbut for encoder unfreezing — eliminates the need to guess a fixedwarmup_epochsvalue. Supportsmode="min"(loss) ormode="max"(e.g. silhouette),min_deltathreshold, andmin_epochsguard. Callspl_module.set_encoder_trainableso compatible withModelandVariationalAutoencoderModel. AddedPatienceWarmupCallbackConfigdataclass registered in Hydra ConfigStore undercallbacks/patience_warmup.
CI / Coverage
- Fixed
.codecov.ymlby movingafter_n_buildsundercodecov.notify, matching Codecov's current schema so repository YAML validation succeeds. - Added focused coverage tests so
model_loader/domain_adaptation_model.py,model_loader/variational_autoencoder_model.py,custom_models/moe_layers.py,custom_models/transformer_adapters.py, andcustom_models/upernet_moe.pyreport 100% line coverage in the fast unit suite.
Autoencoder Latent Metrics
- Added
AutoencoderLatentClusteringCallbackfor validation/test epoch diagnostics of autoencoder encoder spaces. It reuses the framework's PyTorchMiniBatchKMeansbackend and TorchMetrics clustering functions so embeddings, cluster labels, and scores stay on GPU when training uses GPU. GenericAutoencodernow exposesencode(x)and reuses it inforward(), allowing deterministic autoencoders to report latent metrics without duplicating encoder logic.- Autoencoder latent clustering is configured through
callbacks:instead of a model-specific top-levellatent_metricsblock, keeping diagnostic orchestration out ofAutoencoderModelandVariationalAutoencoderModel. The callback accumulates validation/test latents, logs Calinski-Harabasz and Davies-Bouldin by default, optionally logs Dunn and PyTorch Silhouette, and logs ARI/NMI when a configured batch label key is available. VAEs use posteriormuby default, withvae_latent: zavailable for sampled latents. - Added
AutoencoderLatentClusteringCallbackConfig, user documentation,conf/examples/autoencoder_latent_clustering.yaml, and focused tests covering device preservation, spatial latent reduction, optional supervised labels, existingMiniBatchKMeansreuse, CUDA execution, and callback integration.
Sliding-Window Full-Image Test Evaluation
- Added
SlidingWindowCore(tools/inference/sliding_window.py): pure tensor-in / tensor-out sliding-window inference engine usingpytorch_toolbeltImageSlicer/TileMerger. Supports three tile blending modes (mean,pyramid,gaussian), TTA with all D4 dihedral augmentations (or any subset), MC Dropout withn_mc_samplesstochastic passes, and all four combinations of those modes. Returns aSlidingWindowOutputdataclass withprediction, optionaltta_uncertainty(per-pixel std across TTA passes), and optionalmc_uncertainty(per-pixel entropy or mutual information from MC Dropout). - Added
FullImageSegmentationDataset(dataset_loader/dataset.py): subclass ofSegmentationDatasetthat appends an"image_path"key to each sample soModel.test_stepcan georeference prediction output back to the source raster. Designed for use withbatch_size=1when images have variable spatial dimensions. - Extended
Model(model_loader/model.py) with sliding-window test mode: setuse_sliding_window_test: truein the model config to activate full-image inference duringtrainer.test(). New methods_test_step_sliding_window,_build_sw_core, and_save_test_predictionhandle per-image inference, lazySlidingWindowCoreconstruction fromcfg.sliding_window_test, and optional GeoTIFF export viarasterio. Metrics now usetorchmetrics.update()per step andcompute()inon_test_epoch_endso IoU/F1 is computed over complete images rather than averaged over tiles — correct for DDP as well. - Added
SlidingWindowTestConfigdataclass (config_definitions/sliding_window_test_config.py) with HydraConfigStoreregistration under groupsliding_window_test/default. Covers allSlidingWindowCoreparameters plusoutput_dirfor GeoTIFF export.
Training Callbacks
- Added
FinalMetricsCallback: saves all epoch-averaged metrics and losses from the last training epoch to a JSON file viatrainer.callback_metrics. Relativeoutput_pathvalues resolve againsttrainer.log_dirso the file lands alongside TensorBoard/CSV logs. A second hook (on_test_end) merges test-set metrics into the same file without overwriting train/val entries. Safe for multi-GPU training via@rank_zero_only. Exported fromcustom_callbacksand usable as a Hydra callback with_target_: pytorch_segmentation_models_trainer.custom_callbacks.FinalMetricsCallback.
Image Callbacks
- Added
shuffle_indices_seedparameter (defaultNone) toImageSegmentationResultCallback(and all subclasses via inheritance, includingAutoencoderResultCallback). When set,AutoencoderResultCallbacksamples a reproducible random subset of validation indices instead of always visualizing the first N rows. The seed is applied once per epoch vianumpy.random.RandomState, so the shown subset is stable across epochs. WhenNone, behaviour is unchanged (first N samples).
Decoder Upsampling Modes
- Added
upsample_modeparameter (default"bilinear", fully backward-compatible) toProgressiveDecoder,GenericDecoder,GenericAutoencoder, andGenericVariationalAutoencoder. Three modes available:"bilinear": existing behaviour,F.interpolateornn.Upsample+ conv refinement."transposed_conv": learnableConvTranspose2d-based upsampling; supported by bothGenericDecoderandProgressiveDecoder. EachProgressiveDecoderstage uses a 4×4 / stride-2 transposed conv;GenericDecoderuses a single stride-scale_factorkernel."pixel_shuffle": sub-pixel convolution (Conv2d→PixelShuffle(2)) per stage; supported byProgressiveDecoderonly (single-shot channel expansion ofout_channels × scale_factor²is impractical inGenericDecoder, which raisesValueErrorinstead).
- Added module-level
_make_upsample_block(mode, ch_in, ch_out)factory ingeneric_autoencoder.pythat constructs a 2× upsamplingnn.Sequentialfor any supported mode. - Updated
GenericVariationalAutoencoderConfigwithupsample_mode: str = "bilinear". - Added tests for all three modes: shape contracts, gradient flow, dtype preservation, and error guards for invalid/unsupported modes.
ProgressiveDecoder
- Added
ProgressiveDecodertocustom_models/generic_autoencoder.py: multi-stage convolutional decoder that doubles spatial resolution at each step with two conv+ReLU layers, replacing the single bilinear interpolation ofGenericDecoder. Supports the sameoutput_activationoptions (None,"sigmoid","tanh") and validates thatscale_factoris a power of 2. - Wired
use_progressive_decoderparameter (defaultFalse) intoGenericAutoencoderandGenericVariationalAutoencoder. Set toTrueto swap inProgressiveDecoder; also exposedbase_channels(default 128) andmin_channels(default 32) for channel schedule control. - Updated
GenericVariationalAutoencoderConfigdataclass withuse_progressive_decoder,base_channels, andmin_channelsfields for Hydra/YAML configuration. - Added 22 tests in
tests/test_generic_autoencoder.pycovering standalone shape contracts (scale factors 2–32), activation bounds, non-power-of-2 and invalid-activation guards, gradient flow, and end-to-end integration with bothGenericAutoencoderandGenericVariationalAutoencoder.
KL Annealing
- Added
start_afterparameter (default0) toKLAnnealingCallback. When set to a positive integer, beta is held atmin_betafor the firststart_aftersteps (or epochs whenuse_epochs=True) before the annealing ramp begins. The ramp then spans exactlyannealing_stepsunits starting fromstart_after.KLAnnealingCallbackConfigexposes the same field. Zero preserves previous behaviour unchanged.
VAE Loss
- Added
smooth_l1reconstruction mode toVariationalAutoencoderLossusingF.smooth_l1_loss(Huber loss). Newsmooth_l1_betaparameter (default0.1) controls the L2-to-L1 transition threshold and is independent of the KLbetaweight. The term is governed byreconstruction_weightlike all other reconstruction modes. - Added
ms_ssimandsmooth_l1_ms_ssimreconstruction modes toVariationalAutoencoderLossusing the existing KorniaMS_SSIMLossconfigured as pure MS-SSIM by default (ms_ssim_alpha=1.0,ms_ssim_compensation=1.0). The combined mode exposessmooth_l1_weight,ms_ssim_weight,ms_ssim_data_range, and MS-SSIM kernel parameters, and logssmooth_l1_loss,ms_ssim_loss,weighted_smooth_l1_loss, andweighted_ms_ssim_lossalongside the existing VAE loss components.
Image Callbacks
- Fixed
ImageSegmentationResultCallback.on_validation_epoch_endignoringlog_every_k_epochs— it now skips visualization on non-matching epochs, consistent withAutoencoderResultCallbackandEnhancedImageSegmentationResultCallback. - Fixed
ImageSegmentationResultCallback.on_validation_epoch_endmutatingself.n_samplesas side-effect — replaced with local variable. - Fixed
FrameFieldResultCallback,FrameFieldOverlayedResultCallback, andPolygonRNNResultCallbackignoringlog_every_k_epochs— all now respect the frequency setting. - Fixed
AutoencoderResultCallbackcallingval_dataloader()twice — cached in a local variable. - All callbacks now warn (
logger.warning) and cap gracefully whenn_samples > len(val_ds), instead of silently showing fewer images without explanation. - Fixed duplicate filenames when multiple crops of the same image are logged:
save_plot_to_disknow acceptssample_idxand embeds it as_idx{i}in the filename; TensorBoard tags also include the index. - Added 5 new tests:
test_save_plot_to_disk_includes_sample_idx,test_autoencoder_callback_n_samples_capped_with_warning,test_autoencoder_callback_filename_includes_idx,test_log_every_k_epochs_skips_non_matching_epochs,test_autoencoder_log_every_k_epochs_respected.
Reproducibility / Seed
set_training_seednow delegates topytorch_lightning.seed_everythinginstead of callingrandom.seed/np.random.seed/torch.manual_seeddirectly. This ensuresPL_GLOBAL_SEEDis set so DDP-spawned subprocesses inherit the seed automatically.deterministic_cudnn=Truenow also callstorch.use_deterministic_algorithms(True), matching the behaviour ofTrainer(deterministic=True)and covering all ops, not just CuDNN.train()passesdeterministic=Trueto theTrainerwhencfg.deterministic_cudnnisTrue, completing the Lightning integration.- Added tests:
test_sets_pl_global_seed,test_pl_global_seed_uses_seed32,test_trainer_gets_deterministic_true_when_deterministic_cudnn,test_trainer_no_deterministic_when_cudnn_false; updatedtest_deterministic_cudnn_trueandtest_deterministic_cudnn_false_by_defaultto also assert ontorch.are_deterministic_algorithms_enabled().
KL Annealing (VAE)
- Added
KLAnnealingCallback(custom_callbacks/kl_annealing_callback.py): PyTorch Lightning callback that gradually increases the KL weight (beta) inVariationalAutoencoderLossduring training. Supports three schedules —linear,cosine, andcyclical. Can operate step-based (default) or epoch-based viause_epochs. Logs the current beta to TensorBoard asscheduler/kl_betaon every update. - Added
KLAnnealingCallbackConfigdataclass toconfig_definitions/autoencoder_config.pywith Hydra ConfigStore registration undergroup="callbacks",name="kl_annealing". - Extended
VariationalAutoencoderLoss.forward()to return two additional keys —weighted_reconstruction_lossandweighted_kl_loss— representing the actual contribution of each term to the total ELBO loss. These are automatically logged to TensorBoard alongside the existingreconstruction_lossandkl_losskeys. - Added example config
conf/examples/vae_with_kl_annealing.yamldemonstrating cosine KL annealing over 5000 steps with a free-reconstruction warmup (min_beta=0,max_beta=1). - Added 20 unit tests in
tests/test_kl_annealing.pycovering schedule shapes, hook routing, beta clamping, TensorBoard logging, and config validation. - Added
free_bits: float = 0.0parameter toVariationalAutoencoderLoss. When positive, clamps each latent spatial position's KL to at leastfree_bitsnats before averaging — blocking the gradient for collapsed positions so they are not over-regularised while the decoder still gets reconstruction gradients through them. Prevents posterior collapse without requiring careful beta tuning. Recommended: 0.1–0.5 nats for spatial VAEs. - Added
kl_balance: bool = Falseparameter toVariationalAutoencoderLoss. WhenTrue, scales the KL term by(C × H × W) / (Cz × Hz × Wz)so reconstruction and KL are proportional to the same total-information budget (matching the theoretical ELBO). Particularly useful whenencoder_depthis high (e.g. depth=5 yields a ~24× ratio for 224×224 inputs). The rawkl_losslogging key is unaffected; onlyweighted_kl_lossandlossreflect the scaling. - Updated
conf/examples/vae_with_kl_annealing.yamlto demonstratefree_bits: 0.25andkl_balance: truealongside cosine KL annealing.
Bug fixes
- Fixed
smooth_l1_ms_ssimoccasionally returning tiny negative losses for near-identical reconstructions. Kornia's pure MS-SSIM branch can produce1 - MS_SSIM < 0at floating-point precision when similarity is slightly above1.0; the MS-SSIM loss component is now clamped at zero while preserving normal positive values. - Fixed
VariationalAutoencoderModelandAutoencoderModelnot computing or logging YAML-configuredmetrics(e.g. PSNR, SSIM) during training/validation. Both_shared_stepoverrides now extract the reconstruction tensor and call theMetricCollectionwhen present. - Fixed
AutoencoderResultCallback(and baseImageSegmentationResultCallback) producing sepia-tinted images when the dataset usesalbumentations.ToFloatinstead ofalbumentations.Normalize. Root cause:normalized_input=Truewas the default, so ImageNet denormalization (x * std + mean) was applied to[0, 1]images that were never mean/std-normalized, compressing the range and adding an unequal warm channel shift. Fix: (1) addednormalized_input: falseto theAutoencoderResultCallbackingeneric_variational_autoencoder_random_crop_folder.yaml; (2) addednp.clip(0, 1)after denormalization inprepare_image_to_plotto prevent out-of-range values from causing rendering artifacts.
Autoencoder / VAE stability
- Added
output_activationparameter toGenericDecoder,GenericAutoencoder, andGenericVariationalAutoencoder. Supported values:None(default, unchanged behaviour),"sigmoid"(output bounded to[0, 1], correct for uint8/255 targets),"tanh"(output bounded to[-1, 1]). Without this, the unbounded decoder logits caused PSNR to go negative whenever MSE > 1. - Fixed KL divergence scaling in
VariationalAutoencoderLoss: replacedtorch.sum(...) / batch_sizewithtorch.mean(...)so the KL term is a per-element average, matching the scale ofF.mse_lossand preventing the KL from dominating the total loss with large spatial latents. - Added
logvar_clampparameter toGenericVariationalAutoencoder. Clamping log-variance before exponentiation prevents fp16 overflow inexp(logvar)and eliminates numerically negative KL values seen withprecision="16". Recommended value:(-4.0, 4.0)for fp16 training. - Updated
GenericVariationalAutoencoderConfigto exposeoutput_activationandlogvar_clamp.
Tooling CLI
- Added
pytorch-smt-toolsentry point as a generic CLI tooling hub for command-line utilities. - Added
compute-statssubcommand (pytorch-smt-tools compute-stats <yaml>) that instantiates the training dataset defined in a YAML config, streams all samples through a DataLoader to compute per-channel mean and standard deviation, and writes the results back to the same YAML file. - The command inserts an
albumentations.Normalizeentry (with the computed mean/std and the correctmax_pixel_valuefor the dataset'simage_dtype) into theaugmentation_listof every dataset key (train_dataset,val_dataset,test_dataset) found in the YAML. - If the YAML contains image-visualization callbacks (e.g.
AutoencoderResultCallback,ImageSegmentationResultCallback), the command also updates theirnorm_paramsand setsnormalized_input: true, eliminating the need to copy-paste statistics manually. - Added
--dry-run,--skip-callbacks,--dataset-key,--batch-size, and--num-workersflags. - Added
click>=8.0.0andruamel.yaml>=0.18.0as project dependencies (ruamel.yaml preserves YAML comments and formatting on round-trip). - Added unit tests in
tests/test_compute_dataset_stats.pywith 100% coverage. - Changed
compute-statsoutput format: mean/std values are now written to a top-levelnormalization_parameters: {mean: [...], std: [...]}key, and thealbumentations.Normalizeentry and callbacknorm_paramsreference them via Hydra interpolation (${normalization_parameters.mean},${normalization_parameters.std}) instead of inlining the values directly. This makes it easy to override both at once from the command line or a child config without touching every dataset key.
Dataset
- Added
WindowedImageDatasetfor deterministic sliding-window (grid) patch extraction from rasters without requiring pre-generated masks. - Added
WindowedImageAutoencoderDatasetspecifically for Autoencoder validation/testing, yielding(image, target)pairs whereimagecan be optionally corrupted whiletargetremains clean. - Added
IterableWindowedImageDatasetandIterableWindowedImageAutoencoderDataset, which shard whole source rasters across DataLoader workers to avoid concurrent reads from the same GeoTIFF. - Both datasets support global indexing across multiple images of varying sizes using efficient binary search (bisect).
- Added
verify_windowsandwindow_index_cacheto windowed image datasets so unreadable raster windows can be excluded from indexing during initialisation and the verified window coordinates can be reused on later runs. - Added
serialize_rasterio_reads,rasterio_lock_dir, andreopen_rasterio_on_readto windowed image and random-crop raster datasets so DataLoader workers can serialize reads from the same compressed GeoTIFF on shared storage and optionally avoid persistent GDAL handles. - Fixed
RasterPatchDatasetlinting by defining its module logger before the window-read error path uses it. - Added example configuration
conf/examples/windowed_image_autoencoder.yaml. - Added unit tests in
tests/test_windowed_datasets.pywith 100% coverage.
Version 1.2.0 - 2026-05-11
Tests
- Alcançados 100% de test coverage para os módulos
pytorch_segmentation_models_trainer/tools/inference/inference_csv_builder.pyepytorch_segmentation_models_trainer/tools/evaluation/csv_builder.py. - Adicionados novos arquivos de testes unitários:
tests/test_inference_csv_builder.py,tests/test_csv_builder.pyetests/test_image_processing_worker.py. - Aumentada a cobertura global dos módulos
inferenceeevaluationatravés de testes de casos de borda e fluxos de erro. - Achieved 100% test coverage for all files in the
pytorch_segmentation_models_trainer/config_definitions/directory. - Created individual test files for all configuration dataclasses:
test_coco_dataset_config.py,test_dataset_config.py,test_dataset_distillation_config.py,test_edl_config.py,test_evaluation_config.py,test_experiments_runner_config.py,test_fine_tuning_config.py,test_inference_config.py,test_loss_config_definition.py,test_mc_dropout_config.py,test_predict_config.py,test_tools_config_def.py, andtest_train_config.py.
Bug fixes
- Fixed
TrainConfigdataclass inconfig_definitions/train_config.pywhich had an invaliddefault_factoryforcallbacks(was a list instead of a callable) and a problematicdefault_factoryforpl_modelandmodel(was callingModel()without required arguments). - Fixed
LossParamsConfiginconfig_definitions/loss_config_definition.pywhereseg_loss_paramshad an incorrect type hint (SegParamsConfiginstead ofSegLossParamsConfig). - Fixed
LossWeightConfiginconfig_definitions/loss_config_definition.pyto useAnyfor theweightfield, resolving anOmegaConflimitation withUnion[float, List[float]].
Dataset
- Added Apache Parquet support for all datasets inheriting from
AbstractDataset. - Implemented an automatic caching mechanism that converts
.csvmetadata files to.cache.parqueton the first run. Subsequent runs read the Parquet file if the CSV has not been modified, significantly improving metadata loading speed and memory efficiency. - Metadata files (e.g.,
input_csv_path) can now be provided directly as.parquetfiles. - Added a CLI tool
csv-to-parquetfor manual conversion of CSV datasets to Parquet (supports single files and recursive directory conversion). - Integrated
pyarrowas a new dependency. - Added unit tests for Parquet reading, caching logic, and CLI tool in
tests/test_dataframe_utils.py.
Experiments Runner
- Added
ExperimentsRunnerclass (tools/experiments_runner/experiments_runner.py) that runs successive training experiments in series, each with an isolated seed and output directory. - Seeds can be specified explicitly (
seeds: [42, 101, 28]) or generated automatically at runtime by supplying onlyn_runs: 5. Providing both values is accepted when they are consistent; a conflict raises aValueErrorwith a clear message. - Each run receives its seed via the existing
set_training_seed()mechanism (Pythonrandom, NumPy, PyTorch CPU/CUDA, DataLoader workers), and writes checkpoints and logs to<output_base_dir>/run_<idx:02d>_seed<seed>/— the seed is visible directly in the filesystem path. - Wall-clock training time and all Lightning
callback_metrics(train/*,val/*,test/*) are captured per run, including test metrics when atest_datasetblock is present. - When
save_summary: true(default), asummary.csvis updated incrementally after each run with per-run rows plus aggregatedmeanandstdrows for all metrics and the training duration. - After every completed run a
runner_state.jsonis written tooutput_base_dir. Setresume: truein the config to skip already-completed runs on restart; auto-generated seeds are loaded from the state file so they remain stable across restarts. - If a
logger:block is present in the training config, the runner stamps the run identity into it:versionis set to"run_<idx:02d>_seed<seed>"for TensorBoard/CSV loggers;nameis appended with the same tag for WandB-style loggers. - Added
ExperimentsRunnerConfigdataclass inconfig_definitions/experiments_runner_config.pywith newresumefield, registered in Hydra's ConfigStore. - Added new dispatch mode
run-experimentsinmain.py. - Added example configuration
conf/examples/experiments_runner.yaml(UNet / ResNet-34, 3 fixed seeds). - Added user documentation
website/docs/user-guide/experiments_runner.md. - Added 51 unit tests in
tests/test_experiments_runner.pycovering validation, seed resolution, config mutation, logger injection, metric collection, state file persistence, resume logic, and the fullrun()integration with a mocked trainer.
Dataset Distillation
- Added
dataset_distillation.pyutilities for dataset distillation using Coreset of Medoids (Optimal Quantization). - Implemented
extract_all_latentsfor high-throughput embedding extraction from trained Autoencoders. - Implemented
find_coreset_medoidsusing K-Means and L2 distance (torch.cdist) to find representative real samples closest to cluster centroids. - Added
KMeansClusteringToolfor orchestrating DDOQ pipelines, including OOM-safe Medoid search and Voronoi weight calculation. - Implemented DDOQ (Dataset Distillation by Optimal Quantization) with variance reduction heuristics (Square Root) and a toggle for Vanilla density weights.
- Added
DDOQDistilledDatasetwhich returns the triplet(image, mask, weight), supporting both hard-labels and teacher-generated soft-labels. - Added
StudentSegmentationModel(pl.LightningModule) implementing the DDOQ weighted loss:min_theta sum(w * Loss(x, y, theta)). - Added support for Adaptive K search via the Elbow Method (Perpendicular Distance) in
kmeans_calculator.py. - Added utilities to save and load DDOQ results (indices and weights).
- Added
DatasetDistillationConfigHydra dataclass and registered it in the ConfigStore. - Added example configuration
conf/examples/dataset_distillation.yaml. - Added user documentation in
website/docs/user-guide/dataset_distillation.md.
GPU K-Means
- Added
MiniBatchKMeansPyTorch implementation for high-performance clustering on GPU. - Added
KMeansClusteringToolfor orchestrating clustering pipelines with GeoPandas and PostGIS support. - Supports K-Means++ initialization and efficient mini-batch updates for large datasets.
- Added GeoParquet and PostGIS export capabilities for clustered spatial data.
Autoencoder
- Added
GenericVariationalAutoencoderwith SMP/HuggingFace encoder support, posteriormu/logvarprojections, and the reparameterization trick for differentiable latent sampling. - Added
VariationalAutoencoderLoss, a composite reconstruction-plus-analytic-KL objective supporting MSE, L1, and BCE-with-logits reconstruction terms. - Added
VariationalAutoencoderModelto log total, reconstruction, and KL losses across train, validation, and test steps. - Added Hydra dataclasses, API/user documentation, and a complete
AutoencoderRandomCropDatasetVAE example config with random horizontal, vertical, and transpose mirror flips. - Moved image-only datasets to
dataset_loader/image_dataset.py(ImageDataset,CSVWindowedImageDataset,TiledInferenceImageDataset,AutoencoderDataset, andAutoencoderRandomCropDataset), while keeping lazy compatibility exports fromdataset_loader.datasetfor existing configs. - Added
AutoencoderRandomCropDatasetfor self-supervised reconstruction from unlabeled image folders or CSV-backed full-size rasters. It discovers images recursively, supports deterministic train/validation splits, rasterio windowed random crops, selected bands, dtype handling, and input-only corruption augmentations. - Added Hydra dataclasses and a folder-based example config for random-crop autoencoder training.
- Added
GenericAutoencodermodel supporting SMP and Transformers encoders. - Added
AutoencoderModelLightningModule for image reconstruction tasks. - Added
AutoencoderDatasetfor self-supervised learning and reconstruction. - Added
AutoencoderResultCallbackfor side-by-side visualization of input and reconstructed images during validation. - Added example configuration
conf/examples/generic_autoencoder.yaml. - Updated documentation in
website/docs/user-guide/generic_autoencoder.md.
Version 1.1.0 - 2026-04-28
Bug fixes
- Fixed
tests/test_build_mask.py::Test_BuildMask::test_build_output_dirs_raises_exception: changed the output path to be a sub-directory of the input path in the test, ensuring the path validation logic inbuild_destination_dirsis correctly triggered and the expected exception is raised. - Fixed
tests/test_configs/predict.yamlHydra composition: added the@_global_package override to thetrain_config_used_in_predict_testdefault inclusion. This ensures the configuration fields are merged into the root scope, allowingtrain_dataset.input_csv_pathto be successfully overridden during prediction tests. - Fixed
NameError: name 'DictConfig' is not definedindataset_loader/dataset.py(load_augmentation_object):DictConfigwas used but not imported fromomegaconf, causing all augmentation loading to silently fall back to raw OmegaConf objects and crash withalbumentations >= 2.xwhenA.Composetried to access.available_keyson them. - Fixed
Model._unpack_batch(model_loader/model.py) to respect theimage_keyandmask_keyconfig fields (was hardcoded to"image"/"mask"). Also propagated the fix to_shared_stepso training/validation steps honour custom keys end-to-end. - Fixed
ValueError: prefetch_factor option could only be specified in multiprocessinginModel.train_dataloader,val_dataloader, andtest_dataloader: whennum_workers=0,prefetch_factoris now set toNoneas required by PyTorch's DataLoader API. - Fixed
MCDropoutInferenceProcessor(tools/inference/mc_dropout_inference_processor.py) to save uncertainty rasters with suffix_mc_uncertaintyinstead of the generic_uncertaintyfrom the parent class, matching the test expectation and making it distinguishable from TTA uncertainty maps. - Fixed
merge_lora_weights(fine_tuning/lora_utils.py) to useis_peft_model()for the PEFT check rather than a localisinstanceguard; this makes the function testable without a real PEFT installation and consistent withis_peft_model. - Fixed
TimmEncoderWithSMPDecoder(custom_models/timm_models.py) for SMP 0.5.0 compatibility: replaced removeduse_batchnormkwarg withuse_norminUnetDecoder.__init__, and changed the decoderforwardcall to pass features as a single list instead of splatted positional arguments (bothUnetDecoderandFPNDecodernow useforward(features: list)in SMP 0.5.x). - Updated
test_caches.py::TestClassPresenceCacheAutoSave::test_auto_save_json_structureto account for the_configmetadata key now present in the class-presence cache JSON, filtering it out before counting data entries. - Fixed
Model._shared_step(model_loader/model.py) to apply_prepare_preds_for_metricsbefore computing train/val metrics: binary models output[B, 1, H, W]but torchmetrics expects[B, H, W], causing a shape mismatchRuntimeErrorduring training. The same fix was applied to the per-class IoU path. - Fixed
test_frame_field_model.py::_make_ff_batch:class_freqwas built astorch.ones(B)(1-D) butcompute_seg_loss_weightsinbase_loss.pysums overdim=1, requiring[B, C]. Also added the missinggt_crossfield_angle: torch.zeros(B, 1, H, W)key, required bycompute_gt_fieldwhencompute_crossfield: true. - Fixed
mod_polymapper.pyvalidation metric logging: addednumel() > 0guard to skip empty tensors (produced when no polygons are detected infast_dev_run), and added.float()before.mean()to handle Long-dtype metric values, preventingValueError: tensor must have a single elementandRuntimeError: mean() dtype. - Fixed
test_polygonizer.py::test_polygonizer_acm_processor: replacedgeopandas.testing.geom_almost_equals(fixed 5×10⁻⁷ m tolerance, too strict for GeoJSON coordinate rounding) with a 1 cm tolerance wrapper; regenerated theacm_polygonizer.geojsonbaseline because the ACM algorithm produces different vertex coordinates with newer shapely/geopandas. - Fixed
custom_callbacks/image_callbacks.py:CombinedLoader.loaderswas renamed toCombinedLoader.iterablesin PyTorch Lightning ≥ 2.x; added agetattrfallback so the callback works on both old and new PL versions. Also fixed the follow-upDataLoader.loader.datasetchain: PL ≥ 2.x exposes plainDataLoaderobjects initerables, so.datasetis now accessed directly with a fallback to the old.loader.datasetpath.
Installation & Environment
- Migrated to
uvas the primary project and dependency manager. - Added
uv syncas the recommended installation method inREADME.mdand documentation. - Updated project requirements to Python 3.12+ and PyTorch 2.0+ (Lightning 2.4+).
- Updated website documentation (
website/docs/getting-started/installation.md) with comprehensiveuvinstallation guide and updated troubleshooting for CUDA 11.8.
Reproducibility (Training Seed)
- Added
seed: Optional[int]anddeterministic_cudnn: boolfields to theTrainConfigdataclass (config_definitions/train_config.py). Both default toNone/Falseso all existing configs are fully backward-compatible. - Added
set_training_seed(seed, deterministic_cudnn=False)utility function (utils/seed_utils.py). A single call seeds all randomness sources before model or dataset creation: Pythonrandom, NumPynp.random,torch.manual_seed,torch.cuda.manual_seed_all, andPYTHONHASHSEED. Optionally setstorch.backends.cudnn.deterministic = Trueandtorch.backends.cudnn.benchmark = False. Returns atorch.Generatorseeded with the same value for use in DataLoaders. - Modified
train()(train.py) to callset_training_seedas the very first operation whencfg.seedis present, ensuring model weight initialisation (SMP, timm, HuggingFace, custom architectures) is also reproducible. - Modified
_worker_init_fn(dataset_loader/dataset.py) to additionally seed Python'srandommodule alongside NumPy. Since PyTorch setstorch.initial_seed() = global_seed + worker_idper worker automatically, both seeds are deterministic and unique per worker once a global seed is set. - Modified
Model.train_dataloader,Model.val_dataloader, andModel.test_dataloader(model_loader/model.py) to pass atorch.Generator().manual_seed(cfg.seed)to eachDataLoaderwhencfg.seedis set, making the shuffle sampler sequence reproducible. AddedModel._make_dataloader_generator()helper method. - Added reference YAML config
conf/examples/reproducible_training.yamlwith inline comments explaining every field and listing all controlled randomness sources. - Added reproducibility block (commented) to
conf/examples/smp_mit_b2.yamlso users can enable it in one line. - Added user documentation
website/docs/user-guide/reproducibility.mdcovering theseedfield,deterministic_cudnntrade-offs, Python API usage, and known limitations.
MC Dropout (test-time uncertainty)
- Added
mc_dropout_utils.py(utils/mc_dropout_utils.py): three pure utility functions with no inference-framework dependency.enable_mc_dropout(model)sets allDropout/Dropout2d/Dropout3dlayers to train mode while leaving the rest of the model in eval mode (BatchNorm keeps running statistics, only dropout randomness is re-enabled).warn_if_no_dropout(model)emits aUserWarningif the model has no dropout layers — in that case all T samples are identical and uncertainty is zero.compute_uncertainty(samples, mode)accepts a[T, B, C, H, W]tensor of softmax probabilities and returns[B, 1, H, W]uncertainty:"entropy"computes predictive entropy of the mean distribution (total uncertainty);"mutual_information"computes BALD — the difference between the entropy of the mean and the mean of the individual entropies (epistemic uncertainty only). - Added
MCDropoutInferenceProcessor(tools/inference/mc_dropout_inference_processor.py): extendsMultiClassInferenceProcessorwith MC Dropout inference. Overridespredict_and_mergeto runn_samplesstochastic forward passes per tile (after callingenable_mc_dropout), average the softmax probabilities for the class prediction, and — only whenexport_uncertainty_map=True— compute and merge the per-pixel uncertainty map. Uncertainty tensors are only allocated when requested, keeping the no-uncertainty path free of extra memory cost. Overridesprocessto skip the striped inference path (which uses a single TileMerger and cannot merge uncertainty); a warning is logged for large images. Exports{stem}_mc_uncertainty.tif(float32 single-band GeoTIFF, range[0, log C]) alongside the segmentation output whenexport_uncertainty_map=True. - Modified
MultiClassInferenceProcessor(tools/inference/inference_processors.py): added three new parametersexport_uncertainty_map(defaultFalse),uncertainty_mode(default"entropy"), andoutput_uncertainty_dir(defaultNone). Whentta_modeis set andexport_uncertainty_map=True, per-sample softmax probabilities are kept during the TTA loop and used to compute uncertainty viacompute_uncertainty(); the result is merged via a separateTileMergerand exported as{stem}_uncertainty.tif. The no-uncertainty path (default) is fully unchanged. Added_save_uncertainty_rasterhelper for writing the float32 GeoTIFF with source CRS and transform preserved. - Added
MCDropoutInferenceProcessorConfig(config_definitions/mc_dropout_config.py): Hydra dataclass registered in the ConfigStore undergroup="inference_processor",name="mc_dropout". Fields:n_samples,uncertainty_mode,export_uncertainty_map,num_classes,model_input_shape,step_shape,output_uncertainty_dir. - Added reference YAML config
mc_dropout_inference.yamlunderconf/examples/with inline comments explaining thedecoder_dropoutrequirement and theexport_uncertainty_mapflag.
Evidential Deep Learning (EDL)
- Added
EvidentialWrapper(custom_models/edl_wrapper.py): architecture-agnostic wrapper that replaces Softmax with a Dirichlet parameterisation (evidence = Softplus(logits),alpha = evidence + 1). Works with any model that returns[B, K, H, W]tensors including SMP, HuggingFace, timm, and custom models. Forward pass returns a dict withlogits,evidence,alpha,probs, anduncertaintykeys. Handles tuple, dict, and plain-tensor model outputs automatically. - Added
EvidentialMSELossandEvidentialKLLoss(custom_losses/edl_loss.py): two-component EDL loss designed for use with the existingCompoundLoss/MultiLossepoch-weight scheduling mechanism.EvidentialMSELosscomputes the MSE integrated analytically over the Dirichlet distribution (bias² + variance terms).EvidentialKLLosscomputesKL[Dir(α̃) || Dir(1,...,1)]after removing evidence for the correct class, penalising residual wrong-class evidence. KL annealing is expressed as a dynamic weight list in the YAML — no custom scheduler needed. - Added
edl_utils.py(custom_losses/edl_utils.py): analytic helpers for Dirichlet statistics:one_hot_encode(hard + soft label support, ignore_index handling),dirichlet_strength,epistemic_uncertainty(u = K/S),dirichlet_kl_divergence,kl_divergence_to_uniform,edl_kl_regulariser. - Modified
Model._shared_stepto detectEvidentialWrapperoutput (dict with"alpha"key):probsis extracted for metrics and logging; mean uncertaintyedl/{prefix}_uncertaintyis logged automatically at each step. The change is backward-compatible — non-EDL models are unaffected. - Added
EvidentialWarmupCallback(custom_callbacks/edl_callbacks.py): three-phase encoder freeze schedule for fine-tuning. Phase 1 (epoch < warmup_epochs): encoder frozen. Phase 2 (warmup_epochs ≤ epoch < partial_unfreeze_epoch): last two encoder stages unfrozen. Phase 3 (epoch ≥ partial_unfreeze_epoch): full encoder unfrozen. Settingfreeze_encoder=False(training from scratch) disables all freezing while preserving the logging hooks. - Added
EvidentialUncertaintyVisualizationCallback(custom_callbacks/edl_callbacks.py): logs a 4-column diagnostic grid[input | predicted class | uncertainty map | ground truth]every N validation epochs. Compatible with TensorBoard, WandB, and plain file-system fallback. - Added
EvidentialInferenceProcessor(tools/inference/edl_inference_processor.py): extendsSingleImageInfereceProcessorwith separateTileMergerinstances for probabilities and uncertainty. Exports a single-band float32 GeoTIFF of the uncertainty map (u = K/S) viasave_uncertainty_raster, preserving source CRS and transform exactly. Optional alpha band export viaexport_alpha=True. - Added
edl_config.py(config_definitions/edl_config.py): Hydra dataclass configs forEvidentialWrapper,EvidentialMSELoss,EvidentialKLLoss,EvidentialWarmupCallback,EvidentialUncertaintyVisualizationCallback, andEvidentialInferenceProcessor, all registered in the ConfigStore. - Added reference YAML configs
edl_from_scratch.yamlandedl_finetune.yamlunderconf/examples/with inline comments explaining every field.
RandomCropSegmentationDataset
- Added
RandomCropSegmentationDataset: reads large GeoTIFF images on-the-fly using rasterio windowed reads instead of pre-generating tiles on disk. Eliminates the disk-space overhead of a tile library and allows crop size, augmentation, and sampling strategy to be changed without reprocessing data. - Per-worker LRU cache (
_RasterioLRUCache) keeps a configurable number of openDatasetReaderhandles (lru_cache_size, default64) to avoid repeated rasterio open/close overhead. class_balanced_sampling: weights image selection by inverse class frequency so images containing rare classes are sampled more often. Computed once at dataset initialisation from mask histograms in the CSV.- Class-aware
CutMix(cutmix_prob,cutmix_alpha): pastes a rectangular region from a second crop chosen to maximise class diversity. ClassMix(classmix_prob): copies a randomly selected class region from a second image and pastes it onto the primary crop; particularly effective for rare classes.soft_labelsmode: returns float masks in[0, 1]for label-noise and probabilistic annotation workflows. The_shared_steptraining loop detects soft labels automatically and uses the appropriate loss path.grid_mode/grid_step: switches from random crop positions to a deterministic sliding-window grid for reproducible validation coverage and pseudo-labelling.configure_optimizersaccounts for grid mode when computingsteps_per_epochforOneCycleLR.- Added
RandomCropSegmentationDatasetConfigdataclass.
Mixture of Experts Models
- Added
UPerNetMoE(custom_models/upernet_moe.py): UPerNet variant that replaces fusion and/or FPN convolutions withMoEConv2dReLUblocks. Supportstoken_choice(each token picks top-k experts) andexpert_choice(each expert picks top-k tokens) routing, configurable noise injection, capacity factor, and an optional shared dense expert. Load-balancing auxiliary loss is automatically detected and added to the training loss in_shared_step. - Added
UPerNetMEDoE(custom_models/upernet_medoe.py): extendsUPerNetMoEwith structured expert dropout during training (randomly drops a fraction of experts per forward pass) to improve regularisation and reduce reliance on any single expert._shared_stepautomatically logsextra/train_medoe_expert_utilizationandextra/train_medoe_expert_entropywhen a MEDoE model is detected.
Dual-Head Training
- Added
UPerNetDualHead(custom_models/upernet_dual_head.py): UPerNet with two independent decoders sharing a single encoder. Head A is supervised with hard labels (integer class indices); Head B is supervised with soft labels (float probabilities). A consistency loss couples the two heads during training.inference_headcontrols which head is active at inference time:"A","B", or"average"(default).
TTA improvements
- Added
tta_modecompact interface: passingtta_mode: "d4"ortta_mode: "flip"to any inference processor or totest_stepautomatically selects the corresponding augmentation preset (d4= all 8 D4 symmetries;flip= 4 flip/rotation combinations), without listing augmentations explicitly. SingleImageInfereceProcessornow acceptstta_modeas an alias foruse_tta=True+ the corresponding augmentations list. Backward-compatible with the existinguse_tta+tta_augmentationsinterface.MultiClassInferenceProcessornow acceptstta_mode(replaces the previoususe_ttaparameter on that class)._get_tta_augmentations()inModelcheckscfg.tta_modefirst; falls back tocfg.use_tta+cfg.tta_augmentationsfor backward compatibility.
MultiClassInferenceProcessor improvements
- Added striped inference (
make_inference_striped): very large images are automatically split into horizontal stripes when pixel count exceedsstriped_threshold_pixels(default 50 MP), processed in parallel viaThreadPoolExecutor, and reassembled in-memory. Stripe height is configurable viastripe_height(default 4096 px). - Added confidence map output:
confidence_mode("max_prob"or"entropy") computes per-pixel confidence scores alongside the class prediction. Saved tooutput_probs_dirwhen provided. - Added
tile_weightparameter (passed through toAbstractInferenceProcessor) to control how overlapping tile predictions are merged. process()override: automatically routes each image to the striped or standard inference path based on image size.
Callbacks
- Added
EMACallback: maintains an exponential moving average of model weights (configurabledecay). During validation the shadow EMA weights are swapped in so validation metrics reflect the averaged model; the original weights are restored immediately after. Checkpoints saved during a validation epoch contain the EMA weights. - Added
MixStyleCallback: applies MixStyle feature-level domain augmentation via forward hooks on selected encoder stages.stagescontrols which encoder stage indices receive the hook;pandalphacontrol application probability and Beta distribution parameter respectively.
LR Warmup
- Added
warmup_epochssupport inhyperparameters: a linear LR warmup is prepended to any scheduler (exceptOneCycleLR, which has its own warmup viapct_start). The framework automatically subtractswarmup_epochsfromT_max(or equivalent period parameters) so the total scheduled duration remains correct.
Inference pipeline
predict_from_batch.pyupdated: wheninference_processoris present in the config, batch prediction is routed throughinstantiate_inference_processor, enabling sliding-window, striped, and TTA inference modes. The legacytrainer.predict()path is preserved for backward compatibility.InferenceProcessorConfigandPredictSingleImageConfigupdated with new fields:tta_mode,tile_weight,confidence_mode,striped_threshold_pixels,stripe_height,output_probs_dir.
Domain Adaptation — DANN with Gradient Reversal
- Added
GradientReversalFunctionandGradientReversalLayer(domain_adaptation/methods/gradient_reversal.py): atorch.autograd.Functionimplementing the identity forward pass with gradient negation in backward, wrapped in a parameter-freenn.Modulewith aset_lambda()method. - Added
DomainClassifier(domain_adaptation/methods/dann.py): resolution-agnostic MLP domain classifier usingAdaptiveAvgPool2d(1)so it works with any encoder spatial size. - Added
DANNMethod(domain_adaptation/methods/dann.py): full DANN implementation viaBaseDomainAdaptationMethod. Features:requires_features=True, configurablefeature_layer, dedicateddiscriminator_lrparameter group, and GRL lambda initialized to 0 to avoid adversarial pressure at epoch 0. - Added
step_modeparameter toDANNMethod("epoch"or"batch"): controls the granularity at which the lambda schedule is applied."batch"mode updates λ every training step via the newon_train_batch_starthook, producing smoother growth closer to the original Ganin et al. implementation;"epoch"(default) updates once per epoch. - Added
_current_lambdaattribute toDANNMethod: caches the most recently computed lambda soDomainAdaptationModel._get_lambda_da()always returns the correct value regardless of update granularity. - Added
on_train_batch_startlifecycle hook toBaseDomainAdaptationMethod(no-op default) and forwarded it inDomainAdaptationModel.on_train_batch_start. - Updated
DomainAdaptationModel._get_lambda_da()to prefermethod._current_lambdawhen present, falling back to the epoch-level schedule query. - Refactored
BaseLambdaScheduler.get_lambdasignature from(epoch, total_epochs)to(step, total_steps)— the same formula, now granularity-agnostic. All scheduler subclasses (ConstantScheduler,LinearScheduler,DANNScheduler) updated accordingly. - Added full working example config (
conf/examples/dann_domain_adaptation.yaml) for U-Net ResNet-34 with DANN. - Added test suite for GRL (
tests/test_gradient_reversal.py, 17 tests) and DANN (tests/test_dann_method.py, 49 tests including 12 newTestDANNMethodStepModetests). - Added dedicated DANN user guide (
website/docs/advanced/dann-method.md) covering mechanism,in_channelslookup table, full config reference,step_modeguidance, monitoring, and limitations. - Updated
website/docs/advanced/domain-adaptation.mdwith a "Built-in Methods" section linking to the DANN guide. - Updated
website/docs/advanced/domain-adaptation-implementing-methods.mdExample 2 to use the built-inGradientReversalLayerand add a callout pointing to the dedicated DANN guide.
Domain Adaptation — Initial structure
- Added initial domain adaptation module (
domain_adaptation/) with an extensible base class (BaseDomainAdaptationMethod), feature hook extraction (feature_hooks.py), adaptation schedulers (schedulers.py), and a monitoring callback (callbacks/monitor_callback.py). - Added
DomainAdaptationModel(inheritsModel) that orchestrates source/target dataloaders, adaptation loss weighting, and per-epoch scheduler stepping. - Added
DomainAdaptationConfigdataclass inconfig_definitions/domain_adaptation_config.py. - Added comprehensive test suite for domain adaptation (
test_base_method.py,test_domain_adaptation_config.py,test_domain_adaptation_model.py,test_feature_hooks.py,test_schedulers.py). - Added documentation: user guide (
advanced/domain-adaptation.md), config reference (advanced/domain-adaptation-config-reference.md), and implementing custom methods guide (advanced/domain-adaptation-implementing-methods.md).
Test Time Augmentation (TTA)
- Added
tools/tta/tta.pyimplementing TTA with all 8 symmetries of the D4 dihedral group (4 rotations × 2 flips). Each augmentation has an exact inverse so predictions are de-augmented and averaged with no spatial artifacts. - Added
apply_tta()helper for applying TTA to any segmentation model callable. - Exposed
use_ttaandtta_augmentationsfields on all inference processor classes andInferenceProcessorConfig/PredictSingleImageConfig.SingleImageFromFrameFieldProcessorautomatically skips thecrossfieldoutput during TTA de-augmentation. test_step()inModelnow applies TTA whencfg.use_tta=True.- Added TTA user guide (
website/docs/advanced/tta.md) and inference documentation section.
Transformer & Foundation Model Support
- Added
HuggingFaceSegmentationWrapper(custom_models/huggingface_models.py): loads anyAutoModelForSemanticSegmentationfrom the Hub or local path, bypasses the HF processor, and upsamples logits back to input resolution. - Added
TimmEncoderWithSMPDecoder(custom_models/timm_models.py): combines atimmfeatures_onlybackbone with SMP UNet/FPN/PAN decoders. - Added
TerraTorchSegmentationWrapper(custom_models/terratorch_models.py): bridges TerraTorch foundation model encoders (Prithvi, Clay, SatMAE) with a linear or FPN segmentation head; supports single- and multi-temporal inputs. - Added
ModelOutputAdapter(custom_models/transformer_adapters.py): normalises any model output (HF dataclass, dict, tuple, or plain tensor) to a(B, C, H, W)tensor with optional bilinear upsampling. - Added LoRA / PEFT fine-tuning support (
fine_tuning/lora_utils.py):apply_fine_tuning_strategy()supportsfull,freeze_backbone,linear_probe, andlorastrategies;LoraAdapterConfigandFineTuningConfigdataclasses added;merge_lora_weights()for deployment. predict.pynow auto-merges LoRA adapter weights before inference; akeep_lora_adaptersflag skips the merge for fine-tuning resumption.- Hardened training loop in
Model:_unpack_batch()replaces fragilebatch.values()unpacking with configurableimage_key/mask_key;set_encoder_trainable()no longer assumes a.encoderattribute;_prepare_preds_for_metrics()guards metric calls against malformed outputs. - Added 6 example YAML configs:
smp_mit_b2.yaml,smp_tu_convnext.yaml,segformer_hf.yaml,segformer_lora.yaml,vit_linear_probe.yaml,prithvi_terratorch.yaml. - Added
[transformers]pip extras group and a dedicated CI job for the transformer test suite.
RasterPatchDataset
- Added
RasterPatchDataset(dataset_loader/raster_patch_dataset.py): scans image/mask directory pairs recursively and exposes everypatch_size × patch_sizewindow (with configurable stride) as an independent dataset item. Global index to(image, row, col)mapping runs in O(log N) viabisectover cumulative patch counts; rasterio windowed reads ensure full images never enter RAM. - Supports augmentations,
selected_bands,image_dtype,mask_extension,n_classes(binary binarisation whenn_classes=2), andreset_augmentation_function. - Emits
UserWarningfor orphaned mask files (mask without a corresponding image) to surface dataset misconfiguration. - Added
RasterPatchDatasetConfigdataclass and example YAML (conf/examples/raster_patch_segmentation.yaml).
Dataset improvements
- Added
test_datasetsupport withtest_step(),test_dataloader(), andtest_metrics(prefixedtest/) inModelandFrameFieldSegmentationPLModel.trainer.test()is now called automatically aftertrainer.fit()whentest_datasetis present in the config. This completes the three-way dataset split:train_dataset→ training loop;val_dataset→ per-epoch monitoring during fit;test_dataset→ final held-out evaluation after fit. - Added
test_datasetfield toTrainConfigdataclass andtest_datasetblock to all example YAML configs. - Added
SegmentationDatasetFromFolder: a new dataset class that discovers image/mask pairs recursively from two root folders, without requiring a CSV file. Matching is done by relative subfolder path and file stem. Supports all parameters ofSegmentationDataset. RaisesValueErrorwhen no valid pairs are found. SegmentationDataset.__init__now accepts an optionaldfparameter (pre-builtpd.DataFrame) in addition toinput_csv_path, enabling programmatic dataset creation without a CSV file on disk. Fully backwards-compatible.- Added configurable
image_dtypefield toSegmentationDataset,RandomCropSegmentationDataset, and their configs, acceptinguint8(default),uint16,float32, ornative. Auto-normalization scales correctly per dtype (/255,/65535, or no division). Fully backwards-compatible.
Inference improvements
- Added
normalize_max_valueparameter to all inference processor classes (AbstractInferenceProcessorand all subclasses), exposing Albumentations'max_pixel_valuefor the normalization step. DefaultNonepreserves the previous behaviour (255.0). Usenormalize_max_value: 65535.0for uint16 imagery or1.0for pre-normalized float32.
Bug fixes
- Fixed bug in
Model.__init__:gpu_val_transformandgpu_train_transformwere accessed viacfg.val_dataset/cfg.train_datasetwithout checking if those keys exist, causingAttributeErrorwhen the corresponding dataset config was omitted. - Fixed
val_dataloader()to returnNonegracefully whenval_dsisNone(i.e.val_datasetabsent from config), allowing training-only runs without a validation loop. - Removed the orphan
set_test_dataset()method fromFrameFieldSegmentationPLModel(superseded by the newtest_dsattribute set inModel.__init__).
Version 1.0.1
- Bug fix on prediction in a multi gpu environment;
Version 1.0.0
- Updated to newest pytorch lightning;
- Semantic Segmentation model updated for multi class;
- New image callback for multi class semantic segmentation;
- Added improvements for OneCycleLR scheduler (automatic calculation of steps_per_epoch);
- Added band selection on SemanticSegmentation dataset;
- Added compound_loss for Semantic Segmentation models;
- Added experiment evaluation pipeline;
- New multi class semantic segmentation inference pipeline (the old one worked only on binary semantic segmentation);
- Bug fixes on inferences;
Version 0.17.0
- New evaluation metrics;
- New evaluation on test set;
- New inference service with image upload;
- Bug fixes;
Version 0.16.4
- sahi version bump;
- Bug fix with parameter grid on image callbacks and mod polymapper.
Version 0.16.3
- Bug fixes on ModPolymapper training when some parts are frozen.
Version 0.16.2
- Bug fix on ModPolyMapper when choosing not to evaluate while training.
- Added the option of freezing some parts of ModPolyMapper.
Version 0.16.1
- Dependencies fix.
Version 0.16.0
- New Mod PolyMapper model;
- Matching methods added;
- Evaluation methods added;
Veresion 0.15.0
- New Naive Mod PolyMapper model (Object Detection + PolygonRNN);
- New Naive Mod Polymapper dataset;
- New callback: Frame Field Only Crossfield Warmup Callback;
- New inference processors for Object Detection and PolygonRNN;
- Bug fix on object detection model;
- Bug fix on bounding box mask building;
- Bug fix on polygon iou with invalid geometries;
- Minor code refactor;
Veresion 0.14.2
- Bug fix on PolygonRNN polygon tokenizer.
Veresion 0.14.1
- Bug fix on convert dataset;
- Bug fix on PolygonRNNDataset;
- Bug fix on PolygonRNNResultCallback when using gpu;
- Bug fix on PolygonRNNPLModel;
Version 0.14.0
- Vector IOU;
- Polis metric added;
- IoU added to PolygonRNN training loop;
- Object detection visualization callback added;
- PolygonRNN visualization callback added;
- Bug fix on polygon building on build mask geometry handling;
Version 0.13.1
- Bug fixes on SegLoss parameters;
Version 0.13.0
- Dataset conversion added. It is possible to convert between some formats of dataset;
- Tversky Loss and Focal Tversky Loss added;
- LabelSmoothingLoss added;
- MixUpAugmentationLoss added;
- KnowledgeDistillationLoss added;
- Mixup augmentation added to Frame Field Model;
Version 0.12.1
- Bug fixes on mask building;
- Bug fixes on detection model training.
- New mode on build masks;
Version 0.12.0
- Minor improvements on polygonization methods;
- Inference server added;
Version 0.11.0
- Gradient Centralization added;
Version 0.10.0
- Object Detection added;
- Instance Segmentation added;
Version 0.9.0
- PolygonRNN model added;
- Added the option of choosing the number of images on ImageCallback;
- Added the option of adding created masks to existing csv;
- Added the option of generating bounding boxes in create masks;
- Added the option of converting csv dataset to coco dataset;
Version 0.8.2
- Fixes on requirements;
Version 0.8.1
- Minor improvements and bug fixes on polygon building inference;
- Bug fixes on mask builder;
- Performance improvement on mask builder using coco format;
Version 0.8.0
- Added inference features;
- Improved polygon inference;
Version 0.7.2
- Changed the versions of pytorch and torchvision.
Version 0.7.1
- Added MANIFEST.in to include missing yml on pypi packaging.
Version 0.7.0
- Bug fix on loss sync;
- Custom models from Frame Field implementation (to compare training results);
- New HRNet-OCR-W48 backbone;
- Fixed bugs on new versions of pytorch-lightning;
- Build mask from COCO dataset format;
Version 0.6.0
- Polygon inference
- Unittests to Polygon inference;
- Bug fixes warmup callback (invalid signature on method);
- FrameFieldResultCallback renamed to FrameFieldOverlayedResultCallback;
- New implementation of FrameFieldResultCallback;
- Invalid mask handling (frame field training mask with only polygon mask and empty vertex and boundary masks);
- Added multiple schedulers option;
- Added IoU 10, 25, 50, 75 and 90;
- Added GPU augmentation using kornia;
Version 0.5.1
- Bug fixes when inputs are RGBA images;
- Bug fixes on frame field model with models other than U-Net;
- Bug fixes on FrameFieldResultCallback (all black image fixed).
Version 0.5.0
- Added frame field training image visualization callback.
Version 0.4.1
Bug fixes on missing entrypoints and mask process execution.
Version 0.4
Polygoniztion by Frame Field Learning features
- FrameField dataset
- FrameField Learning
- Polygonization
Version 0.3.2
Bug fixes on image callback when Pytorch Lightning DDP is used.
Version 0.3.1
Bug fixes when Pytorch Lightning DDP is used.
Version 0.3.0
- Custom metric option in the model config;
- pytorch_toolbelt added as required package. This enables usage of the models, losses and metrics in the training;
- Added the option of setting a limit of rows to be read in the csv dataset;
- Added the option of setting a root_dir to the dataset. This root_dir will be concatenated to the entry in the csv dataset before loading the image;
- Bug fixes on image_callback;
Version 0.2.1
Fixes relative path bug on dataset
Version 0.2.0
New custom callbacks:
- ImageSegmentationResultCallback: Callback that logs the results of the training on TensorBoard and on saved files; and
- WarmupCallback: Applies freeze weight on encoder during callback epochs and then unfreezes the weights after the warmup epochs.
Metrics added to Segmentation Model:
- Accuracy;
- Precision;
- Recall; and
- Jaccard Index (IoU).
Version 0.1.4
First version of metrics added.
Bug fixes on dataset reading with prefix path.
Version 0.1.3
Bug fix on entry points and --config-dir syntax.
Version 0.1.2
Bug fix on Python's version.
Minor bug fix
Bug fix.
First Release
Files
dsgoficial/pytorch_segmentation_models_trainer-v1.3.0.zip
Files
(78.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9eb52c479a5ee330946f4a9728264673
|
78.2 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/dsgoficial/pytorch_segmentation_models_trainer/tree/v1.3.0 (URL)