Published December 12, 2023 | Version 0.5.0
Software Open

eqcorrscan/EQcorrscan: EQcorrscan 0.5.0

  • 1. Victoria University of Wellington
  • 2. Lawrence Berkeley National Lab
  • 3. University of Bergen
  • 4. National Institute for Occupational Safety and Health
  • 5. International Institute of Earthquake Engineering and Seismology
  • 6. Gitter
  • 7. @Snyk
  • 8. Karlsruhe Institute of Technology - Geophysical Institute

Description

This release represents a significant increase in efficiency in large-scale matched-filters in EQcorrscan. Lots of work has gone in to reducing memory usage in the non-correlation components of the matched-filter workflow, streamlining the code, making better use of shared memory multi-threaded parallelism and increasing CPU loads. In our testing we can now achieve and maintain >190% CPU efficiency (e.g. >95% hyperthreaded performance). We can also better load GPUs by making use of concurrent CPU and GPU processing of workflow steps. You should not need to change your code to make use of most of these speed-ups. Hopefully you will notice that you can run larger datasets faster than even!

Changelog

  • core.match_filter.tribe
    • Significant re-write of detect logic to take advantage of parallel steps (see #544)
    • Significant re-structure of hidden functions.
  • core.match_filter.matched_filter
    • 5x speed up for MAD threshold calculation with parallel (threaded) MAD calculation (#531).
  • core.match_filter.detect
    • 1000x speedup for retrieving unique detections for all templates.
    • 30x speedup in handling detections (50x speedup in selecting detections, 4x speedup in adding prepick time)
  • core.match_filter.template
    • new quick_group_templates function for 50x quicker template grouping.
    • Templates with nan channels will be considered equal to other templates with shared nan channels.
    • New grouping strategy to minimise nan-channels - templates are grouped by similar seed-ids. This should speed up both correlations and prep_data_for_correlation. See PR #457.
  • utils.pre_processing
    • _prep_data_for_correlation: 3x speedup for filling NaN-traces in templates
    • New function ``quick_trace_select` for a very efficient selection of trace by seed ID without wildcards (4x speedup).
    • process, dayproc and shortproc replaced by multi_process. Deprecation warning added.
    • multi_process implements multithreaded GIL-releasing parallelism of slow sections (detrending, resampling and filtering) of the processing workflow. Multiprocessing is no longer supported or needed for processing. See PR #540 for benchmarks. New approach is slightly faster overall, and significantly more memory efficeint (uses c. 6x less memory than old multiprocessing approach on a 12 core machine)
  • utils.correlate
    • 25 % speedup for _get_array_dicts with quicker access to properties.
  • utils.catalog_to_dd
    • _prepare_stream
      • Now more consistently slices templates to length = extract_len * samp_rate so that user receives less warnings about insufficient data.
    • write_correlations
      • New option use_shared_memory to speed up correlation of many events by ca. 20 % by moving trace data into shared memory.
      • Add ability to weight correlations by raw correlation rather than just correlation squared.
  • utils.cluster.decluster_distance_time
    • Bug-fix: fix segmentation fault when declustering more than 46340 detections with hypocentral_separation.

Files

eqcorrscan/EQcorrscan-0.5.0.zip

Files (111.7 MB)

Name Size Download all
md5:371068df7ed8363051da5af27ce70093
111.7 MB Preview Download

Additional details

Related works