There is a newer version of the record available.

Published December 31, 2022 | Version 0.18.0
Software Open

datalad/datalad: 0.18.0

  • 1. Institute of Neuroscience and Medicine, Brain & Behaviour (INM-7), Research Centre Jülich, Jülich, Germany and Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
  • 2. Dartmouth College, Hanover, NH, United States
  • 3. Institute of Neuroscience and Medicine, Brain & Behaviour (INM-7), Research Centre Jülich, Jülich, Germany
  • 4. University of Texas at Austin
  • 5. UC Berkeley - UCSF Graduate Program in Bioengineering
  • 6. UC Berkeley
  • 7. Stanford University, Stanford, CA, United States
  • 8. Psychoinformatics Lab, INM-7, Research Centre Juelich
  • 9. Maze Therapeutics, South San Francisco, CA, United States
  • 10. Potsdam Institute for Climate Impact Research (PIK) e. V.
  • 11. Université catholique de Louvain, Louvain la neuve, Belgium
  • 12. Institute of Energy and Climate Research - Stratosphere (IEK-7), Research Centre Jülich, Jülich, Germany

Description

💥 Breaking Changes

  • Automatic reconfiguration of the ORA special remote when cloning from RIA stores now only applies locally rather than being committed. PR #7235 (by @bpoldrack)
🚀 Enhancements and New Features
  • Saving removed dataset content was sped-up, and reporting of types of removed content now accurately states dataset for added and removed subdatasets, instead of file. Moreover, saving previously staged deletions is now also reported. PR #6784 (by @mih)

  • foreach-dataset command got a new possible value for the --output-streamns|--o-s option 'relpath' to capture and pass-through prefixing with path to subds. Very handy for e.g. running git grep command across subdatasets. PR #7071 (by @yarikoptic)

  • New config datalad.create-sibling-ghlike.extra-remote-settings.NETLOC.KEY=VALUE allows to add and/or overwrite local configuration for the created sibling by the commands create-sibling-<gin|gitea|github|gitlab|gogs>. PR #7213 (by @matrss)

  • The siblings command does not concern the user with messages about inconsequential failure to annex-enable a remote anymore. PR #7217 (by @bpoldrack)

  • ORA special remote now allows to override its configuration locally. PR #7235 (by @bpoldrack)

  • Added a 'ria' special remote to provide backwards compatibility with datasets that were set up with the deprecated ria-remote. PR #7235 (by @bpoldrack)
🐛 Bug Fixes
  • When create-sibling-ria was invoked with a sibling name of a pre-existing sibling, a duplicate key in the result record caused a crashed. Fixes #6950 via PR #6952 (by @adswa)
📝 Documentation
  • create-sibling-ria's docstring now defines the schema of RIA URLs and clarifies internal layout of a RIA store. PR #6861 (by @adswa)

  • Move maintenance team info from issue to CONTRIBUTING. PR #6904 (by @adswa)

  • Describe specifications for a DataLad GitHub Action. PR #6931 (by @thewtex)

  • Fix capitalization of some service names. PR #6936 (by @aqw)

  • Command categories in help text are more consistently named. PR #7027 (by @aqw)

  • DOC: Add design document on Tests and CI. PR #7195 (by @adswa)

  • CONTRIBUTING.md was extended with up-to-date information on CI logging, changelog and release procedures. PR #7204 (by @yarikoptic)

🏠 Internal
  • Use looseversion.LooseVersion as drop-in replacement for distutils.version.LooseVersion Fixes #6307 via PR #6839 (by @effigies)

  • Use --pathspec-from-file where possible instead of passing long lists of paths to git/git-annex calls. Fixes #6922 via PR #6932 (by @yarikoptic)

  • Make clone_dataset() better patchable ny extensions and less monolithic. PR #7017 (by @mih)

  • Remove simplejson in favor of using json. Fixes #7034 via PR #7035 (by @christian-monch)

  • Fix an error in the command group names-test. PR #7044 (by @christian-monch)

  • Move eval_results() into interface.base to simplify imports for command implementations. Deprecate use from interface.utils accordingly. Fixes #6694 via PR #7170 (by @adswa)

🏎 Performance
  • Use regular dicts instead of OrderedDicts for speedier operations. Fixes #6566 via PR #7174 (by @adswa)

  • Reimplement get_submodules_() without get_content_info() for substantial performance boosts especially for large datasets with few subdatasets. Originally proposed in PR #6942 by @mih, fixing #6940. PR #7189 (by @adswa). Complemented with PR #7220 (by @yarikoptic) to avoid O(N^2) (instead of O(N*log(N)) performance in some cases.

  • Use --include=* or --anything instead of --copies 0 to speed up get_content_annexinfo. PR #7230 (by @yarikoptic)

🧪 Tests
  • Reenable two now-passing core test on Windows CI. PR #7152 (by @adswa)

  • Remove the with_testrepos decorator and associated tests for it Fixes #6752 via PR #7176 (by @adswa)

Breaking Changes
  • Move all old-style metadata commands aggregate_metadata, search, metadata and extract-metadata, as well as the cfg_metadatatypes procedure and the old metadata extractors into the datalad-deprecated extension. Now recommended way of handling metadata is to install the datalad-metalad extension instead. Fixes #7012 via PR #7014
Internal Enhancements and New Features
  • A repository description can be specified with a new --description option when creating siblings using create-sibling-[gin|gitea|github|gogs]. Fixes #6816 via PR #7109 (by @mslw)

  • Make validation failure of alternative constraints more informative. Fixes #7092 via PR #7132 (by @bpoldrack)

Files

datalad/datalad-0.18.0.zip

Files (1.7 MB)

Name Size Download all
md5:956a9c1d0cd8e802fb962fd2a7f7cc82
1.7 MB Preview Download

Additional details

Related works

Funding

CRCNS US-German Data Sharing: DataGit - converging catalogues, warehouses, and deployment logistics into a federated 'data distribution' 1429999
U.S. National Science Foundation