Labeled high-resolution orthoimagery time-series of an alluvial river corridor; Elwha River, Washington, USA.
Description
Labeled high-resolution orthoimagery time-series of an alluvial river corridor; Elwha River, Washington, USA.
Daniel Buscombe, Marda Science LLC
There are two datasets in this data release:
1. Model training dataset. A manually (or semi-manually) labeled image dataset that was used to train and evaluate a machine (deep) learning model designed to identify subaerial accumulations of large wood, alluvial sediment, water, and vegetation in orthoimagery of alluvial river corridors in forested catchments.
2. Model output dataset. A labeled image dataset that uses the aforementioned model to estimate subaerial accumulations of large wood, alluvial sediment, water, and vegetation in a larger orthoimagery dataset of alluvial river corridors in forested catchments.
All of these label data are derived from raw gridded data that originate from the U.S. Geological Survey (Ritchie et al., 2018). That dataset consists of 14 orthoimages of the Middle Reach (MR, in between the former Aldwell and Mills reservoirs) and 14 corresponding Lower Reach (LR, downstream of the former Mills reservoir) of the Elwha River, Washington, collected between the period 2012-04-07 and 2017-09-22. That orthoimagery was generated using SfM photogrammetry (following Over et al., 2021) using a photographic camera mounted to an aircraft wing. The imagery capture channel change as it evolved under a ~20 Mt sediment pulse initiated by the removal of the two dams. The two reaches are the ~8 km long Middle Reach (MR) and the lower-gradient ~7 km long Lower Reach (LR).
The orthoimagery have been labeled (pixelwise, either manually or by an automated process) according to the following classes (inter class in the label data in parentheses):
1. vegetation / other (0)
2. water (1)
3. sediment (2)
4. large wood (3)
1. Model training dataset.
Imagery was labeled using a combination of the open-source software Doodler (Buscombe et al., 2021; https://github.com/Doodleverse/dash_doodler) and hand-digitization using QGIS at 1:300 scale, rasterizeing the polygons, and gridded and clipped in the same way as all other gridded data. Doodler facilitates relatively labor-free dense multiclass labeling of natural imagery, enabling relatively rapid training dataset creation. The final training dataset consists of 4382 images and corresponding labels, each 1024 x 1024 pixels and representing just over 5% of the total data set. The training data are sampled approximately equally in time and in space among both reaches. All training and validation samples purposefully included all four label classes, to avoid model training and evaluation problems associated with class imbalance (Buscombe and Goldstein, 2022).
Data are provided in geoTIFF format. The imagery and label grids (imagery) are reprojected to be co-located in the NAD83(2011) / UTM zone 10N projection, and to consist of 0.125 x 0.125m pixels.
Pixel-wise labels measurements such as these facilitate development and evaluation of image segmentation, image classification, object-based image-analysis (OBIA), and object-in-image detection models, and numerous potential other machine learning models for the general purposes of river corridor classification, description, enumeration, inventory, and process or state quantification. For example this dataset may serve in transfer learning contexts for application in different river or coastal environments or for different tasks or class ontologies.
Files:
1. Labels_used_for_model_training_Buscombe_Labeled_high_resolution_orthoimagery_time_series_of_an_alluvial_river_corridor_Elwha_River_Washington_USA.zip, 63 MB, label tiffs
2. Model_training_ images1of4.zip, 1.5 GB, imagery tiffs
3. Model_training_ images2of4.zip, 1.5 GB, imagery tiffs
4. Model_training_ images3of4.zip, 1.7 GB, imagery tiffs
5. Model_training_ images4of4.zip, 1.6 GB, imagery tiffs
2. Model output dataset.
Imagery was labeled using a deep-learning based semantic segmentation model (Buscombe, 2023) trained specifically for the task using the Segmentation Gym (Buscombe and Goldstein, 2022) modeling suite. We use the software package Segmentation Gym (Buscombe and Goldstein, 2022) to fine-tune a Segformer (Xie et al., 2021) deep learning model for semantic image segmentation. We take the instance (i.e. model architecture and trained weights) of the model of Xie et al. (2021), itself fine-tuned on ADE20k dataset (Zhou et al., 2019) at resolution 512x512 pixels, and fine-tune it on our 1024x1024 pixel training data consisting of 4-class label images.
The spatial extent of the imagery in the MR is [455157.2494695878122002,5316532.9804129302501678 : 457076.1244695878122002,5323771.7304129302501678] (NAD83(2011) / UTM zone 10N). Imagery width is 15351 pixels and imagery height is 57910 pixels. The spatial extent of the imagery in the LR is [457704.9227139975992031,5326631.3750646486878395 : 459241.6727139975992031,5333311.0000646486878395] (NAD83(2011) / UTM zone 10N). Imagery width is 12294 pixels and imagery height is 53437 pixels. Data are provided in Cloud-Optimzed geoTIFF (COG) format. The imagery and label grids (imagery) are reprojected to be co-located in the NAD83(2011) / UTM zone 10N projection, and to consist of 0.125 x 0.125m pixels. All grids have been clipped to the union of extents of active channel margins during the period of interest.
Reach-wide pixel-wise measurements such as these facilitate comparison of wood and sediment storage at any scale or location. These data may be useful for studying the morphodynamics of wood-sediment interactions in other geomorphically complex channels, wood storage in channels, the role of wood in ecosystems and conservation or restoration efforts.
Files:
1. Elwha_MR_labels_Buscombe_Labeled_high_resolution_orthoimagery_time_series_of_an_alluvial_river_corridor_Elwha_River_Washington_USA.zip, 9.67 MB, label COGs from Elwha River Middle Reach (MR)
2. ElwhaMR_ imagery_ part1_ of_ 2.zip, 566 MB, imagery COGs from Elwha River Middle Reach (MR)
3. ElwhaMR_ imagery_ part2_ of_ 2.zip, 618 MB, imagery COGs from Elwha River Middle Reach (MR)
3. Elwha_LR_labels_Buscombe_Labeled_high_resolution_orthoimagery_time_series_of_an_alluvial_river_corridor_Elwha_River_Washington_USA.zip, 10.96 MB, label COGs from Elwha River Lower Reach (LR)
4. ElwhaLR_ imagery_ part1_ of_ 2.zip, 622 MB, imagery COGs from Elwha River Middle Reach (MR)
5. ElwhaLR_ imagery_ part2_ of_ 2.zip, 617 MB, imagery COGs from Elwha River Middle Reach (MR)
This dataset was created using open-source tools of the Doodleverse, a software ecosystem for geoscientific image segmentation, by Daniel Buscombe (https://github.com/dbuscombe-usgs) and Evan Goldstein (https://github.com/ebgoldstein). Thanks to the contributors of the Doodleverse!. Thanks especially Sharon Fitzpatrick (https://github.com/2320sharon) and Jaycee Favela for contributing labels.
References
• Buscombe, D. (2023). Doodleverse/Segmentation Gym SegFormer models for 4-class (other, water, sediment, wood) segmentation of RGB aerial orthomosaic imagery (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8172858
• Buscombe, D., Goldstein, E. B., Sherwood, C. R., Bodine, C., Brown, J. A., Favela, J., et al. (2021). Human-in-the-loop segmentation of Earth surface imagery. Earth and Space Science, 9, e2021EA002085. https://doi.org/10.1029/2021EA002085
• Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
• Over, J.R., Ritchie, A.C., Kranenburg, C.J., Brown, J.A., Buscombe, D., Noble, T., Sherwood, C.R., Warrick, J.A., and Wernette, P.A., 2021, Processing coastal imagery with Agisoft Metashape Professional Edition, version 1.6—Structure from motion workflow documentation: U.S. Geological Survey Open-File Report 2021–1039, 46 p., https://doi.org/10.3133/ofr20211039.
• Ritchie, A.C., Curran, C.A., Magirl, C.S., Bountry, J.A., Hilldale, R.C., Randle, T.J., and Duda, J.J., 2018, Data in support of 5-year sediment budget and morphodynamic analysis of Elwha River following dam removals: U.S. Geological Survey data release, https://doi.org/10.5066/F7PG1QWC.
• Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M. and Luo, P., 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34, pp.12077-12090.
• Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A. and Torralba, A., 2019. Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision, 127, pp.302-321.
Files
Elwha_LR_imagery_part1_of_2.zip
Files
(8.8 GB)
Name | Size | Download all |
---|---|---|
md5:0e95098907c4d52f8d9c4f679ba2db1e
|
621.8 MB | Preview Download |
md5:19c8182d73e62666ab5f75dee6998f77
|
616.6 MB | Preview Download |
md5:afb2a18f5f45f6ee2d6e768c39979202
|
11.0 MB | Preview Download |
md5:97775604389757683eff8683c4ac1c89
|
565.4 MB | Preview Download |
md5:f16801c5012e5a5a9e8df3720c47c77a
|
617.6 MB | Preview Download |
md5:d714bc189e83847dcfcbf4216df3cfcc
|
9.7 MB | Preview Download |
md5:e3fb69cc0da04c876fefff7584263490
|
63.0 MB | Preview Download |
md5:bc06e31bd6aadcf3e1598d94fcd8ae92
|
1.5 GB | Preview Download |
md5:4024b873ba3966f0591a0bc87a366d34
|
1.5 GB | Preview Download |
md5:03b94917faa5a57840934d519b6bf06d
|
1.7 GB | Preview Download |
md5:69e3e3bdb2c3104f11b1d49a4fc87aea
|
1.6 GB | Preview Download |
Additional details
Related works
- Is required by
- Dataset: https://zenodo.org/records/8172858 (Other)
- Requires
- Dataset: https://zenodo.org/records/8172858 (Other)
Dates
- Available
-
2023-11-18v1.0
References
- Buscombe, D. (2023). Doodleverse/Segmentation Gym SegFormer models for 4-class (other, water, sediment, wood) segmentation of RGB aerial orthomosaic imagery (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8172858
- Buscombe, D., Goldstein, E. B., Sherwood, C. R., Bodine, C., Brown, J. A., Favela, J., et al. (2021). Human-in-the-loop segmentation of Earth surface imagery. Earth and Space Science, 9, e2021EA002085. https://doi.org/10.1029/2021EA002085
- Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
- Over, J.R., Ritchie, A.C., Kranenburg, C.J., Brown, J.A., Buscombe, D., Noble, T., Sherwood, C.R., Warrick, J.A., and Wernette, P.A., 2021, Processing coastal imagery with Agisoft Metashape Professional Edition, version 1.6—Structure from motion workflow documentation: U.S. Geological Survey Open-File Report 2021–1039, 46 p., https://doi.org/10.3133/ofr20211039.
- Ritchie, A.C., Curran, C.A., Magirl, C.S., Bountry, J.A., Hilldale, R.C., Randle, T.J., and Duda, J.J., 2018, Data in support of 5-year sediment budget and morphodynamic analysis of Elwha River following dam removals: U.S. Geological Survey data release, https://doi.org/10.5066/F7PG1QWC.
- Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M. and Luo, P., 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34, pp.12077-12090.
- Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A. and Torralba, A., 2019. Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision, 127, pp.302-321.