Handling DTS data is an iterative process, often requiring some adjustments to decisions like mapping before the continuous, automated processing can occur. To accommodate this work flow pyfocs includes several functions that can be used to give a quick view of the data.
In this notebook we will look at how some of these decisions, such as number of reference sections to use for calibration vs validation, mapping, and spotting potential issues in the data quality, and aligning sections.
import xarray as xr
import pyfocs
import os
dir_example = os.path.join('../tests/data/')
# Grab a configuration file for the twisted pair pvc fiber and for the stainless steel fiber
config_names = [
'example_configuration_steelfiber.yml',
'example_twistedpair_bothwls.yml',
'example_twistedpair_p1wls.yml',
'example_twistedpair_p2wls.yml',
]
cfg_fname = os.path.join(dir_example, config_names[0])
cfg_ss, lib_ss = pyfocs.check.config(cfg_fname, ignore_flags=True)
cfg_fname = os.path.join(dir_example, config_names[1])
cfg_both, lib_both = pyfocs.check.config(cfg_fname, ignore_flags=True)
cfg_fname = os.path.join(dir_example, config_names[2])
cfg_p1, lib_p1 = pyfocs.check.config(cfg_fname, ignore_flags=True)
cfg_fname = os.path.join(dir_example, config_names[3])
cfg_p2, lib_p2 = pyfocs.check.config(cfg_fname, ignore_flags=True)
calibration and unheated, that name segments of the LAF fiber and allow us to select pieces of the data.cal_temp, warmProbe, and coldProbe.cal_temp is the calibrated tempreature (as you might expect)warmProbe and coldProbe are the external reference PT100s that were named in the calibration section of the configuration file. The external data field has been merged into the DataSet.Attributes which specify properties of the fiber such a the method selected and whether the data are double-ended. The caibration and unheated attributes correspond to the new coordinates and let us know which sections are named in the data.Each fiber was calibrated individually with no reference to the others since they all have slightly different differential attenuations. We also have an example of calibrating both twisted-pairs simultaneously, leading to 4 types of calibrated files.
| Name | Validation Baths | Notes |
|---|---|---|
| both-wls | Cold far | Single-ended wls for both twisted pairs simultaneously |
| p1-wls | Cold far | Single-ended wls for only p1 |
| p2-wls | Cold far | Single-ended wls for only p2 |
| ss-wls | Cold far | Single-ended wls for only ss |
ds_both = xr.open_dataset(os.path.join(dir_example, 'multifiledemo', 'calibrated', 'multifiledemo_cal_channel 1_20190722-0000_both-wls.nc'))
ds_p1 = xr.open_dataset(os.path.join(dir_example, 'multifiledemo', 'calibrated', 'multifiledemo_cal_channel 1_20190722-0000_p1-wls.nc'))
ds_p2 = xr.open_dataset(os.path.join(dir_example, 'multifiledemo', 'calibrated', 'multifiledemo_cal_channel 1_20190722-0000_p2-wls.nc'))
ds_ss = xr.open_dataset(os.path.join(dir_example, 'multifiledemo', 'calibrated', 'multifiledemo_cal_channel 1_20190722-0000_ss-wls.nc'))
print(ds_both)
f = pyfocs.bias_violin(
ds_both,
cfg_both['calibration']['library'],
title='Both twisted pairs',
fig_kwargs={'figsize': (9, 4)},
plot_lims=[-1.5, 1.5])
f = pyfocs.bias_violin(
ds_p1,
cfg_p1['calibration']['library'],
title='p1 wls',
fig_kwargs={'figsize': (9, 4)},
plot_lims=[-1.5, 1.5])
f = pyfocs.bias_violin(
ds_p2,
cfg_p2['calibration']['library'],
title='p2 wls',
fig_kwargs={'figsize': (9, 4)},
plot_lims=[-1.5, 1.5])
f = pyfocs.bias_violin(
ds_ss,
cfg_ss['calibration']['library'],
title='ss wls',
fig_kwargs={'figsize': (9, 4)},
plot_lims=[-1.5, 1.5])
The both twisted pairs plot shows us why we want to calibrate each fiber separately. The over biases are larger, with the biases at the end of the fiber being the largest (pair 2's far baths). The spread of biases within a given reference section are also larger compared to calibrating each fiber separately.
The stainless steel fiber is mostly located after 2km of fiber so the noise and thus uncertainty in the baths, is larger.
f = pyfocs.bath_validation(
ds_both,
cfg_both['calibration']['library'],
title='Both twisted pairs',
fig_kwargs={'figsize': (9, 8)},
temp_field_name='cal_temp')
f = pyfocs.bath_validation(
ds_p1,
cfg_p1['calibration']['library'],
title='Twisted pair: p1',
fig_kwargs={'figsize': (9, 8)},
temp_field_name='cal_temp')
f = pyfocs.bath_validation(
ds_p2,
cfg_p2['calibration']['library'],
title='Twisted pair: p2',
fig_kwargs={'figsize': (9, 8)},
temp_field_name='cal_temp')
The bath validation type of plots are useful for verifying that a calibration is well-behaved within the reference sections by breaking out the biases into space and time components. These plots are largely similar to the plots as first implemented within dtscalibration.
The Twisted pair: p2 plot suggests that within this time slice that the coldfar_p2 and warmfar_p2 baths have some poor behavior due to an edge-effect.
For heated/unheated fibers it is necessary to line them up in space such that they represent the same physical locations. pyfocs includes a tool that automatically aligns these sections and interpolates them to a common index that replaces LAF. This enables each section to
1) Have equal number of points, necessary for the calibration option of temperature matching. Even though we aren't using it here it could be applied to the twisted-pair fiber as a calibration option.
2) Have points that line up in space, allowing for us to easily calculate other quantities such as wind speed, wind direction, gradients etc.
Aligning the section labels in these can be difficult, especially when needing to pair locations that have different names due to being different location types. To make it easier, the calibration library can be written in a stacked form. Here we use a stacked location library option, found in the dataProperties section:
# Default method is to list each location with a unique name. If we "stack"
# names then different location types can share section names names.
phys_locs_labeling: stacked
The location library then needs to be re-written using the below format.
library:
location_type_A:
non_unique_name:
LAF:
- val1
- val2
location_type_B:
non_unique_name:
LAF:
- val3
- val4
This is in contrast to the (default) unstacked version that looks like the below:
location_library:
unique_name_A:
long name: Outer Rim NE_1, Unheated
loc_type: location_type_A
LAF:
- val1
- val2
.
.
.
unique_name_B:
long name: Outer Rim NE_1, Heated
loc_type: location_type_B
LAF:
- val3
- val4
Here we can verify that the stacked locations we provided in the example all share all section names.
print(ds_ss.attrs['unheated'].split(';'))
print(ds_ss.attrs['heated'].split(';'))
The automatic alignment relies on a maximum cross-correlation. This method therefor has two major requirements:
1) We cannot use a time period when the fiber is heated. The factors that drive the heated fibers behavior (namely wind) do not impact the unheated fiber so the cross-correlation is nonsense.
2) We must use a longer time period in order to minimize the effect of instrument noise on defining a best shift for alignment.
To accommodate both of these requirements we provide the unheated demo data which is the time average of an hour of data from a period with no heating.
no_heat = xr.open_dataset(os.path.join(dir_example, 'unheateddemo', 'calibrated', 'unheateddemo_timeavg.nc'))
no_heat.cal_temp.plot()
print(no_heat)
No heating is applied to the stainless steel fiber in this period.
In this example we will map the heated location type onto the unheated location type. We verify that these two location types can be aligned with each other. This step isn't necessary but illustrative since the section_shift_x is called by the interpolation scheme.
# Physical location type 1
ploc1 = 'unheated'
# Physical location type 2
ploc2 = 'heated'
# The section label
label = 'OR_NE2'
# How far on either side of a section to look when doing the cross-correlation.
dl = 10
# What is the temperature field called?
temp_field = 'cal_temp'
# How many indices to look on either side of zero lag
lag = 50
# s1 and s2 have been aligned a new arbitrary dimension called 'x'
# using a shift given by `shift`
s1, s2, shift, _, _ = pyfocs.align.section_shift_x(
no_heat,
lib_ss,
ploc1,
ploc2,
label,
dl=dl,
temp_field='cal_temp',
plot_results=True)
print('Unheated location')
print(s1)
print('')
print('Heated location')
print(s2)
If the section limits were far off from each other we would have a much harder time aligning them. Since they represent mostly the same underlying data we can make them nicely line up on top of each other. x is a new quasi-LAF index that both datasets have been aligned to. However, note that they do not share exact values in this view but rather the sections are labeled by a similar relative distance variable.
In order to have the two datasets use the exact same spatial index we must interpolate them to a common value. To do this in pyfocs we need to take advantage of the location_matching section in pyfocs. Here is the relevant section from the stainless steel example:
# Default method is to list each location with a unique name. If we "stack"
# names then different locations can have different names.
phys_locs_labeling: stacked
# Section matching.
location_matching:
# Means that the location listed is mapped onto the location in `map_to`
# My intuition is that it is better to map other location types onto the
# unheated location or the location with the smallest LAFs (?)
heated:
map_to: unheated
# These are the fixed distances that `location` is shifted when
# interpolating to `map_to` in pyfocs. These need to be derived by
# running the `interp_section` outside of PyFOX.py on a longer period
# of data. Given in meters.
fixed_shift:
OR_NE1: -1.25
OR_NE2: -2.05
OR_NW: -1.55
OR_SW2: -1.3
OR_SW1: -1.8
OR_SE: -1.85
A couple requirements:
1) The location library must be stacked as explained above
2) A fixed_shift must be provided. If no fixed_shift is provided the interpolation will use whichever shift yields the best cross-correlation. Periods with noisy data or with paired heated and unheated fibers will yield a poor selection of the shift parameter creating nonsense coordinates.
To derive the fixed_shift we must first run the automatic alignment scheme outside of pyfocs. This method will return an adjusted library with the new LAF limits that can be used for aligning the two location types.
s1, s2, adj_lib = pyfocs.interp_section(
no_heat,
lib_ss,
ploc1,
ploc2,
label,
dl=10,
plot_results=True,
fig_kwargs={'figsize': (10, 10)})
The best shift given in the plot is the value that should be entered into the fixed_shift section of the location_matching. We choose to not return this value in order to force the user to verify that the interpolation functioned correctly as this step can and does go wrong. Finally, the adj_lib returned by the function should be used instead of the previous location library as it automatically found the closest LAF values to the interpolated section.