Python Dts Calibration
Python Dts Calibration
Release 0.6.3
1 Overview 1
1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Learn by examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Installation 3
3 Usage 5
4 Learn by Examples 7
4.1 1. Load your first measurement files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.2 2. Common DataStore functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 3. Define calibration sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 4. Calculate variance of Stokes and anti-Stokes measurements . . . . . . . . . . . . . . . . . . . . . 13
4.5 5. Calibration of double ended measurement with OLS . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.6 6. Calibration of double ended measurement with OLS . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.7 7. Calibration of single ended measurement with WLS and confidence intervals . . . . . . . . . . . . 23
4.8 8. Calibration of double ended measurement with WLS and confidence intervals . . . . . . . . . . . 30
4.9 9. Import a time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.10 10. Align double ended measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Reference 41
5.1 dtscalibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6 Contributing 53
6.1 Bug reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Documentation improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.3 Feature requests and feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.4 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7 Authors 55
8 Changelog 57
8.1 0.6.3 (2019-04-03) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2 0.6.2 (2019-02-26) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.3 0.6.1 (2019-01-04) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.4 0.6.0 (2018-12-08) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
i
8.5 0.5.3 (2018-10-26) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.6 0.5.2 (2018-10-26) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.7 0.5.1 (2018-10-19) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.8 0.4.0 (2018-09-06) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.9 0.2.0 (2018-08-16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.10 0.1.0 (2018-08-01) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
ii
CHAPTER 1
Overview
docs
tests
package
citable
Example notebooks
A Python package to load raw DTS files, perform a calibration, and plot the result
• Free software: BSD 3-Clause License
1.1 Installation
1
dtscalibration, Release 0.6.3
Interactively run the example notebooks online by clicking the launch-binder button.
1.3 Documentation
https://python-dts-calibration.readthedocs.io/
2 Chapter 1. Overview
CHAPTER 2
Installation
3
dtscalibration, Release 0.6.3
4 Chapter 2. Installation
CHAPTER 3
Usage
import dtscalibration
5
dtscalibration, Release 0.6.3
6 Chapter 3. Usage
CHAPTER 4
Learn by Examples
import os
import glob
../../tests/data/double_ended2
for fn in filenamelist:
print(fn)
7
dtscalibration, Release 0.6.3
channel 1_20180328014052498.xml
channel 1_20180328014057119.xml
channel 1_20180328014101652.xml
channel 1_20180328014106243.xml
channel 1_20180328014110917.xml
channel 1_20180328014115480.xml
Define in which timezone the measurements are taken. In this case it is the timezone of the Silixa Ultima computer
was set to ‘Europe/Amsterdam’. The default timezone of netCDF files is UTC. All the steps after loading the raw files
are performed in this timezone. Please see www..com for a full list of supported timezones. We also explicitely define
the file extension (.xml) because the folder is polluted with files other than measurement files.
ds = read_silixa_files(directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
The object tries to gather as much metadata from the measurement files as possible (temporal and spatial coordinates,
filenames, temperature probes measurements). All other configuration settings are loaded from the first files and stored
as attributes of the DataStore.
print(ds)
<dtscalibration.DataStore>
Sections: ()
Dimensions: (time: 6, x: 1693)
Coordinates:
* x (x) float64 -80.5 -80.38 -80.25 ... 134.3 134.4 134.5
filename (time) <U31 'channel 1_20180328014052498.xml' ... 'channel
˓→1_20180328014115480.xml'
import os
First we load the raw measurements into a DataStore object, as we learned from the previous notebook.
ds = read_silixa_files(
directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
The implemented read routines try to read as much data from the raw DTS files as possible. Usually they would have
coordinates (time and space) and Stokes and anti Stokes measurements. We can access the data by key. It is presented
as a DataArray. More examples are found at http://xarray.pydata.org/en/stable/indexing.html
ds['TMP'].plot(figsize=(12, 8));
The first argument is the dimension. The function is taken along that dimension. dim can be any dimension (e.g.,
time, x). The returned DataStore does not contain that dimension anymore.
Normally, you would like to keep the attributes (the informative texts from the loaded files), so set keep_attrs to
True. They don’t take any space compared to your Stokes data, so keep them.
Note that also the sections are stored as attribute. If you delete the attributes, you would have to redefine the sections.
4.2.3 2 Selecting
What if you would like to get the maximum temperature between 𝑥 >= 20 m and 𝑥 < 35 m over time? We first have
to select a section along the cable.
section_of_interest_max = section_of_interest.max(dim='x')
What if you would like to see what the values on the first timestep are? We can use isel (index select)
section_of_interest = ds.isel(x=0)
We currently have measurements at 3 time steps, with 30.001 seconds inbetween. For our next exercise we would like
to down sample the measurements to 2 time steps with 47 seconds inbetween. The calculated variances are not valid
anymore. We use the function resample_datastore.
So we have measurements every 0.12 cm starting at 𝑥 = 0 m. What if we would like to change our coordinate system
to have a value every 12 cm starting at 𝑥 = 0.05 m. We use (linear) interpolation, extrapolation is not supported. The
calculated variances are not valid anymore.
x_old = ds.x.data
x_new = x_old[:-1] + 0.05 # no extrapolation
ds_xinterped = ds.interp(coords={'x': x_new})
import numpy as np
time_old = ds.time.data
time_new = time_old + np.timedelta64(10, 's')
ds_tinterped = ds.interp(coords={'time': time_new})
The goal of this notebook is to show how you can define calibration sections. That means that we define certain parts
of the fiber to a timeseries of temperature measurements. Here, we assume the temperature timeseries is already part
of the DataStore object.
import os
First we have a look at which temperature timeseries are available for calibration. Therefore we access ds.
data_vars and we find probe1Temperature and probe2Temperature that refer to the temperature mea-
surement timeseries of the two probes attached to the Ultima.
Alternatively, we can access the ds.timeseries_keys property to list all timeseries that can be used for calibra-
tion.
print(ds.timeseries_keys) # list the available timeseeries
ds.probe1Temperature.plot(figsize=(12, 8)); # plot one of the timeseries
˓→'userAcquisitionTimeBW']
/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/.tox/docs/lib/
˓→python3.6/site-packages/pandas/plotting/_converter.py:129: FutureWarning: Using an
A calibration is needed to estimate temperature from Stokes and anti-Stokes measurements. There are three unknowns
for a single ended calibration procedure 𝛾, 𝐶, and 𝛼. The parameters 𝛾 and 𝛼 remain constant over time, while 𝐶 may
vary.
At least two calibration sections of different temperatures are needed to perform a decent calibration procedure.
This setup has two baths, named ‘cold’ and ‘warm’. Each bath has 2 sections. probe1Temperature is the
temperature timeseries of the cold bath and probe2Temperature is the temperature timeseries of the warm bath.
Name sec- Name reference temperature time se- Number of sec- Location of sections
tion ries tions (m)
Cold bath probe1Temperature 2 7.5-17.0; 70.0-80.0
Warm bath probe2Temperature 2 24.0-34.0; 85.0-95.0
Sections are defined in a dictionary with its keywords of the names of the reference temperature time series. Its values
are lists of slice objects, where each slice object is a section.
Note that slice is part of the standard Python library and no import is required.
sections = {
'probe1Temperature': [slice(7.5, 17.), slice(70., 80.)], # cold bath
'probe2Temperature': [slice(24., 34.), slice(85., 95.)], # warm bath
}
ds.sections = sections
ds.sections
NetCDF files do not support reading/writing python dictionaries. Internally the sections dictionary is stored in ds.
_sections as a string encoded with yaml, which can be saved to a netCDF file. Each time the sections dictionary is
requested, yaml decodes the string and evaluates it to the Python dictionary.
The goal of this notebook is to estimate the variance of the noise of the Stokes measurement. The measured Stokes
and anti-Stokes signals contain noise that is distributed approximately normal. We need to estimate the variance of the
noise to: - Perform a weighted calibration - Construct confidence intervals
import os
%matplotlib inline
ds = read_silixa_files(
directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
And we define the sections as we learned from the previous notebook. Sections are required to calculate the variance
in the Stokes.
sections = {
'probe1Temperature': [slice(7.5, 17.), slice(70., 80.)], # cold bath
'probe2Temperature': [slice(24., 34.), slice(85., 95.)], # warm bath
}
ds.sections = sections
print(ds.variance_stokes.__doc__)
Parameters
----------
reshape_residuals
st_label : str
label of the Stokes, anti-Stokes measurement.
E.g., ST, AST, REV-ST, REV-AST
sections : dict, optional
Define sections. See documentation
Returns
-------
I_var : float
Variance of the residuals between measured and best fit
resid : array_like
Residuals between measured and best fit
Notes
-----
Because there are a large number of unknowns, spend time on
calculating an initial estimate. Can be turned off by setting to False.
The variance of the Stokes signal along the reference sections is approximately 8.
˓→181920419777416 on a 2.0 sec acquisition time
fig_handle = plot.plot_residuals_reference_sections(
residuals,
sections,
title='Distribution of the noise in the Stokes signal',
plot_avg_std=I_var ** 0.5,
plot_names=True,
robust=True,
units='',
method='single')
/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/.tox/docs/lib/
˓→python3.6/site-packages/numpy/lib/nanfunctions.py:1628: RuntimeWarning: Degrees of
keepdims=keepdims)
/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/.tox/docs/lib/
˓→python3.6/site-packages/xarray/core/nanops.py:161: RuntimeWarning: Mean of empty
˓→slice
The residuals should be normally distributed and independent from previous time steps and other points along the
cable. If you observe patterns in the residuals plot (above), it might be caused by: - The temperature in the calibration
bath is not uniform - Attenuation caused by coils/sharp bends in cable - Attenuation caused by a splice
import scipy
import numpy as np
sigma = residuals.std()
mean = residuals.mean()
x = np.linspace(mean - 3*sigma, mean + 3*sigma, 100)
approximated_normal_fit = scipy.stats.norm.pdf(x, mean, sigma)
residuals.plot.hist(bins=50, figsize=(12, 8), density=True)
plt.plot(x, approximated_normal_fit);
We can follow the same steps to calculate the variance from the noise in the anti-Stokes measurments by setting
st_label='AST and redo the steps.
A double ended calibration is performed with Ordinary Least Squares. Over all timesteps simultaneous. 𝛾 and 𝛼
remain constant, while 𝐶 varies over time. The weights are considered equal here and no variance or confidence
interval is calculated.
Note that the internal reference section can not be used since there is a connector between the internal and external
fiber and therefore the integrated differential attenuation cannot be considered to be linear anymore.
import os
%matplotlib inline
ds = read_silixa_files(
directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
print(ds100.calibration_single_ended.__doc__)
Parameters
----------
store_p_cov : str
Key to store the covariance matrix of the calibrated parameters
store_p_val : str
Key to store the values of the calibrated parameters
nt : int, optional
Number of timesteps. Should be defined if method=='external'
z : array-like, optional
Distances. Should be defined if method=='external'
p_val
p_var
p_cov
sections : dict, optional
st_label : str
Label of the forward stokes measurement
ast_label : str
Label of the anti-Stoke measurement
st_var : float, optional
The variance of the measurement noise of the Stokes signals in
the forward
direction Required if method is wls.
ast_var : float, optional
The variance of the measurement noise of the anti-Stokes signals
in the forward
direction. Required if method is wls.
store_c : str
Label of where to store C
store_gamma : str
Label of where to store gamma
store_dalpha : str
Label of where to store dalpha; the spatial derivative of alpha.
store_alpha : str
Label of where to store alpha; The integrated differential
attenuation.
alpha(x=0) = 0
store_tmpf : str
Label of where to store the calibrated temperature of the forward
direction
variance_suffix : str, optional
(continues on next page)
Returns
-------
ds100.calibration_single_ended(st_label='ST',
ast_label='AST',
method='ols')
ds = read_silixa_files(
directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
print(ds100.calibration_double_ended.__doc__)
Parameters
----------
store_p_cov
store_p_val
nt
z
p_val
p_var
p_cov
sections : dict, optional
st_label : str
Label of the forward stokes measurement
ast_label : str
Label of the anti-Stoke measurement
rst_label : str
Label of the reversed Stoke measurement
rast_label : str
Label of the reversed anti-Stoke measurement
st_var : float, optional
The variance of the measurement noise of the Stokes signals in
the forward
direction Required if method is wls.
(continues on next page)
Returns
-------
st_label = 'ST'
ast_label = 'AST'
rst_label = 'REV-ST'
rast_label = 'REV-AST'
ds100.calibration_double_ended(sections=sections,
st_label=st_label,
ast_label=ast_label,
rst_label=rst_label,
rast_label=rast_label,
method='ols')
After calibration, two data variables are added to the DataStore object: - TMPF, temperature calculated along the
forward direction - TMPB, temperature calculated along the backward direction
A better estimate, with a lower expected variance, of the temperature along the fiber is the average of the two. We
cannot weigh on more than the other, as we do not have more information about the weighing.
plt.legend();
Lets compare our calibrated values with the device calibration. Lets average the temperature of the forward channel
and the backward channel first.
ds1_diff.plot(figsize=(12, 8));
The device calibration sections and calibration sections defined by us differ. The device only allows for 2 sections,
one per thermometer. And most likely the 𝛾 is fixed in the device calibration.
A single ended calibration is performed with weighted least squares. Over all timesteps simultaneous. 𝛾 and 𝛼 remain
constant, while 𝐶 varies over time. The weights are not considered equal here. The weights kwadratically decrease
with the signal strength of the measured Stokes and anti-Stokes signals.
The confidence intervals can be calculated as the weights are correctly defined.
The confidence intervals consist of two sources of uncertainty.
1. Measurement noise in the measured Stokes and anti-Stokes signals. Expressed in a single variance value.
2. Inherent to least squares procedures / overdetermined systems, the parameters are estimated with limited cer-
tainty and all parameters are correlated. Which is expressen in the covariance matrix.
Both sources of uncertainty are propagated to an uncertainty in the estimated temperature via Monte Carlo.
import os
4.7. 7. Calibration of single ended measurement with WLS and confidence intervals 23
dtscalibration, Release 0.6.3
}
ds.sections = sections
print(ds.calibration_single_ended.__doc__)
Parameters
----------
store_p_cov : str
Key to store the covariance matrix of the calibrated parameters
store_p_val : str
Key to store the values of the calibrated parameters
nt : int, optional
Number of timesteps. Should be defined if method=='external'
z : array-like, optional
Distances. Should be defined if method=='external'
p_val
p_var
p_cov
sections : dict, optional
st_label : str
Label of the forward stokes measurement
ast_label : str
Label of the anti-Stoke measurement
st_var : float, optional
The variance of the measurement noise of the Stokes signals in
the forward
direction Required if method is wls.
ast_var : float, optional
The variance of the measurement noise of the anti-Stokes signals
in the forward
direction. Required if method is wls.
store_c : str
Label of where to store C
store_gamma : str
Label of where to store gamma
store_dalpha : str
Label of where to store dalpha; the spatial derivative of alpha.
store_alpha : str
(continues on next page)
Returns
-------
st_label = 'ST'
ast_label = 'AST'
First calculate the variance in the measured Stokes and anti-Stokes signals, in the forward and backward direction.
The Stokes and anti-Stokes signals should follow a smooth decaying exponential. This function fits a decaying expo-
nential to each reference section for each time step. The variance of the residuals between the measured Stokes and
anti-Stokes signals and the fitted signals is used as an estimate of the variance in measured signals.
Similar to the ols procedure, we make a single function call to calibrate the temperature. If the method is wls and
confidence intervals are passed to conf_ints, confidence intervals calculated. As weigths are correctly passed to
the least squares procedure, the covariance matrix can be used. This matrix holds the covariances between all the
parameters. A large parameter set is generated from this matrix, assuming the parameter space is normally distributed
with their mean at the best estimate of the least squares procedure.
The large parameter set is used to calculate a large set of temperatures. By using percentiles or quantile the
95% confidence interval of the calibrated temperature between 2.5% and 97.5% are calculated.
The confidence intervals differ per time step. If you would like to calculate confidence intervals of all time steps
together you have the option ci_avg_time_flag=True. ‘We can say with 95% confidence that the temperature
remained between this line and this line during the entire measurement period’.
ds.calibration_single_ended(sections=sections,
st_label=st_label,
ast_label=ast_label,
st_var=st_var,
ast_var=ast_var,
method='wls',
solver='sparse',
store_p_val='p_val',
store_p_cov='p_cov'
)
4.7. 7. Calibration of single ended measurement with WLS and confidence intervals 25
dtscalibration, Release 0.6.3
ds.conf_int_single_ended(
p_val='p_val',
p_cov='p_cov',
st_label=st_label,
ast_label=ast_label,
st_var=st_var,
ast_var=ast_var,
store_tmpf='TMPF',
store_tempvar='_var',
conf_ints=[2.5, 97.5],
mc_sample_size=500,
ci_avg_time_flag=False)
ds.TMPF_MC_var.plot(figsize=(12, 8));
4.7. 7. Calibration of single ended measurement with WLS and confidence intervals 27
dtscalibration, Release 0.6.3
We can tell from the graph above that the 95% confidence interval widens furtherdown the cable. Lets have a look
at the calculated variance along the cable for a single timestep. According to the device manufacturer this should be
around 0.0059 degC.
ds1.TMPF_MC_var.plot(figsize=(12, 8));
The variance of the temperature measurement appears to be larger than what the manufacturer reports. This is already
the case for the internal cable; it is not caused by a dirty connector/bad splice on our side. Maybe the length of the
calibration section was not sufficient.
At 30 m the variance sharply increases. There are several possible explanations. E.g., large temperatures or decreased
signal strength.
Lets have a look at the Stokes and anti-Stokes signal.
ds1.ST.plot(figsize=(12, 8))
ds1.AST.plot();
4.7. 7. Calibration of single ended measurement with WLS and confidence intervals 29
dtscalibration, Release 0.6.3
Clearly there was a bad splice at 30 m that resulted in the sharp increase of measurement uncertainty for the cable
section after the bad splice.
A double ended calibration is performed with weighted least squares. Over all timesteps simultaneous. 𝛾 and 𝛼 remain
constant, while 𝐶 varies over time. The weights are not considered equal here. The weights kwadratically decrease
with the signal strength of the measured Stokes and anti-Stokes signals.
The confidence intervals can be calculated as the weights are correctly defined.
The confidence intervals consist of two sources of uncertainty.
1. Measurement noise in the measured Stokes and anti-Stokes signals. Expressed in a single variance value.
2. Inherent to least squares procedures / overdetermined systems, the parameters are estimated with limited cer-
tainty and all parameters are correlated. Which is expressen in the covariance matrix.
Both sources of uncertainty are propagated to an uncertainty in the estimated temperature via Monte Carlo.
import os
ds_ = read_silixa_files(
directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
st_label = 'ST'
ast_label = 'AST'
rst_label = 'REV-ST'
rast_label = 'REV-AST'
First calculate the variance in the measured Stokes and anti-Stokes signals, in the forward and backward direction.
The Stokes and anti-Stokes signals should follow a smooth decaying exponential. This function fits a decaying expo-
nential to each reference section for each time step. The variance of the residuals between the measured Stokes and
anti-Stokes signals and the fitted signals is used as an estimate of the variance in measured signals.
resid.plot(figsize=(12, 8));
4.8. 8. Calibration of double ended measurement with WLS and confidence intervals 31
dtscalibration, Release 0.6.3
We calibrate the measurement with a single method call. The labels refer to the keys in the DataStore object containing
the Stokes, anti-Stokes, reverse Stokes and reverse anti-Stokes. The variance in those measurements were calculated
in the previous step. We use a sparse solver because it saves us memory.
ds.calibration_double_ended(
st_label=st_label,
ast_label=ast_label,
rst_label=rst_label,
rast_label=rast_label,
st_var=st_var,
ast_var=ast_var,
rst_var=rst_var,
rast_var=rast_var,
store_tmpw='TMPW',
method='wls',
solver='sparse')
ds.TMPW.plot()
<matplotlib.collections.QuadMesh at 0x11e485e48>
With another method call we estimate the confidence intervals. If the method is wls and confidence intervals are
passed to conf_ints, confidence intervals calculated. As weigths are correctly passed to the least squares proce-
dure, the covariance matrix can be used as an estimator for the uncertainty in the parameters. This matrix holds the
covariances between all the parameters. A large parameter set is generated from this matrix as part of the Monte Carlo
routine, assuming the parameter space is normally distributed with their mean at the best estimate of the least squares
procedure.
The large parameter set is used to calculate a large set of temperatures. By using percentiles or quantile the
95% confidence interval of the calibrated temperature between 2.5% and 97.5% are calculated.
The confidence intervals differ per time step. If you would like to calculate confidence intervals of all time steps
together you have the option ci_avg_time_flag=True. ‘We can say with 95% confidence that the temperature
remained between this line and this line during the entire measurement period’. This is ideal if you’d like to calculate
the background temperature with a confidence interval.
ds.conf_int_double_ended(
p_val='p_val',
p_cov='p_cov',
st_label=st_label,
ast_label=ast_label,
rst_label=rst_label,
rast_label=rast_label,
st_var=st_var,
ast_var=ast_var,
rst_var=rst_var,
rast_var=rast_var,
store_tmpf='TMPF',
(continues on next page)
4.8. 8. Calibration of double ended measurement with WLS and confidence intervals 33
dtscalibration, Release 0.6.3
The DataArrays TMPF_MC and TMPB_MC and the dimension CI are added. MC stands for monte carlo and the CI
dimension holds the confidence interval ‘coordinates’.
(ds1.TMPW_MC_var**0.5).plot(figsize=(12, 4));
plt.ylabel('$\sigma$ ($^\circ$C)');
ds.data_vars
Data variables:
ST (x, time) float64 4.049e+03 4.044e+03 ... 3.501e+03
AST (x, time) float64 3.293e+03 3.296e+03 ... 2.803e+03
REV-ST (x, time) float64 4.061e+03 4.037e+03 ... 4.584e+03
REV-AST (x, time) float64 3.35e+03 3.333e+03 ... 3.707e+03
TMP (x, time) float64 16.69 16.87 16.51 ... 13.6 13.69
acquisitionTime (time) float32 2.098 2.075 2.076 2.133 2.085 2.062
referenceTemperature (time) float32 21.0536 21.054 ... 21.0531 21.057
probe1Temperature (time) float32 4.36149 4.36025 ... 4.36021 4.36118
probe2Temperature (time) float32 18.5792 18.5785 ... 18.5805 18.5723
referenceProbeVoltage (time) float32 0.121704 0.121704 ... 0.121705
probe1Voltage (time) float32 0.114 0.114 0.114 0.114 0.114 0.114
probe2Voltage (time) float32 0.121 0.121 0.121 0.121 0.121 0.121
userAcquisitionTimeFW (time) float32 2.0 2.0 2.0 2.0 2.0 2.0
userAcquisitionTimeBW (time) float32 2.0 2.0 2.0 2.0 2.0 2.0
gamma float64 482.6
alpha (x) float64 -0.007156 -0.003301 ... -0.005165
d (time) float64 1.465 1.465 1.464 1.465 1.465 1.465
gamma_var float64 0.03927
alpha_var (x) float64 1.734e-07 1.814e-07 ... 1.835e-07
d_var (time) float64 4.854e-07 4.854e-07 ... 4.854e-07
TMPF (x, time) float64 16.8 17.05 16.32 ... 13.49 13.78
TMPB (x, time) float64 16.8 16.83 16.88 ... 13.74 13.69
TMPF_MC_var (x, time) float64 dask.array<shape=(787, 6),
˓→chunksize=(699, 6)>
4.8. 8. Calibration of double ended measurement with WLS and confidence intervals 35
dtscalibration, Release 0.6.3
In this tutorial we are adding a timeseries to the DataStore object. This might be useful if the temperature in one of
the calibration baths was measured with an external device. It requires three steps to add the measurement files to the
DataStore object: 1. Load the measurement files (e.g., csv, txt) with pandas into a pandas.Series object 2. Add the
pandas.Series object to the DataStore 3. Align the time to that of the DTS measurement (required for calibration)
import pandas as pd
import os
# Bonus:
print(filepath, '\n')
with open(filepath, 'r') as f:
head = [next(f) for _ in range(5)]
print(' '.join(head))
../../tests/data/external_temperature_timeseries/Loodswaternet2018-03-28 02h.csv
"time","Pt100 2"
2018-03-28 02:00:05, 12.748
2018-03-28 02:00:10, 12.747
2018-03-28 02:00:15, 12.746
2018-03-28 02:00:20, 12.747
time
2018-03-28 02:00:05+02:00 12.748
2018-03-28 02:00:10+02:00 12.747
2018-03-28 02:00:15+02:00 12.746
2018-03-28 02:00:20+02:00 12.747
2018-03-28 02:00:26+02:00 12.747
Name: Pt100 2, dtype: float64
Now we quickly create a DataStore from xml-files with Stokes measurements to add the external timeseries to
4.9.2 Step 2: Add the temperature measurements of the external probe to the Data-
Store.
ds.coords['time_external'] = ts.index.values
4.9.3 Step 3: Align the time of the external measurements to the Stokes measure-
ment times
We linearly interpolate the measurements of the external sensor to the times we have DTS measurements
ds['external_probe_dts'] = ds['external_probe'].interp(time_external=ds.time)
print(ds.data_vars)
Data variables:
ST (x, time) float64 1.281 -0.5321 ... -43.44 -41.08
AST (x, time) float64 0.4917 1.243 ... -30.14 -32.09
REV-ST (x, time) float64 0.4086 -0.568 ... 4.822e+03
REV-AST (x, time) float64 2.569 -1.603 ... 4.224e+03
TMP (x, time) float64 196.1 639.1 218.7 ... 8.442 18.47
acquisitionTime (time) float32 2.098 2.075 2.076 2.133 2.085 2.062
referenceTemperature (time) float32 21.0536 21.054 ... 21.0531 21.057
probe1Temperature (time) float32 4.36149 4.36025 ... 4.36021 4.36118
probe2Temperature (time) float32 18.5792 18.5785 ... 18.5805 18.5723
referenceProbeVoltage (time) float32 0.121704 0.121704 ... 0.121705
probe1Voltage (time) float32 0.114 0.114 0.114 0.114 0.114 0.114
probe2Voltage (time) float32 0.121 0.121 0.121 0.121 0.121 0.121
userAcquisitionTimeFW (time) float32 2.0 2.0 2.0 2.0 2.0 2.0
userAcquisitionTimeBW (time) float32 2.0 2.0 2.0 2.0 2.0 2.0
external_probe (time_external) float64 12.75 12.75 ... 12.76 12.76
external_probe_dts (time) float64 12.75 12.75 12.75 12.75 12.75 12.75
Now we can use external_probe_dts when we define sections and use it for calibration
The cable length was initially configured during the DTS measurement. For double ended measurements it is important
to enter the correct length so that the forward channel and the backward channel are aligned.
This notebook shows how to better align the forward and the backward measurements. Do this before the calibration
steps.
import os
from dtscalibration import read_silixa_files
from dtscalibration.datastore_utils import suggest_cable_shift_double_ended, shift_
˓→double_ended
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
suggest_cable_shift_double_ended?
ds_aligned = read_silixa_files(
directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml') # this one is already correctly aligned
Because our loaded files were already nicely aligned, we are purposely offsetting the forward and backward channel
by 3 ‘spacial indices’.
ds_notaligned = shift_double_ended(ds_aligned, 3)
The device-calibrated temperature doesnot have a valid meaning anymore and is dropped
suggested_shift = suggest_cable_shift_double_ended(
ds_notaligned,
np.arange(-5, 5),
plot_result=True,
figsize=(12,8))
/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/src/dtscalibration/
˓→datastore_utils.py:240: RuntimeWarning: invalid value encountered in log
The two approaches suggest a shift of -3 and -4. It is up to the user which suggestion to follow. Usually the two
suggested shift are close
Note that our fiber has become shorter by 2*3 spatial indices
Reference
5.1 dtscalibration
41
dtscalibration, Release 0.6.3
42 Chapter 5. Reference
dtscalibration, Release 0.6.3
• variance_suffix (str, optional) – String appended for storing the variance. Only used when
method is wls.
• method ({‘ols’, ‘wls’, ‘external’}) – Use ‘ols’ for ordinary least squares and ‘wls’ for
weighted least squares
• solver ({‘sparse’, ‘stats’}) – Either use the homemade weighted sparse solver or the
weighted dense matrix solver of statsmodels
calibration_single_ended(sections=None, st_label=’ST’, ast_label=’AST’, st_var=None,
ast_var=None, store_c=’c’, store_gamma=’gamma’,
store_dalpha=’dalpha’, store_alpha=’alpha’, store_tmpf=’TMPF’,
store_p_cov=’p_cov’, store_p_val=’p_val’, variance_suffix=’_var’,
method=’ols’, solver=’sparse’, nt=None, z=None, p_val=None,
p_var=None, p_cov=None)
Parameters
• store_p_cov (str) – Key to store the covariance matrix of the calibrated parameters
• store_p_val (str) – Key to store the values of the calibrated parameters
• nt (int, optional) – Number of timesteps. Should be defined if method==’external’
• z (array-like, optional) – Distances. Should be defined if method==’external’
• p_val
• p_var
• p_cov
• sections (dict, optional)
• st_label (str) – Label of the forward stokes measurement
• ast_label (str) – Label of the anti-Stoke measurement
• st_var (float, optional) – The variance of the measurement noise of the Stokes signals in
the forward direction Required if method is wls.
• ast_var (float, optional) – The variance of the measurement noise of the anti-Stokes sig-
nals in the forward direction. Required if method is wls.
• store_c (str) – Label of where to store C
• store_gamma (str) – Label of where to store gamma
• store_dalpha (str) – Label of where to store dalpha; the spatial derivative of alpha.
• store_alpha (str) – Label of where to store alpha; The integrated differential attenuation.
alpha(x=0) = 0
• store_tmpf (str) – Label of where to store the calibrated temperature of the forward direc-
tion
• variance_suffix (str, optional) – String appended for storing the variance. Only used when
method is wls.
• method ({‘ols’, ‘wls’}) – Use ‘ols’ for ordinary least squares and ‘wls’ for weighted least
squares
• solver ({‘sparse’, ‘stats’}) – Either use the homemade weighted sparse solver or the
weighted dense matrix solver of statsmodels
channel_configuration
5.1. dtscalibration 43
dtscalibration, Release 0.6.3
chbw
chfw
conf_int_double_ended(p_val=’p_val’, p_cov=’p_cov’, st_label=’ST’, ast_label=’AST’,
rst_label=’REV-ST’, rast_label=’REV-AST’, st_var=None,
ast_var=None, rst_var=None, rast_var=None, store_tmpf=’TMPF’,
store_tmpb=’TMPB’, store_tmpw=’TMPW’, store_tempvar=’_var’,
conf_ints=None, mc_sample_size=100, ci_avg_time_flag=False,
ci_avg_x_flag=False, var_only_sections=False,
da_random_state=None, remove_mc_set_flag=True, re-
duce_memory_usage=False)
Parameters
• p_val (array-like or string) – parameter solution directly from calibra-
tion_double_ended_wls
• p_cov (array-like or string) – parameter covariance at the solution directly from calibra-
tion_double_ended_wls If set to False, no uncertainty in the parameters is propagated into
the confidence intervals. Similar to the spec sheets of the DTS manufacturers. And similar
to passing an array filled with zeros
• st_label (str) – Key of the forward Stokes
• ast_label (str) – Key of the forward anti-Stokes
• rst_label (str) – Key of the backward Stokes
• rast_label (str) – Key of the backward anti-Stokes
• st_var (float) – Float of the variance of the Stokes signal
• ast_var (float) – Float of the variance of the anti-Stokes signal
• rst_var (float) – Float of the variance of the backward Stokes signal
• rast_var (float) – Float of the variance of the backward anti-Stokes signal
• store_tmpf (str) – Key of how to store the Forward calculated temperature. Is calculated
using the forward Stokes and anti-Stokes observations.
• store_tmpb (str) – Key of how to store the Backward calculated temperature. Is calculated
using the backward Stokes and anti-Stokes observations.
• store_tmpw (str) – Key of how to store the forward-backward-weighted temperature.
First, the variance of TMPF and TMPB are calculated. The Monte Carlo set of TMPF
and TMPB are averaged, weighted by their variance. The median of this set is thought to
be the a reasonable estimate of the temperature
• store_tempvar (str) – a string that is appended to the store_tmp_ keys. and the variance
is calculated for those store_tmp_ keys
• conf_ints (iterable object of float) – A list with the confidence boundaries that are calcu-
lated. Valid values are between [0, 1].
• mc_sample_size (int) – Size of the monte carlo parameter set used to calculate the confi-
dence interval
• ci_avg_time_flag (bool) – The confidence intervals differ per time step. If you would
like to calculate confidence intervals of all time steps together. ‘We can say with 95%
confidence that the temperature remained between this line and this line during the entire
measurement period’.
44 Chapter 5. Reference
dtscalibration, Release 0.6.3
• ci_avg_x_flag (bool) – Similar to ci_avg_time_flag but then the averaging takes place over
the x dimension. And we can observe to variance over time.
• var_only_sections (bool) – useful if using the ci_avg_x_flag. Only calculates the var over
the sections, so that the values can be compared with accuracy along the reference sections.
Where the accuracy is the variance of the residuals between the estimated temperature and
temperature of the water baths
• da_random_state – For testing purposes. Similar to random seed. The seed for dask.
Makes random not so random. To produce reproducable results for testing environments.
• remove_mc_set_flag (bool) – Remove the monte carlo data set, from which the CI and
the variance are calculated.
• reduce_memory_usage (bool) – Use less memory but at the expense of longer computa-
tion time
conf_int_single_ended(p_val=’p_val’, p_cov=’p_cov’, st_label=’ST’, ast_label=’AST’,
st_var=None, ast_var=None, store_tmpf=’TMPF’,
store_tempvar=’_var’, conf_ints=None, mc_sample_size=100,
ci_avg_time_flag=False, ci_avg_x_flag=False, da_random_state=None,
remove_mc_set_flag=True, reduce_memory_usage=False)
Parameters
• p_val (array-like or string) – parameter solution directly from calibra-
tion_double_ended_wls
• p_cov (array-like or string or bool) – parameter covariance at p_val directly from calibra-
tion_double_ended_wls. If set to False, no uncertainty in the parameters is propagated into
the confidence intervals. Similar to the spec sheets of the DTS manufacturers. And similar
to passing an array filled with zeros. If set to string, the p_cov is retreived by accessing
ds[p_cov] . See p_cov keyword argument in the calibration routine.
• st_label (str) – Key of the forward Stokes
• ast_label (str) – Key of the forward anti-Stokes
• st_var (float) – Float of the variance of the Stokes signal
• ast_var (float) – Float of the variance of the anti-Stokes signal
• store_tmpf (str) – Key of how to store the Forward calculated temperature. Is calculated
using the forward Stokes and anti-Stokes observations.
• store_tempvar (str) – a string that is appended to the store_tmp_ keys. and the variance
is calculated for those store_tmp_ keys
• conf_ints (iterable object of float) – A list with the confidence boundaries that are calcu-
lated. Valid values are between [0, 1].
• mc_sample_size (int) – Size of the monte carlo parameter set used to calculate the confi-
dence interval
• ci_avg_time_flag (bool) – The confidence intervals differ per time step. If you would
like to calculate confidence intervals of all time steps together. ‘We can say with 95%
confidence that the temperature remained between this line and this line during the entire
measurement period’.
• ci_avg_x_flag (bool) – Similar to ci_avg_time_flag but then over the x-dimension instead
of the time-dimension
• da_random_state – For testing purposes. Similar to random seed. The seed for dask.
Makes random not so random. To produce reproducable results for testing environments.
5.1. dtscalibration 45
dtscalibration, Release 0.6.3
• remove_mc_set_flag (bool) – Remove the monte carlo data set, from which the CI and
the variance are calculated.
• reduce_memory_usage (bool) – Use less memory but at the expense of longer computa-
tion time
get_default_encoding()
get_time_dim(data_var_key=None)
Find relevant time dimension. by educative guessing
Parameters data_var_key (str) – The data variable key that contains a relevant time dimension.
If None, ‘ST’ is used.
get_x_dim(data_var_key=None)
Find relevant x dimension. by educative guessing
Parameters data_var_key (str) – The data variable key that contains a relevant time dimension.
If None, ‘ST’ is used.
in_confidence_interval(ci_label, conf_ints, sections=None)
Returns an array with bools wether the temperature of the reference sections are within the confidence
intervals
Parameters
• sections (Dict[str, List[slice]])
• ci_label
• conf_ints
inverse_variance_weighted_mean(tmp1=’TMPF’, tmp2=’TMPB’,
tmp1_var=’TMPF_MC_var’,
tmp2_var=’TMPB_MC_var’, tmpw_store=’TMPW’,
tmpw_var_store=’TMPW_var’)
Average two temperature datasets with the inverse of the variance as weights. The two temperature datasets
tmp1 and tmp2 with their variances tmp1_var and tmp2_var, respectively. Are averaged and stored in the
DataStore.
Parameters
• tmp1 (str) – The label of the first temperature dataset that is averaged
• tmp2 (str) – The label of the second temperature dataset that is averaged
• tmp1_var (str) – The variance of tmp1
• tmp2_var (str) – The variance of tmp2
• tmpw_store (str) – The label of the averaged temperature dataset
• tmpw_var_store (str) – The label of the variance of the averaged temperature dataset
inverse_variance_weighted_mean_array(tmp_label=’TMPF’,
tmp_var_label=’TMPF_MC_var’,
tmpw_store=’TMPW’,
tmpw_var_store=’TMPW_var’, dim=’time’)
Calculates the weighted average across a dimension.
See also:
-()
is_double_ended
46 Chapter 5. Reference
dtscalibration, Release 0.6.3
sections
Define calibration sections. Each section requires a reference temperature time series, such as the temper-
ature measured by an external temperature sensor. They should already be part of the DataStore object.
Please look at the example notebook on sections if you encounter difficulties.
Parameters sections (Dict[str, List[slice]]) – Sections are defined in a dictionary with its key-
words of the names of the reference temperature time series. Its values are lists of slice
objects, where each slice object is a stretch.
temperature_residuals(label=None)
Parameters label (str) – The key of the temperature DataArray
Returns resid_da (xarray.DataArray) – The residuals as DataArray
timeseries_keys
Returns the keys of all timeseires that can be used for calibration.
to_netcdf(path=None, mode=’w’, format=None, group=None, engine=None, encoding=None, un-
limited_dims=None, compute=True)
Write datastore contents to a netCDF file. :Parameters: * path (str, Path or file-like object, optional) –
Path to which to save this dataset. File-like objects are only
supported by the scipy engine. If no path is provided, this function returns the resulting
netCDF file as bytes; in this case, we need to use scipy, which does not support netCDF
version 4 (the default format becomes NETCDF3_64BIT).
• mode ({‘w’, ‘a’}, optional) – Write (‘w’) or append (‘a’) mode. If mode=’w’, any existing
file at this location will be overwritten. If mode=’a’, existing variables will be overwritten.
• format ({‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_64BIT’,)
• ‘NETCDF3_CLASSIC’}, optional – File format for the resulting netCDF file: *
NETCDF4: Data is stored in an HDF5 file, using netCDF4 API
5.1. dtscalibration 47
dtscalibration, Release 0.6.3
features.
All formats are supported by the netCDF4-python library. scipy.io.netcdf only supports the
last two formats. The default format is NETCDF4 if you are saving a file to disk and have
the netCDF4-python library available. Otherwise, xarray falls back to using scipy to write
netCDF files and defaults to the NETCDF3_64BIT format (scipy does not support netCDF4).
• group (str, optional) – Path to the netCDF4 group in the given file to open (only works for
format=’NETCDF4’). The group(s) will be created if necessary.
• engine ({‘netcdf4’, ‘scipy’, ‘h5netcdf’}, optional) – Engine to use when writing netCDF
files. If not provided, the default engine is chosen based on available dependencies, with a
preference for ‘netcdf4’ if writing to a file on disk.
• encoding (dict, optional) – defaults to reasonable compression. Use encoding={} to disable
encoding. Nested dictionary with variable names as keys and dictionaries of variable specific
encodings as values, e.g., ‘‘{‘my_variable’: {‘dtype’: ‘int16’, ‘scale_factor’: 0.1,
‘zlib’: True}, . . . }‘‘
The h5netcdf engine supports both the NetCDF4-style compression encoding parameters
{'zlib': True, 'complevel': 9} and the h5py ones {'compression':
'gzip', 'compression_opts': 9}. This allows using any compression plugin
installed in the HDF5 library, e.g. LZF.
• unlimited_dims (sequence of str, optional) – Dimension(s) that should be serialized as un-
limited dimensions. By default, no dimensions are treated as unlimited dimensions. Note
that unlimited_dims may also be set via dataset.encoding['unlimited_dims'].
• compute (boolean) – If true compute immediately, otherwise return a dask.delayed.
Delayed object that can be computed later.
48 Chapter 5. Reference
dtscalibration, Release 0.6.3
• x_indices (bool) – To retreive an integer array with the indices of the x-coordinates in the
section/stretch
• ref_temp_broadcasted (bool)
• calc_per ({‘all’, ‘section’, ‘stretch’})
• func_kwargs (dict) – Dictionary with options that are passed to func
• TODO (Spend time on creating a slice instead of appendng everything)
• to a list and concatenating after.
Examples
# Calculate the variance of the residuals in the along ALL the # reference sections wrt the temperature of
the water baths TMPF_var = d.ufunc_per_section(
func=’var’, calc_per=’all’, label=’TMPF’, temp_err=True )
# Calculate the variance of the residuals in the along PER # reference section wrt the temperature of the
water baths TMPF_var = d.ufunc_per_section(
func=’var’, calc_per=’stretch’, label=’TMPF’, temp_err=True )
# Calculate the variance of the residuals in the along PER # water bath wrt the temperature of the water
baths TMPF_var = d.ufunc_per_section(
func=’var’, calc_per=’section’, label=’TMPF’, temp_err=True )
# Obtain the coordinates of the measurements per section locs = d.ufunc_per_section(
func=None, label=’x’, temp_err=False, ref_temp_broadcasted=False, calc_per=’stretch’)
# Number of observations per stretch nlocs = d.ufunc_per_section(
func=len, label=’x’, temp_err=False, ref_temp_broadcasted=False, calc_per=’stretch’)
# broadcast the temperature of the reference sections to stretch/section/all dimensions. The value of the
reference temperature (a timeseries) is broadcasted to the shape of self[ label]. The self[label] is not used
for anything else. temp_ref = d.ufunc_per_section(
label=’ST’, ref_temp_broadcasted=True, calc_per=’all’)
# x-coordinate index ix_loc = d.ufunc_per_section(x_indices=True)
Note: If self[label] or self[subtract_from_label] is a Dask array, a Dask array is returned Else a numpy
array is returned
5.1. dtscalibration 49
dtscalibration, Release 0.6.3
• st_label (str) – label of the Stokes, anti-Stokes measurement. E.g., ST, AST, REV-ST,
REV-AST
• sections (dict, optional) – Define sections. See documentation
Returns
• I_var (float) – Variance of the residuals between measured and best fit
• resid (array_like) – Residuals between measured and best fit
Notes
Because there are a large number of unknowns, spend time on calculating an initial estimate. Can be turned
off by setting to False.
variance_stokes_exponential(st_label, sections=None, use_statsmodels=False, sup-
press_info=True, reshape_residuals=True)
Calculates the variance between the measurements and a best fit exponential at each reference section.
This fits a two-parameter exponential to the stokes measurements. The temperature is constant and there
are no splices/sharp bends in each reference section. Therefore all signal decrease is due to differential
attenuation, which is the same for each reference section. The scale of the exponential does differ per
reference section.
Assumptions: 1) the temperature is the same along a reference section. 2) no sharp bends and splices in
the reference sections. 3) Same type of optical cable in each reference section.
Idea from discussion at page 127 in Richter, P. H. (1995). Estimating errors in least-squares fitting. For
weights used error propagation: w^2 = 1/sigma(lny)^2 = y^2/sigma(y)^2 = y^2
Parameters
• reshape_residuals
• use_statsmodels
• suppress_info
• st_label (str) – label of the Stokes, anti-Stokes measurement. E.g., ST, AST, REV-ST,
REV-AST
• sections (dict, optional) – Define sections. See documentation
Returns
• I_var (float) – Variance of the residuals between measured and best fit
• resid (array_like) – Residuals between measured and best fit
dtscalibration.open_datastore(filename_or_obj, group=None, decode_cf=True,
mask_and_scale=None, decode_times=True, con-
cat_characters=True, decode_coords=True, engine=None,
chunks=None, lock=None, cache=None, drop_variables=None,
backend_kwargs=None, **kwargs)
Load and decode a datastore from a file or file-like object. :Parameters: * filename_or_obj (str, Path, file or
xarray.backends.*DataStore) – Strings and Path objects are interpreted as a path to a netCDF file
or an OpenDAP URL and opened with python-netCDF4, unless the filename ends with
.gz, in which case the file is gunzipped and opened with scipy.io.netcdf (only netCDF3
supported). File-like objects are opened with scipy.io.netcdf (only netCDF3 supported).
• group (str, optional) – Path to the netCDF4 group in the given file to open (only works for
netCDF4 files).
50 Chapter 5. Reference
dtscalibration, Release 0.6.3
• decode_cf (bool, optional) – Whether to decode these variables, assuming they were saved
according to CF conventions.
• mask_and_scale (bool, optional) – If True, replace array values equal to _FillValue with NA
and scale values according to the formula original_values * scale_factor + add_offset, where
_FillValue, scale_factor and add_offset are taken from variable attributes (if they exist). If the
_FillValue or missing_value attribute contains multiple values a warning will be issued and all
array values matching one of the multiple values will be replaced by NA. mask_and_scale de-
faults to True except for the pseudonetcdf backend.
• decode_times (bool, optional) – If True, decode times encoded in the standard NetCDF datetime
format into datetime objects. Otherwise, leave them encoded as numbers.
• concat_characters (bool, optional) – If True, concatenate along the last dimension of character
arrays to form string arrays. Dimensions will only be concatenated over (and removed) if they
have no corresponding variable and if they are only used as the last dimension of character
arrays.
• decode_coords (bool, optional) – If True, decode the ‘coordinates’ attribute to identify coordi-
nates in the resulting dataset.
• engine ({‘netcdf4’, ‘scipy’, ‘pydap’, ‘h5netcdf’, ‘pynio’,)
• ‘pseudonetcdf’}, optional – Engine to use when reading files. If not provided, the default
engine is chosen based on available dependencies, with a preference for ‘netcdf4’.
• chunks (int or dict, optional) – If chunks is provided, it used to load the new dataset into dask
arrays. chunks={} loads the dataset with dask using a single chunk for all arrays.
• lock (False, True or threading.Lock, optional) – If chunks is provided, this argument is passed
on to dask.array.from_array(). By default, a global lock is used when reading data
from netCDF files with the netcdf4 and h5netcdf engines to avoid issues with concurrent access
when using dask’s multithreaded backend.
• cache (bool, optional) – If True, cache data loaded from the underlying datastore in memory as
NumPy arrays when accessed to avoid reading from the underlying data- store multiple times.
Defaults to True unless you specify the chunks argument to use dask, in which case it defaults to
False. Does not change the behavior of coordinates corresponding to dimensions, which always
load their data from disk into a pandas.Index.
• drop_variables (string or iterable, optional) – A variable or list of variables to exclude from
being parsed from the dataset. This may be useful to drop variables with problems or inconsistent
values.
• backend_kwargs (dictionary, optional) – A dictionary of keyword arguments to pass on to the
backend. This may be useful when backend options would improve performance or allow user
control of dataset processing.
See also:
read_xml_dir()
dtscalibration.read_sensornet_files(filepathlist=None, directory=None, file_ext=’*.ddf’,
timezone_netcdf=’UTC’, timezone_input_files=’UTC’,
silent=False, **kwargs)
Read a folder with measurement files. Each measurement file contains values for a single timestep. Remember
to check which timezone you are working in.
Parameters
5.1. dtscalibration 51
dtscalibration, Release 0.6.3
• filepathlist (list of str, optional) – List of paths that point the the silixa files
• directory (str, Path, optional) – Path to folder
• timezone_netcdf (str, optional) – Timezone string of the netcdf file. UTC follows CF-
conventions.
• timezone_input_files (str, optional) – Timezone string of the measurement files. Remember
to check when measurements are taken. Also if summertime is used.
• file_ext (str, optional) – file extension of the measurement files
• silent (bool) – If set tot True, some verbose texts are not printed to stdout/screen
• kwargs (dict-like, optional) – keyword-arguments are passed to DataStore initialization
Notes
Compressed sensornet files can not be directly decoded, because the files are encoded with encoding=’windows-
1252’ instead of UTF-8.
Returns datastore (DataStore) – The newly created datastore.
dtscalibration.read_silixa_files(filepathlist=None, directory=None, zip_handle=None,
file_ext=’*.xml’, timezone_netcdf=’UTC’, silent=False,
load_in_memory=’auto’, **kwargs)
Read a folder with measurement files. Each measurement file contains values for a single timestep. Remember
to check which timezone you are working in.
The silixa files are already timezone aware
Parameters
• filepathlist (list of str, optional) – List of paths that point the the silixa files
• directory (str, Path, optional) – Path to folder
• timezone_netcdf (str, optional) – Timezone string of the netcdf file. UTC follows CF-
conventions.
• file_ext (str, optional) – file extension of the measurement files
• silent (bool) – If set tot True, some verbose texts are not printed to stdout/screen
• load_in_memory ({‘auto’, True, False}) – If ‘auto’ the Stokes data is only loaded to mem-
ory for small files
• kwargs (dict-like, optional) – keyword-arguments are passed to DataStore initialization
Returns datastore (DataStore) – The newly created datastore.
dtscalibration.plot_dask(arr, file_path=None)
For debugging the scheduling of the calculation of dask arrays. Requires additional libraries to be installed.
Parameters
• arr (Dask-Array) – An uncomputed dask array
• file_path (Path-like, str, optional) – Path to save graph
Returns out (array-like) – The calculated array
52 Chapter 5. Reference
CHAPTER 6
Contributing
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
dtscalibration could always use more documentation, whether as part of the official dtscalibration docs, in docstrings,
or even on the web in blog posts, articles, and such.
53
dtscalibration, Release 0.6.3
6.4 Development
tox
git add .
git commit -m "Your detailed description of your changes."
git push origin name-of-your-bugfix-or-feature
If you need some code review or feedback while you’re developing the code just make the pull request.
For merging, you should:
1. Include passing tests (run tox)1 .
2. Update documentation when there’s new API, functionality etc.
3. Add a note to CHANGELOG.rst about the changes.
4. Add yourself to AUTHORS.rst.
6.4.2 Tips
To run all the test environments in parallel (you need to pip install detox):
detox
1 If you don’t have all the necessary python versions available locally you can rely on Travis - it will run the tests for each change you add in the
pull request.
It will be slower though . . .
54 Chapter 6. Contributing
CHAPTER 7
Authors
55
dtscalibration, Release 0.6.3
56 Chapter 7. Authors
CHAPTER 8
Changelog
• Added reading support for zipped silixa files. Still rarely fails due to upstream bug.
• pretty __repr__
• Reworked double ended calibration procedure. Integrated differential attenuation outside of reference sections
is now calculated seperately.
• New approach for estimation of Stokes variance. Not restricted to a decaying exponential
• Bug in averaging TMPF and TMPB to TMPW
• Modified residuals plot, especially useful for long fibers (Great work Bart!)
• Example notebooks updatred accordingly
• Bug in to_netcdf when passing encodings
• Better support for sections that are not related to a timeseries.
• Double-ended weighted calibration procedure is rewritten so that the integrated differential attenuation outside
of the reference sections is calculated seperately. Better memory usage and faster
• Other calibration routines cleaned up
• Official support for Python 3.7
• Coverage figures are now trustworthy
• String representation improved
• Include test for aligning double ended measurements
• Example for aligning double ended measurements
57
dtscalibration, Release 0.6.3
• Reworked the double-ended calibration routine and the routine for confidence intervals. The integrated differ-
ential attenuation is not zero at x=0 anymore.
• Verbose commands carpentry
• Bug fixed that would make the read_silixa routine crash if there are copies of the same file in the same folder
• Routine to read sensornet files. Only single-ended configurations supported for now. Anyone has double-ended
measurements?
• Lazy calculation of the confidence intervals
• Bug solved. The x-coordinates where not calculated correctly. The bug only appeared for measurements along
long cables.
• Example notebook of importing a timeseries. For example, importing measurments from an external temperature
sensor for calibration.
• Updated documentation
• No changes
58 Chapter 8. Changelog
dtscalibration, Release 0.6.3
60 Chapter 8. Changelog
CHAPTER 9
• genindex
• modindex
• search
61
dtscalibration, Release 0.6.3
d
dtscalibration, 41
63
dtscalibration, Release 0.6.3
C R
calibration_double_ended() (dtscalibra- read_sensornet_files() (in module dtscalibra-
tion.DataStore method), 42 tion), 51
calibration_single_ended() (dtscalibra- read_silixa_files() (in module dtscalibration),
tion.DataStore method), 43 52
channel_configuration (dtscalibra- resample_datastore() (dtscalibration.DataStore
tion.DataStore attribute), 43 method), 46
chbw (dtscalibration.DataStore attribute), 43
chfw (dtscalibration.DataStore attribute), 44 S
conf_int_double_ended() (dtscalibra- sections (dtscalibration.DataStore attribute), 47
tion.DataStore method), 44
conf_int_single_ended() (dtscalibra- T
tion.DataStore method), 45 temperature_residuals() (dtscalibra-
D tion.DataStore method), 47
timeseries_keys (dtscalibration.DataStore at-
DataStore (class in dtscalibration), 41 tribute), 47
dtscalibration (module), 41 to_netcdf() (dtscalibration.DataStore method), 47
G U
get_default_encoding() (dtscalibra-
ufunc_per_section() (dtscalibration.DataStore
tion.DataStore method), 46
method), 48
get_time_dim() (dtscalibration.DataStore method),
46
get_x_dim() (dtscalibration.DataStore method), 46
V
variance_stokes() (dtscalibration.DataStore
I method), 49
in_confidence_interval() (dtscalibra- variance_stokes_exponential() (dtscalibra-
tion.DataStore method), 46 tion.DataStore method), 50
inverse_variance_weighted_mean() (dtscali-
bration.DataStore method), 46
inverse_variance_weighted_mean_array()
(dtscalibration.DataStore method), 46
is_double_ended (dtscalibration.DataStore at-
tribute), 46
O
open_datastore() (in module dtscalibration), 50
P
plot_dask() (in module dtscalibration), 52
65