Analyzing time series¶
Although the Raven server by no means offer a complete suite of tools for time series analysis, we strive to provide basic algorithms that can reduce the volume of data to be downloaded and analyzed locally. To this end, Raven offers a couple of indicators for stream flow series.
Here we’ll test those indicators on a simple test file with around ten years of daily streamflow generated by a Raven simulation.
[1]:
%matplotlib inline
from birdy import WPSClient
from example_data import TESTDATA
import os
# Set environment variable WPS_URL to "http://localhost:9099" to run on the default local server
url = os.environ.get("WPS_URL", "https://pavics.ouranos.ca/twitcher/ows/proxy/raven/wps")
wps = WPSClient(url)
fn = str(TESTDATA['simfile_single'])
Base flow index¶
The base flow index is the minimum 7-day average flow divided by the mean flow.
[2]:
help(wps.base_flow_index)
Help on method base_flow_index in module birdy.client.base:
base_flow_index(q=None, variable=None, freq='YS') method of birdy.client.base.WPSClient instance
Return the base flow index, defined as the minimum 7-day average flow divided by the mean flow.
Parameters
----------
q : ComplexData:mimetype:`application/x-netcdf`
NetCDF Files or archive (tar/zip) containing netCDF files.
variable : string
Name of variable to analyze in netCDF file.
freq : {'YS', 'MS', 'QS-DEC', 'AS-JUL'}string
Resampling frequency
Returns
-------
output : ComplexData:mimetype:`application/x-netcdf`
The indicator values computed on the original input grid.
output_log : ComplexData:mimetype:`text/plain`
Collected logs during process run.
The base flow index needs as input arguments the link to a NetCDF file storing the stream flow time series, the name of the stream flow variable, and the frequency at which the index is computed (YS
: yearly, QS-DEC
: seasonally).
[3]:
resp = wps.base_flow_index(fn, variable='q_sim')
out, log = resp.get(asobj=True)
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
out.base_flow_index.plot()
[3]:
[<matplotlib.lines.Line2D at 0x7f0b243cbda0>]
To compute generic statistics of a time series, use the ts_stats
process.
[4]:
help(wps.ts_stats)
Help on method ts_stats in module birdy.client.base:
ts_stats(da, op=None, variable=None, freq='YS', season=None, month=None) method of birdy.client.base.WPSClient instance
Parameters
----------
da : ComplexData:mimetype:`application/x-netcdf`
NetCDF Files or archive (tar/zip) containing netCDF files.
variable : string
Name of variable to analyze in netCDF file.
op : {'min', 'max', 'mean', 'std', 'var', 'count', 'sum', 'argmax', 'argmin'}string
Operation name
freq : {'YS', 'MS', 'QS-DEC', 'AS-JUL'}string
Resampling frequency
season : {'DJF', 'MAM', 'JJA', 'SON'}string
Season selection specification.
month : {'range(1, 13)'}string
Month selection specification
Returns
-------
output : ComplexData:mimetype:`application/x-netcdf`
The indicator values computed on the original input grid.
output_log : ComplexData:mimetype:`text/plain`
Collected logs during process run.
[5]:
# Here we compute the annual summer (JJA) minimum
resp = wps.ts_stats(fn, variable='q_sim', op='min', season='JJA')
out, log = resp.get(asobj=True)
out.ts_stats.plot()
[5]:
[<matplotlib.lines.Line2D at 0x7f0b242e8358>]
Frequency analysis¶
The process freq_analysis
is similar to the previous stat sin the it fits a series of annual maxima or minima to a statistical distribution, and returns the values corresponding to different return periods.
[6]:
help(wps.freq_analysis)
Help on method freq_analysis in module birdy.client.base:
freq_analysis(da, mode, t, dist=None, variable=None, window=1, freq=None, season=None, month=None) method of birdy.client.base.WPSClient instance
Parameters
----------
da : ComplexData:mimetype:`application/x-netcdf`
NetCDF Files or archive (tar/zip) containing netCDF files.
variable : string
Name of variable to analyze in netCDF file.
mode : {'min', 'max'}string
Mode
t : integer
Return period
dist : string
Distribution
window : integer
Window size
freq : {'YS', 'MS', 'QS-DEC', 'AS-JUL'}string
Resampling frequency
season : {'DJF', 'MAM', 'JJA', 'SON'}string
Season selection specification.
month : {'range(1, 13)'}string
Month selection specification
Returns
-------
output : ComplexData:mimetype:`application/x-netcdf`
The indicator values computed on the original input grid.
output_log : ComplexData:mimetype:`text/plain`
Collected logs during process run.
For example, computing the Q27, the minimum 7-days streamflow of reccurrence two years, can be done using the following.
[7]:
resp = wps.freq_analysis(fn, variable='q_sim', mode='min', t=2, dist='gumbel_r', window=7)
[8]:
out, log = resp.get(asobj=True)
out.freq_analysis
[8]:
<xarray.DataArray 'freq_analysis' (return_period: 1, nbasins: 1)>
array([[8.88304]])
Coordinates:
basin_name (nbasins) object ...
* return_period (return_period) int64 2
Dimensions without coordinates: nbasins
Attributes:
units: m^3 s-1
long_name: N-year return period min annual 7-day flow
original_name:
description: Streamflow frequency analysis for the min annual 7-day fl...
estimator: Maximum likelihood
scipy_dist: gumbel_r
standard_name: discharge
cell_methods:
mode: min
history: [2020-03-09 15:39:43] freq_analysis: freq_analysis(da, mo...
An array of return periods can be passed.
[9]:
resp = wps.freq_analysis(fn, variable='q_sim', mode='max', t=(2, 5, 10, 25, 50, 100), dist='gumbel_r')
out, log = resp.get(asobj=True)
out.freq_analysis.plot()
[9]:
[<matplotlib.lines.Line2D at 0x7f0b1db6cf28>]
Getting the parameters of the distribution and comparing the fit¶
It’s sometimes more useful to store the fitted parameters of the distribution rather than storing only the quantiles. In the example below, we’re first computing the annual maxima of the simulated time series, then fitting them to a gumbel distribution using the fit
process.
[10]:
resp = wps.ts_stats(fn, variable="q_sim", op="max")
ts = resp.get()[0]
resp = wps.fit(ts, dist='gumbel_r')
pa_fn = resp.get()[0]
pa = resp.get(True)[0]
pa.params
[10]:
<xarray.DataArray 'params' (dparams: 2, nbasins: 1)>
array([[154.27106 ],
[ 44.071605]])
Coordinates:
basin_name (nbasins) object ...
* dparams (dparams) object 'loc' 'scale'
Dimensions without coordinates: nbasins
Attributes:
units:
long_name: gumbel_r distribution parameters
history: [2020-03-09 15:39:46] ts_stats: ts_stats(da, op, freq='YS...
cell_methods: time: fit
standard_name: gumbel_r parameters
description: Parameters of the gumbel_r distribution fitted
original_name: discharge
estimator: Maximum likelihood
scipy_dist: gumbel_r
To see how the distribution defined by those parameters compare with the original data, the graph_fit
process can help by creating a graphic showing an histogram of the data, overlayed with the fitted probability density function. The left y-axis displays the density (pdf) while the right y-axis displays the histogram count.
[11]:
resp = wps.graph_fit(ts=ts, params=pa_fn)
resp.get(asobj=True)[0]
[11]: