Perform regionalization when no parameter set is available¶
Here we call the Regionalization WPS service to provide estimated streamflow (best estimate and ensemble) at an ungauged site using three pre-calibrated hydrological models and a large hydrometeorological database with catchment attributes (Extended CANOPEX). Multiple regionalization strategies are allowed.
[1]:
from birdy import WPSClient
from example_data import TESTDATA
import datetime as dt
from urllib.request import urlretrieve
import xarray as xr
import numpy as np
from matplotlib import pyplot as plt
import json
import os
# Set environment variable WPS_URL to "http://localhost:9099" to run on the default local server
url = os.environ.get("WPS_URL", "https://pavics.ouranos.ca/twitcher/ows/proxy/raven/wps")
wps = WPSClient(url)
[2]:
# Get the documentation for the method's usage:
help(wps.regionalisation)
Help on method regionalisation in module birdy.client.base:
regionalisation(ts, latitude, longitude, model_name, properties=None, elevation=None, start_date=datetime.datetime(1, 1, 1, 0, 0), end_date=datetime.datetime(1, 1, 1, 0, 0), name='watershed', ndonors=5, min_nse=0.6, method='SP_IDW', area=0.0) method of birdy.client.base.WPSClient instance
Compute the hydrograph for an ungauged catchment using a regionalization method.
Parameters
----------
ts : ComplexData:mimetype:`application/x-netcdf`, :mimetype:`application/x-ogc-dods`, :mimetype:`text/plain`, :mimetype:`application/x-zipped-shp`
Files (text or netCDF) storingdaily liquid precipitation (pr), solid precipitation (prsn), minimum temperature (tasmin), maximum temperature (tasmax), potential evapotranspiration (evspsbl) and observed streamflow (qobs [m3/s]).
start_date : dateTime
Start date of the simulation (AAAA-MM-DD). Defaults to the start of the forcing file.
end_date : dateTime
End date of the simulation (AAAA-MM-DD). Defaults to the end of the forcing file.
latitude : float
Watershed's centroid latitude
longitude : float
Watershed's centroid longitude
name : string
The name of the watershed the model is run for.
model_name : {'HMETS', 'GR4JCN', 'MOHYSE'}string
Hydrological model identifier: {HMETS, GR4JCN, MOHYSE}
ndonors : integer
Number of close or similar catchments to use to generate the representative hydrograph at the ungauged site.
min_nse : float
Minimum calibration NSE value required to be considered in the regionalization.
method : {'MLR', 'SP', 'PS', 'SP_IDW', 'PS_IDW', 'SP_IDW_RA', 'PS_IDW_RA'}string
Regionalisation method to use, one of MLR, SP, PS, SP_IDW,
PS_IDW, SP_IDW_RA, PS_IDW_RA.
The available regionalization methods are:
Multiple linear regression (MLR)
Ungauged catchment parameters are estimated individually by a linear regression
against catchment properties.
Spatial proximity (SP)
The ungauged hydrograph is an average of the `n` closest catchments' hydrographs.
Physical similarity (PS)
The ungauged hydrograph is an average of the `n` most similar catchments' hydrographs.
Spatial proximity with inverse distance weighting (SP_IDW)
The ungauged hydrograph is an average of the `n` closest catchments' hydrographs, but
the average is weighted using inverse distance weighting
Physical similarity with inverse distance weighting (PS_IDW)
The ungauged hydrograph is an average of the `n` most similar catchments' hydrographs, but
the average is weighted using inverse distance weighting
Spatial proximity with IDW and regression-based augmentation (SP_IDW_RA)
The ungauged hydrograph is an average of the `n` closest catchments' hydrographs, but
the average is weighted using inverse distance weighting. Furthermore, the method uses the CANOPEX/USGS
dataset to estimate model parameters using Multiple Linear Regression. Parameters whose regression r-squared
is higher than 0.5 are replaced by the MLR-estimated value.
Physical Similarity with IDW and regression-based augmentation (PS_IDW_RA)
The ungauged hydrograph is an average of the `n` most similar catchments' hydrographs, but
the average is weighted using inverse distance weighting. Furthermore, the method uses the CANOPEX/USGS
dataset to estimate model parameters using Multiple Linear Regression. Parameters whose regression r-squared
is higher than 0.5 are replaced by the MLR-estimated value.
properties : ComplexData:mimetype:`application/json`
json string storing dictionary of properties. The available properties are: area (km2), longitude (dec.degrees), latitude (dec. degrees), gravelius, perimeter (m), elevation (m), slope(%), aspect, forest (%), grass (%), wetland (%), water (%), urban (%), shrubs (%), crops (%) and snowIce (%).
area : float
Watershed area (km2)
elevation : float
Watershed's mean elevation (m)
Returns
-------
hydrograph : ComplexData:mimetype:`application/x-netcdf`, :mimetype:`application/zip`
A netCDF file containing the outflow hydrographs (in m3/s) for all subbasins specified as `gauged` in the .rvh file. It reports period-ending time-averaged flows for the preceding time step, as is consistent with most measured stream gauge data (again, the initial flow conditions at the start of the first time step are included). If observed hydrographs are specified, they will be output adjacent to the corresponding modelled hydrograph.
ensemble : ComplexData:mimetype:`application/x-netcdf`
A netCDF file containing the outflow hydrographs (in m3/s) for the basin on which the regionalization method has been applied. The number of outflow hydrographs is equal to the number of donors (ndonors) passed to the method. The average of these hydrographs (either using equal or Inverse-Distance Weights) is the hydrograph generated in "hydrograph".
[3]:
# Forcing files. This file should only contain weather data (tmin, tmax, rain, snow, pet (if desired), etc.
# No streamflow is required. This is a link to a string, but you can submit a string to your netcdf file directly.
ts = str(TESTDATA['raven-hmets-nc-ts'])
# Model configuration parameters
config = dict(
start_date=dt.datetime(2000, 1, 1),
end_date=dt.datetime(2002, 1, 1),
area=4250.6,
name='Saumon',
elevation=843.0,
latitude=54.4848,
longitude=-123.3659,
method='PS', # One of the methods described above
model_name='HMETS', # One of the three models are allowed: HMETS, GR4JCN and MOHYSE
min_nse=0.7, # Minimumcalibration NSE required to be considered a donor (for selecting good donor catchments)
ndonors=5, # Number of donors we want to use. Usually between 4 and 8 is a robust number.
properties=json.dumps({'latitude':54.4848, 'longitude':-123.3659, 'forest':0.4}),
)
# Let's call the model with the timeseries, model parameters and other configuration parameters
resp = wps.regionalisation(ts=ts, **config)
[4]:
# And get the response
# With `asobj` set to False, only the reference to the output is returned in the response.
# Setting `asobj` to True will retrieve the actual files and copy the locally.
[hydrograph, ensemble] = resp.get(asobj=True)
The hydrograph
and ensemble
outputs are netCDF files storing the time series. These files are opened by default using xarray
, which provides convenient and powerful time series analysis and plotting tools.
[5]:
hydrograph.q_sim
[5]:
xarray.DataArray
'q_sim'
- time: 732
- nbasins: 1
- 0.0 277.9 539.4 507.8 479.0 452.5 ... 24.36 23.86 23.37 22.92 22.48
array([[ 0. ], [277.916569], [539.369893], ..., [ 23.374975], [ 22.916863], [ 22.479806]])
- basin_name(nbasins)object...
- long_name :
- Name/ID of sub-basins with simulated outflows
- cf_role :
- timeseries_id
- units :
- 1
array(['Saumon'], dtype=object)
- time(time)datetime64[ns]2000-01-01 ... 2002-01-01
- standard_name :
- time
array(['2000-01-01T00:00:00.000000000', '2000-01-02T00:00:00.000000000', '2000-01-03T00:00:00.000000000', ..., '2001-12-30T00:00:00.000000000', '2001-12-31T00:00:00.000000000', '2002-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
- units :
- m**3 s**-1
- long_name :
- Simulated outflows
[6]:
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
hydrograph.q_sim.plot()
[6]:
[<matplotlib.lines.Line2D at 0x7f66bfc71c18>]
[7]:
print("Max: ", hydrograph.q_sim.max())
print("Mean: ", hydrograph.q_sim.mean())
print("Monthly means: ", hydrograph.q_sim.groupby('time.month').mean(dim='time'))
Max: <xarray.DataArray 'q_sim' ()>
array(539.36989292)
Mean: <xarray.DataArray 'q_sim' ()>
array(119.92177053)
Monthly means: <xarray.DataArray 'q_sim' (month: 12, nbasins: 1)>
array([[160.93082366],
[ 82.52531876],
[ 73.72276794],
[159.31442212],
[153.70776744],
[130.69632646],
[149.09884217],
[135.01868718],
[127.69917325],
[109.88942476],
[109.48369515],
[ 44.82918717]])
Coordinates:
basin_name (nbasins) object 'Saumon'
* month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
Dimensions without coordinates: nbasins
Now we can also see the results coming from the 5 donors using the ‘ensemble’ variable
[8]:
# Plot the simulations from the 5 donor parameter sets
ensemble.q_sim.isel(nbasins=0).plot.line(hue='realization')
[8]:
[<matplotlib.lines.Line2D at 0x7f66bc4bc208>,
<matplotlib.lines.Line2D at 0x7f66bc483080>,
<matplotlib.lines.Line2D at 0x7f66bc4831d0>,
<matplotlib.lines.Line2D at 0x7f66bc483320>,
<matplotlib.lines.Line2D at 0x7f66bc483470>]
[9]:
# You can also obtain the data in netcdf format directly by changing asobj to False:
[hydrograph, ensemble] = resp.get(asobj=False)
print(hydrograph)
print(ensemble)
http://localhost:9099/outputs/a3317734-8bf3-11ea-99bc-b052162515fb/qsim.nc
http://localhost:9099/outputs/a3317734-8bf3-11ea-99bc-b052162515fb/ensemble.nc