Perform regionalization when no parameter set is available

Here we call the Regionalization WPS service to provide estimated streamflow (best estimate and ensemble) at an ungauged site using three pre-calibrated hydrological models and a large hydrometeorological database with catchment attributes (Extended CANOPEX). Multiple regionalization strategies are allowed.

[1]:
from birdy import WPSClient

from example_data import TESTDATA
import datetime as dt
from urllib.request import urlretrieve
import xarray as xr
import numpy as np
from matplotlib import pyplot as plt
import json
import os

# Set environment variable WPS_URL to "http://localhost:9099" to run on the default local server
url = os.environ.get("WPS_URL", "https://pavics.ouranos.ca/twitcher/ows/proxy/raven/wps")
wps = WPSClient(url)
[2]:
# Get the documentation for the method's usage:
help(wps.regionalisation)
Help on method regionalisation in module birdy.client.base:

regionalisation(ts, latitude, longitude, model_name, properties=None, elevation=None, start_date=datetime.datetime(1, 1, 1, 0, 0), end_date=datetime.datetime(1, 1, 1, 0, 0), name='watershed', ndonors=5, min_nse=0.6, method='SP_IDW', area=0.0) method of birdy.client.base.WPSClient instance
    Compute the hydrograph for an ungauged catchment using a regionalization method.

    Parameters
    ----------
    ts : ComplexData:mimetype:`application/x-netcdf`, :mimetype:`application/x-ogc-dods`, :mimetype:`text/plain`, :mimetype:`application/x-zipped-shp`
        Files (text or netCDF) storingdaily liquid precipitation (pr), solid precipitation (prsn), minimum temperature (tasmin), maximum temperature (tasmax), potential evapotranspiration (evspsbl) and observed streamflow (qobs [m3/s]).
    start_date : dateTime
        Start date of the simulation (AAAA-MM-DD). Defaults to the start of the forcing file.
    end_date : dateTime
        End date of the simulation (AAAA-MM-DD). Defaults to the end of the forcing file.
    latitude : float
        Watershed's centroid latitude
    longitude : float
        Watershed's centroid longitude
    name : string
        The name of the watershed the model is run for.
    model_name : {'HMETS', 'GR4JCN', 'MOHYSE'}string
        Hydrological model identifier: {HMETS, GR4JCN, MOHYSE}
    ndonors : integer
        Number of close or similar catchments to use to generate the representative hydrograph at the ungauged site.
    min_nse : float
        Minimum calibration NSE value required to be considered in the regionalization.
    method : {'MLR', 'SP', 'PS', 'SP_IDW', 'PS_IDW', 'SP_IDW_RA', 'PS_IDW_RA'}string
        Regionalisation method to use, one of MLR, SP, PS, SP_IDW,
        PS_IDW, SP_IDW_RA, PS_IDW_RA.

        The available regionalization methods are:

            Multiple linear regression (MLR)
                Ungauged catchment parameters are estimated individually by a linear regression
                against catchment properties.

            Spatial proximity (SP)
                The ungauged hydrograph is an average of the `n` closest catchments' hydrographs.

            Physical similarity (PS)
                The ungauged hydrograph is an average of the `n` most similar catchments' hydrographs.

            Spatial proximity with inverse distance weighting (SP_IDW)
                The ungauged hydrograph is an average of the `n` closest catchments' hydrographs, but
                the average is weighted using inverse distance weighting

            Physical similarity with inverse distance weighting (PS_IDW)
                The ungauged hydrograph is an average of the `n` most similar catchments' hydrographs, but
                the average is weighted using inverse distance weighting

            Spatial proximity with IDW and regression-based augmentation (SP_IDW_RA)
                The ungauged hydrograph is an average of the `n` closest catchments' hydrographs, but
                the average is weighted using inverse distance weighting. Furthermore, the method uses the CANOPEX/USGS
                dataset to estimate model parameters using Multiple Linear Regression. Parameters whose regression r-squared
                is higher than 0.5 are replaced by the MLR-estimated value.

            Physical Similarity with IDW and regression-based augmentation (PS_IDW_RA)
                The ungauged hydrograph is an average of the `n` most similar catchments' hydrographs, but
                the average is weighted using inverse distance weighting. Furthermore, the method uses the CANOPEX/USGS
                dataset to estimate model parameters using Multiple Linear Regression. Parameters whose regression r-squared
                is higher than 0.5 are replaced by the MLR-estimated value.
    properties : ComplexData:mimetype:`application/json`
        json string storing dictionary of properties. The available properties are: area (km2), longitude (dec.degrees), latitude (dec. degrees), gravelius, perimeter (m), elevation (m), slope(%), aspect, forest (%), grass (%), wetland (%), water (%), urban (%), shrubs (%), crops (%) and snowIce (%).
    area : float
        Watershed area (km2)
    elevation : float
        Watershed's mean elevation (m)

    Returns
    -------
    hydrograph : ComplexData:mimetype:`application/x-netcdf`, :mimetype:`application/zip`
        A netCDF file containing the outflow hydrographs (in m3/s) for all subbasins specified as `gauged` in the .rvh file. It reports period-ending time-averaged flows for the preceding time step, as is consistent with most measured stream gauge data (again, the initial flow conditions at the start of the first time step are included). If observed hydrographs are specified, they will be output adjacent to the corresponding modelled  hydrograph.
    ensemble : ComplexData:mimetype:`application/x-netcdf`
        A netCDF file containing the outflow hydrographs (in m3/s) for the basin on which the regionalization method has been applied. The number of outflow hydrographs is equal to the number of donors (ndonors) passed to the method. The average of these hydrographs (either using equal or Inverse-Distance Weights) is the hydrograph generated in "hydrograph".

[3]:
# Forcing files. This file should only contain weather data (tmin, tmax, rain, snow, pet (if desired), etc.
# No streamflow is required. This is a link to a string, but you can submit a string to your netcdf file directly.
ts = str(TESTDATA['raven-hmets-nc-ts'])

# Model configuration parameters
config = dict(
    start_date=dt.datetime(2000, 1, 1),
    end_date=dt.datetime(2002, 1, 1),
    area=4250.6,
    name='Saumon',
    elevation=843.0,
    latitude=54.4848,
    longitude=-123.3659,
    method='PS', # One of the methods described above
    model_name='HMETS', # One of the three models are allowed: HMETS, GR4JCN and MOHYSE
    min_nse=0.7, # Minimumcalibration NSE required to be considered a donor (for selecting good donor catchments)
    ndonors=5, # Number of donors we want to use. Usually between 4 and 8 is a robust number.
    properties=json.dumps({'latitude':54.4848, 'longitude':-123.3659, 'forest':0.4}),
    )

# Let's call the model with the timeseries, model parameters and other configuration parameters
resp = wps.regionalisation(ts=ts, **config)
[4]:
# And get the response
# With `asobj` set to False, only the reference to the output is returned in the response.
# Setting `asobj` to True will retrieve the actual files and copy the locally.
[hydrograph, ensemble] = resp.get(asobj=True)

The hydrograph and ensemble outputs are netCDF files storing the time series. These files are opened by default using xarray, which provides convenient and powerful time series analysis and plotting tools.

[5]:
hydrograph.q_sim
[5]:
Show/Hide data repr Show/Hide attributes
xarray.DataArray
'q_sim'
  • time: 732
  • nbasins: 1
  • 0.0 277.9 539.4 507.8 479.0 452.5 ... 24.36 23.86 23.37 22.92 22.48
    array([[  0.      ],
           [277.916569],
           [539.369893],
           ...,
           [ 23.374975],
           [ 22.916863],
           [ 22.479806]])
    • basin_name
      (nbasins)
      object
      ...
      long_name :
      Name/ID of sub-basins with simulated outflows
      cf_role :
      timeseries_id
      units :
      1
      array(['Saumon'], dtype=object)
    • time
      (time)
      datetime64[ns]
      2000-01-01 ... 2002-01-01
      standard_name :
      time
      array(['2000-01-01T00:00:00.000000000', '2000-01-02T00:00:00.000000000',
             '2000-01-03T00:00:00.000000000', ..., '2001-12-30T00:00:00.000000000',
             '2001-12-31T00:00:00.000000000', '2002-01-01T00:00:00.000000000'],
            dtype='datetime64[ns]')
  • units :
    m**3 s**-1
    long_name :
    Simulated outflows
[6]:
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

hydrograph.q_sim.plot()
[6]:
[<matplotlib.lines.Line2D at 0x7f66bfc71c18>]
../_images/notebooks_Perform_Regionalization_7_1.png
[7]:
print("Max: ", hydrograph.q_sim.max())
print("Mean: ", hydrograph.q_sim.mean())
print("Monthly means: ", hydrograph.q_sim.groupby('time.month').mean(dim='time'))
Max:  <xarray.DataArray 'q_sim' ()>
array(539.36989292)
Mean:  <xarray.DataArray 'q_sim' ()>
array(119.92177053)
Monthly means:  <xarray.DataArray 'q_sim' (month: 12, nbasins: 1)>
array([[160.93082366],
       [ 82.52531876],
       [ 73.72276794],
       [159.31442212],
       [153.70776744],
       [130.69632646],
       [149.09884217],
       [135.01868718],
       [127.69917325],
       [109.88942476],
       [109.48369515],
       [ 44.82918717]])
Coordinates:
    basin_name  (nbasins) object 'Saumon'
  * month       (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
Dimensions without coordinates: nbasins

Now we can also see the results coming from the 5 donors using the ‘ensemble’ variable

[8]:
# Plot the simulations from the 5 donor parameter sets
ensemble.q_sim.isel(nbasins=0).plot.line(hue='realization')
[8]:
[<matplotlib.lines.Line2D at 0x7f66bc4bc208>,
 <matplotlib.lines.Line2D at 0x7f66bc483080>,
 <matplotlib.lines.Line2D at 0x7f66bc4831d0>,
 <matplotlib.lines.Line2D at 0x7f66bc483320>,
 <matplotlib.lines.Line2D at 0x7f66bc483470>]
../_images/notebooks_Perform_Regionalization_10_1.png
[9]:
# You can also obtain the data in netcdf format directly by changing asobj to False:
[hydrograph, ensemble] = resp.get(asobj=False)
print(hydrograph)
print(ensemble)
http://localhost:9099/outputs/a3317734-8bf3-11ea-99bc-b052162515fb/qsim.nc
http://localhost:9099/outputs/a3317734-8bf3-11ea-99bc-b052162515fb/ensemble.nc