03 April 2014

[R] Following the Hydromad Tutorial by Felix Andrews

1. Getting to know Cotter River Catchment I made this post to share my experience following the Hydromad Tutorial by Felix Andrew. Hydromad is an R package that deals with hydrological modeling. I think mostly it works with hydrograph analysis. I am a newbie at this and try to learn from the tutorial.
You can visit download the package via R install.package command and the official website is (http://hydromad.catchment.org). You can download the tutorial at (http://hydromad.catchment.org/downloads/tutorial.pdf).
I will breakdown the post in some sub sections based on my learning steps. And here we go...

1. Getting to know Cotter River Catchment

The tutorial uses Cotter River Catchment data. The Cotter River catchment upper reaches are part of Namadgi National Park and the catchment lies within the borders of the ACT. The river itself starts high up on the Brindabella Mountains and is fed by melting snow and rainfall. Corin, Bendora and Cotter dams are part of the Cotter River catchment, which are heavily protected from outside activities that may affect water quality. Water stored in the highest dam, Corin, is released into the Cotter River to add to the level of storage in Bendora dam. From Bendora, a large diameter pipeline brings the water by gravity flow to the treatment plant on Mount Stromlo to the west of Canberra. The Bendora gravity main is 20 kilometres long and has the capacity to carry 310 million litres of water per day. You may visit The ACT Water Website.

2. Getting to know the data

Let's type this code.
> # R script following Hydromad Tutorial by Felix Andrews
> # link: http://hydromad.catchment.org/downloads/tutorial.pdf
> # ------------------------------

> # open hydromad
> library(hydromad)

> # setting up working directory 
> setwd("~/Documents/2014/SYDNEY/R practice-hydromad") 
Open the Cotter dataset. It is embedded in the Hydromad package. So you don't have to download it separately. Basically the Cotter dataset is a cluster of three measurements: daily rainfall (in P column), daily river water discharge (in Q column), and daily max Temp (in E column). The measurement started in 1966 and stopped in 2003. We can see it by typing this code.
> # checking data
> head(Cotter)
> tail(Cotter)
> plot(Cotter, screens=c("streamflow (mm/day)", 
>                         "areal rain (mm/day)", 
>                          "temperature (deg.c)"), xlab=NULL)
> plot(window(Cotter, start="1974-01-01", 
>               end="1975-01-01"))
> monthlyPQE <- aggregate="" as.yearmon="" mean="" otter=""> plot(monthlyPQE, screens=c("streamflow (mm/day)", 
>                              "areal rain (mm/day)",  
>                               "temperature (deg.c)"),
>                              xlab=NULL)
You should see this result following the "head" command.
               P         Q    E
1966-05-01  0.00 0.1755797 14.1
1966-05-02  0.00 0.1689892 16.7
1966-05-03  0.00 0.1670385 18.4
1966-05-04  0.00 0.1651716 19.0
1966-05-05 48.12 1.1826149 16.6
1966-05-06  1.20 0.5459561 14.7
And this one, following the "tail" command.
P         Q    E
2003-06-07 10.3 0.2978946 10.5
2003-06-08  0.0 0.2852959 12.9
2003-06-09  0.0 0.2649878 13.4
2003-06-10  0.0 0.2475432 14.9
2003-06-11  0.0 0.2368203 15.0
2003-06-12  0.0 0.2325378 16.1
We can also check the missing data in our dataset using complete.case and calculating the runoff ratio value dividing sum of Q with sum of P.
> ok <- 1:2="" complete.cases="" otter=""> with(Cotter, sum(Q[ok])/sum(P[ok]))
Then we should see the result is [1] 0.279 and according to the tutorial, we can assume that we probably have the right data series and units.

2. Cross-correlation

Why do we do this? To estimate the delay (lag) time between rainfall and a consequent stream. What function do we need? By using: • rollccf function • estimateDelay function
What do the functions do? Both functions pick out the lag time corresponding to the maximum correlation between rainfall and rises in stream.
> # calculate rolling cross-correlation
> # between rainfall and streamflow rises
> # from 1980 to 1990 data period
> x <- otter="" rollccf=""> plot(x, xlim=extendrange(as.Date(c("1980-01-01", 
                                     "1990-01-01"))))

> # estimateDelay(x)
> delay <- estimatedelay="" otter=""> delay
And the result in the Cotter Case this is 0 days. [1] 0

3. Model specification and calibration

3.1 Model specification

At this level, we are going to define the model:
  1. Define the model time period
  2. Define the SMA (Soil Moisture Accounting) component
  3. Define the routing component
The tutorial uses IHACRES model of Jakeman and Hornberger (1993).
We will divide it in to three periods: 70s, 80s, and 90s, as the following code. Then we set the 90s period as calibration data.
> # model spec
> # define data period name "data70s", "data80s", "data90s"
> data70s <- otter="" start="1970-01-01" window="">                   end="1979-12-31")
> data80s <- otter="" start="1980-01-01" window="">                   end="1989-12-31")
> data90s <- code="" end="1999-12-31" otter="" start="1990-01-01" window="">
According to the tutorial, the Hydromad is consisting of two components: the sma (Soil Moisture Accounting) and routing. So we will use the "expuh" routing modul and define:
  • sma = cwi (Catchment Wetness Index)
  • tau_s = change from 5 to 100 (slow flow component)
  • tau_q = change from 0 to 5 (quick flow component)
  • v_s = change from 0 to 1 (fraction of volume in slow flow component)
> # define routing and parameters
> # using: data90s periode, "expuh" routing
> # cwi (catchment wetness index) as 
> # sma (soil moisture accounting)
> # tau_s=5-100 (slow flow component), 
> # tau_q=0-5 (fast flow comp), 
> # v_s=0-1 (fraction volume in slow comp)
> cotterModel <- data90s="" hydromad="" routing="expuh" sma="cwi" tau_q="c(0,5)," tau_s="c(5,100),"> v_s=c(0,1))
> print(cotterModel)
> # saving result to txt file
> out<-capture .output="" cottermodel="" print=""> cat(out,file="cotterModel1.txt",sep="\n",append=TRUE)
You should see the following result.
Hydromad model with "cwi" SMA and "expuh" routing:
Start = 1990-01-01, End = 1999-12-31

SMA Parameters:
      lower upper     
tw        0   100     
f         0     8     
scale    NA    NA     
l         0     0 (==)
p         1     1 (==)
t_ref    20    20 (==)
Routing Parameters:
      lower upper  
tau_s     5   100  
tau_q     0     5  
v_s       0     1 

3.2 Model calibration using"fitByOptim"

We need to the calibration to make sure that our model can perform as we expect it to. Hydromad provide "fitByoptim" function to do that. We will calibrate our "cotterModel" and the result will be stored as "cotterFit". If your machine kind of freeze, and you will see stop sign (on your console window)going red. Don't be afraid, because Hydromad is working.
> # model calibration using"fitByOptim"
> cotterModel <- cottermodel="" rfit="list(" routing="armax" sriv="" update="">                order=c(n=2, m=1)))
> cotterFit <- code="" cottermodel="" fitbyoptim="" method="PORT" samples="100,">

4. Try to simulate other period

Then we can try to simulate other period in the dataset using our model. Remember, We have divided our Cotter dataset in to: 70s, 80s, and 90s. Also, we have used the 90s data as calibration data in the model. We will store the result of 70s model to sim70s and 80s model to sim80s.
> # remember data90s is the calib period, 
> # then we try to simulate data70s and data80s 
> sim70s <- cotterfit="" newdata="data70s)" update=""> sim80s <- cotterfit="" newdata="data80s)" update=""> sim70s
> sim80s
We can see the results by typing the last two commands. Then you should see the following for 70s data and 80s data respectively.
Hydromad model with "cwi" SMA and "armax" routing:
Start = 1970-01-01, End = 1979-12-31

SMA Parameters:
       tw          f      scale          l          p      t_ref  
29.181451   1.891884   0.001705   0.000000   1.000000  20.000000  
Routing Parameters:
    a_1      a_2      b_0      b_1    delay  
 1.5227  -0.5379   0.1567  -0.1415   0.0000  
TF Structure: S + Q (two stores in parallel)
    Poles:0.557, 0.9656 

Fit: ($fit.result)
fitByOptim(MODEL = cotterModel, method = "PORT", samples = 100)
     194 function evaluations in 71.61 seconds 

Routing fit info:  list(converged = TRUE, iteration = 6) 
and
Hydromad model with "cwi" SMA and "armax" routing:
Start = 1980-01-01, End = 1989-12-31

SMA Parameters:
       tw          f      scale          l          p      t_ref  
29.181451   1.891884   0.001705   0.000000   1.000000  20.000000  
Routing Parameters:
    a_1      a_2      b_0      b_1    delay  
 1.5227  -0.5379   0.1567  -0.1415   0.0000  
TF Structure: S + Q (two stores in parallel)
    Poles:0.557, 0.9656 

Fit: ($fit.result)
fitByOptim(MODEL = cotterModel, method = "PORT", samples = 100)
     194 function evaluations in 71.61 seconds 

Routing fit info:  list(converged = TRUE, iteration = 6) 

5. Run model verification

Then we can run model verification to our new models: the 70s and 80s (excluding 90s data). Why? It's because we have already used the 90s data as calibration data.
> # Run model verification to whole dataset 
> (70s,80s), excluding data90s
> # make dataVerif frame to store 70s and 80s period from Cotter
> dataVerif <- cotter=""> # we verify Q and excluding data90s by making the values as "NA"
> dataVerif$Q[time(data90s)] <- na=""> # run dataVerif, save in simVerif
> simVerif <- code="" cotterfit="" newdata="dataVerif)" update="">
We also can group all the model using "runlist" and store it to "allModels".
> # grouping all models using "runlist" func
> allModels <- calibration="cotterFit," runlist="" sim70s="" sim80s="" simverif=""> # summary of model performance
> summary(allModels) 
> print(allModels)
You should see the result like this.
List of model runs:
$calibration

Hydromad model with "cwi" SMA and "armax" routing:
Start = 1990-01-01, End = 1999-12-31

SMA Parameters:
       tw          f      scale          l          p      t_ref  
29.181451   1.891884   0.001705   0.000000   1.000000  20.000000  
Routing Parameters:
    a_1      a_2      b_0      b_1    delay  
 1.5227  -0.5379   0.1567  -0.1415   0.0000  
TF Structure: S + Q (two stores in parallel)
    Poles:0.557, 0.9656 

Fit: ($fit.result)
fitByOptim(MODEL = cotterModel, method = "PORT", samples = 100)
     194 function evaluations in 71.61 seconds 

Routing fit info:  list(converged = TRUE, iteration = 6) 

$sim70s

Hydromad model with "cwi" SMA and "armax" routing:
Start = 1970-01-01, End = 1979-12-31

SMA Parameters:
       tw          f      scale          l          p      t_ref  
29.181451   1.891884   0.001705   0.000000   1.000000  20.000000  
Routing Parameters:
    a_1      a_2      b_0      b_1    delay  
 1.5227  -0.5379   0.1567  -0.1415   0.0000  
TF Structure: S + Q (two stores in parallel)
    Poles:0.557, 0.9656 

Fit: ($fit.result)
fitByOptim(MODEL = cotterModel, method = "PORT", samples = 100)
     194 function evaluations in 71.61 seconds 

Routing fit info:  list(converged = TRUE, iteration = 6) 

$sim80s

Hydromad model with "cwi" SMA and "armax" routing:
Start = 1980-01-01, End = 1989-12-31

SMA Parameters:
       tw          f      scale          l          p      t_ref  
29.181451   1.891884   0.001705   0.000000   1.000000  20.000000  
Routing Parameters:
    a_1      a_2      b_0      b_1    delay  
 1.5227  -0.5379   0.1567  -0.1415   0.0000  
TF Structure: S + Q (two stores in parallel)
    Poles:0.557, 0.9656 

Fit: ($fit.result)
fitByOptim(MODEL = cotterModel, method = "PORT", samples = 100)
     194 function evaluations in 71.61 seconds 

Routing fit info:  list(converged = TRUE, iteration = 6) 

$simVerif

Hydromad model with "cwi" SMA and "armax" routing:
Start = 1966-05-01, End = 2003-06-12

SMA Parameters:
       tw          f      scale          l          p      t_ref  
29.181451   1.891884   0.001705   0.000000   1.000000  20.000000  
Routing Parameters:
    a_1      a_2      b_0      b_1    delay  
 1.5227  -0.5379   0.1567  -0.1415   0.0000  
TF Structure: S + Q (two stores in parallel)
    Poles:0.557, 0.9656 

Fit: ($fit.result)
fitByOptim(MODEL = cotterModel, method = "PORT", samples = 100)
     194 function evaluations in 71.61 seconds 

Routing fit info:  list(converged = TRUE, iteration = 6) 
Then we can continue evaluate our model by using this code provided in the tutorial. However I am still continuing to learn it to know the other evaluation functions. After each chunck of code, you should see result plots.
> # plotting cotterFit with defined period
> plot(fitted(cotterFit), with.P=TRUE, xlim=as.Date(c("1994-01-01", "1997-01-01")))
then
> # plotting allModels
> plot(allMods[2:3], scales = list(y = list(log = TRUE)))
> summary(simAll, breaks="5 years")
I will update this post as I continue to exercise with the Hydromad. Later on I might show case how it works with my own data.
{@dasaptaerwin}

No comments: