History Match Freyberg using K and Recharge
This notebook continues where the freyberg_k.ipynb
notebook left off. In the previous notebook we calibrated the Freyberg model by adjusting a single parameter - K
. We were able to obtain excelent fits with measured heads, and aparently forecast uncertainties were very very low. This was of course wrong, as we will see over the course of the next few tutorials.
In this tutorial we introduce the concepts of non-uniqueness and parameter correlation. We will repeat the same calibration and uncertainty analysis but with two adjustable parameters - hydraulic conductivity (K
) and recharge (rch0
).
In the previous tutorial, recharge parameters were fixed. Remember that when a parameter is fixed (e.g. not adjustable during calibration or uncertainty analysis), this implies that it is perfectly known. Rarely, if ever, is a parameter value perfectly known. Recharge in particular. Recharge is hard ot measure int he field and diffcult to upscale to a model domain. It is never perfectly known and should be included in the parameter estimation process. Let’s get to it.
Admin
We have provided some pre-cooked PEST dataset files, wraped around the modified Freyberg model. This is the same dataset introduced in the “freyberg_pest_setup” and “freyberg_k” notebooks.
The functions in the next cell import required dependencies and prepare a folder for you. This folder contains the model files and a preliminary PEST setup. Run the cells, then inspect the new folder named “freyberg_k” which has been created in your tutorial directory. (Just press shift+enter
to run the cells).
import sys
import os
import warnings
warnings.filterwarnings("ignore")
warnings.filterwarnings("ignore", category=DeprecationWarning)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt;
import shutil
sys.path.insert(0,os.path.join("..", "..", "dependencies"))
import pyemu
import flopy
assert "dependencies" in flopy.__file__
assert "dependencies" in pyemu.__file__
sys.path.insert(0,"..")
import herebedragons as hbd
plt.rcParams['font.size'] = 10
pyemu.plot_utils.font =10
# folder containing original model files
org_d = os.path.join('..', '..', 'models', 'monthly_model_files_1lyr_newstress')
# a dir to hold a copy of the org model files
tmp_d = os.path.join('freyberg_mf6')
if os.path.exists(tmp_d):
shutil.rmtree(tmp_d)
shutil.copytree(org_d,tmp_d)
# get executables
hbd.prep_bins(tmp_d)
# get dependency folders
hbd.prep_deps(tmp_d)
# run our convenience functions to prepare the PEST and model folder
hbd.prep_pest(tmp_d)
ins file for heads.csv prepared.
ins file for sfr.csv prepared.
noptmax:0, npar_adj:1, nnz_obs:24
written pest control file: freyberg_mf6\freyberg.pst
<pyemu.pst.pst_handler.Pst at 0x167fe946ac0>
Reminder - the modified-Freyberg model
Just a quick reminder of what the model looks like and what we are doing.
It is a one-layer model. A river runs north-south, represented with the SFR package (green cells in the figure). On the southern border there is a GHB (cyan cells). No-flow cells are shown in black. Pumping wells are shown with red cells.
Time-series of measured heads are available at the locations marked with black X’s. River flux is also measured at three locations (headwater, tailwater and gage; not displayed).
The simulation starts with a steady state stress period, followed by twelve transient stress periods. These represent the historic period, for which measured data are available.
A subsequent twelve transient stress periods representing a period in the future. Modelling is undertaken to assess selected forecasts during the simulated period.
hbd.plot_freyberg(tmp_d)
The PEST Control File
You may wish to explore the freyberg_mf6
folder which has been created in the tutorial directory. In it you will find a PEST control file named freyberg.pst
.
Let’s use pyemu
to load the PEST control file and check some details.
pst = pyemu.Pst(os.path.join(tmp_d, 'freyberg.pst'))
Let’s check what parameters are listed in the control file:
pst.par_names
['hk1', 'ne1', 'rch0', 'rch1']
Which of these are adjustable?
pst.adj_par_names
['hk1']
So only hk1
is adjustable. We want to make rch0
adjustable as well. (rch0
is a factor by which mean historical recharge is multiplied.)
We can do so by chaging the parameter transfrom in the parameter data
section of the onctrol file:
par = pst.parameter_data
par.head()
parnme | partrans | parchglim | parval1 | parlbnd | parubnd | pargp | scale | offset | extra | dercom | |
---|---|---|---|---|---|---|---|---|---|---|---|
parnme | |||||||||||
hk1 | hk1 | log | factor | 5.00 | 0.050 | 500.00 | hk | 1.0 | 0.0 | NaN | 1.0 |
ne1 | ne1 | fixed | factor | 0.01 | 0.005 | 0.02 | porosity | 1.0 | 0.0 | NaN | 1.0 |
rch0 | rch0 | fixed | factor | 1.00 | 0.500 | 2.00 | rch0 | 1.0 | 0.0 | NaN | 1.0 |
rch1 | rch1 | fixed | factor | 1.00 | 0.500 | 2.00 | rch1 | 1.0 | 0.0 | NaN | 1.0 |
Change partrans
of the rch0
parameter to log
:
par.loc['rch0', 'partrans'] = 'log'
par.head()
parnme | partrans | parchglim | parval1 | parlbnd | parubnd | pargp | scale | offset | extra | dercom | |
---|---|---|---|---|---|---|---|---|---|---|---|
parnme | |||||||||||
hk1 | hk1 | log | factor | 5.00 | 0.050 | 500.00 | hk | 1.0 | 0.0 | NaN | 1.0 |
ne1 | ne1 | fixed | factor | 0.01 | 0.005 | 0.02 | porosity | 1.0 | 0.0 | NaN | 1.0 |
rch0 | rch0 | log | factor | 1.00 | 0.500 | 2.00 | rch0 | 1.0 | 0.0 | NaN | 1.0 |
rch1 | rch1 | fixed | factor | 1.00 | 0.500 | 2.00 | rch1 | 1.0 | 0.0 | NaN | 1.0 |
We can now re-write the control file with the updated parameter data section:
pst.write(os.path.join(tmp_d, 'freyberg_k_r.pst'))
noptmax:0, npar_adj:2, nnz_obs:24
Run PEST once, to check everything is copacetic:
pyemu.os_utils.run("pestpp-glm freyberg_k_r.pst",cwd=tmp_d)
Let’s just check that it worked by loading the control file again and checking results:
pst = pyemu.Pst(os.path.join(tmp_d, 'freyberg_k_r.pst'))
assert pst.phi, "Something ain't right cap'n"
pst.phi
29.203134704019217
Awesome, good to go. We can set NOPTMAX
to 20 and let PEST loose. We are actualy going to do this twice. Once with K
and rch0
adjustable ( freyberg_k_r.pst
) and once with only k
adjustable (freyberg_k.pst
). The latter is a repeat of the previous tutorial. We will do it again just so that we can compare outcomes.
pst.control_data.noptmax = 20
pst.write(os.path.join(tmp_d, 'freyberg_k_r.pst'))
noptmax:20, npar_adj:2, nnz_obs:24
pst = pyemu.Pst(os.path.join(tmp_d, 'freyberg.pst'))
pst.control_data.noptmax = 20
pst.write(os.path.join(tmp_d, 'freyberg_k.pst'))
noptmax:20, npar_adj:1, nnz_obs:24
Run PEST
Okay, let’s run this thing.
Because we call a program from within the Jupyter Notebook you have to look at the terminal window that you used to start the notebook to see the screen report of the run. So, when executing this next block look at your terminal window to see the run. It will say “Simulation complete…” when finished.
Note: And/or wait until the standard out reports a “0” below this next block (=when the run is finished) before going on.
# use pyemu to run a command line, run pestpp on the pst file defined on the import
pyemu.os_utils.run("pestpp-glm freyberg_k.pst", cwd=tmp_d)
# use pyemu to run a command line, run pestpp on the pst file defined on the import
pyemu.os_utils.run("pestpp-glm freyberg_k_r.pst", cwd=tmp_d)
Explore Results
PEST writes lots of usefull information to the *.rec
file. It also outputs a series of other useful files. What outputs are recorded depends on which version of PEST or PEST++ is being used. Here we will use PEST++GLM. The following section will demonstrate usefull information that can be found in some of the outputs. Throughout subsequent tutorials we will address others.
Objective Function
First let’s look at the measurement objective function (Phi), which is calculated using the sum of squared weighted residuals. First we’ll look at a table, then plots.
# make a dataframe "df_obj" that shows the contents of the pst file casename with the extension .iobj
# .iobj = PEST++ output file that has the objective function by iteration
df_obj_k = pd.read_csv(os.path.join(tmp_d, "freyberg_k.iobj"),index_col=0)
df_obj_k_r = pd.read_csv(os.path.join(tmp_d, "freyberg_k_r.iobj"),index_col=0)
Let’s compare the proress in Phi for both cases:
# plot out the dataframe that was shown as a table above
plt.rcParams["font.size"] = 10
fig, ax = plt.subplots()
df_obj_k.loc[:,["total_phi","model_runs_completed"]].plot(y='total_phi', ax=ax, label='k')
df_obj_k_r.loc[:,["total_phi","model_runs_completed"]].plot(y='total_phi', ax=ax, label='k+r');
Looks like giving PEST more “flexibility” (i.e. more parmaters) allows it to get a better fit with measured data. (the Phi obtained by k+r
is lower).
What about the parameter uncertainties?
Read in the parameter uncertainty *.csv
files which PEST++GLM wrote:
df_paru_k = pd.read_csv(os.path.join(tmp_d, "freyberg_k.par.usum.csv"),index_col=0)
df_paru_k
prior_mean | prior_stdev | prior_lower_bound | prior_upper_bound | post_mean | post_stdev | post_lower_bound | post_upper_bound | |
---|---|---|---|---|---|---|---|---|
name | ||||||||
hk1 | 0.69897 | 1 | -1.30103 | 2.69897 | 0.505884 | 0.029466 | 0.446952 | 0.564816 |
df_paru_k_r = pd.read_csv(os.path.join(tmp_d, "freyberg_k_r.par.usum.csv"),index_col=0)
df_paru_k_r
prior_mean | prior_stdev | prior_lower_bound | prior_upper_bound | post_mean | post_stdev | post_lower_bound | post_upper_bound | |
---|---|---|---|---|---|---|---|---|
name | ||||||||
hk1 | 0.69897 | 1.000000 | -1.30103 | 2.69897 | 0.688363 | 0.144337 | 0.399689 | 0.977037 |
rch0 | 0.00000 | 0.150515 | -0.30103 | 0.30103 | 0.120730 | 0.100419 | -0.080109 | 0.321568 |
Hmm - Recharge prior uncertainty = 0.00000? No, it’s not 0.0 - recall that when we log-transformed this parameter, so its uncertainty is reported as logarithms. So, 10^0 = 1, which is what we see for an initial value in the PEST control file.
How does the uncertainty reduction for hk1
change when rch0
is included? Its easier to see if we plot them. Here’s the posterior and prior standard deviation for the new K+R (left) next the to K only results (right):
df_paru_concat = pd.concat([df_paru_k_r,df_paru_k],join="outer",axis=1,keys=["k+r","k_only"])
df_paru_concat.sort_index(inplace=True,axis=1)
for pname in df_paru_concat.index:
ax = df_paru_concat.loc[pname,(slice(None),("prior_stdev","post_stdev"))].plot(kind="bar")
ax.set_title(pname)
plt.show()
Damn…so hk1
uncertainty increases if we include rch0
.
Forecasts
Let’s look at the forecast uncertainties. First for the singe parameter, k
case. Sweet - nice and tight
df_predu_k = pd.read_csv(os.path.join(tmp_d, "freyberg_k.pred.usum.csv"),index_col=0)
df_predu_k.loc[:,"reduction"] = 100.0 * (1.0 - (df_predu_k.post_stdev / df_predu_k.prior_stdev))
figs, axes = pyemu.plot_utils.plot_summary_distributions(df_predu_k,subplots=True)
figs[0].tight_layout()
And now for the 2 parameter case (k+r
). Let’s compare it to the k
-only run (shown in green in the plot below). Some forecast uncertainties have increased (the distribution is shorter and wider).
df_predu_k_r = pd.read_csv(os.path.join(tmp_d, "freyberg_k_r.pred.usum.csv"),index_col=0)
df_predu_k_r.loc[:,"reduction"] = 100.0 * (1.0 - (df_predu_k_r.post_stdev / df_predu_k_r.prior_stdev))
i=1
fig = plt.figure(figsize=(15,5))
for forecast in df_predu_k_r.index:
ax = fig.add_subplot(1,4,i)
i+=1
pyemu.plot_utils.plot_summary_distributions(df_predu_k_r.loc[[forecast],:],ax=ax)
pyemu.plot_utils.plot_summary_distributions(df_predu_k.loc[[forecast],:],ax=ax,pt_color='g')
ax.set_title(forecast)
ax.grid()
fig.tight_layout()
So…which forecasts are influenced by the rch_0
parameter? The higher and narrower the peak the more certain we are - did the uncertainty decrease or increase when we added the second parameter?
Which forecasts had little change? Why do you think that is?
And which case (K
or K+R
) provides the more robust uncertainty estimate?
Hold up!
Look at this slightly modified version of the groundwater governing equation from Anderson et al. (2015) below. Is this problem well posed? That is, if recharge increased (represented by an increase in $W^$) *and at the same time K increased (represented by an increase in q) could they offset each other so that the righthand side stays the same? What is this called? Yeah, that’s right: non-uniqueness.
Recall the Bravo et al. (2002) trough when calibrating K and R with only heads:
Let’s look at the correlation of our two parameters
In the next cell we will use pyemu.Schur
object. (See the “intro to pyemu” and “intro to FOSM” tutorials for more details.)
sc = pyemu.Schur(os.path.join(tmp_d, "freyberg_k_r.jcb"))
cov = pyemu.Cov(sc.xtqx.x, names=sc.pst.adj_par_names)
R = cov.to_pearson()
R.df()
hk1 | rch0 | |
---|---|---|
hk1 | 1.000000 | -0.987643 |
rch0 | -0.987643 | 1.000000 |
Wow - correlated. Recall that Hill and Tiedeman (2007) suggest that correlation > 0.95 or so is not estimable. Even though estimating both R and K using only head observations is not possible because of this correlation, PEST++ gave you an answer.
So how did PEST++ deal with this intractable correlation? Let’s check how paramter values changed throught the PEST run. Parameter values per iteration are recorded in the freyberg_k_r.ipar
file.
ipar = pd.read_csv(os.path.join(tmp_d, 'freyberg_k_r.ipar'))
ipar
iteration | hk1 | ne1 | rch0 | rch1 | |
---|---|---|---|---|---|
0 | 0 | 5.00000 | 0.01 | 1.00000 | 1 |
1 | 1 | 5.05697 | 0.01 | 1.30061 | 1 |
2 | 2 | 4.94950 | 0.01 | 1.33807 | 1 |
3 | 3 | 4.87936 | 0.01 | 1.32047 | 1 |
4 | 4 | 4.89034 | 0.01 | 1.32352 | 1 |
Let’s check the ratio of hk1
to rch0
:
ipar.loc[:, 'ratio'] = ipar.rch0 / ipar.hk1
ipar.ratio.plot()
plt.ylabel('rch0 / hk1')
plt.xlabel('iteration');
So PEST++ found a combination of hk1
and rch0
that provide an “optimal” fit. It then continues to test changes in parameter values; however, it retains the ratio between these correlated parameters. As this results in no improvement of Phi, PEST halts operation (in this case afetr 3 iterations of no improvement).
Effectively, PEST has detemrined the optimal ratio of these two parameters. But not necessarily their “correct” absolute values.
Last point
Do you believe that the forecast values are optimal, or even defensible? Should we believe the forecast uncertainty either? Let’s again look at the forecasts with the “Truth” (dashed black lines in the plot below).
figs, axes = pyemu.plot_utils.plot_summary_distributions(df_predu_k_r,subplots=True)
for ax in axes:
fname = ax.get_title().lower()
ylim = ax.get_ylim()
v = pst.observation_data.loc[fname,"obsval"]
ax.plot([v,v],ylim,"k--")
ax.set_ylim(0, ylim[-1])
figs[0].tight_layout()
Yikes - very much missing the truth. Uncertainty analysis is still