Uncertainty analysis involve the propagation of uncertainty in model inputs (parameters) to estimate their impact on model outputs. From uncertainty analysis one can also learn of the importance of specific parameters (and other model components) to an end result.
It is recommended that uncertainty analysis is performed iteratively. It is time consuming (and therefore costly) to collect detailed data for parameters. Sensitivity analysis can help us identify which parameters are of biggest importance and for which we should focus our data gathering effort.
An uncertainty analysis in MERLIN-Expo starts with identifying uncertain parameters. These parameters are assigned probability density functions that describe the knowledge you have about each parameter value. Before a simulation is run, settings such as number of iterations, which parameters to include etc. are entered. You then proceed to running simulations after which you create charts and tables.
Many of the uncertain parameters in the MERLIN-Expo library have been given probability density functions (PDFs) for the several contaminants. The documentation of each model describes not only how the PDFs have been derived but also suggest how you can proceed to improve the data for your site and how you can derive PDFs for contaminants that are not included in the library.
Not all parameters are associated with (relevant) uncertainties, though it depends on your situation. If you study a specific agricultural land, the parameter for the surface area of the soil would not be uncertain. If you have a single local measurement of the soil density, this parameter could also be considered certain.
The pre-defined PDF's for site specific parameters often have very wide distributions because they are based on measurements of many sites. If you have access to local data you should use it instead of the pre-defined PDF. If you have only one measurement, use it and state that the parameter is not uncertain. If you have several measurements you should use them to derive a new PDF.
MERLIN-Expo does not include functions to fit your data to a distribution, but the software resources section lists some tools you can use. Also, there are generic distributions in MERLIN-Expo that allow you to enter measurements directly such as the general distribution and the histogram distribution.
The parameters screen lists all parameters used by the sub-systems of your model. In the table where you enter data for a specific parameter is a column named PDF.
A PDF is given using a named parameter approach. The following PDF is for a normal distribution, with a mean value of 0.025 and a standard deviation of 0.012. It is also truncated at 0 in order to avoid potential negative values:
norm(mean=0.025,sd=0.012,trmin=0.0) |
Remembering the names of all distribution functions and their parameters is difficult, so you rarely enter distributions this way. Instead you use the PDF editor which appears when you click a cell in the table for which a PDF is required.
The editor asks you for the type of distribution (in this case normal). When a distribution type is selected, a row of boxes appear into which you enter parameters for the distribution.
The editor offers more functionality, read more here.
From the simulation screen you can open the probabilistic settings window by clicking the Probabilistic settings button in the toolbar.
The Parameters page lets you select which of the uncertain parameters to include in your study. The more uncertain parameters you have, the more iterations you must run in order to cover a realistic combination of all all parameters. More iterations means longer simulations and more data to process after the simulation completes.
All parameters that have been assigned PDF's are listed on the left hand side. Choose the parameters you wish to include by selecting them in the list and clicking the > button.
When running a probabilistic simulation, a set of values is generated for each uncertain parameter by random sampling of their corresponding distributions. The random sampling can result in very unlikely combinations.
Example:
You are studying a generic river and have identified the width and depth of the river as uncertain:
Parameter | Distribution | Min | Max |
---|---|---|---|
Width | uniform | 10 | 30 |
Depth | uniform | 2 | 6 |
When the simulation starts, a set of values is generated for each:
Iteration | Width | Depth |
---|---|---|
1 | 15 | 2 |
2 | 10 | 6 |
3 | 29 | 2 |
4 | 14 | 5 |
5 | 23 | 3 |
… | … | … |
This would mean that the river in the second iteration is narrow but deep, in the third iteration it is wide but very shallow. These might be very unlikely situations, and your results would not be realistic. The depth and width are correlated - when there is much water in the river both the depth and width should grow and vice versa.
Parameter correlations are described by assigning weights between -1 and 1 for each parameter pair. A value larger than zero implies a positive correlation - when the first parameter has a large value, the second parameter should also have a large value. A negative weight implies a negative correlation - a large value for the first parameter should be combined with a small value for the second parameter.
After the sampling has been performed, MERLIN-Expo will try to sort the samples to accomodate the correlation weights. With a value of 0.9 between width and height, we would get the following sets of values:
Iteration | Width | Depth |
---|---|---|
1 | 10 | 2 |
4 | 14 | 2 |
2 | 15 | 3 |
3 | 23 | 5 |
5 | 29 | 6 |
… | … | … |
The Correlation page lets you set up correlation for parameters. You must first click Enabled to enable correlations. Then click the Add button to add each correlation.
Probabilistic simulations generate a lot of data. It is a good idea to select only the simulation outputs and time points you are interested in.
To run a probabilistic simulation, you need to first to change Simulation type from Deterministic to Probabilistic. Then click the Run button.
Many new types of charts and tables are available for probabilistic results, please refer to the charts screen and the tables screen pages.
After a probabilistic simulation you have access to the same charts and tables as after a sensitivity analysis using random methods.
A lot of problems can arise during a probabilistic simulation which would not happen in a deterministic case. When parameter samples are drawn from the probability density functions, you will inevitably end up with some extreme values. As discussed in the correlation section, it is also easy to end up with combinations of parameter values which are unlikely or extreme. This can cause iterations to take hundreds of times longer than iterations with less extreme parameter values. In bad cases, the numerical solver has to abort because it cannot meet error tolerances.
Problems are not always easy to identify - is it a specific PDF which causes problems or a combination of samples?
Problem | Solution |
---|---|
Simulation message: X is NaN at t=Y | NaN |
Simulation message: could not without reducing | Error tolerance |
Simulation never finishes | Memory problem,Time out |
http://en.wikipedia.org/wiki/NaN (Not a Number) is the result of an undefined calculation. Examples are 0/0 (zero divided by zero), log(-3) (logarithm of a negative number), -1.5^(10/3) etc.
MERLIN-Expo will abort a simulation as soon as a NaN is detected in any intermediate calculation result. For debugging purposes however, it is sometimes useful to continue the simulation.
NaN is sometimes used as a default value for parameters to assure that users have entered values. So, if the NaN is reported for a parameter, assert that its values are correctly set in the parameters_screen.
The error message contains the name of the block for which the NaN/Infinity value was discovered. In the information box of the model_screen, find the equation for the block. This will show you candidate parameters for the error.
Example:
River.Mass transfer coefficient at the surface water-sediment interface is NaN at t=0
The equation is D_{water}·φ_{sed}^(4.0/3.0)/(Δ_{sed}+Δw·φ_{sed}^(4.0/3.0))
It is easy to tell that if φ_{sed} is < 0, a NaN would be reported.
MERLIN-Expo can be told to continue simulations even when infinity/NaN occurs and to produce statistics on which iterations failed. After the simulation is finished, a raw data table can be created to see exactly for which parameter sets the simulation fails.
When it is not possible to understand by observing the parameter values what is wrong, only one thing remains: the process of elimination.