====Introduction====

[[wp>Uncertainty analysis]] involve the [[wp>propagation of uncertainty]] in model inputs ([[parameter|parameters]]) to estimate their impact on model outputs. From uncertainty analysis one can also learn of the importance of specific parameters (and other model components) to an end result. 

It is recommended that uncertainty analysis is performed iteratively. It is time consuming (and therefore costly) to collect detailed data for parameters. [[Sensitivity analysis]] can help us identify which parameters are of biggest importance and for which we should focus our data gathering effort. 

An uncertainty analysis in MERLIN-Expo starts with [[#identifying uncertain parameters]]. These parameters are [[#Assigning probability density functions|assigned probability density functions]] that describe the knowledge you have about each parameter value. Before a simulation is run, [[#setting up a probabilistic simulation|settings]] such as number of iterations, which parameters to include etc. are entered. You then proceed to [[#running simulations]] after which you [[#creating charts and tables|create charts and tables]].  

====Identifying uncertain parameters====

Many of the uncertain [[parameter|parameters]] in the MERLIN-Expo [[library]] have been given [[PDF|probability density functions (PDFs)]] for the several [[contaminant|contaminants]]. The documentation of each model describes not only how the PDFs have been derived but also suggest how you can proceed to improve the data for your site and how you can derive PDFs for contaminants that are not included in the library. 

Not all parameters are associated with (relevant) uncertainties, though it depends on your situation. If you study a specific agricultural land, the parameter for the surface area of the soil would not be uncertain. If you have a single local measurement of the soil density, this parameter could also be considered certain. 

The pre-defined PDF's for site specific parameters often have very wide distributions because they are based on measurements of many sites. If you have access to local data you should use it instead of the pre-defined PDF. If you have only one measurement, use it and state that the parameter is not uncertain. If you have several measurements you should use them to derive a new PDF. 

MERLIN-Expo does not include functions to fit your data to a distribution, but the [[#software resources]] section lists some tools you can use. Also, there are generic distributions in MERLIN-Expo that allow you to enter measurements directly such as [[https://userguide.intelligentscenariomodelling.com/doku.php?id=general_distribution|the general distribution]] and [[https://userguide.intelligentscenariomodelling.com/doku.php?id=histogram_distribution|the histogram distribution]].

====Assigning probability density functions====

The [[parameters screen]] lists all parameters used by the [[sub-system|sub-systems]] of your [[model]]. In the table where you enter data for a specific parameter is a column named **PDF**. 

A PDF is given using a [[http://en.wikipedia.org/wiki/Named_parameter|named parameter]] approach. The following PDF is for a normal distribution, with a mean value of 0.025 and a standard deviation of 0.012. It is also truncated at 0 in order to avoid potential negative values:

^PDF^
|norm(mean=0.025,sd=0.012,trmin=0.0)|

Remembering the names of all distribution functions and their parameters is difficult, so you rarely enter distributions this way. Instead you use the [[parameters screen#PDF Editor|PDF editor]] which appears when you click a cell in the table for which a PDF is required.

{{ :pdf_editor.png?nolink |}}

The editor asks you for the  type of distribution (in this case //normal//). When a distribution type is selected, a row of boxes appear into which you enter parameters for the distribution.

The editor offers more functionality, read more [[parameters screen#PDF Editor|here]].

====Setting up a probabilistic simulation====

From the [[simulation screen]] you can open the [[simulation_screen#probabilistic settings]] window by clicking the {{:probabilisticsettings24.png?nolink|}} **Probabilistic settings** button in the [[user_interface#toolbar]].

===General settings===

  * **Number of simulations** - Choose the number of iterations to run. The more uncertain parameters you include in your analysis, the more iterations are needed to cover all the combinations of randomized parameter values. 
  * **Seed** - The seed for the random number generator is by default itself random. This means that completely new data sets will be generated each time you start a simulation. If you wish to be able to get the same sampling each type you start a simulation, enter any number in this box.
  * **Sampling** - Specifies the scheme for generating parameter samples, either **Monte Carlo** (pseudo random numbers) or **[[wp>Latin hypercube sampling]]**. Latin hypercube sampling gives you better coverage of each distribution, but can result in extreme outliers. 

===Selecting parameters for the uncertainty study===

The **Parameters** page lets you select which of the uncertain parameters to include in your study. The more uncertain parameters you have, the more iterations you must run in order to cover a realistic combination of all all parameters. More iterations means longer simulations and more data to process after the simulation completes. 

All parameters that have been assigned [[PDF|PDF's]] are listed on the left hand side. Choose the parameters you wish to include by selecting them in the list and clicking the **>** button.

===Definining parameter correlations===

When running a probabilistic simulation, a set of values is generated for each uncertain parameter by random sampling of their corresponding distributions. The random sampling can result in very unlikely combinations. 

Example:

You are studying a generic river and have identified the width and depth of the river as uncertain: 

^Parameter^Distribution^Min^Max^
|Width|uniform|10|30|
|Depth|uniform|2|6|

When the simulation starts, a set of values is generated for each:

^Iteration^Width^Depth^
|1|15|2|
|2|10|6|
|3|29|2|
|4|14|5|
|5|23|3|
|...|...|...|

This would mean that the river in the second iteration is narrow but deep, in the third iteration it is wide but very shallow. These might be very unlikely situations, and your results would not be realistic. The depth and width are correlated - when there is much water in the river both the depth and width should grow and vice versa. 

Parameter correlations are described by assigning weights between -1 and 1 for each parameter pair. A value larger than zero implies a positive correlation - when the first parameter has a large value, the second parameter should also have a large value. A negative weight implies a negative correlation - a large value for the first parameter should be combined with a small value for the second parameter.

After the sampling has been performed, MERLIN-Expo will try to sort the samples to accomodate the correlation weights. With a value of 0.9 between width and height, we would get the following sets of values:

^Iteration^Width^Depth^
|1|10|2|
|4|14|2|
|2|15|3|
|3|23|5|
|5|29|6|
|...|...|...|

The **Correlation** page lets you set up correlation for parameters. You must first click **Enabled** to enable correlations. Then click the **Add** button to add each correlation.  

====Running simulations====

Probabilistic simulations generate a lot of data. It is a good idea to select only the [[simulation output|simulation outputs]] and time points you are interested in.
 
===Selecting time points===

  - Click the **Simulation settings...** button in the [[user_interface#toolbar]] of the [[simulation screen]].
  - By default, MERLIN-Expo will output values for each day: the **Time series** is a linear series with an increment of 1 day. 
  - Edit the time series by clicking the **Edit** button
  - Either change the increment (for instance to 10 or 50), or choose some other type of series. The **Custom** series allow you to enter exactly for which time points you want results. Read more [[simulation_screen#simulation_settings|here]].

===Selecting outputs===

  - In the **Simulation settings** window, click the **Outputs** tab. 
  - Remove all the [[block|blocks]] you are not interested in.

===Starting the simulation===

To run a probabilistic simulation, you need to first to change **Simulation type** from **Deterministic** to **Probabilistic**. Then click the {{:runall24.png?nolink|}} **Run** button.

====Creating charts and tables====

Many new types of charts and tables are available for probabilistic results, please refer to the
[[charts screen]] and the [[tables screen]] pages.

====Sensitivity analysis: random methods====

After a probabilistic simulation you have access to the same charts and tables as after a sensitivity analysis using random methods. 

====Troubleshooting====

A lot of problems can arise during a probabilistic simulation which would not happen in a deterministic case. When parameter samples are drawn from the [[PDF|probability density functions]], you will inevitably end up with some extreme values. As discussed in the [[#Definining parameter correlations|correlation]] section, it is also easy to end up with //combinations// of parameter values which are unlikely or extreme. This can cause iterations to take hundreds of times longer than iterations with less extreme parameter values. In bad cases, the [[solver|numerical solver]] has to abort because it cannot meet error tolerances. 

Problems are not always easy to identify - is it a specific [[PDF]] which causes problems or a combination of samples?

^Problem^Solution^
|Simulation message: **X** is NaN at t=**Y**|[[#NaN]]|
|Simulation message: could not without reducing|[[#Error tolerance]]|
|Simulation never finishes|[[#Memory problem]],[[#Time out]]|


===NaN===

http://en.wikipedia.org/wiki/NaN (Not a Number) is the result of an undefined calculation. Examples are 0/0 (zero divided by zero), log(-3) (logarithm of a negative number), -1.5^(10/3) etc. 

MERLIN-Expo will abort a simulation as soon as a NaN is detected in any intermediate calculation result. For debugging purposes however, it is sometimes useful to [[#debugging|continue the simulation]].

NaN is sometimes used as a default value for parameters to assure that users have entered values. So, if the NaN is reported for a parameter, assert that its values are correctly set in the [[parameters_screen]].

===Debugging NaN and Infinite values===

==Step 1==

The error message contains the name of the [[block]] for which the NaN/Infinity value was discovered. In the [[user_interface#information|information box]] of the [[model_screen]], find the equation for the block. This will show you candidate parameters for the error.

Example: 

//River.Mass transfer coefficient at the surface water-sediment interface is NaN at t=0//

The equation is //D<sub>water</sub>·φ<sub>sed</sub>^(4.0/3.0)/(Δ<sub>sed</sub>+Δw·φ<sub>sed</sub>^(4.0/3.0))//

It is easy to tell that if φ<sub>sed</sub> is < 0, a NaN would be reported.

==Step 2==

MERLIN-Expo can be told to continue simulations even when infinity/NaN occurs and to produce statistics on which iterations failed. After the simulation is finished, a [[tables_screen#raw data table]] can be created to see exactly for which parameter sets the simulation fails.

  - Open [[simulation_screen#simulation settings|Simulation settings window]]
    - In the **Output** page, make sure that all uncertain parameters are among the simulation outputs. 
    - In the **Advanced** page
      - Deselect **Halt on error**. This will make MERLIN-Expo continue with the next iteration instead of aborting the simulation.
      - Select **Output statistics**. After the simulation is complete, a set of outputs will be available with information on which iterations failed. 
  - Open [[simulation_screen#probabilistic settings|Probabilistic settings window]].
    - In the **General** page, decrease the number as much as you can - no need to wait for a 1000 iterations if the first error is reported for iteration 10. 
    - Enter a number for the **Seed** instead of using **Auto**. This way the same sampling will be repeated every time you run a new simulation.
  - Start a simulation
  - Go to the the [[tables screen]]. 
    - Among the [[simulation output|simulation outputs]] there should be a folder named //_Statistics//. In it, right click //Failed_NaN// and select **Raw data table** from the menu.
    - //Failed_NaN// will be 1.0 for each failed iteration. 
   - Finally create a table with all the parameters: in the **Type** drop down list (below the results), select **Parameter**.
    - Select all parameters with CTRL+A.
    - Right-click one of the parameters and choose **Raw data table** from the menu.
    - Edit the table by right-clicking and choosing **Edit...**
    - Add a column for Failed_NaN by clicking the **Add column** button. The new column will be located last in the table.
    - Select this new, blank, column. In the **Output** field above, click the **...** button, and add **Failed_NaN**. 
    - Sort the table by clicking the **Failed_NaN** column so that the failed iterations end up in the top of the table.
    - Now, try to see if there is anything obviously wrong with the any of the parameter values.

==Step 3==

When it is not possible to understand by observing the parameter values what is wrong, only one thing remains: the process of elimination.

  - In the [[simulation_screen#probabilistic settings]] window, go to the **Parameters** screen.
  - Remove all except for one parameter.
  - Run a simulation.
  - If the simulation is successful,
  - Go back to the [[simulation_screen#probabilistic settings]] window and add another parameter.
  - Continue until a simulation fails. 

====Software resources====