# How to calibrate a computational model?

Calibration, or parameter estimation, consists in solving the inverse problem of finding descriptors (species, parameters or compartments) values such that the computational model outputs match some user-defined fitness criteria.

## Fitness criteria

A fitness criterion is a metric of the match between outputs of the simulation and the expected behaviors that are covered by the Scoring sets and / or the data tables used as inputs.

The "Objective function" tab of the calibration consists of a list of scoring sets and/or data tables, where having at least one item selected is mandatory.

### Scoring set

A scoring set is a set of constraints that the model simulation should respect, and that may be quantitative or qualitative. For now, it is only possible to create scoring sets with Nova internal tools - reach out to your contact point to get some support in defining those for your models.

### Data table

A data table usable in a calibration is typically the time series corresponding to a specie, a parameter or a compartment volume of the model. These Data tables can be mapped in order to be compatible with the calibration of a computational model, provided that they are in a compatible format. See the dedicated documentation to read more about data tables and mappings.

The data tables transformed to be usable in the context have the following columns:

-*obsId***mandatory**. The ID of the model component that is to be compared to the data.-*time***mandatory**. Time point, as an ISO 8601 durationOne of the following* -

**mandatory:**

*Note that if both columns are provided then the narrow bounds prevail: Target numerical value of the obsId at time**value**Or

*narrowRangeLowBound***narrowRangeHighBound**The score is =1 if the simulated value is within the narrow range.

Default is: (value,value), hence in that case, the score will equal 1 only if the simulated value is exactly equal to the target.

The value has to be within the narrow range.

*wideRangeLowBound**wideRangeHighBound*- optional. Wide range around the target values. The score is:in ]0,1[ if the simulated value is within the wide bounds but outside of the narrow bounds

=0 if the simulated value is equal to one of the wide bounds, with a quadratic evolution

<0 if the simulated value is outside of the wide bounds

Default is : (narrowRangeLowBound - 1, narrowRangeHighBound + 1). In that case the cost function is equivalent to a mean squared error (f(x) = 1 - (x- x*)^2 where x* is either the narrowRangeLowBound or narrowRangeRightBound).

*armScope*- optional. This row will only be evaluated on that protocol arm of the simulation. If null, the observation will be applied to all arms.*weight*- optional. Weight of the observation, it will be used to compute a weighted mean for the score tied to each experimentName. Default is 1.*unit**experimentRef*- optional. A link to track where the data comes from, like the "links" field of a computational model.

Here is an example of a table that can be uploaded to jinkō for calibration purposes:

obsId* | time* | value* | narrowRangeLowBound* | narrowRangeHighBound* | wideRangeLowBound | wideRangeHighBound | armScope | weight | unit | experimentRef |
---|---|---|---|---|---|---|---|---|---|---|

String | String (ISO8601 convention) | Numeric | Numeric | Numeric | Numeric | Numeric | String | Positive Numeric | String | String |

observable_1 | P0M | 30 | 44 | 54 | 32.8 | 172.8 | control | 1 | mg | https://... |

observable_1 | P12M | 34 | 44 | 54 | 32.8 | 172.8 | control | 1 | mg | https://... |

observable_1 | P0M | 41 | 48 | 58 | 37.6 | 185.6 | treated | 1 | mg | https://... |

observable_1 | P12M | 42 | 48 | 58 | 37.6 | 185.6 | treated | 1 | mg | https://... |

observable_2 | P0M | 5 | 2 | 10 | 0 | 32 | control | 1 | mol / L | https://... |

observable_2 | P12M | 5 | 2 | 10 | 0 | 32 | control | 1 | mol / L | https://... |

observable_2 | P0M | 5 | 2 | 10 | 0 | 32 | treated | 1 | mol / L | https://... |

observable_2 | P12M | 5 | 2 | 10 | 0 | 32 | treated | 1 | mol / L | https://... |

## Algorithm

The calibration fitness function is defined as the weighted sum of the different fitness criteria. It produces a single scalar value, ranging from -∞ to 1, which the calibration algorithm aims at maximizing.

Under the hood, Jinkō uses the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm to perform such maximization.

## Running a calibration

### Computational model, Output set and Protocol arms

To start off, you’ll need a computational model to calibrate. This model will be used to run the simulations with varying inputs (cf next section).

You can also add an Output Set, if there are outputs of the model you’d like to observe. Note that unlike the scorings, which may also be output as scalar results, the output set will not be used for the optimization, but only for visualization.

You can also add Protocol arms, in order to run your model in several different circumstances (typically, different types of experiments, or different doses of a trial).

### Inputs to calibrate

The inputs to calibrate are the model components that will be used for the optimization of the weighted score. You will need to select here the parameters, species or compartments for which the values (respectively initial condition or volume) are unknown. You also need to ensure that the selected inputs actually have an impact on the outpurs you want to optimize: you can use the contribution analysis tool for that.

For each selected input, you need to define:

The mean and SD: the default values are taken from your CM, and will be used to define a normal distribution from which the patients for the first iteration will be drawn.

The bounds are used in the calibration to penalize unacceptable values of a given parameter, typically to avoid wrong behaviors of the model, such as a KM parameter becoming negative.

For parameters that you need to explore on a large range, you can use a log transformation. In that case, In that case we perform a change of variables and calibrate the log of the parameter. This means in particular that the the mean and SD values are those of the log (i.e. you should put a mean of 1 if you want your parameter to be sampled around 10).

Adding too many inputs to calibrate or very wide ranges will lead to the parameter space to explore being huge, and therefore maybe not fully explored. On the other hand, having a too small parameter space may not allow you to optimize your weighted score. It is usually recommended to have 3 to 10 inputs to calibrate, possibly by doing several iterative calibration steps.