Optimization

Jan Salecker

2024-01-29

Optimization with nlrx

Here we present two simple examples for running an optimization algorithm on a NetLogo model with nlrx. In our example, we use the Simulated Annealing simdesign (simdesign_GenSA()). However, except for the parameter definitions in the simdesign function and the output of the function, the genetic algorithm optimization (simdesign_GenAlg()) works in the same way.

We use the Wolf Sheep Predation model from the models library to show a basic example of the optimization workflow. Example 1 shows, how a NetLogo reporter can be used as a fitness criterion for optimization. Example 2 uses a self-defined evaluation function that calculates landscape metrics that are then used as fitness criterion.

Example 1: NetLogo reporter as fitness criterion

Step 1: Create a nl object:

library(nlrx)
# Windows default NetLogo installation path (adjust to your needs!):
netlogopath <- file.path("C:/Program Files/NetLogo 6.0.3")
modelpath <- file.path(netlogopath, "app/models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("C:/out")
# Unix default NetLogo installation path (adjust to your needs!):
netlogopath <- file.path("/home/NetLogo 6.0.3")
modelpath <- file.path(netlogopath, "app/models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("/home/out")

nl <- nl(nlversion = "6.0.3",
         nlpath = netlogopath,
         modelpath = modelpath,
         jvmmem = 1024)

Step 2: Attach an experiment

Because we want to apply an optimization algorithm, we need to define proper variable ranges. The algorithm is allowed to change the values of these parameters within these ranges in order to minimize our fitness criterion. In this example we want to use a reporter from the metrics slot for evaluating our model runs. Here we want to find a parameterization that leads to the maximum number of wolfs after 50 ticks. Because the algorithm automatically searches for minimum values, we add "1 / count wolves" to the metrics vector in order to find the maximum number of wolves.

It is also important to think about the settings for tickmetrics, runtime and evalticks. Because we only want to consider the last tick of the simulation, we set tickmetrics to “false” and runtime to 50. If more than one tick would be measured, the algorithm automatically calculates the mean value of the selected reporter. If you wish to apply other functions to aggregate temporal information into one value, you can use a self-defined evaluation function (see Example 2).

nl@experiment <- experiment(expname="wolf-sheep-GenSA1",
                            outpath=outpath,
                            repetition=1,
                            tickmetrics="false",
                            idsetup="setup",
                            idgo="go",
                            runtime=50,
                            metrics=c("(1 / count wolves)"),
                            variables = list('initial-number-sheep' = list(min=50, max=150, qfun="qunif"),
                                             'initial-number-wolves' = list(min=50, max=150, qfun="qunif")),
                            constants = list("model-version" = "\"sheep-wolves-grass\"",
                                             "grass-regrowth-time" = 30,
                                             "sheep-gain-from-food" = 4,
                                             "wolf-gain-from-food" = 20,
                                             "sheep-reproduce" = 4,
                                             "wolf-reproduce" = 5,
                                             "show-energy?" = "false"))

Step 3: Attach a simulation design

We use the simdesgin_GenSA() function to attach a Simulated Annealing simdesign. We select the evaluation criterion (evalcrit) by assigning the position of the reporter that we want to evaluate within the metrics vector of the experiment. In our case, there is only one reporter in the metrics vector thus we set evalcrit to use the first reporter (evalcrit = 1). The control parameter allows us to provide additional parameters for the GenSA function (see ?GenSA for details). For demonstration purposes, we set the maximum number of iterations to 20.

nl@simdesign <- simdesign_GenSA(nl, 
                                evalcrit = 1, 
                                nseeds = 1, 
                                control=list(maxit = 20))

Step 4: Run simulations

For optimization simdesign, the run_nl_dyn() function lets you execute the simulations. There are some notable differences between run_nl_all() and run_nl_dyn(). First, because parameterizations depend of results from previous runs, run_nl_dyn() can not be parallelized. Second, the procedure does not automatically loop over created random seeds of the simdesign. If you want to repeat the same algorithm several times, just embed the run_nl_dyn() function in any kind of loop and iterate through the nl@simdesign@simseeds vector. Third, the output of run_nl_dyn() is reported as objects from the specific optimization procedures and not in tibble format. In order to attach these results to the nl object, the output needs to be converted to tibble format first. However, attaching optimization results to the nl does not enable any further post-processing functions of the nlrx package and is only relevant for storing results together with the nl object. This design decision was made in order to allow application of the method specific summary functions to the results of the optimization.

results <- run_nl_dyn(nl, seed = nl@simdesign@simseeds[1])

Step 5: Investigate output

The output list of the Simulated Annealing procedure contains four elements: value reports the minimum final value of the evaluation criterion. par reports the parameter settings of the final parameterisation in the same order as defined in the experiment of the nl object. trace.mat gives you detailed information on the optimization process over all iterations. counts indicates how often the optimization procedure was executed in total.

results

In order to store our results together with the nl object we need to attach the results to the nl object first. As explained above, we need to enframe the results as a tibble.

setsim(nl, "simoutput") <- tibble::enframe(results)
saveRDS(nl, file.path(nl@experiment@outpath, "genSA_1.rds"))

Example 2: Evaluation function as fitness criterion

Step 1: Create a nl object:

library(nlrx)
# Windows default NetLogo installation path (adjust to your needs!):
netlogopath <- file.path("C:/Program Files/NetLogo 6.0.3")
modelpath <- file.path(netlogopath, "app/models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("C:/out")
# Unix default NetLogo installation path (adjust to your needs!):
netlogopath <- file.path("/home/NetLogo 6.0.3")
modelpath <- file.path(netlogopath, "app/models/Sample Models/Biology/Wolf Sheep Predation.nlogo")
outpath <- file.path("/home/out")

nl <- nl(nlversion = "6.0.3",
         nlpath = netlogopath,
         modelpath = modelpath,
         jvmmem = 1024)

Step 2: Attach an experiment

Because we want to apply an optimization algorithm, we need to define proper variable ranges. The algorithm is allowed to change the values of these parameters within these ranges in order to minimize our fitness criterion. In this example we want to use a self-defined evaluation function to calculate a fitness criterion. Thus, we add the patch coordinates and patch color (as a patch class indicator) to the metrics.patches vector. We want to use spatial data to calculate the landscape edge density index of the final tick and find a parameterization that leads to the edge density. Because we only want to consider the last tick of the simulation, we set tickmetrics to “false” and runtime to 50.

nl@experiment <- experiment(expname="wolf-sheep-GenSA2",
                            outpath=outpath,
                            repetition=1,
                            tickmetrics="false",
                            idsetup="setup",
                            idgo="go",
                            runtime=50,
                            metrics.patches = c("pxcor", "pycor", "pcolor"),
                            variables = list('initial-number-sheep' = list(min=50, max=150),
                                             'initial-number-wolves' = list(min=50, max=150)),
                            constants = list("model-version" = "\"sheep-wolves-grass\"",
                                             "grass-regrowth-time" = 30,
                                             "sheep-gain-from-food" = 4,
                                             "wolf-gain-from-food" = 20,
                                             "sheep-reproduce" = 4,
                                             "wolf-reproduce" = 5,
                                             "show-energy?" = "false"))

Step 3: Attach a simulation design

We use the simdesgin_GenSA() function to attach a Simulated Annealing simdesign. Because we want to post-process our simulation results, we need to define an evaluation function. The evaluation function needs to accept the nl object as input and must return a single numeric value. First we load the package landscapemetrics. We then convert the spatial data to a raster format and calculate the landscape edge density index. Finally, we report only the index value of the resulting tibble.

critfun <- function(nl) {
  library(landscapemetrics)
  res_spat <- nl_to_raster(nl)
  res_spat_raster <- res_spat$spatial.raster[[1]]
  lsm <- lsm_l_ed(res_spat_raster)
  crit <- lsm$value
  return(crit)
}

In the simdesign_GenSA() function we now provide our evaluation function (critfun) as evaluation criterion (evalcrit). The control parameter allows us to provide additional parameters for the GenSA function (see ?GenSA for details). For demonstration purposes, we set the maximum number of iterations to 20.

nl@simdesign <- simdesign_GenSA(nl, 
                                evalcrit = critfun, 
                                nseeds = 1, 
                                control=list(maxit = 20))

Step 4: Run simulations

For optimization simdesign, the run_nl_dyn() function lets you execute the simulations. There are some notable differences between run_nl_all() and run_nl_dyn(). First, because parameterizations depend of results from previous runs, run_nl_dyn() can not be parallelized. Second, the procedure does not automatically loop over created random seeds of the simdesign. If you want to repeat the same algorithm several times, just embed the run_nl_dyn() function in any kind of loop and iterate through the nl@simdesign@simseeds vector. Third, the output of run_nl_dyn() is reported as objects from the specific optimization procedures and not in tibble format. In order to attach these results to the nl object, the output needs to be converted to tibble format first. However, attaching optimization results to the nl does not enable any further post-processing functions of the nlrx package and is only relevant for storing results together with the nl object. This design decision was made in order to allow application of the method specific summary functions to the results of the optimization.

results <- run_nl_dyn(nl, seed = nl@simdesign@simseeds[1])

Step 5: Investigate output

The output list of the Simulated Annealing procedure contains four elements: value reports the minimum final value of the evaluation criterion. par reports the parameter settings of the final parameterisation in the same order as defined in the experiment of the nl object. trace.mat gives you detailed information on the optimization process over all iterations. counts indicates how often the optimization procedure was executed in total.

results

In order to store our results together with the nl object we need to attach the results to the nl object first. As explained above, we need to enframe the results as a tibble.

setsim(nl, "simoutput") <- tibble::enframe(results)
saveRDS(nl, file.path(nl@experiment@outpath, "genSA_2.rds"))