This notebook tracks progress on the development of GMSE software R package, for game-theoretic management strategy evaluation, and related issues surrounding the development and application of game theory for addressing questions of biodiversity and food security.
2019
2018
2017
2016
Towards a Game-theoretic Management Strategy Evaluation (G-MSE)
Game-theory modelling (game.c; green box above)
Some side-notes that might be of use
Potentially relevant conferences and workshops
References consulted and annotated (Mendeley)
GMSE v0.4.0.11 is now available on GitHub. Some minor changes and bug fixes are included in the update version.
paras
vector in development. It also made repeated calls to gmse_apply
crash when the exact same arguments were specified but some values (e.g., resource abundance) changed.gmse_apply
, noting some special considerations that could potentially cause confusion.I will update once more when GMSE v0.4.0.11 is successfully submitted to CRAN.
GMSE v0.4.0.7 is now available on GitHub, and the official GMSE repository is now transferred to the ConFooBio organisation. I am currently in the process of fixing some website issues, but most of the important stuff has transferred to the new website location.
GMSE v0.4.0.3 is now available on CRAN. A new website for GMSE has also been launched. This website was built with the R package pkgdown, recently released on CRAN. The site contains all of the vignettes and documentation for GMSE, and also includes a link to this lab notebook. A submission of the accompanying manuscript will soon be uploaded on bioRxiv.
A new GMSE v0.4.0.3 has now been pushed to the master branch on GitHub and has been submitted to CRAN. The biggest update in this new version is a series of vignettes, plus a minor improvement to the genetic algorithm. More updates will follow soon, including some re-organisation of the GMSE project and a new manuscript submission.
I have re-worked the way that a manager restimates how the change in their policy affects users’ actions. The new new_act
function in the genetic algorithm (games.c
) performs well for getting more precise cost settings. The former way of doing it was much more of a blunt instrument, and it had a ceiling issue – that is, the manager would believe that higher costs caused fewer actions even when the resulting cost was over the users’ budgets.
/* =============================================================================
* This function updates an action based on the change in costs & paras
* old_cost: The old cost of an action, as calculated in policy_to_counts
* new_cost: The new cost of an action, as calculated in policy_to_counts
* paras: Vector of global parameters
* ========================================================================== */
int new_act(double old_cost, double new_cost, double old_act, double *paras){
int total_acts;
double users, max_to_spend, acts_per_user, cost_per_user, total_cost;
double pr_on_act, budget_for_act, mgr_budget, min_cost;
users = paras[54] - 1; /* Minus one for the manager */
min_cost = paras[96]; /* Minimum cost of an action */
max_to_spend = paras[97]; /* Maximum per user budget */
mgr_budget = paras[105]; /* Manager's total budget */
total_cost = 0.0;
if(old_cost < mgr_budget){
total_cost = old_act * old_cost; /* Total cost devoted to action */
}
cost_per_user = (total_cost / users); /* Cost devoted per user */
pr_on_act = cost_per_user / max_to_spend; /* Pr. devoted to action */
/* Assume that the proportion of the budget a user spends will not change */
budget_for_act = max_to_spend * pr_on_act;
/* Calculate how many actions to expect given acts per user and users */
acts_per_user = budget_for_act / (new_cost + min_cost);
total_acts = (double) users * acts_per_user;
return(total_acts);
}
This new way of assessing how users will act is now the function to be run in the background of all manager genetic agorithms. Very nicely, this also resolves an annoyance with the maximum allowed budgets. Previously, it was unclear why maximum budgets greater than 10000 were causing problems (managers were making bad predictions). I have now set the maximum budget to an order of magnitude higher, and there are no longer any apparent issues. A new version of GMSE will soon have this update.
New Issue #40: Age distribution bump
Running simulations using gmse_apply
, jeremycusack noticed a small but noticeable sharp decline in the population size at a generation equal to the maximum age of resources in the population (used a maximum age of 20). This decline is caused by the initial seed of resources having a uniform age distribution. In the first generation, these resources reproduce offspring that all have an age of zero, leading to an age structure in the population with many zero age individuals and a uniform distribution of ages greater than zero. The initial seed of individuals with random ages died gradually, but there were enough individuals in the initial offspring cohort that made it to the maximum age for it to have a noticeable effect in decreasing population size (i.e., all of these resources died on the maximum_age + 1
time step).
This effect can be avoided entirely given sufficient burn in generations of a model, and is less of a problem when the maximum age is low because this allows the age distribution to stabilise sooner. Further, using gmse_apply
can avoid the issue by directly manipulating resources ages after the initial generation. Nevertheless, it would be useful to have a different default of age distributions instead of a uniform distribution.
One way to do this would be to find the age (\(A\)) at which a resource is expected to be alive with a probability of \(0.5\), after accounting for mortality (\(\mu\)). This is simply calculated below:
\((1 - \mu)^A = 0.5\)
The above can be re-arranged to find A,
\(A = \frac{log(0.5)}{log(1 - \mu)}\).
Note that we could use a switch function (or something like it in R) to make \(A = 0\) when \(\mu = 1\), and revert to a uniform distribution of \(\mu = 0\) (though this should rarely happen).
The value of \(\mu\) would depend on res_death_type
, and be processed in make_resource
, which is used in both gmse
and gmse_apply
. If res_death_type = 1
(density independent, rarely used), then \mu
is simply equal to remov_pr
. If res_death_type = 2
(density dependent), then \mu
could be found perhaps using something like the following:
mu = (RESOURCE_ini * lambda) / (RESOURCE_ini + RESOURCE_ini * lambda)
gi This would get a value that is at least proportional to expected mortality rate of a resource (if res_death_type = 3
, then we could use the some of types 1 and 2). Overall, the documentation should perhaps recommend finding a stable set of age distributions for a particular set of parameter combinations when using gmse_appy
(i.e., through simulation), then using that distribution as an initial condition. But something like the above could probably get close to whatever the stable age distribution would be, at least close enough to make the decline in population size trivial.
I will start to consider some of the above as a potential default for the next version of GMSE. The best way to do this is probably to look at how code from the res_remove
function in the resource.c
file can best be integrated into a function called by the R function make_resource
(i.e., either use the equations, or estimates of them, or somehow call res_remove
directly).
Improved convergence criteria
I have introduced and then immediately resolved Issue #39.
The convergence criteria has now been fixed with commit f598d8e52b47ef2017cac13d09aac1fb7aa6b506. To do this, I re-configured some of the genetic algorithm code into easier to read functions for checking the fitness increase. Now two separate ways of checking the increase in fitness from one genetic algorithm generation to the next exist; one for managers and one for users. This is needed because user fitness values are greater than zero and increase as their utility is maximised, but manager fitness values are less than zero and increase toward zero as their utility is maximised. The genetic algorithm now checks for a percentage improvement in fitness.
Now the default value of converge_crit
equals 1, which means it does actually play a role sometimes (or is expected to). The genetic algorithm will continue until the percent increase in fitness from the previous generation is less than one percent. In practice, this doesn’t noticeably affect much, but it does allow better strategies to be found more quickly, and without having to play with ga_mingen
to find them under extreme parameter settings (e.g., huge budgets and rapid shifts in abundance).
The new fix has now been checked and built with Winbuilder into v0.3.2.0, but I am leaving this on the development branch for now in anticipation of other potential improvements to be made soon.
CRAN ready GMSE v0.3.1.7 – more flexibility, better error messages
I have now completed some substantial coding of error messages, which will be called in both gmse
and gmse_apply
. Essentially, these provide some help to software users who parameterise their models in a way that does not work with GMSE. For example, if the parameter stakeholders
is set to equal a negative number, an error message will be returned that informs the user that at least one stakeholder is required in the model. These error messages become a bit more important in gmse_apply
, where it is possible for users to include arguments that don’t make sense (e.g., arrays of incorrect dimensions, or arguments that contradict one another).
The function gmse_apply
has also been improved to make looping it easier. What had been happening during testing was that we were finding it all too easy to crash R by reading in parameters that contradicted one another (e.g., changing setting the landscape dimensions through land_dim_1
and land_dim_2
caused a crash when also trying to add in a LAND
of different dimension – now this returns an error that LAND and land_dim_1 disagree about landscape size
). This has been resolved in two ways. First, I have included many error messages meant to catch bad and contradictory arguments in gmse_apply
(and, to a lesser extent gmse
); it is still possible to crash R by setting things incorrectly, but you have to work very hard to do it – i.e., it almost has to be deliberate, as far as I can tell. Second, I have added the argument old_list
to gmse_apply
, which is FALSE
by default, but can instead take the output of a previous full list return of gmse_apply
(where get_res = Full
). An element of the full list includes the basic output from which key parameters can be pulled. As a reminder, the basic gmse_apply
output looks like the below.
$resource_results
[1] 1062
$observation_results
[1] 680.2721
$manager_results
resource_type scaring culling castration feeding help_offspring
policy_1 1 NA 110 NA NA NA
$user_results
resource_type scaring culling castration feeding help_offspring tend_crops kill_crops
Manager 1 NA 0 NA NA NA NA NA
user_1 1 NA 9 NA NA NA NA NA
user_2 1 NA 9 NA NA NA NA NA
user_3 1 NA 9 NA NA NA NA NA
user_4 1 NA 9 NA NA NA NA NA
An example gmse_apply
used in a loop is below.
to_scare <- FALSE;
sim_old <- gmse_apply(scaring = to_scare, get_res = "Full", stakeholders = 6);
sim_sum <- matrix(data = NA, nrow = 20, ncol = 7);
for(time_step in 1:20){
sim_new <- gmse_apply(scaring = to_scare, get_res = "Full",
old_list = sim_old);
sim_sum[time_step, 1] <- time_step;
sim_sum[time_step, 2] <- sim_new$basic_output$resource_results[1];
sim_sum[time_step, 3] <- sim_new$basic_output$observation_results[1];
sim_sum[time_step, 4] <- sim_new$basic_output$manager_results[2];
sim_sum[time_step, 5] <- sim_new$basic_output$manager_results[3];
sim_sum[time_step, 6] <- sum(sim_new$basic_output$user_results[,2]);
sim_sum[time_step, 7] <- sum(sim_new$basic_output$user_results[,3]);
sim_old <- sim_new;
print(time_step);
}
colnames(sim_sum) <- c("Time", "Pop_size", "Pop_est", "Scare_cost",
"Cull_cost", "Scare_count", "Cull_count");
The ouput sim_sum
is shown below.
Time Pop_size Pop_est Scare_cost Cull_cost Scare_count Cull_count
[1,] 1 733 839.0023 NA 110 NA 54
[2,] 2 768 702.9478 NA 110 NA 54
[3,] 3 824 725.6236 NA 110 NA 54
[4,] 4 933 907.0295 NA 110 NA 54
[5,] 5 1180 816.3265 NA 110 NA 54
[6,] 6 1345 1224.4898 NA 10 NA 426
[7,] 7 1114 1269.8413 NA 10 NA 425
[8,] 8 820 884.3537 NA 110 NA 54
[9,] 9 952 793.6508 NA 110 NA 54
[10,] 10 1101 884.3537 NA 110 NA 54
[11,] 11 1299 1111.1111 NA 12 NA 402
[12,] 12 1079 907.0295 NA 110 NA 54
[13,] 13 1227 1564.6259 NA 10 NA 431
[14,] 14 934 839.0023 NA 110 NA 54
[15,] 15 1065 1133.7868 NA 10 NA 423
[16,] 16 768 725.6236 NA 110 NA 54
[17,] 17 869 929.7052 NA 110 NA 54
[18,] 18 949 907.0295 NA 110 NA 54
[19,] 19 1049 884.3537 NA 110 NA 54
[20,] 20 1200 1020.4082 NA 64 NA 90
We can take advantage of gmse_apply
to dynamically change parameter values mid-loop. For example, below shows the same code, but with a policy of scaring introduced on time step 10.
to_scare <- FALSE;
sim_old <- gmse_apply(scaring = to_scare, get_res = "Full", stakeholders = 6);
sim_sum <- matrix(data = NA, nrow = 20, ncol = 7);
for(time_step in 1:20){
sim_new <- gmse_apply(scaring = to_scare, get_res = "Full",
old_list = sim_old);
sim_sum[time_step, 1] <- time_step;
sim_sum[time_step, 2] <- sim_new$basic_output$resource_results[1];
sim_sum[time_step, 3] <- sim_new$basic_output$observation_results[1];
sim_sum[time_step, 4] <- sim_new$basic_output$manager_results[2];
sim_sum[time_step, 5] <- sim_new$basic_output$manager_results[3];
sim_sum[time_step, 6] <- sum(sim_new$basic_output$user_results[,2]);
sim_sum[time_step, 7] <- sum(sim_new$basic_output$user_results[,3]);
sim_old <- sim_new;
if(time_step == 10){
to_scare <- TRUE;
}
print(time_step);
}
colnames(sim_sum) <- c("Time", "Pop_size", "Pop_est", "Scare_cost",
"Cull_cost", "Scare_count", "Cull_count");
The above simulation results in the following output for sim_sum
.
Time Pop_size Pop_est Scare_cost Cull_cost Scare_count Cull_count
[1,] 1 745 657.5964 NA 110 NA 54
[2,] 2 805 1111.1111 NA 12 NA 400
[3,] 3 473 634.9206 NA 110 NA 54
[4,] 4 504 566.8934 NA 110 NA 54
[5,] 5 577 498.8662 NA 110 NA 54
[6,] 6 600 430.8390 NA 110 NA 54
[7,] 7 648 612.2449 NA 110 NA 54
[8,] 8 714 702.9478 NA 110 NA 54
[9,] 9 813 612.2449 NA 110 NA 54
[10,] 10 914 1020.4082 NA 64 NA 90
[11,] 11 1011 1179.1383 57 10 49 301
[12,] 12 858 725.6236 10 110 193 37
[13,] 13 1011 1043.0839 37 30 0 198
[14,] 14 989 1043.0839 57 30 0 198
[15,] 15 983 1065.7596 48 20 10 270
[16,] 16 851 839.0023 10 110 193 37
[17,] 17 962 1111.1111 38 12 58 306
[18,] 18 783 612.2449 10 110 193 37
[19,] 19 862 816.3265 10 110 193 37
[20,] 20 963 702.9478 10 110 182 38
Hence, in addition to all of the other benefits of gmse_apply
, one new feature is that we can use it to study change in policy availability – in this case, what happens when scaring is introduced as a possible policy option. Similar things can be done, for example, to see how manager or user power changes over time. In the example below, users’ budgets increase by 100 every time step, with the manager’s budget remaining the same. The consequence appears to be decreased population stability and a higher likelihood of extinction.
ub <- 500;
sim_old <- gmse_apply(get_res = "Full", stakeholders = 6, user_budget = ub);
sim_sum <- matrix(data = NA, nrow = 20, ncol = 6);
for(time_step in 1:20){
sim_new <- gmse_apply(get_res = "Full", old_list = sim_old,
user_budget = ub);
sim_sum[time_step, 1] <- time_step;
sim_sum[time_step, 2] <- sim_new$basic_output$resource_results[1];
sim_sum[time_step, 3] <- sim_new$basic_output$observation_results[1];
sim_sum[time_step, 4] <- sim_new$basic_output$manager_results[3];
sim_sum[time_step, 5] <- sum(sim_new$basic_output$user_results[,3]);
sim_sum[time_step, 6] <- ub;
sim_old <- sim_new;
ub <- ub + 100;
print(time_step);
}
colnames(sim_sum) <- c("Time", "Pop_size", "Pop_est", "Cull_cost", "Cull_count",
"User_budget");
The output of sim_sum
is below.
Time Pop_size Pop_est Cull_cost Cull_count User_budget
[1,] 1 1215 1405.8957 10 292 500
[2,] 2 1065 1224.4898 10 336 600
[3,] 3 833 680.2721 110 36 700
[4,] 4 936 907.0295 110 42 800
[5,] 5 1174 1224.4898 10 401 900
[6,] 6 887 521.5420 110 54 1000
[7,] 7 988 680.2721 110 60 1100
[8,] 8 1084 975.0567 110 60 1200
[9,] 9 1208 861.6780 110 66 1300
[10,] 10 1360 1133.7868 10 520 1400
[11,] 11 975 861.6780 110 78 1500
[12,] 12 1079 1156.4626 10 560 1600
[13,] 13 597 770.9751 110 90 1700
[14,] 14 595 476.1905 110 96 1800
[15,] 15 586 612.2449 110 102 1900
[16,] 16 584 770.9751 110 108 2000
[17,] 17 557 589.5692 110 114 2100
[18,] 18 519 521.5420 110 120 2200
[19,] 19 469 521.5420 110 120 2300
[20,] 20 430 453.5147 110 126 2400
There is an important note to make about changing arguments to gmse_apply
when old_list
is being used: The function gmse_apply
is trying to avoid a crash, so the function will accomodate parameter changes by rebuilding data structures if necessary. For example, if you change the number of stakeholders (and by including an argument stakeholders
to gmse_apply
, it is assumed that stakeholders are changing even they are not), then a new array of agents will need to be built. If you change landscape dimensions (or just include the argument land_dim_1
or land_dim_2
), then a new landscape willl be built. This is mentioned in the documentation.
GMSE v0.3.3.7 passes all CRAN checks in Rstudio. I will make sure that the code works with win-builder, then prepare the new submission. Alternatively, as always, the newest GMSE version can be downloaded through GitHub if you have devtools
installed in R.
devtools::install_github("bradduthie/GMSE")
I will soon update the manuscript for GMSE and upload it to biorXiv.
Bug fix concerning density-based estimation
An error with density-based resource estimation (observe_type = 0
) at very high values of agent_view
was identified by Jeremy. When managers had a view of the landscape that encompassed a number of cells that was calculated to be larger than the actual number of landscape cells (as defined by land_dim_1 * land_dim_2
), the manager would understimate actual population size. This occurred only in the manager.c
file and not in the equivalent R function shown during plotting. The bug was fixed in commit a916b8f8a40041b5f08984cf73348108482dde59 with a simple if
statement. This has therefore been resolved in a patched GMSE v0.3.1.3, which is now availabe on GitHub.
Bug fix concerning resource movement
An error with the res_move_obs
parameter was identified by Jeremy. This parameter was supposed to only affect resource movement during observation, but an if statement corrected in commit 5eeb88d285af57984171e7d72410659b3b441af3 was causing res_move_obs = FALSE
to stop moving entirely in the resource model. This has now been resolved in a patched GMSE v0.3.1.1, which is now available on GitHub.
New option for removal of resources
A new option has been included for the argument res_death_type
. By setting res_death_type = 3
in gmse
or gmse_apply
, resources can experience both density dependent (caused by res_death_K
) and density independent (caused by remove_pr
) removal simultaneously. Effects of each are independent of one another (i.e., both processes occur simultaneously, so the calculation of population size affecting removal due to carrying capacity includes resources that might experience density independent mortality).
New group_think parameter in GMSE v0.3.1.0
A new group_think
parameter has been developed by Jeremy and me, and included into an updated v0.3.1.0. This parameter is defined as FALSE
by default, but when set to be TRUE
will cause all users to act as a single block instead of independently. In the code, what happens is that a single user (user ID number 2) runs through the genetic algorithm, but then instead of having the resulting actions apply to only this user, they apply to all users so that the genetic algorithm only needs to be run once in the user model. This decreases simulation time, particularly when there are a lot of users to model, but at a cost of removing all variation in actions among users. The group_think
parameter can be defined in both gmse()
and gmse_apply()
, but I have not added it as an option in gmse_gui()
.
GMSE v0.3.0.0 now available with gmse_apply
The gmse_apply
function is now available on a new GMSE version 0.3.0.0. (minor tweaks to other functions have also been made, but nothing that changes the user experience of gmse – mostly typos corrected in the documentation). The new function allows software users to integrate their own submodels (resource, observation, manager, and user) into GMSE, or to use their own submodels entirely within a single function.
GMSE apply function
The gmse_apply function is a flexible function that allows for user-defined sub-functions calling resource, observation, manager, and user models. Where such models are not specified, GMSE submodels ‘resource’, ‘observation’, ‘manager’, and ‘user’ are run by default. Any type of sub-model (e.g., numerical, individual-based) is permitted as long as the input and output are appropriately specified. Only one time step is simulated per call to gmse_apply, so the function must be looped for simulation over time. Where model parameters are needed but not specified, defaults from gmse are used.
gmse_apply arguments
res_mod The function specifying the resource model. By default, the individual-based resource model from gmse is called with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘resource_array’ or ‘resource_vector’, and arrays must follow the format of GMSE in terms of column number and type (if there is only one resource type, then the model can also just return a scalar value).
obs_mod The function specifying the observation model. By default, the individual-based observation model from gmse is called with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘observation_array’ or ‘observation_vector’, and arrays must follow the format of GMSE in terms of column number and type (if there is only one resource type, then the model can also just return a scalar value).
man_mod The function specifying the manager model. By default, the individual-based manager model that calls the genetic algorithm from gmse is used with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘manager_array’ or ‘manager_vector’, and arrays must follow the (3 dimensional) format of the ‘COST’ array in GMSE in terms of column numbers and types, with appropriate rows for interactions and layers for agents (see documentation of GMSE for constructing these, if desired). User defined manager outputs will be recognised as costs by the default user model in gmse, but can be interpreted differently (e.g., total allowable catch) if specifying a custom user model.
use_mod The function specifying the user model. By default, the individual-based user model that calls the genetic algorithm from gmse is used with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘user_array’ or ‘user_vector’, and arrays must follow the (3 dimensional) format of the ‘ACTION’ array in GMSE in terms of column numbers and types, with appropriate rows for interactions and layers for agents (see documentation of GMSE for constructing these, if desired).
get_res How the output should be organised. The default ‘basic’ attempts to distill results down to their key values from submodel outputs, including resource abundances and estimates, and manager policy and actions. An option ‘custom’ simply returns a large list that includes the output of every submodel. Any other option (e.g. ‘full’) will return a massive list with all of the input, output, and parameters used to run gmse_apply.
… Arguments passed to user-defined functions, and passed to modify default parameter values that would otherwise be called for gmse default models. Any argument that can be passed to gmse can be specified explicitly, just as if it were an argument to gmse. Similarly, any argument taken by a user-defined function should be specified, though the function will work if the user-defined function has a default that is not specified explicitly.
Example uses of gmse_apply
A simple run of gmse_apply()
will return one generation of gmse using default submodels and parameter values.
sim <- gmse_apply();
For sim
, the default ‘basic’ results are returned as below.
$resource_results
[1] 1102
$observation_results
[1] 1179.138
$manager_results
scaring culling castration feeding help_offspring
policy NA 10 NA NA NA
$user_results
resource_type scaring culling castration feeding help_offspring tend_crops kill_crops
Manager 1 NA 0 NA NA NA NA NA
user_2 1 NA 70 NA NA NA NA NA
user_3 1 NA 75 NA NA NA NA NA
user_4 1 NA 69 NA NA NA NA NA
user_5 1 NA 74 NA NA NA NA NA
Note in the case above we have the total abundance of resources returned, the estimate of resource abundance from the observation function, the costs the manager sets for the only available action of culling, and the number of culls attempted by each user.
The above was produced by all of the individual-based functions that are default in GMSE; custom generated subfunctions can instead be included provided that they fit the specifications described above. For example, we can define a very simple logistic growth function to send to res_mod
instead.
alt_res <- function(X, K = 2000, rate = 1){
X_1 <- X + rate*X*(1 - X/K);
return(X_1);
}
The above function takes in a population size of X
and returns a value X_1
based on the population intrinsic growth rate rate
and carrying capacity K
. Iterating the logistic growth model by itself under default parameter values with a starting population of 100 will cause the population to increase to carrying capacity in roughly 7 generations. The function can be substituted into gmse_apply
to use it instead of the default GMSE resource model.
sim <- gmse_apply(res_mod = alt_res, X = 100, rate = 0.3);
The gmse_apply
function will find the parameters it needs to run the alt_res
function in place of the default resource function, either by running the default function values (e.g., K = 2000
) or values specified directly into gmse_apply
(e.g., X = 100
and rate = 0.3
). If an argument to a custom function is required but not provided either as a default or specified in gmse_apply
, then an error will be returned.
To integrate across different types of submodels, gmse_apply
translates between vectors and arrays between each submodel. For example, because the default GMSE observation model requires a resource array with particular requirements for column identites, when a resource model subfunction returns a vector, or a list with a named element ‘resource_vector’, this vector is translated into an array that can be used by the observation model. Specifically, each element of the vector identifies the abundance of a resource type (and hence will usually be just a single value denoting abundance of the only focal population). If this is all the information provided, then a resource_array will be made with default GMSE parameter values with an identical number of rows to the abundance value (floored if the value is a non-integer; non-default values can also be put into this transformation from vector to array if they are specified in gmse_apply, e.g., through an argument such as lambda = 0.8
). Similarly, a resource_array
is also translated into a vector after the default individual-based resource model is run, should the observation model require simple abundances instead of an array. The same is true of observation_vector
and observation_array
objects returned by observation models, of manager_vector
and manager_array
(i.e., COST) objects returned by manager models, and of user_vector
and user_array
(i.e., ACTION) objects returned by user models. At each step, a translation between the two is made, with necessary adjustments that can be tweaked through arguments to gmse_apply
when needed. Alternative observation, manager, and user, submodels, for example, are defined below; note that each requires a vector from the preceding model.
# Alternative observation submodel
alt_obs <- function(resource_vector){
X_obs <- resource_vector - 0.1 * resource_vector;
return(X_obs);
}
# Alternative manager submodel
alt_man <- function(observation_vector){
policy <- observation_vector - 1000;
if(policy < 0){
policy <- 0;
}
return(policy);
}
# Alternative user submodel
alt_usr <- function(manager_vector){
harvest <- manager_vector + manager_vector * 0.1;
return(harvest);
}
All of these submodels are completely deterministic, so when run with the same parameter combinations, they produce replicable outputs.
gmse_apply(res_mod = alt_res, obs_mod = alt_obs,
man_mod = alt_man, use_mod = alt_usr, X = 1000);
The above, for example, produces the following output (Note that the X
argument needs to be specified, but the rest of the subfunctions take vectors that gmse_apply
recognises will become available after a previous submodel is run).
$resource_results
[1] 1500
$observation_results
[1] 1350
$manager_results
[1] 350
$user_results
[1] 385
Note that the manager_results
and user_results
are ambiguous here, and can be interpreted as desired – e.g., as total allowable catch and catches made, or as something like costs of catching set by the manager and effort to catching made by the user. Hence while manger output is set in terms of costs of performing each action, and user output is set in terms of action attempts, this need not be the case when using gmse_apply
(though it should be recognised when using default GMSE manager and user functions).
GMSE default submodels can be added in at any point.
gmse_apply(res_mod = alt_res, obs_mod = observation,
man_mod = alt_man, use_mod = alt_usr, X = 1000)
The above produces the results below.
$resource_results
[1] 1500
$observation_results
[1] 1655.329
$manager_results
[1] 655.3288
$user_results
[1] 720.8617
If we wanted to, for example, specify a simple resource and observation model, but then take advantage of the genetic algorithm to predict policy decisions and user actions, we could use the default GMSE manager and user functions (written below explicitly, though this is not necessary).
gmse_apply(res_mod = alt_res, obs_mod = alt_obs,
man_mod = manager, use_mod = user, X = 1000)
The above produces the output below returning culling costs and culling actions attempted by four users (note that the default manager target abundance is 1000).
$resource_results
[1] 1500
$observation_results
[1] 1350
$manager_results
scaring culling castration feeding help_offspring
policy NA 10 NA NA NA
$user_results
resource_type scaring culling castration feeding help_offspring tend_crops kill_crops
Manager 1 NA 0 NA NA NA NA NA
user_2 1 NA 70 NA NA NA NA NA
user_3 1 NA 70 NA NA NA NA NA
user_4 1 NA 71 NA NA NA NA NA
user_5 1 NA 73 NA NA NA NA NA
Instead of using the gmse
function, we might simulate multiple generations by calling gmse_apply
through a loop, reassigning outputs where necessary for the next generation (where outputs are not reassigned, new defaults will be inserted in their place, so, e.g., if we were to just loop without reassigning any variables, nothing would update and we would be running the same model, effectively, multiple times). Below shows how this might be done.
sim1 <- gmse_apply(get_res = "full", lambda = 0.3);
RESOURCES <- sim1$resource_array;
LAND <- sim1$LAND;
PARAS <- sim1$PARAS;
results <- matrix(dat = NA, nrow = 40, ncol = 4);
for(time_step in 1:40){
sim_new <- gmse_apply(RESOURCES = RESOURCES, LAND = LAND, PARAS = PARAS,
COST = COST, ACTION = ACTION, stakeholders = 10,
get_res = "full", agent_view = 20);
results[time_step, 1] <- sim_new$resource_vector;
results[time_step, 2] <- sim_new$observation_vector;
results[time_step, 3] <- sim_new$manager_vector;
results[time_step, 4] <- sim_new$user_vector;
RESOURCES <- sim_new$resource_array;
LAND <- sim_new$LAND;
PARAS <- sim_new$PARAS;
COST <- sim_new$COST;
ACTION <- sim_new$ACTION;
}
colnames(results) <- c("Abundance", "Estimate", "Cull_cost", "Cull_attempts");
The above results in the following output for results
.
Abundance Estimate Cull_cost Cull_attempts
[1,] 1195 1165.9726 10 716
[2,] 1045 939.9167 110 461
[3,] 1160 1160.0238 10 715
[4,] 1056 1183.8192 10 715
[5,] 1014 850.6841 110 468
[6,] 1171 1237.3587 10 717
[7,] 1026 993.4563 110 464
[8,] 1202 957.7632 110 464
[9,] 1394 1469.3635 10 702
[10,] 1333 1457.4658 10 702
[11,] 1277 1397.9774 10 702
[12,] 1175 1415.8239 10 702
[13,] 1088 701.9631 110 468
[14,] 1275 1207.6145 10 718
[15,] 1200 1332.5402 10 718
[16,] 1116 1029.1493 45 512
[17,] 1249 1814.3962 10 699
[18,] 1141 1273.0518 10 722
[19,] 1019 963.7121 110 455
[20,] 1216 1629.9822 10 708
[21,] 1088 1130.2796 10 708
[22,] 988 1035.0982 38 537
[23,] 1056 1029.1493 45 505
[24,] 1154 749.5538 110 463
[25,] 1344 1499.1077 10 722
[26,] 1268 1386.0797 10 712
[27,] 1165 1493.1588 10 707
[28,] 1061 1070.7912 19 633
[29,] 1019 1076.7400 17 663
[30,] 961 600.8328 110 457
[31,] 1135 874.4795 110 450
[32,] 1338 1189.7680 10 701
[33,] 1275 1600.2380 10 710
[34,] 1174 1362.2844 10 709
[35,] 1104 1112.4331 12 685
[36,] 1003 1302.7960 10 715
[37,] 828 1183.8192 10 712
[38,] 649 785.2469 110 462
[39,] 739 1023.2005 56 488
[40,] 813 910.1725 110 455
Note that managers increase the cost of culling based on the time step’s estimated abundance, and user culling attempts decrease when culling costs increase.
In addition to the flexibility of allowing user-defined submodels, gmse_apply
is also useful for modellers who might be interested in simulating processes not currently available in gmse
by itself. For example, if we wanted to model a sudden environmental perturbation decreasing population size, or a sudden influx of new users, after 30 generations, we could do so in the loop.
In the near future, the gmse_apply
function will be included in the GMSE vignette and submitted to CRAN with the rest of v0.3.0.0 – in the mean time, I believe that all major bugs have been ironed out, but please let me know or report an issue if you are able to crash the function (i.e., if you run it and it causes R to crash – you should always get an error message before this happens).
To download the latest GMSE v0.3.0.0, simply run the below in R (make sure that devtools
is installed).
devtools::install_github("bradduthie/GMSE")
I welcome any feedback, and I expect to submit an update to CRAN around late October.
New function gmse_apply complete and tested
I have now completed the gmse_apply
function, which exploits the full modularity of GMSE by allowing software users to develop their own sub-functions and string them together with any combination of GMSE default sub-functions. As a brief summary, gmse_apply
includes the following features:
Any arguments for custom user functions can simply be passed along by specifying them in gmse_apply
. For example, if we have a custom resource function alt_res
below:
alt_res <- function(X = 1000, K = 2000, r = 1){
X_1 <- X + r*X*(1 - X/K);
return(X_1);
}
We can simply include the above in gmse_apply
as follows to use the very simple logistic growth sub-model with the individual-based submodels that are defaults of GMSE.
sim_app <- gmse_apply(res_mod = alt_res);
The gmse_apply
function simply adds in GMSE defaults for unspecified models, but we can specify them too.
sim_app <- gmse_apply(res_mod = alt_res, obs_mod = observation);
To adjust parameters in the alternative resource model, simply add in the arguments as below.
sim_app <- gmse_apply(res_mod = alt_res, X = 2000, K = 5000, r = 1.2);
The gmse_apply
function will know where to place them, and update them should they be needed for other models.
I will give a more lengthy description of how to use gmse_apply
tomorrow, when I push GMSE v0.3.0.0 to the master branch of GitHub and advertise the update.
Compensation suggestion
A suggestion from Jeremy to include a compensation option for users. Users could devote some of their budget to compensation, then managers could compensate a proportion of their damaged yield. Implementing this will require consideration from the manager’s perspective with respect to the genetic algorithm – the users’ perspective will be easier because a user can remember their previous losses and assess compensation versus culling. Managers might have to think about how compensation could incentivise non-culling, but this might actually already work given the way the manager anticipates actions; more investigation into this will be useful following the finalisation of gmse_apply()
, which is in progress.
Progress has been made on the gmse_apply()
function. My goal is to make this as modular as possible – to allow any four functions to be included in the GMSE framework, including arbitrary arguments to each function. The gmse_apply()
function will recognise which arguments go along with which functions, and naturally string together results from one sub-function to the input of the next sub-function (though this will demand that the output from functions is labelled in a way that matches the arguments of the next function; e.g., if you have a ‘N_total’ as input for the observation model, then ‘N_total’ will either need to be labelled output of the resource model or specificied directly in gmse_apply()
). Default submodels will be the IBMs used in gmse()
, and where arguments are not specified by the software user in gmse_apply()
(e.g., LAND
) they will be built from default gmse()
parameters.
The GMSE GUI has been updated with all of the new features in version 0.2.3.0. The gmse_gui()
function is likewise updated in a new patch version 0.2.3.1. I did this quickly because the GUI was actually easy to update; plans for the gmse_apply
function are now also clear, and I hope to have a working function and version 0.3.0.0 by the end of the week, or by early next week.
GMSE Version 0.2.3.0 on GitHub
I have pushed a new version 0.2.3.0 of GMSE onto the master branch of GitHub, which means that the most up-to-date version can be installed using the code below (make sure the devtools library is installed).
devtools::install_github("bradduthie/GMSE")
The new version includes multiple new features:
plot_gmse_effort
function, which shows the conflict between manager targets and user actions more directly (see plots from 23 AUG 2017 notes).gmse_summary
function, which takes the large output produced from gmse
and returns a much easier to understand set of tables.gmse_gui
function that has better defaults and parameter organisation, which has been uploaded to shiny for use in a browser.To run a simple default simulation, the gmse
function remains unchanged.
sim <- gmse();
To plot the effort of managers and users, use the below.
plot_gmse_effort(agents = sim$agents, paras = sim$paras,
ACTION = sim$action, COST = sim$cost);
Below summarises the results more cleanly, extracting key information from sim
.
gmse_summary(sim);
And as before, the GUI can be called directly from the R console.
gmse_gui();
The GUI does not yet allow you to get a vew of the plot_gmse_effort
output, or a gmse_summary
, but this will be a goal for future versions of GMSE.
If able, I recommend updating to version 0.2.3.0 as soon as possible. In the coming few days, I will also add the gmse_apply
function, primarily for developers who will benefit from a more modular way of using GMSE, allowing for different types of submodules to be used within the broader GMSE framework. When the new apply function has been added (and possibly the GUI improved), I will submit a new version 0.3.x.x to CRAN.
Bug Fix and tweaks to agent prediction
I have now fixed a bug in the code that was causing confusion between culling and castration. After recompiling and running simulations, manager and user actions improve. I have also made some minor changes to default gmse()
options. Regarding the predicted consequences of manager and user actions (i.e., the predictions from the agents’ perspective that guid their decision making), I have adjusted some things to make them more in line with what is expected in the simulation as follows (recall that managers are interested in global abundance and users are interested specifically in how abundance affects themselves):
These values are a bit more in line with what will actually happen, so we assume that managers and users are a bit more informed now. It also allows for a bit more differentiation among actions. Overall, the model appears to perform better now – meaning that managers and users appear to be better predictors of the conseuqneces of their actions.
Before finishing the gmse_apply()
function, I will push an updated version of GMSE to GitHub with these changes, plus new plotting options.
I have written a gmse_summary
function (see below), which returns a simplified list that includes four elements, each of which is a table of data: 1. resources, a table showing time step in the first column, followed by resource abundance in the second column. 2. observations, a table showing time step in the first column, followed by the estimate of population size (produced by the manager) in the second column. 3. costs, a table showing time step in the first column, manager number in the second column (should always be zero), followed by the costs of each action set by the manager (policy); the far-right column indicates budget that is unused and therefore not allocated to any policy. 4. actions, a table showing time step in the first column, user number in the second column, followed by the actions of each user in the time step; additional columns indicate unused actions, crop yield on the user’s land (if applicable), and the number of resources that a user successfully harvests (i.e., ‘culls’).
At the moment, I have not added in the actual number of resources that a user culls. This will be added shortly, after which I will post a new function. Doing so is a bit more complicated because it requires me to go into the C code and make a recording every time it happens (see how I plan to do this below the function).
gmse_summary <- function(gmse_results){
time_steps <- dim(gmse_results$paras)[1];
parameters <- gmse_results$paras[1,];
#--- First get the resource abundances
res_types <- unique(gmse_results$resource[[1]][,2]);
resources <- matrix(dat = 0, nrow = time_steps,
ncol = length(res_types) + 1);
res_colna <- rep(x = NA, times = dim(resources)[2]);
res_colna[1] <- "time_step";
for(i in 1:length(res_types)){
res_colna[i+1] <- paste("type_", res_types[i], sep = "");
}
colnames(resources) <- res_colna;
#--- Next get estimates abd the costs set by the manager
observations <- matrix(dat = 0, nrow = time_steps,
ncol = length(res_types) + 1);
costs <- matrix(dat = NA, nrow = time_steps*length(res_types), ncol = 10);
agents <- gmse_results$agents[[1]];
users <- agents[agents[,2] > 0, 1];
actions <- matrix(dat = NA, ncol = 13,
nrow = time_steps * length(res_types) * length(users));
c_row <- 1;
a_row <- 1;
for(i in 1:time_steps){
the_res <- gmse_results$resource[[i]][,2];
manager_acts <- gmse_results$action[[i]][,,1];
resources[i, 1] <- i;
observations[i, 1] <- i;
land_prod <- gmse_results$land[[i]][,,2];
land_own <- gmse_results$land[[i]][,,3];
for(j in 1:length(res_types)){
#---- Resource abundance below
resources[i,j+1] <- sum(the_res == res_types[j]);
#---- Manager estimates below
target_row <- which(manager_acts[,1] == -2 &
manager_acts[,2] == res_types[j]);
estim_row <- which(manager_acts[,1] == 1 &
manager_acts[,2] == res_types[j]);
target <- manager_acts[target_row, 5];
adjusr <- manager_acts[estim_row, 5];
observations[i,j+1] <- target - adjusr;
#---- Cost setting below
costs[c_row, 1] <- i;
costs[c_row, 2] <- res_types[j];
estim_row <- which(manager_acts[,1] == 1 &
manager_acts[,2] == res_types[j]);
if(parameters[89] == TRUE){
costs[c_row, 3] <- manager_acts[estim_row, 8];
}
if(parameters[90] == TRUE){
costs[c_row, 4] <- manager_acts[estim_row, 9];
}
if(parameters[91] == TRUE){
costs[c_row, 5] <- manager_acts[estim_row, 10];
}
if(parameters[92] == TRUE){
costs[c_row, 6] <- manager_acts[estim_row, 11];
}
if(parameters[93] == TRUE){
costs[c_row, 7] <- manager_acts[estim_row, 12];
}
if(parameters[94] == TRUE){
costs[c_row, 8] <- parameters[97];
}
if(parameters[95] == TRUE){
costs[c_row, 9] <- parameters[97];
}
costs[c_row, 10] <- manager_acts[estim_row, 13] - parameters[97];
c_row <- c_row + 1;
#--- Action setting below
for(k in 1:length(users)){
usr_acts <- gmse_results$action[[i]][,,users[k]];
actions[a_row, 1] <- i;
actions[a_row, 2] <- users[k];
actions[a_row, 3] <- res_types[j];
res_row <- which(usr_acts[,1] == -2 &
usr_acts[,2] == res_types[j]);
if(parameters[89] == TRUE){
actions[a_row, 4] <- usr_acts[res_row, 8];
}
if(parameters[90] == TRUE){
actions[a_row, 5] <- usr_acts[res_row, 9];
}
if(parameters[91] == TRUE){
actions[a_row, 6] <- usr_acts[res_row, 10];
}
if(parameters[92] == TRUE){
actions[a_row, 7] <- usr_acts[res_row, 11];
}
if(parameters[93] == TRUE){
actions[a_row, 8] <- usr_acts[res_row, 12];
}
if(j == length(res_types)){
if(parameters[104] > 0){
land_row <- which(usr_acts[,1] == -1);
if(parameters[95] > 0){
actions[a_row, 9] <- usr_acts[land_row, 10];
}
if(parameters[94] > 0){
actions[a_row, 10] <- usr_acts[land_row, 11];
}
}
actions[a_row, 11] <- sum(usr_acts[, 13]);
}
if(parameters[104] > 0){
max_yield <- sum(land_own == users[k]);
usr_yield <- sum(land_prod[land_own == users[k]]);
actions[a_row, 12] <- 100 * (usr_yield / max_yield);
}
a_row <- a_row + 1;
}
}
}
cost_col <- c("time_step", "resource_type", "scaring", "culling",
"castration", "feeding", "helping", "tend_crop",
"kill_crop", "unused");
colnames(costs) <- cost_col;
colnames(resources) <- res_colna;
colnames(observations) <- res_colna;
action_col <- c("time_step", "user_ID", "resource_type", "scaring",
"culling", "castration", "feeding", "helping", "tend_crop",
"kill_crop", "unused", "crop_yield", "harvested");
colnames(actions) <- action_col;
the_summary <- list(resources = resources,
observations = observations,
costs = costs,
actions = actions);
return(the_summary);
}
To record kills, I think that the best way is to use the resource mortality adjustment column (at the moment, column 17 in C and 18 in R of the resource array). Mortality as of now is just adjusted to 1 in the event of a kill, and mortality occurs whenever a random probability is greater than or equal to 1. Hence, I can replace the 1 value with the user’s ID (for non-managers, this must be at least 1), and then the resource array will record the ID of the user that killed it at the particular time step. Note that this cannot be done for other adjustments such as growth rate or offspring production because the values are not interpreted as probabilities.
I will do the above tomorrow, which should not take too long. I will then continue work on the gmse_apply
function.
Currently, the gmse()
function returns a list that includes all of the data produced by the model, some details of which are required for plotting.
sim_results <- list(resource = RESOURCE_REC,
observation = OBSERVATION_REC,
paras = PARAS_REC,
land = LANDSCAPE_REC,
time_taken = total_time,
agents = AGENT_REC,
cost = COST_REC,
action = ACTION_REC
);
I think that this list is fine, perhaps necessary, to keep, but the ConFooBio group has also concluded that there should be some easier to understand summary of the data. I propose that some function written, a gmse_summary()
, that summarises the results in an easier to understand way would be useful. The function could just be run as below.
sim <- gmse();
sim_summary <- gmse_summary(sim);
The output of gmse_summary()
should be a list of all of the relevant information that a user might want to plot or analyse. It should include the following list elements.
sim_summary$resources
sim_summary$observations
sim_summary$costs
sim_summary$actions
More might be needed, but the above should be a good starting point that will provide four clear data tables for the user. The tables will look like the below.
1. Resource abundances over time
time_step | abundance |
---|---|
1 | 100 |
2 | 104 |
… | … |
99 | 116 |
100 | 108 |
In the above, only the resource abundance is reported to the software user, though it might also be useful to have additional columns as well eventually.
2. Observation estimates of abundance over time
time_step | estimated_abundance |
---|---|
1 | 102 |
2 | 101 |
… | … |
99 | 121 |
100 | 112 |
In the above, only the estimate from the observaiton submodel is reported to the software user. Additional columns might also be useful for things like confidence intervals, though for now I’m not sure if this is needed.
3. Costs set in each time step
time_step | manager | scaring | castration | culling | feeding | helping | unused |
---|---|---|---|---|---|---|---|
1 | 0 | 40 | NA | 60 | NA | NA | 0 |
2 | 0 | 36 | NA | 62 | NA | NA | 2 |
… | … | … | … | … | … | … | … |
99 | 0 | 0 | NA | 100 | NA | NA | 0 |
100 | 0 | 3 | NA | 97 | NA | NA | 0 |
In the above, the manager number is always 0 because this is the number of the agent that has that role in GMSE. All impossible actions (specificed by the simulation) are labelled NA
, while the possible scaring and culling actions are given values that correspond to the cost of each action for users in each time step. Hence the table summarises policy for each time step in a way that software users can interpret more cleanly.
4. Actions in each time step
time_step | user | scaring | castration | culling | feeding | helping | tend_crop | kill_crop | unused | crop_yield | harvested |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 50 | NA | 50 | NA | NA | NA | NA | 0 | 90 | 12 |
1 | 2 | 59 | NA | 40 | NA | NA | NA | NA | 1 | 92 | 9 |
1 | 3 | 100 | NA | 0 | NA | NA | NA | NA | 0 | 89 | 0 |
2 | 1 | 44 | NA | 66 | NA | NA | NA | NA | 0 | 88 | 16 |
2 | 2 | 52 | NA | 48 | NA | NA | NA | NA | 0 | 94 | 12 |
2 | 3 | 98 | NA | 0 | NA | NA | NA | NA | 2 | 90 | 0 |
… | … | … | … | … | … | … | … | … | … | … | … |
99 | 1 | 36 | NA | 63 | NA | NA | NA | NA | 1 | 79 | 20 |
99 | 2 | 40 | NA | 60 | NA | NA | NA | NA | 0 | 83 | 18 |
99 | 3 | 28 | NA | 72 | NA | NA | NA | NA | 0 | 88 | 12 |
100 | 1 | 35 | NA | 62 | NA | NA | NA | NA | 3 | 82 | 18 |
100 | 2 | 37 | NA | 63 | NA | NA | NA | NA | 0 | 84 | 22 |
100 | 3 | 23 | NA | 77 | NA | NA | NA | NA | 0 | 84 | 13 |
The above action table has more rows in it than the cost table because a row is needed for each user in each time step. This gives the software user full access to each individual user’s actions, and their results. Note that as above, castration, feeding, and helping, are not options. Additionally, in this hypothetical simulation, tending or killing crops are not options, so no actions are performed. Users divide their budget between scaring and culling in each time step. The last two columns also give useful information to the software user. The first is crop yield on the user’s owned land (should probably be NA
if land_ownership = FALSE
), which will reflect the percentage of the total possible yield (or maybe raw yield?) for each user – hencing allowing the table to direclty correlate actions with yield. The last column is the number of resources ‘harvested’, which I think should count successful ‘kills’ (rather than just actions devoted to culling). The realised culling might be lower than the actions devoted to culling, for example, if not enough resources are actually on the user’s land to cull. Additional statistics for each user could be added in as columns, but this seems a good place to start. This gmse_summary
producing a list of the above four tables will be included in the next version of GMSE, along with the new plotting function highlighting the conflict itself, and the gmse_apply
function discussed on 6 SEPT.
Continued progress has been made on slides for an upcoming talk.
I will be giving a talk on 19 September 2017 for the Mathematics and Statistics Group at the University of Stirling on GMSE as a general tool for management strategy evaluation. Slides for this talk will be available on GitHub.
The alternative approach from Wednesday is being implemented smoothly. Passing user-defined functions in a modular way is possible, but inputs and outputs need to be carefully considered within gmse_apply()
. The objective is to make things as easy and flexible as possible for the user, while also making sure that the function runs efficiently.
A modular function for modellers
I am beginning work on a gmse_apply()
function, which will improve the modularity of GMSE for developers. The goal behind this function will be to provide a simple tool for allowing developers to use their own resource and observation models and, with the correct inputs, take advantage of the manager and user functions. Hence, simple resource and observation models will be possible, but the flexibility of GMSE should be retained as much as possible. A few starting points include the following:
gmse_apply()
function will run a single cycle of GMSE instead of multiple time steps.gmse()
options will be acceptable, but not obvious, appearing as ...
passes that the user can add if they want to change things. Otherwise, defaults will be used.Inputs and outputs of different functions will then include the following:
gmse_apply()
, the only thing required of the user-defined function will be the population size vector, and other parameters will be specified in the logistic function, e.g.: gmse_apply(resource = LV(pop, K = 100, r = 1, ...), observation = ..., type = "Numerical")
.resource()
function, though some options to input landscape and starting conditions will be needed (though these should also switch to default).observation()
function, as as with the resource model.estimate_abundances
function in manager.c
can be bipassed entirely (i.e., I still think that we’ll want to call the c function as normal, but add an option). This can be arranged simply by reading in the observation vector (might have to re-structure to an array a bit better?) as the OBSERVATION
array in c, but then instead of running estimate_abundances
, allow an if-else statement to read this array as abun_est
given a new value in paras
, after which the genetic algorithm and set_action_costs
can be run as normal. The irrelevant output can just be ignored by the user model in gmse_apply
.manager()
function in its normal form. As with the others, this will require some eventual decisions about initialisation, but I can worry about this later.gmse_apply()
that would allow users to specify their own manager functions, but for the moment this is just going to be hidden because the genetic algorithm requires the COST
and ACTION
arrays to be in the correct form for use. Hence, if someone later wants to apply their own manager or user function, they will either need to get the input-output correct (at the moment) or (eventually) use some different input-output data structure that I make up later; but at some point, things would just collapse down to MSEtools. Or, rather, gmse_apply()
just becomes a trivial function that includes four lines of compatible code, calling each model.ACTION
output from user()
will just be translated in R to adjusting the vector.user()
function in its normal form. As with the others, this will require some eventual decisions about initialisation to be determined later.As an alternative, at least to the implementation, I think that the call could be made at the level of the individual resource()
and observation()
R functions. This was kind of always the plan, but there’s a semi-dirty way to mix numerical resource and observation models with the full individual-based manager and user models. This can be done by adding a model option to be user defined through an if( is.function(model) == TRUE )
in the resource.R function. If the condition is satisfied, then resource()
will shift to the user generated model. This can actually be done for all of the submodels very easily.
This alternative might be a better way to go. The aforementioned ‘dirty’ part of the technique might be to check to see if the output is in the correct form, then, if only a vector is returned – turn it into the correct form by making a data frame that has the same number of rows by calling make_resource
. The type 1 values could correspond to vector elements. Admittedly, this could get slow for huge population sizes, but population sizes would have to be massive for R to slow down from simplying making a matrix with a lot of rows. In any case, it would at least standardise the input and output for the user of gmse_apply
in a way that plays nice with everything else in GMSE.
Similarly, the observation function could also call make_resource
if a vector is returned (since individual variation wouldn’t be relevant in the numerical model).
With this alternative approach, no changes to the C code need to be made – the inputs and outputs just need to be tweaked into a standardised way when a vector or scalar is returned from any user-defined model (small detail – population size needs to be an integer). This can be an option later for the user and manager models – though I’m not sure how this would work, exactly. A benefit here is that some parts of the model could concievably individual-based, with others being numerical – the trade-off being the requirement for discrete resource numbers and a very small amount of slowdown (which will almost certainly not be noticable for any resaonable model).
The gmse_apply
function would then initialise a very small number of agents, and a small landscape (unless otherwise specified) in every run. The possibility of passing more options could be applied with a simple ...
. This would also require a sub-function build_para_vec
, which would be used for the sole purpose of taking the list of options included (same as in gmse()
) and passing it to the sub-function, with any functions not passed being assumed defaults (and most would be irrelevant). So the default function should then look like
gmse_apply(resource_function = "IBM", observation_function = "IBM", manager_function = "IBM", user_function = "IBM", res_number = 100, ...);
I think at least an initial population needs to be specified, but everything else can be left up to the user, with the elipses passing to the function building the parameter vector (which can also be called by gmse()
, replacing some clutter). Overall, the function will run without any input if none is specified, defaulting to an IBM with a population size of 100 for one generation. All other options, including non-standard functions, are left to the user.
Additional thoughts
Working this through, I’m slightly apprehensive about the motivation for including gmse_apply
function. Once you strip the mechanistic approach from the resource and observation models, all you really have are two values: (1) the population abundance or density and (2) the estimate of the population size or density. Once you include the manage_target
into gmse_apply
(necessary, I believe), then the genetic algorithm is really just a fancy way of getting the difference between the population estimate and the target size, and then setting a number of culling actions acceptable for users. Users then cull as much as possible because they’re assumed to want to use the resource as much as possible. Of course, we can consider other parameters that affect user actions (e.g., maximum catch, effort), but if we’re interested in learning about how these concepts affect harvesting in theory, then they can and should probably be studied using a simpler model. The real point of the genetic algorithm is that it allows for complex, multi-variable goal driven behaviour, as might occur given indirect effects (e.g., organisms on crop yield) or multiple options (e.g., culling versus scaring or growing) and spatial complexities. There seems little to be gained by calling the genetic algorithm to tell users to cull as much as possible, which can be done with a (very) simple function.
I have finally fixed the annoyance in the shiny app of GMSE that caused the bottom of the browser to black, hence making it difficult to set parameter values in some tabs.
Additionally, by hovering over the different options in the application, the software user can now see a brief description of what each option does in the simulation.
I am experimenting with ways of demonstrating the conflict between what a manager incentivises, and what the users actually do, in GMSE. Below are some plots that show this for a few sample simulations. The five panels in each plot correspond to the five possible actions where policy is set. Policy set by the manager is shown with the black solid line, with the thin coloured lines reflecting individual user effort expended into each action.
The right axis is fairly easy to interpret – it’s just the percentage of the user’s total budget devoted to a particular action (note, this is not necessarily the number of actions a user performs because different actions can cost different amounts – hence the term ‘effort’).
The left axis is a bit trickier – it’s how permissive of an action the manager is in practice. High values correspond to an action being highly permitted by the manager (i.e., the manager invests no effort in making these actions costly), whereas low values correspond to an action being less permitted (i.e., the manger invests highly in making these actions costly for users).
The end result is that the lines indicating manger permissiveness are typically correlated with user effort towards any particular action. In the first example below, this is true for scaring and culling (as the manager becomes more permissive of these actions, users tend to take advantage and spend more effort doing them). Note that users do not feed because they have nothing to gain by feeding the resources, even though the manager is usually permits feeding (around generation 75, the population started going way over the manager’s target).
In the second example (below), the option for scaring has been removed. Because users want resources off of their land, the only option is to cull, so users will cull as much as permitted even though the manager is incentivising them not to as much as possible.
The below is a final example where all actions except helping are possible options.
While playing with the proto-type GUI, I discovered a minor bug in the plotting function, which I fixed so that the plot doesn’t make an error. I have also updated the list of contributors in the description file, and the list of recommended packages (shiny packages for the new gmse_gui
function).
I have now also added a new release version 0.2.2.8 to GitHub. This version requires three additional libraries:
The above three libraries will be imported as dependencies (or should be) in the new version of GMSE.
A proto-type GUI for the GMSE package is now up on shiny. I’m going to make this look nicer with a CSS style-sheet, but for now this gets the job done.
I am currently trying to get a handle on creating a GMSE GUI in shiny by looking at the elementR package. The authors of this package, to get their very impressive shiny application running, need to nest multiple sub-functions inside the long (10000+ line) runElementR
function. GMSE won’t need to have this much code for the user interface – I have figured out roughly how to make the input look good and functional in a browser, but a tricky part will be to link that input with the gmse()
function paramters, then run things.
In writing a draft manuscript, the term ‘stakeholder’ is being applied to mean both managers and users. This differs from the model itself and therefore in the use of GMSE. To resolve this, I think that it would be worthwhile to change the documentation to match the manuscript. But I don’t want to change the input stakeholder
for any existing users of GMSE that might be inconvenienced. Instead, I think just defining stakeholder
to be the number of managers and users could be fine by changing stakeholder <- stakeholder + 1
in the gmse()
function. This might need revisiting in later versions (if we wanted to have multiple managers and stakeholders), but such a change would be likely part of a much bigger release in which major (and potentially inconvenient) changes would be unavoidable.
Following the release of GMSE v0.2.2.7
on CRAN, with extended documentation, as introduced on the ConFooBio blog and my blog, I shift my attention to the vignette. The vignette in development will eventually be packaged into a futre version of GMSE, then submitted as a separate methods paper.
GMSE v0.2.2.5
is now up on CRAN (13:32 GMT), and my hope is that v0.2.2.7
will replace it soon following some clarification of the documentation. I am avoiding a public announcement of the package on CRAN until I receive confirmation that the new version is accepted.
New logo
v0.2.2.0
: Bug fixes, new feature
While beginning to write up the vignette, I worked out a bug that applied to simulations in which stakeholder number was greater than 4 (tl;dr, these stakeholders were not acting according to their interests). This was fixed with commit 6ae58ec374f48464a0706fcf585dd5f1534e4511, and in fixing this I made the distinction between hunting type scenarios (where stakeholders have an interested in directly using resources) and farmer type scenarios (where stakeholders care about their land, and resources only indirectly because the resources affect the land).
I also added a new feature allowing the software user to adjust the proportion of the landscape that is public_land
(commit f88545569a4c3e39906291759f376403b8e665f3). This can be interpreted as land that is unmanaged and therefore available for resources to use without fear of scaring or culling when land_ownership = TRUE
. Also, now when land_ownership = FALSE
, all land is considered public and this is now reflected accurately with the plots.
I have also opted to change the default res_min_age
, the age at which resources can be seen, to zero instead of one. This results in plots that are above the defined carrying capacity sometimes because carrying capacity is applied to adults, not juveniles when res_death_K
is set. The result is a total carrying capacity of (res_death_K
+ (1 * lambda
)), which accounts for birth of juveniles in a population at carrying capacity.
The fixed_recapt
is now running as intended as of commit ad9d9e10ead215a703f9accdfbd149d35b350567.
New issues – proposed enhancements for the future GMSE
Before I lose track of all the proposed ideas for improving upon the GMSE package, I want to get all of them up as issues on GitHub. For completeness, I have also included the unresolved Issue 9. I will add to the below to form an organised list of future ideas to work on, all laid out as enhancement issues on GitHub. Anyone should be able to add to this list, or comment on the issues (e.g., if they would be especially useful ones to resolve).
Issue 9: Observation Error
It would be useful to incorporate observation error into the simulations more directly. This could be affected by one or more variables attached to each agent, which would potentially cause the mis-identification (e.g., incorrect return of seeme
) or mis-labelling (incorrect traits read into the observation array) of resources. This could be done in either of two ways:
Cause the errors to happen in ‘real time’ – that is, while the observations are happening in the simulation. This would probably be slightly inefficient, but have the benefit of being able to assign errors specifically to agents more directly.
Wait until the resource_array
is marked in the observation
function, then introduce errors to the array itself, including errors to whether or not resources are recorded and what their trait values are. These errors would then be read into the obs_array
, which is returned by the function.
Issue 30: Manager assumptions about user actions
It would be useful to allow for simulations to dynamically adjust the caution that the manager has when changing actions. At the moment, managers always assume that some specified number of actions
will be performed by users, and this number does not change over the course of the simulation. But managers might be able to use the history of user actions to learn to be more or less cautious when setting new policy.
Issue 31: Modify manager’s predicted effects
Currently, the predicted effects of a manager’s actions are set to values that, heuristically, appear to work in the genetic algorithm. This is adjusted with the manager_sense
parameter, which has a default of 0.1, such that the manager assumes that if they set costs to increase culling by 100 percent, it will actually only increase by 10 percent (as not all users are going to necessarily cull if given the opportunity). Like real-world management, this is heuristic and results in uncertainty, but future versions of GMSE could dynamically modify this value during the course of the simulation based on real knowledge of how policy changes have affected user actions in previous time steps.
Issue 32: Long-term histories affect genetic algorithm
Currently, only the history of interactions from the previous time step directly affects the genetic algorithm for stakeholders and managers. For managers especially, this could be made a bit more nuanced. The entire history of total actions and resource dynamics is recorded, and this could easily be made available (e.g., in PARAS_REC
) for managers to make decisions. Incorporating these data into the genetic algorithm, and therefore into agent decision making, could be tricky, but one simple example of this could be having managers use the per-time step mean number of stakeholder actions in the last 2-3 time steps to predict future user actions with a bit more inertia. Managers could also use stakeholder action history from earlier time steps, but weighting each by how long ago they occurred.
Issue 33: Non fixed mark-recapture sampling number
Currently, to simulate mark-recapture observation and analysis, values for fixed_mark
and fixed_recapt
need to be specified in GMSE, and the manager would have exactly these numbers of marks and recaptures in each generation, respectively. It would also be useful to, instead of specifying exact numbers, to have the manager search a general area, then mark all resources in that area. Next, the manager could search again and recapture, so the exact number is not always set and the observation process probably mimics more closely what happens in the field. This type of sampling is actually already available (observe_type = 0
), so I would just need to add some code to have managers interpret some observations as marks and others as recaptures.
Issue 34: Resource interactions
Currently, more than one resource type is permitted, but this is not offered/visible to users of the software. A next major version of GMSE could have multiple resource types with resources actually interacting with one another (could borrow future development code from EcoEdu). Simple interactions could include competition and predator-prey functions in the resource model. The code is also already ready for managers and users to consider multiple resources in making policy decisions and actions, respectively.
Issue 35: Stakeholder lobbying
Currently, GMSE assumes that stakeholders have a negative relationship with resources – they either want to hunt them or scare them from their land. Future versions of GMSE should include an option for a stakeholder type (e.g., activist) that lobbies the manager to adjust the manager’s utilities, effectively increasing or decreasing the target
. The data structure to do this already exists, it’s just a matter of figuring out how best to enact it and why. For example, would adding this have any actual effect that differs from just assuming that the manager is being lobbied by conservationists continuously, and that their target
is a reflection of that.
I need to double check that fixed_recapt
is doing what I said it did on 23 JUN. My concern is that it is not being implemented properly in the observation model – there needs to be a difference between the first and second times_obs
, or times_obs
might need to be redefined for the first and second rounds of observation. It looks like the observation model is just doing times_obs
observations with the same number of samples in each one.
A better mark-recapture observation model estimator
Setting the parameters for the mark-recapture observation model (observe_type = 1
) was confusing, so much so that I had to remember how to do it. In v0.2.1.3
, I have fixed this so that the sampling is clearer. Rather than having a fixed_observe
argument in gmse()
, I’ve included a fixed_mark
and fixed_recapt
; arguments that only apply when observe_type = 1
. Under these conditions, times_observe
is ignored and fixed_mark
defines how many resources will be marked in each time step; fixed_recapt
defines how many recaptures will be made. If the value of fixed_mark
or fixed_recapt
is greater than the actual size of the resource popuation, then all resources in the population will be sampled.
Get a better confidence interval for the density estimator
The density estimator is giving too few Type 1 errors because of times_observe > 1
. This doesn’t affect anything but the visualisation, since managers don’t make decisions based on confidence intervals. Still, fixing the CIs would be a good idea. The CIs should also be correct when times_observe = 1
. Really, the times_observe > 1
is simulating a weird case in which the central limit theorem would apply to the times_observe
estimates, and hence the mean estimate among time times observed should be normally distributed around the mean.
Double-check for memory leaks with Valgrind
Running Valgrind on the R package GMSE revealed no memory leaks.
==15438==
==15438== HEAP SUMMARY:
==15438== in use at exit: 67,722,174 bytes in 20,290 blocks
==15438== total heap usage: 6,398,005 allocs, 6,377,715 frees, 1,211,596,543 bytes allocated
==15438==
==15438== LEAK SUMMARY:
==15438== definitely lost: 0 bytes in 0 blocks
==15438== indirectly lost: 0 bytes in 0 blocks
==15438== possibly lost: 0 bytes in 0 blocks
==15438== still reachable: 67,722,174 bytes in 20,290 blocks
==15438== suppressed: 0 bytes in 0 blocks
==15438== Reachable blocks (those to which a pointer was found) are not shown.
==15438== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==15438==
==15438== For counts of detected and suppressed errors, rerun with: -v
==15438== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I have changed some default parameters so that when I write up the default example, it provides a description that will be useful to new users.
I am in the process of organising the vignette. As I’ve done in a previous manuscript, I’ll start with notes and an outline and update now on the Rmarkdown file. The vignette will therefore evolve and be tracked through git, just like the code.
Also, while compensation payements are not yet included as a feature of GMSE, I think that an option to include them should be relatively easy to implement through the COST
and ACTION
arrays manager layers where column 1 equals -1
(landscape). The killem
and helpem
columns remove and increase crops, respectively, but the additional available columns could be used to track compensation owed (to stakeholders) and paid (by managers) – both at a cost, of course.
Unit tests are written
Unit tests for all sub-functions of the model are written, with the exception of functions used for plotting, which I don’t think are necessary to unit test because errors to the plot will be very obvious in development. Everything now passes CRAN checks except the licensing, which we’ll need to agree on at some point. As of Monday, I will be able to start on the vignette (manuscript).
Unit testing for long-term code maintenance
To ensure that the gmse
package functions as intended in the long term, I am writing an extensive battery of unit tests that will need to be passed to ensure that any new features do not introduce bugs or break existing functions. To do this, I will use the testthat
package in R and follow the advice in Hadley Wickham’s chapter on unit testing R code. I’ve done this already for the gamesGA package, which is now on CRAN, though the gmse
package will require much more tests simply because there are many more functions to test.
The unit testing already helped by identifying a potential bug later on down the line when initialising cost arrays for simulations with more than two resources (see commit 65088054481266e67f06513dc368c515e4a9fed0. Unit tests for all initialisation functions except landscape functions are now complete. Next, I will need to do landscape functions and perhaps some (but probably not all) functions associated with plotting, then the four main GMSE model functions.
The density-based observation estimates were giving incorrect values. Looking into this, the reason for the error was because I was useing confidence intervals for proportions (e.g., the proportion of cells with resources on them) rather than counts (which will be, we assume, have a Poisson error structure). I have replaced the previous estimate of confidence intervals around local density with a Poisson estimate
\[ \hat{\lambda} \pm 1.96 \times \sqrt \frac{\hat{\lambda}}{vision^2} \]
In the above \(\hat{\lambda}\) is the estimated local density and \(vision\) refers to the total number of cells that managers can see.
With this new correction, and also fairly major bug fixes having to do with fixing an error in landscape actions causing an infinite loop (commit 310fb76b7e3b3499ab74e2f94c61c3276f3c4118 and fixing user actions to tend crops or kill crops appropriately (commit 424fc2eb4f6274763f5ead0fc48ad5dd7f68c422), I am now pushing to master and releasing v0.2.1.1
, which effectively patches some major issues and improves plotting (including a new legend for costs and actions).
Bug fix to the user function
An erroneous condition in a while
loop was causing an infinite loop when manager budgets were very high and user actions were not restricted to landscape that they owned. This has been fixed on the development branch but not yet pushed to the master branch.
Some initial notes: GMSE (beta) package v0.2.1.0
A beta version of GMSE is now available, and is ready to be experimented with and tested as an R package. To download and begin using GMSE, it is necessary to first download the devtools
library.
install.packages("devtools")
library(devtools)
Use install_github
to install using devtools
.
install_github("bradduthie/gmse")
From here, it is possible to run GMSE simulations using the gmse()
function. For help using this function, all documentation can be accessed by simply calling the help files.
help(gmse)
The documentation contains a basic description of the gmse()
function (the only one that is needed to run simulations – subfunctions for resource, observation, manager, and user models are all accessible as independent R functions, but are not very useful at the moment without the initialisation in the main function – nevertheless, the documentation for these can be accessed with help(resource)
, help(observation)
, help(manager)
, and help(user)
). It also contains arguments for most of the variables that might be usefully changed to simulate different types of management scenarios; additional options are not shown for the moment either because more coding is needed to make them useful or because I don’t expect they will be needed. The explanations of the arguments are detailed, along with documentation explaining the (extensive) amount of data that is returned after running a simulation. To get started though, the default simulation can be run simply.
sim <- gmse();
Parameter values can then be adjust by varying the options in the gmse()
function.
R vignette, and the beginning of a methods paper
I will soon begin work on an R vignette, which is essentially a long form documentation that can also be a manuscript to submit to a journal.
Add some formal testing functions for future development
I will also need to add some formal R tests, which are basically ways of automating the kind of testing that is done continually while writing the code. The idea with formal unit tests is to have a process that checks to see if the code breaks when a new feature (and therefore new coded) is added. Since the results of the simulation are stochastic, I think the best way to test is to set a seed and use default parameter values, then check to make sure that the results match the expect_equal_to_reference()
function in devtools
. It might be useful to do this for each of the resource()
, observation()
, manager()
, user()
, and gmse()
models – perhaps testing the ith time step for each of the sub-functions, but then the gmse()
function also as a whole (perhaps using just 10 time steps would be sufficient for this instead of a default of 100).
Introduce Issue #29: No edge effect causes crash
When edge_effect = 0
, and therefore nothing happens when resources and agents move off of the edge of the landscape, R crashes. This is almost certainly due to some sort of memory leak. This is a low priority issue at the moment because I cannot think of a reason why anyone explicitly want the model to just ignore resources moving off of the landscape if someone wants something other than a torus (edge_effect = 1
), such as a reflective edge or emigration upon leaving the landscape, this should be explicitly coded into the edge_effect
function in utilities.c
. Until someone asks for it, I’ll stick with a torus.
New (draft) documentation for the gmse()
function
DESCRIPTION: GMSE simulation
The gmse function is the the primary function to call to run a simulation. It calls other functions that run resource, observation, management, and user models in each time step. Hence while individual models can be used on their own, gmse() is really all that is needed to run a simulation.
res_move_type
settings. Under default settings, during each time step, resources move from zero to res_movement cells away from their starting cell in any direction. Hence res_movement is the maximum distance away from a resources starting cell that it can move in a time step; other types of resource movement, however, interpret res_movement differently to get the raw distance moved (see res_move_type). The default value is 4.res_movement
cells away during a time step. Movement direction is random and the cell distance moved is randomly selected from zero to res_movement
. (2) Poisson selected movement in the x and y dimensions where distance in each direction is determined by Poisson(res_movement) and direction (e.g., left versus right) is randomly selected for each dimension. This type of movement tends to look a bit odd with low res_movement
values because it results in very little diagonal movement. It also is not especially biologically realistic, so should probably not be used without a good reason. (3) Uniform movement in any direction up to res_movement
cells away during a a time step res_movement
times. In other words, the res_movement
variable of each resource is acting to determine the times that a resource moves in a time step and the maximum distance it travels each time it moves. This type of movement has been simulated in ecological models, particularly plant-pollinator systems. The default movement type is (1).lambda
is the population growth rate also set as an argument in gmse simulations.removal_pr
for each resource (which may be further affected by agent actions or interactions with landscape cells). A value of (2) causes death to be density-dependent (though potentially independently affected by agents and landscape), with mortality probability calculated based on the carrying capacity res_death_K
set in as an argument in gmse simulations. The default res_death_type
is (2), as values of (1) must be used carefully because it can result in exponential growth that leads to massive population sizes that slow down simulations.agent_view
from the cell of the manager. Managers sample times_observe
subsets, where times_observe
is a parameter value set in the gmse simulation. Managers then extrapolate the density of resources in the subset to estimate the total number of resources on a landscape. (1) Mark-recapture estimate of the popuation, in which managers randomly sample times_observe
resources in the population without any spatial bias (if there are fewer than times_observe
resources, managers sample all resources) times_observe
times with replacement. The first fixed_observe
times are interpreted as marks, while the remaining times are interpreted as recaptures (note that fixed_observe
must be less than times_observe
). Hence if a resource is observed at any time in fixed_observe
independent observations, then it is considered marked; if it is observed again at any time in times_observe - fixed_observe
independent observations, then it is considered recaptured. A Chapman estimate is used in the manager model to estimate population size from these observation data. (2) Transect-based sampling (linear), in which a manager samples an entire row of the landscape and counts the resources on the row, then moves onto the next row of the landscape until the entire landscape has been covered. The number of cells in each row (i.e., the height) equals agent_view
, so fewer transects are needed if agents can see farther. If res_move_obs == TRUE
, then resources can move on the landscape between each transect sampling, potentially causing observation error if some resources are double counted or not counted at all due to movement. If res_move_obs == FALSE
, then this type of observation should produce no error, and resource estimation will be exact. (3) Transect-based sampling (block), in which a manager samples a block of the landscape and counts the resources in the block, then moves on to the next (equally sized) block until the entire landscape has been covered. Blocks are square, with the length of each side equaling agent_view
, so fewer blocks are needed if agents can see farther. If res_move_obs == TRUE
, then resources can move on the landscape between each block sampling, potentially causing observation error if some resources are double counted or not counted at all due to movement. If res_move_obs == FALSE
, then this type of observation should produce no error, and resource estimation will be exact. The default observation type is 0 for density-based sampling.times_observe
.observe_type = 0
) and mark-recapture sampling (observe_type = 1
). In the former case, the value determines how many times the manager goes out to sample resources from a subset of the landscape. In the latter case, the value determines how many times the manager goes out to attempt to find new resources to mark or recapture (hence its value must be greater than fixed_observe
).agent_move
cells away during a time step. Movement direction is random and the cell distance moved is randomly selected from zero to agent_move
. (2) Poisson selected movement in the x and y dimensions where distance in each direction is determined by Poisson(agent_move) and direction (e.g., left versus right) is randomly selected for each dimension. This type of movement tends to look a bit odd with low agent_move
values because it results in very little diagonal movement. It also is not especially realistic, so should probably not be used without a good reason. (3) Uniform movement in any direction up to agent_move
cells away during a a time step agent_move
times. In other words, the agent_move
variable of each agent is acting to determine the times that an agent moves in a time step and the maximum distance it travels each time it moves. This type of movement has been simulated in ecological models, particularly plant-pollinator systems. The default movement type is (1).times_observe
times being observed. The default value is TRUE, but if the option is set to FALSE then it shuts down all resource movement during sampling (making observe_type = 2
and observe_type = 3
error free).land_ownership
in the gmse() function), then this gives some idea of where actions are being performed and where resources are affecting the landscape. (3) Middle left panel: Shows the actual population abundance (black solid line) and the population abundance estimated by the manager (blue solid line) over time. The dotted red line shows the resource carrying capacity (death-based) and the dotted blue line shows the target for resource abundance as set in the gmse() function; the orange line shows the total percent yield of the landscape (i.e., 100 percent means that resources have not decreased yield at all, 0 percent means that resources have completely destroyed all yield). (4) Middle right panel: Shows the raw landscape yield for each stakeholder (can be ignored if land_ownership
is FALSE) over time; colours correspond to land ownership shown in the upper right panel. (5) Lower left panel: The cost of stakeholders performing actions over time, as set by the manager. (6) Lower right panel: The total number of actions performed by all stakeholders over time.start_hunting
time steps to ask the user how many resources they want to hunt (some management information is given to help make this choice). This feature will be expanded upon in later versions. Right now, the human is playing the role of agent number 2, the first stake-holder in the simulation. By default, this value is set to FALSE.hunt = TRUE
. The default value is 95.ga_popsize
times, and this population of individual agent actions undergoes a process of natural selection to find an adaptive strategy. Selection is naturally stronger in larger populations, but a default population size of 100 is more than sufficient to find adaptive strategies.ga_popsize
times, and this population of individual agent actions undergoes a process of natural selection at least ga_mingen
times to find an adaptive strategy. If convergence criteria converge_crit
is set to a default value of 100, then the genetic algorithm will almost always continue for exactly ga_mingen
generations. The default value is 20, which is usually plenty for finding adaptive agent strategies – the objective is not to find optimal strategies, but strategies that are strongly in line with agent interests.ga_popsize
replicate agents are produced; ga_seedrep
of these replicates are exact replicates, while the rest have random actions to introduce variation into the population. Because adaptive agent strategies are not likely to change wildly from one generation to the next, it is highly recommended to use some value of ga_seedrep
greater than zero; the default value is 20, which does a good job of finding adaptive strategies.ga_sampleK
strategies at random and with replacement from the population of ga_popsize
to be included in the torunament. The default value is 20.ga_sampleK
strategies at random and with replacement from the population of ga_popsize
to be included in the torunament, and from these randomly selected strategies, the top ga_chooseK
strategies are selected. The default value is 2, so the top 10 percent of the random sample in a tournament makes it into the next generation (note that multiple tournaments are run until ga_popsize
strategies are selected for the next generation).ga_popsize
.max_ages
is 5.miminimum_cost
of actions, and the policy set by the manager. The default user_budget
is 1000. manager_budget This is the total budget for the manager when setting policy. Higher budgets make it easier to restrict the actions of stakeholders; lower budgets make it more difficult for managers to limit the actions of stakeholders by setting policy. The default manager_budget
is 1000.lambda
value to zero. The default value of this is FALSE.tend_crops
. Actions on the landscape cannot be regulated by managers, so the cost of this action is always minimum_cost
. The default value of this is FALSE.tend_crops
. Actions on the landscape cannot be regulated by managers, so the cost of this action is always minimum_cost
.manage_caution
of each possible action will always be performed by stakeholders. I manager will therefore not ignore policy for one action because no stakeholder is engaging in it; the default value of manage_caution
is 1.ga_mingen
, the genetic algorithm will terminate if the convergence criteria is met. Usually making this criteria low doesn’t do much to improve adaptive strategies, so the default value is 100, which in practice cases the genetic algorithm to simply terminate after ga_mingen
generations.Returns: A large list is returned that includes detailed simulation histories for the resource, observation, management, and user models. This list includes eight elements, most of which are themselves complex lists of arrays: (1) A list of length time_max
in which each element is an array of resources as they exist at the end of each time step. Resource arrays include all resources and their attributes (e.g., locations, growth rates, offspring, how they are affected by stakeholders, etc.). (2) A list of length time_max
in which each element is an array of resource observations from the observation model. Observation arrays are similar to resource arrays, except that they can have a smaller number of rows if not all resources are observed, and they have additional columns that show the history of each resource being observed over the course of times_observe
observations in the observation model. (3) A 2D array showing parameter values at each time step (unique rows); most of these values are static but some (e.g., resource number) change over time steps. (4) A list of length time_max
in which each element is an array of the landscape that identifies proportion of crop production per cell. This allows for looking at where crop production is increased or decreased over time steps as a consequence of resource and stakeholder actions. (5) The total time the simulation took to run (not counting plotting time). (6) A 2D array of agents and their traits. (7) A list of length time_max
in which each element is a 3D array of the costs of performing each action for managers and stakeholders (each agent gets its own array layer with an identical number of rows and columns); the change in costs of particular actions can therefore be be examined over time. (8) A list of length time_max
in which each element is a 3D array of the actions performed by managers and stakeholders (each agent gets its own array layer with an identical number of rows and columns); the change in actions of agents can therefore be examined over time. Because the above lists cannot possibly be interpreted by eye all at once in the simulation output, it is highly recommended that the contents of a simulation be stored and interprted individually if need be; alternativley, simulations can more easily be interpreted through plots when plotting = TRUE
.
GMSE is now a package
I have now made GMSE package, including documentation for all of the R code except the main gmse()
function, which I will complete soon. The package should be available to use as early as tomorrow evening. There are still some additional tweaks that I will continue to make, particularly to the plotting, and I want to add some tests to the model as well. Uploading to CRAN will be done after some beta testing – I’ll mainly follow Hadley Wickam’s book for advice here.
Progess on new features
A new six by two plot for for case 2
and case 3
observation functions has been added. Additionally, G-MSE now records a new array PARAS_REC
, which holds parameters each generation, including observation estimates and confidence intervals. The PARAS_REC
will allow me to simplify the plotting functions because the relevant data will be calculated in C on the fly and neatly held in PARAS_REC
. Additionally, I will add in the total actions for all stake-holders as seven elements in paras
(five actions on resource type1
and two landscape actions), and also the cost of each action. This will not only make all of the plotting code much simpler, it will also allow the potential for the history of actions and costs to affect manager and stake-holder actions in future software development.
It’s always tempting to push the model a bit further with new features or more efficient algorithms and code, but I think that now is the time to turn G-MSE into a package and send it off to colleagues to experiment with, which I will do tomorrow. Nevertheless, I want to hit a few points that will be very useful for future G-MSE features:
PARAS_REC
, but using these data will be tricky.Each of the above will take a bit of planning in addition to coding. I’m not sure if they would also require the addition of new data arrays, but I think they are worth considering.
Resolve Issue #27
The start column (and, because observation column number equals times
observed, the end column) was specified incorrectly in the density and mark-recapture estimates in R. This meant that three columns were sampled with values all equaling zero, and three columns were not sampled with values equaling one and zero, to estimate population size. Hence, this produced an underestimate of population size in plots. The issue has now been resolved.
Additional user options
Additional user options now include the following (defaults shown):
stakeholders = 4, # Number of stake-holders
manage_caution = 1, # Caution rate of the manager
land_ownership = FALSE, # Do stake-holders act on their land?
manage_freq = 1 # Frequency that management enacted
Within the week the following features will be added:
case 2
and case 3
observation functionsmanager.c
instead of in R (and saved in paras
)The above points should not take more than a day to complete, at most, and upon completing them I will then make G-MSE into a package that can be downloaded using devtools from GitHub. More long-term, I want to do the following, but this might not happen until after a draft of the methods paper is written.
Introduce Issue #27: Observation estimate understimates real population size
The case 0 observation type is consistently underestimating the true population size. This could be caused by a calculation that assumes that the size of the sampled area is larger than it actually is, or that the size of the landscape is smaller than it actually is; either way, the observation.c file needs to be double-checked and potentially debugged.
Playing around with parameter values
I have made some of the simulation inputs easier to work with on the user end and played around with different variable combinations on a relatively low-power laptop (Lenovo X201 Thinkpad). The simulation is a bit slower than desirable, but not so slow as to cause major issues (takes about a minute ore so to simulate a fairly big population of ca 200 with 12 stake-holders).
Introduce Issue #28: More stake-holders have fewer actions
For some reason, having more stakeholders appears to lead to less culling of resources even when all of them are attempting to do it. If there are more stakeholders to act, then actions should happen more often because each has the same budget.
Note that this appears to even occur when users are not restricted to their landscape; it might be something to do with double killing? There just aren’t enough resources dying in the model to match with the actions.
Resoved Issue #28:
Resolved – just input the stakeholder number incorrectly. See commit 6b63439b384cab90680f6a36a79f2c94eba46c45
Code is finally stable
I have now deliberately tried to crash G-MSE in multiple ways – the goal being to throw parameter combinations or options at the model in such a way as to cause the model to not work accurately. At first, I was successful at this when I forced managers to only allow for one management option (culling, scaring, etc.). After much debugging and testing, I have fixed this so that I am confident that the code runs as advertised, for the moment. Features that have now been included to G-MSE as a consequence of this process include the following.
movem
, killem
, castem
, feedem
, and helpem
) are allowed; actions that are not allowed can never be performed. At the moment, these actions are still plotted as zeros, but soon they will be removed from the plots.killem
at 1 instead of 0. Yet if they wanted something between keeping culing the same and doubling it, they were out of luck. If actions always cost at least some value (default = 10), then some increment just above that value is always available – hence it is better to simply give everyone a bigger budget and set a minimum cost, giving more precision to managers to fine tune policy.Finally shift to Friday’s goals
I have started to change some parameter inputs to make it easier to play with parameters, but I’m going to do more of this now that I’m much more confident in the G-MSE software. Once this is done, I will make the whole thing an Rpackage that can be downloaded using github developer tools, and I’ll add documentation before sending instructions around. Additionally, I will then start to write up some sample case studies (e.g., hunters on a public landscape or farmers trying to maximise yield) to show what G-MSE can do. Writing these out into an Rmarkdown file, I’ll have the start of a methods paper introducing the software.
Minor debugging
There are still a few minor bugs to work out, some of which I was able to take care of (see commit history if need be). I’m now trying to give the option to restric the number of possible actions, but restricting them seems to still produce some errors – namely, the genetic algorithm for managers doesn’t seem to be responding appropriately.
Landscape actions added to the user.c
file
I have added the function act_on_landscape
in user.c
so that users can perform actions on the landscape. The only two actions that the users can do, at the moment are killem
and feedem
which effectively kill the crop yield and increase it, respectively. All other action columns do nothing. I’ve also added a new element to paras
that modifies how much a user can increase crop yield (previously, I was allowing users to double crop production on a cell only). Testing confirmst that when users value crop yield and can greatly increase it by feedem
, they will find this option and do so to increase crop yield.
Resolution of Issue #21: paras
now used everywhere
I have now cleaned the code so that paras
is effectively used across all G-MSE functions. This effectively resolves Issue #21 and makes the code more readable. Likewise, I have also cleaned up the functions in a few places and introduced get_rand_int
for easier sampling.
Next steps: Making it easier for users
The next steps as outlined on Friday are to do the following:
I don’t think that this will be too time-consuming because there is likely to be very little trouble-shooting and debugging for the above. Once all of this is done though, I will want to add the browser interface for G-MSE. This will be challinging, but the recently developed elementR package can provide some inspiration for working with shiny in a package that requires a lot of options to be set.
Rewritten do_actions
successful
I am largely satisfied with a rewrite of the do_actions
function, which affects the way that users perform actions on resources by changing the rules to make actions simultaneous instead of sequential by user. Instead of having one user perform actions on resources, then another user perform actions, etc., the new do_actions
program instead just grabs the ACTIONS
array after the genetic algorithm is called for all users and randomly performs actions until no more actions exist. In other words, the order in which the actions of all users are performed is effectively randomised so that, for example, one user does not have an advantage of acting last and therefore moving all of their resources to a neighbour’s territory after their neighbour has performed all of their actions in a time step. This implementation is probably slightly less efficient, but probably not too much.
Landscape actions are not implemented at the moment, and will need to be rewritten, though this should be considerably easier as their are fewer actions to perform and the actions occur directly on the landscape. The existing landscape_actions
(still not deleted from user.c
) might be easy to edit even. Once this is done, the whole model should be in place without any major issues; I’m not sure if Issue #26 is actually a bug, or just a consequence to be expected from a low-seeded genetic algorithm, but the algorithm works either way, and not having a seed is probably always a bad idea.
There are a few things that are definitely left to do.
landscape_actions
function in the user.c
file.paras
. Some added elements might even help with the plotting later and the management function (predicting growth rates) – maybe make a paras_rec
that is a two dimensional array with time_max
rows.One weird thing to address, which I actually don’t think is a bug: Sometimes when resource movement is very low (ca 1), the resources become highly autocorrelated on the landcape. I hypothesise that this is caused by some stake-holders doing a relatively poor job of killing resources at some point in the past and leading to a threshold of population growth that is localised and out of control. This happens often, but not consistently to the same agent, and sometimes to more than one agent in a simulation; it would be good to check to make sure that this is the correct interpretation of the patterns from the model.
One potential idea is to also give the manager a bit more information, in addition to allowing them to see growth rates of species empirically measured, to also see the enactment of policy in relation to how it is set. For example, if a manager sets a cost of 10 for killing, does that over or undershoot the target – the degree to which the target is over or undershot could be multiplied by the existing value to get a clearer prediction (e.g., if the manager wants 50 resources to die, but the way that those resources are distributed only allows 30 deaths because some resources are autocorrelated among different user’s land).
Resolved Issue #22
I’ve finally tracked down the bug that causes multiple resource types to crash in the user function. The error was in the land_to_counts
, which had conditions in the main while
loop that couldn’t be met and were unnecesary (might have to add one more at some point if we want multiple landscape layers to work – later though). I’ve removed this condition, and also initiased the COST
and ACTION
arrays without the extraneous rows caused by landscape levels being repeated for resource types; if nothing else, these were distracting, but I could see them causing bugs later. As of commit 102018fc0457e510f87e812a97681860bed1a382, G-MSE should be, in theory, functional with multiple resources, though I still have the rewrite of do_actions
to do.
Major test fails – rewrite of do_actions
needed
The user function has a major bug that is causing strange things to happen to resources. Consistently, resources are piling up on one or another user’s land – I’ve found little rhyme or reason why, but it is caused at least partly by the location-specific nature of user actions (in other words, once u_loc = 0
and users can affect any resource on the landscape, no spatial pattern exists). Note that this happens even when users cannot move resources (e.g., only kill on their land), so it’s not just that the last agent to act clears all the resources from their landscape. The resources always seem to collect on one owner’s land, and it’s not consistent whose (nor is there any seeming connection from the spatial distribution and the agent actions).
This bug gives me an excuse to re-write do_actions
, which I probably needed to do anyway because Issue #22 is still unresolved. A rewrite of do_actions
and everything down stream might fix the resource type specification error. As of now the do_actions
function is called for each agent sequentially, and each agent then performs their actions on each type of resource and landscape level by moving through rows of the ACTION
array (with error if there is more than one type of resource). Hence, one user does all of their business, then another, and so on.
It would be much better to do this all simultaneously, and it shouldn’t take too much computation time or coding time. Instead of going through agents sequentially, the idea is to copy the entire ACTION
array (all agents having gone through their genetic algorithms) into a function. Next, calculate the total number of actions to be performed. Then, sample a random row, column, and layer of ACTION
, which will be associated with a randomly selected agent. The lucky winner will then randomly sample rows of the RESOURCE
array until they find one that they can affect (e.g., is on their land, has not been killed, and is of the correct type); if they exhaustively search all resources but cannot find one to affect, then they don’t perform the action – note the element in the copied ACTION
array should not necessarily be set to zero because another agent might subsequently kick a resource onto their land to kill; it should decrement the action by one though (else a clear risk of infinite loop). Landscape actions can proceed the same way; the random selection simulates people doing things simultaneously over the course of a season.
Introduce Issue #26: Genetic algorithm seed reliance
For some reason, the initial seed of the genetic algorithm appears to be having an effect that it shouldn’t. When there are no individuals seeded in the genetic algorithm from the previous generation, the agents appear to go under-budget. It’s not clear why this is the case. Oddly, managers appear to use a budget of 250 despite it being set at 300 given any seed greater than zero. When the seed is zero, the budget for setting costs drops to ca 100 for reasons that are not at all clear to me. For stake-holders, the cost drops to a fraction of its set budget (about a 30th of it). Yet, the stake-holder cost is still too low even when a seed of 20 is set; most stake-holders spend ca 1/6 of their budget when they should be forced to spend all of it.
Stake-holders are helping resources when they should not
Stake-holders are helping resources. This was caused by some issues in resource_actions
in the user model which has now been partially resolved in commit f1ce95e092739e6e53df05b326c491d917679eb9. Essentially, resources were being helped out too much (i.e., growth rates went from 0.05 to 1 when helping – changed now to 0.05 times two – increasing birth rate 100 percent), and sometimes being helped out even after having been killed or castrated. This is still happening, as is evident when looking at RESOURCE_REC
. Resolving it is priority one.
Cleanup and toward resolving Issue #21
I have done some more clean-up of the manager.c
file, mainly reducing the number of arguments passed to functions using the paras
vector. I’ve also removed some more hard-coded values, particularly by defining columns for things like resource types by holding column numbers in paras
. I’m not sure whether I want to do this for the action and cost array cols 7-12 in set_action_costs
yet. It might be a good idea. One thing to keep an eye on is the para[66]
value, which now is just the number of resources (also 1 minus the lookup
table rows). It holds together for now, and nicely can be affected globally, but I need to pay attention to how its affecting management in the set_action_costs
function.
Note on managing-observing trade-off
We could introduce a trade-off between observation and allocating costs for the manager in G-MSE, as in Milner-Gulland (2011). Running this through the genetic algorithm could be a challenge – somehow the observation intensity would need to be put into the fitness function. Storing it would be fairly trivial – could just use bankem
, but converting observation time to manager fitness would require somemore thought.
Introduce Issue #25: Agent’s action error
For some reason, some initial testing seemed to suggest that resource population growth increases with the number of stake-holders, even if those stake-holders are hostile to the resource. Some further testing confirmed that stake-holders don’t engage in actions there are more than two of them – it’s possible that I hard-coded something during testing, but it needs to be fixed. For now, I’m shifting the default testing options to 3 stake-holders to isolate the issue.
Resolve Issue #25: Agent’s action error
That was quick. What happened was an issue with the COST
and ACTION
array – I basically had the code to initialise three but not four stake-holders accurately. When a fourth was initialised (or nine, in the case of one test), the stake-holder did nothing because it was devoting itself to costly non-sense actions from the start and couldn’t get out of them. Resources then did better because there were fewer agents able to affect them (those agents owning a smaller amount of land). When this is resolved, a fourth stake-holder performs the expected actions and the population dynamics oscillate even more as a result because more total actions are being performed (and on more land, as I’ve set it).
More progress toward resolving Issue #21
I’ve now reduced the number of arguments and hard-coded values in the functions of observation.c
, leaving only the transect
and sample_fixed_res
functions to go. Overall, I do think tha this makes the code more readable, and everything goes back to the paras
vector, which will be useful later in input and output during software testing and use.
Progress toward resolving Issue #21
The functions in resource.c
now take the paras
vector as an argument where practical (most of the time). This cleans the code up quite a bit and has the nice side effect of giving me an excuse to also remove some of the hard coded values (even if they don’t change, this is probably a good idea).
Concrete plans for cleaning up the code
Now that the main engine of G-MSE is in place, there are a few things that I want to do in the next week or two to clean up the code.
para
vector as originally intended. It shouldn’t take too much extra work to do this, and it can be done systematically for each c file by adding any new arguments to para
if they are not in it already.gmse
. It’s not necessary for running the model, but it would be very nice to somehow create some options to build the COST
and ACTION
arrays for some simple scenarios – perhaps even have a way to edit these arrays easily within the code (then, eventually, as input into gmse()
).Following these things, it would also be helpful to do the following.
Resolved Issue #24 Resources retain helpem
and feedem
Issue #24 appears to be resolved, although the it was a bit trickier than anticipated to do so. I created three additional columns in the RESORUCE
array to store the change in the baseline values of birthrate, death probability, and offspring number. As far as I can tell, there is no longer any carryover in these demographic values, nor do parents pass on their adjusted values to their offspring. Fixing this required several changes to user.c
and resource.c
. As a consequence, death rate caused by killing is now completely independent of carrying capacity (as seems sensible). Another thing to decide is if increases in birth rate or offspring number caused by user actions should also be independent of carrying capacity; that is, when users helpem
or feedem
, are they increasing the carrying capacity itself, or just the population growth rate to carrying capacity (as of now, it’s the latter).
A working example – but still some debugging to do
After resolving Issue #24, some initial testing shows that the model appears to be working as intended (more testing is obviously needed). The below shows a scenario in which one resource has a small effect on crop yield. The upper left panel shows resources on the landscape. The upper right panel shows land ownership (the blue is public – manager owned – land). The middle left shows population abundance (black) and its estimate (blue); carrying capacity is 400 (red dashed line), but the manager is trying to keep the population around 200 (dashed blue line) – mean percent yield of the crop is shown in the orange line. The mid right panel shows yields of each plot. The lower left panel shows the changing policy set by the manager – red lines show the cost of stakeholders killing or castrating resources (very high values effectively prohibit it). Green shows the cost of moving (scaring) resources off the stake-holder’s land, and blue shows the cost of helping the resource (increasing its birthrate or offspring production). The lower right shows what stake-holders do in response to policy – colours show actions corresponding to the same colour costs in the lower left panel.
So in the above example, we have the manager effectively prohibiting killing or castrating resources until about generation 18, when the population gets higher than desired. At this point, the manager switches to allow killing and castrating, and makes moving and helping resources more costly – stake-holders respond to this by doing a bit more killing and castrating, and the population goes down in response.
Looking good, but still need to clean the code
The above example is encouraging, but there is still quite a bit of clean-up to do. More unit testing is necessary to make sure that all resources are doing what they should, and I think the interaction between resources and landscape could be made a bit better in the resource model. Also, setting the initial costs and utilities is quite messy – I need to fix this up a bit so that there is at least one easy place to do this in the code, then an easy way to do it as an argument in the gmse
function. It would also be nice not to have managers or stake-holders be quite so short-sighted – but having decisions be made based on history will require quite a bit more work, though the structure is there for it to be done in the code.
One more debugging in the genetic algorithm
A bug in the code was causing managers to set their marginal fitneses to zero within the genetic algorithm. The reason for this was that the functions crossover
and mutation
allowed for util
, u_loc
, and u_land
columns to be changed when the zero column of the action array was positive – i.e., when the actions corresponded to affecting other agents utilities in some way. The reason for this coding is so that agents can potentially affect one anothers utilities (e.g., a stake-holder lobbying the manager), but it does not make sense for stake-holders to affect their own utilities. The bug was caused because when the manager mutated (or crossed over) to change their own utilities in some way, the high cost recognised this as over-budget and set the value to zero, hence replacing the marginal utility set in the manager model. This was easily fixed by not allowing an agent to affect its own utility values (i.e., disallow the utility columns to be changed when the first column of the ACTION
array equal’s the agent’s own ID). This would have caused an issue later anyway, so it’s better to spot it now. Re-running the model, the bug is fixed and the manager marginal utilities are retained in the appropriate row of ACTION
(see commit 4dacbe83ed1be0d1216b692a1db18f5323ed22f2).
Another thing that needs de-bugging
For some reason, managers are going way overbudget in allocating actions. Fortunately, they’re at least allocating their actions well, but I need to fined out why their budget looks more like 500 when I set it to 100. Note that this only happens when managers want more of a resource, not fewer. Perhaps the marginal utility is getting added into the budget? Yes, this appears to be the case and has been fixed with commit d60312da590630fc2a680a57b8daed8e6d6bfafd, and now the costs no longer go over the manager’s budget.
Valgrind summary
Some initial testing revealed that some memory might have been poorly allocated; allocating space for an int
instead of a double
in the genetic algorithm was flagged by valgrind
. After fixing this (see commit 57d0c78de7e421687870749549d309cf85d31dab), valgrind
returns no errors or leaks.
==8048==
==8048== HEAP SUMMARY:
==8048== in use at exit: 194,900,716 bytes in 19,004 blocks
==8048== total heap usage: 12,387,437 allocs, 12,368,433 frees, 2,482,975,628 bytes allocated
==8048==
==8048== LEAK SUMMARY:
==8048== definitely lost: 0 bytes in 0 blocks
==8048== indirectly lost: 0 bytes in 0 blocks
==8048== possibly lost: 0 bytes in 0 blocks
==8048== still reachable: 194,900,716 bytes in 19,004 blocks
==8048== suppressed: 0 bytes in 0 blocks
==8048== Reachable blocks (those to which a pointer was found) are not shown.
==8048== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==8048==
==8048== For counts of detected and suppressed errors, rerun with: -v
==8048== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I dare say that this might nearly be an alpha version of the software. I just need to get some more clever ways to input array values, and make sure that I resolve Issue #22.
Issue #24 Resources retain helpem
and feedem
Resources are retaining their values of helpem
and feedem
after being helped for one generation. Worse, they are passing their inherited characteristics on to their offspring. This needs to be changed so that agent actions have the temporary effect of increasing offspring survival probability or reproduction – else populations will never run the risk of crashing.
Some success with manager fitness function debugging – more testing needed
After much time working on debugging the manager fitness function, I believe that all of the bugs are worked out of it, and that the managers are now responding dynmically to agent actions and resource abundances. I now need to test the whole function in multiple different ways to confirm this, and to make sure that the manager sets policy as predicted for some very simple scenarios.
helpem
and feedem
costly when resource abundance is higher than what managers want it to be, but stake-holders want more resources.One potential issue I’ve already noticed – if managers make stake-holder actions so costly that they never perform them, then the manager might operate under the assumption that they will never perform the action even if costs drop. It might therefore be necessary to add an increment to the total actions (e.g., add 10 to each, just to give managers the ability to consider the possibility) or somehow have managers tie predicted actions to stake-holder utilities (I don’t like this as much – to speculative and computationally intense).
Debugging the manager fitness function in the genetic algorithm
Today I have spent my time attempting to completely debug the newly created manager_fitness
function and its sub-functions. Unfortunately, one bug still appears to remain. For some reason, the function adds actions to the POPULATION
array in the first row. This issue has been isolated, and I’m almost sure that it is caused by something in manager_fitness
. Tomorrow, the goal is to fix this so that actions are applied correctly where the row’s first column is 1 (the manager agentID
).
New Issue #23: Revise predicted consequences of user and manager actions
In functions in the genetic algorithm res_to_counts
and policy_to_counts
, the projected consequences of actions needs to be fine tuned. As of now, it predicts one fewer resource from movem
, killem
, and castem
, and one more resource from feedem
and helpem
in res_to_counts
. In policy_to_counts
, it predicts one fewer resource for killem
and one more resource for feedem
and helpem
. Really, there should probably at least be an option to use more precise estimates of what will happen. For the user function, this matters a bit less because stake-holders typically just want more or less of a resource. Managers, however, are trying to hit a middle ground a lot of the time; it is also more reasonable to assume that they have demographic information on the resources of interest.
More writing and re-writing the manager genetic algorithm
I have completed an initial draft of the manager fitness function manager_fitness
and its associated sub-functions policy_to_counts
and sum_array_layers
. The function manager_fitness
might need to be pruned a bit by adding a third sub-function, as it’s a bit long at the moment.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void manager_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, double **agent_array, double **jaco,
int **interact_table, int interest_num, int agentID,
double ***COST, double ***ACTION, int COLS, int layers){
int agent, i, row, act_type, action_row, manager_row, type1, type2, type3;
double agent_fitness, *count_change, foc_effect, change_dev, max_dev;
double movem, castem, killem, feedem, helpem, *dev_from_util;
double utility, *utils, **merged_acts, **merged_costs, **act_change;
count_change = malloc(interest_num * sizeof(int));
utils = malloc(interest_num * sizeof(int));
dev_from_util = malloc(interest_num * sizeof(double));
merged_acts = malloc(ROWS * sizeof(double *));
for(i = 0; i < ROWS; i++){
merged_acts[i] = malloc(COLS * sizeof(double));
}
merged_costs = malloc(ROWS * sizeof(double *));
for(i = 0; i < ROWS; i++){
merged_costs[i] = malloc(COLS * sizeof(double));
}
act_change = malloc(ROWS * sizeof(double *));
for(i = 0; i < ROWS; i++){
act_change[i] = malloc(COLS * sizeof(double));
}
sum_array_layers(ACTION, merged_acts, 0, ROWS, COLS, layers);
sum_array_layers(COST, merged_costs, 1, ROWS, COLS, layers);
max_dev = 0;
for(agent = 0; agent < pop_size; agent++){
for(action_row = 0; action_row < interest_num; action_row++){
count_change[action_row] = 0; /* Initialise at zero */
utils[action_row] = 0; /* Same for utilities */
while(population[action_row][0][agent] < -1){
type1 = population[action_row][1][agent];
type2 = population[action_row][2][agent];
type3 = population[action_row][3][agent];
manager_row = 0;
while(population[manager_row][0][agent] == agentID &&
population[manager_row][1][agent] == type1 &&
population[manager_row][2][agent] == type2 &&
population[manager_row][3][agent] == type3
){
manager_row++;
}
}
policy_to_counts(population, merged_acts, agent, merged_costs,
act_change, action_row, manager_row, COLS);
foc_effect = 0.0;
foc_effect -= act_change[action_row][9]; /* See Issue #23 */
foc_effect += act_change[action_row][10];
foc_effect += act_change[action_row][11];
for(i = 0; i < interest_num; i++){
count_change[i] += foc_effect * jaco[action_row][i];
}
utils[action_row] = population[manager_row][4][agent];
}
for(i = 0; i < interest_num; i++){ /* Minimises dev from marg util*/
change_dev += (count_change[i]-utils[i])*(count_change[i]-utils[i]);
}
if(change_dev > max_dev){
max_dev = change_dev;
}
dev_from_util[agent] = change_dev;
}
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] = max_dev - dev_from_util[agent];
}
for(i = 0; i < ROWS; i++){
free(act_change[i]);
}
free(act_change);
for(i = 0; i < ROWS; i++){
free(merged_costs[i]);
}
free(merged_costs);
for(i = 0; i < ROWS; i++){
free(merged_acts[i]);
}
free(merged_acts);
free(dev_from_util);
free(utils);
free(count_change);
}
The policy_to_counts
function feeds new actions back to the main manager fitness function based on the new costs imposed by managers. We assume that new actions are proportional to the percent increase or reduction to costs (e.g., twice as many killem
actions if the manager makes it cost half as much). I cases where the cost drops to zero (debating whether I want his to be possible – probably not), we assume the new cost is 0.5 and calculate accordingly.
/* =============================================================================
* This function updates a temporary action array for changes in policy
* population: The population array of agents in the genetic algorithm
* merged_acts: The action 2D array of summed elements across 3D ACTION
* agent: The agent (layer) in the population being simulated
* merged_costs: The mean cost paid for each element in the ACTION array
* act_change: The array of predicted new actions given new costs
* action_row: The row where the action and old costs are located
* manager_row: The row where the new costs from the manager are located
* COLS: The number of columns in the ACTION and COST arrays
* ========================================================================== */
void policy_to_counts(double ***population, double **merged_acts, int agent,
double **merged_costs, double **act_change,
int action_row, int manager_row, int COLS){
int col;
double old_cost, new_cost, cost_change, new_action;
for(col = 0; col < COLS; col++){
old_cost = merged_costs[action_row][col];
new_cost = population[manager_row][col][agent];
if(new_cost == 0){
new_cost = 0.5; /* Need to avoid Inf increase in cost somehow */
}
cost_change = old_cost / new_cost;
new_action = merged_acts[action_row][col] * cost_change;
act_change[action_row][col] = floor(new_action);
}
}
The function sum_array_layers
is basically an apply function in R, except that it only works with the COST
or ACTION
arrays, and only in one dimension.
/* =============================================================================
* This function sums (or averages) a row of COST or ACTION across all layers
* array: The 3D array that is meant to be summed or averaged
* out: The 2D array where the summed/average values are to be stored
* get_mean: TRUE (1) or FALSE (0) indiciating whether to get mean vs sum
* ROWS: Number of rows in array
* COLS: Number of cols in array
* total_layers: How many layers there are in array (depth)
* ========================================================================== */
void sum_array_layers(double ***array, double **out, int get_mean, int ROWS,
int COLS, int layers){
int row, col, layer;
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
if(get_mean == 1){
for(layer = 0; layer < layers; layer++){
out[row][col] += (array[row][col][layer] / layers);
}
}else{
for(layer = 0; layer < layers; layer++){
out[row][col] += array[row][col][layer];
}
}
}
}
}
I have not tested any of these functions at all. They almost certainly contain some bugs at the moment, so a lot of work is going to need to debugging them and making sure that they actually are doing what I want them to do. Tomorrow might be a good time for a thorough debugging and memory leak checks. If all this works though, managers should be able to dynamically change costs in response to stake-holders to manage resources – once the appropriate call from manager.c
is in place (it hasn’t been coded yet, but this should be trivial to write). Note that the git history immediately prior to commit 79446e394133bb9e6b4792d334ab863e32ef0881 will show some attempts at getting the above functions working in different ways. I settled on the above after restructuring the code considerably for both speed and readability.
Linking manager marginal utilities and manager actions remains difficult, but I have decided on the following plan to move forward.
It will be useful to develop a very simple criteria for assessing the fitness of adjusting costs in strategy_fitness
in game.c
. Do this in the switch
function where case 1:
, but include an if
statement to make sure that if(act_type == agentID)
, then the genetic algorithm knows that it’s affecting all other user actions in the -2
row. A new function policy_to_counts
will be created in game.c
which takes in the ACTION
and COST
arrays. This new function will assume two things.
The proportion of +
, -
, and 0
actions (from the perspective of a stake-holder) will not change – i.e., stake-holders will try to achieve the same ends in the next time step as they did in the previous time step. The movem
column will be defined as -
if util_loc = 1
and util_land = 1
, else it will be defined as 0
(again, from the stake-holder perspective, from the manager’s perspective this is always 0
– at least, I can’t think of any reason why we would want it not to be zero.
Stake-holders will invest in whatever +
, -
, or 0
action is least costly. Hence to loosely predict stake-holder actions, the manager could simply assume that the stake-holder invests a proportion of their total budget to the least costly action, and based on the manager’s set cost, puts their budget into those actions accordingly. Hence, if we had a farmer that wanted to increase crop yield by reducing resource abundance, and had to choose between movem
, killem
, and castem
with costs of 10, 2, and 5, respectively, then the farmer would put all of their budget into killem
(or a high proportion, at least). This requires the manager (whose actions are already set within the genetic algorithm) to get a proportion for each of the stake-holders actions, then divy their actions out based on the revised costs.
The initial plan: getting something to work
Let’s try all of the above again. We’re trying to get from cost adjustments to fitness. We have the cost adjustments in hand; the manager population in the genetic algorithm is in the process of selecting which of these adjustments are best. The difficultly is now translating the cost adjustments to stake-holder actions, and figuring out just how good we want managers to be at assessing stake-holder actions. One extreme is to run the genetic algorithm in each within the manger’s decision making to figure out how stake-holders will respond to policy change with a high degree of acurracy; this would take a massive amount of computation time and be a bit unrealistic in that it would kind of assume that managers can read the minds of stake-holders.
Another extreme is to assume the sum total of each action will not change and to adjust costs accordingly. Perhaps, to start, we could define a new array within the new function policy_to_counts
, **sum_actions
, which would sum up all stake-holder actions for each resource type.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 10 | 10 | 10 | 32 | 2 | 16 | 0 | 0 | 0 |
-2 | 2 | 0 | 0 | 301 | 10 | 10 | 4 | 1 | 0 | 0 | 1 | 1 |
The hypothetical sum_actions
array above adds up all of the rows in the ACTION
array where column 1 equals -2. For each resource we then get a picture of what is going to happen in the next generation (to some extent it is unrealistic to assume that managers have even this much detailed information, but then again, these are actions from the previous time step, so it’s perhaps not too much of a stretch to assume that the manager has some idea of what actions were taken by stake-holders). We also get a picture of the sum utilities for each resource type. To project the consequences of manager cost adjustment, managers can compute the proportion change in cost (which will require to read the COST
array into the fitness function) and assume that the proportion of actions changes accordingly. For example, if the manager makes it half as costly to killem
for resource type1 = 1
above, then they could assume that killem
will be 32 total actions in the next generation. These sum total actions, adjusted by manager changes in costs, could then be run through the interaction array to project the change in resource abundance – fitness could be assessed by minimising the the difference between the projected change in resources from the marginal utilities.
This isn’t perfect prediction. Sometimes stake-holders will probably radically change behaviour after some cost threshold is met, but I think this is kind okay (at the very least, managers will respond in the next generation).
Other ideas
I will start coding the above plan, but there are probably other reasonable options to consider. I would like to also add the option of enacting policy via a second resource – representing resources as something like hunting licenses. The effects of these licenses could be understood through the interaction matrix (essentially, they’d be like introducing a predator, but one that goes through stake-holders). The manager could set the number of hunting licenses using the feedem
(increases birth rate by action number) and castem
(causes one fewer so resource doesn’t reproduce) columns (licenses would otherwise have a birthrate and death rate of one, so each replaces itself in the next generation) – birth type would also need to be changed to not be selected from a random Poisson. The bankem
column could be interpreted as buying a license, somehow.
Implementation of the initial plan
This was somewhat difficult because of the way that marginal utilities are handled in the manager.c
file. A new vector needed to grab the correct utilities and actions for adjusting costs and it was easier and more readable to just write a separate manager_fitness
function (it can still be called by non-managers, though I’m struggling to think of when this would be desirable). The manager_fitness
function is unfinished.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void manager_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, double **agent_array, double **jaco,
int **interact_table, int interest_num, int agentID){
int agent, i, row, act_type, action_row, manager_row, type1, type2, type3;
double agent_fitness, *count_change, foc_effect, change_dev;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(agent = 0; agent < pop_size; agent++){
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
action_row = 0;
while(population[action_row][0][agent] < -1){
type1 = population[action_row][1][agent];
type2 = population[action_row][2][agent];
type3 = population[action_row][3][agent];
manager_row = 0;
while(population[manager_row][0][agent] == agentID &&
population[manager_row][1][agent] == type1 &&
population[manager_row][2][agent] == type2 &&
population[manager_row][3][agent] == type3
){
manager_row++;
}
}
/* Get the marginal utilities into utilities by running policy_to_counts
* and get the count_change the same way. The above runs thorugh this
* for each agent and for each resource Here still within the agent loop
* we need to get the vectors summed appropriately to a reasonable
* fitness metric (keeping in mind that it's not just ordinal
*/
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){ /* Minimises dev from marg util*/
change_dev = (count_change[i] - utilities[i]) *
(count_change[i] - utilities[i]) + 1;
fitnesses[agent] += (1 / change_dev);
}
}
free(utilities);
free(count_change);
}
Likewise, a sub-function that manager_fitness
will call also needs some work.
/* =============================================================================
* This function updates count change and utility arrays for changes in policy
* population: The population array of agents in the genetic algorithm
* interact_table: The lookup table for figuring out how resources interact
* int_num: The number of rows and cols in jac, and rows in the lookup
* utilities: A vector of the utilities of each resource/landscape level
* agent: The agent in the population whose fitness is being assessed
* layers: The number of layers (z dimension) in the COST and ACTION arrays
* COST: The cost array, for comparison with how costs change with actions
* ACTION: The action array to summarise current stake-holder actions
* agentID: The ID of the agent doing policy (should probably always be 1)
* ========================================================================== */
void policy_to_counts(double ***population, int **interact_table, int int_num,
double *utilities, int agent, int layers, double **jaco,
double *count_change, double ***COST, double ***ACTION,
int agentID, int ROWS, int action_row, int manager_row){
int row, col, layer, act_type, i, type1, type2, type3, cost_row;
double old_cost, new_cost, cost_change, new_action, mean_cost, sum_actions;
double **mean_costs, *hold_actions;
hold_actions = malloc(13 * sizeof(double));
for(i = 0; i < 13; i++){
hold_actions[i] = population[action_row][i][agent];
}
for(col = 7; col < 13; col++){
sum_actions = 0;
mean_cost = 0;
for(layer = 0; layer < layers; layer++){
sum_actions += ACTION[action_row][col][layer];
mean_cost += (COST[action_row][col][layer] / layers);
}
old_cost = mean_cost;
new_cost = population[manager_row][col][agent];
cost_change = old_cost / new_cost;
new_action = sum_actions * cost_change;
population[action_row][col][agent] = floor(new_action);
}
res_to_counts(population, interact_table, int_num, count_change, utilities,
jaco, action_row, agent);
for(i = 0; i < 13; i++){
population[action_row][i][agent] = hold_actions[i];
}
free(hold_actions);
}
The history of struggling with these two functions in a way that is accurate, readable, and efficient is in the git history. I’ll consider both functions with fresh eyes tomorrow with the goal of getting something working.
We have now reached a point where we have a clear link from manager utility to a manager’s desired change in resources. The util
column of a manager (layer = 1) action array defines how many resources of a particular type the manager wants there to be when column 1 equals -2 (added below for clarity).
-2.000000 1.000000 0.000000 0.000000 100.000000 1.000000 1.000000
-1.000000 1.000000 0.000000 0.000000 1.000000 1.000000 1.000000
1.000000 1.000000 0.000000 0.000000 -330.696014 0.000000 0.000000
2.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
3.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
From this util
value and the estimated abundance from the observation model, we can get to the marginal utility, which is placed in the same action array layer where column 1 equals the manager ID. We now need this value to have some effect; e.g., in the above where the population size is 330 individuals more than the manager wants, the manager needs to adjust the cost array in some way that has the predicted effect of lowering population size by roughly this amount. The way that the genetic algorithm can learn to do this is by assuming that the action array (which will have been the actions run in the last user
model) represents what stake-holders will do when constrained appropriately by costs. So, for example, we can consider the ACTION
array below.
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 100 1 1 0 0 0 0 20 0
[2,] -1 1 0 0 1 1 1 0 0 1 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 1
[4,] 2 1 0 0 0 0 0 0 1 1 0 0 0
[5,] 3 1 0 0 0 0 0 0 0 0 0 0 0
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 1 1 1 81 0 0 0 0 0
[2,] -1 1 0 0 100 1 1 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 0 0
[5,] 3 1 0 0 0 0 0 0 0 0 0 0 0
, , 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 1 1 1 72 2 0 0 0 1
[2,] -1 1 0 0 100 1 1 1 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 1 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 1 0 0
[5,] 3 1 0 0 0 0 0 0 0 1 0 0 0
The first layer is the manager, and the second two are stake-holders that have util = 100
for the landscape layer 1, and util = 1
for the resource defined by type1 = 1
, type2 = 0
, and type3 = 0
. In the above example, the resource might be geese disturbing crops, and the stake-holders might be farmers. In both cases, the stake-holders devote nearly all or nearly all of their budget to moving the resource (column 8 corresopnds to movem
). The manager can then project total number of resources increased or decreased by these actions, and – perhaps eventually – whose land they will be on (the code is there for the manager to prefer them on public or private land, but this might need to be implemented later). Assuming that the manager just cares about total resource abundance for now, the should be able to recognise that movem
will not decrease total resource abundance; hence, the manager might prefer stake-holders to switch to killem
(column 10).
It’s this switch that is a challenge for the model. It’s easy now to have the manager recognise that stake-holder actions are not optimal in terms of policy – more actions are being devoted to something that doesn’t kill resources, and more would be better placed by increasing stake-holder column 10 values. Still, how much the manager should lower costs to get the desired abundance is unclear (and even more so if we were to add multiple resources). The manager can’t exactly use the stake-holder utilities for the resource per se either, because the actions are determined by how resources interact; we also don’t want to run a sub-genetic algorithm for the manager to anticipate stake-holder actions, as this would end up being computationally intense.
Perhaps the manager should simply recognise the plus-minus-neutral effects of each column (from columns 8-13 above, 0 - - + + 0
). This gets part of the way there; if the manager wants resources killed, then they could crank up the costs associated with all 0
and +
actions (perhaps this shouldn’t be allowed though for bankem
, which would effectively prohibit stake-holders from inaction). The magnitude of costs for -
actions such as killem
or castem
could then be decided by assuming that stake-holders would transfer -
and 0
(again, exclude bankem
) to the lowest column of -
if the cost were the lowest.
Maybe managers should make a judgement a priori about what stake-holders are trying to do; classifying them as either wanting more or less of a particular resource. Wanting less of a resource would be associated with high values in action columns for killem
and castem
, and also movem
but only if u_loc = 1
and u_land = 1
for the stakeholder (i.e., if the desires and ability to move them depends on the resource being on their land). Wanting more of a resource would be associated with high values in action columns for feedem
and helpem
. The manager could then assume that stake-holders would allocate their total budget actions proportionally to -
, +
, or 0
columns, but without discrimination between columns. It could be easy to summarise budget and action totals as in the table below.
Action type | Total budget |
---|---|
Increasing | 500 |
Decreasing | 200 |
Neutral | 300 |
In the above, a net 300 more resources would appear if the manager does nothing (ignoring the resource model and consequences of carrying capacity for the moment). Note that the actions should really be run through the interaction array so that interactions between two resources could be hypothetically projected. This wouldn’t be much extra work – the increase in a resource could just be multiplied by the appropriate column in the Jacobian matrix. Also note that costs are only one way to adjust resources – another would be having something like licenses to kill be a resource that stake-holders might want to buy – the manager could make more of these and they could themselves be modelled as a dynamic and affecting the Jacobian matrix.
Manager model looped resources and mark-recapture
The density estimate within the manager model manager.c
now returns accurate abundances for multiple resources (storing them in a vector est_abun
). This was confirmed by shutting off the user
function (see Issue: #22).
Additionally, the mark-recapture estimator has been successfully initialised in manager.c
, and works for multiple resources. There was a bit of a hiccup here because the test printouts of abundances were consistently different from what was seen in the R plot. This turned out to be a minor error in R, not C (one too many columns were being read in in R to estimate resources marked). By fixing the error in R, both R and C estimates now match and are accurate. The next step is to return estmates for transect-based sampling abundances.
The mark-recapture analysis uses Chapman estimation, which is calculated in two functions. The function rmr_est
runs calls chapman_est
for each individual resource, inputting the results into the abun_est
vector.
/* =============================================================================
* This function calculates mark-recapture-based (Chapman) abundance estimates
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* obs_array_rows: Number of rows in the observation array obs_array
* obs_array_cols: Number of cols in the observation array obs_array
* abun_est: Vector where abundance estimates for each type are placed
* interact_table: Lookup table to get all types of resource values
* int_table_rows: The number of rows in the interact_table
* trait_number: The number of traits in the resouce array
* ========================================================================== */
void rmr_est(double **obs_array, double *para, int obs_array_rows,
int obs_array_cols, double *abun_est, int **interact_table,
int int_table_rows, int trait_number){
int resource, type1, type2, type3;
double estimate;
for(resource = 0; resource < int_table_rows; resource++){
abun_est[resource] = 0;
if(interact_table[resource][0] == 0){ /* Change when turn off type? */
type1 = interact_table[resource][1];
type2 = interact_table[resource][2];
type3 = interact_table[resource][3];
estimate = chapman_est(obs_array, para, obs_array_rows,
obs_array_cols, trait_number, type1, type2,
type3);
abun_est[resource] = estimate;
}
}
}
The function chapman_est
itself does all of the maths for estimating population abundance from mark-recapture data in the OBSERVATION
ARRAY.
/* =============================================================================
* This function calculates RMR (chapman) for one resource type
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* obs_array_rows: Number of rows in the observation array obs_array
* obs_array_cols: Number of cols in the observation array obs_array
* trait_number: The number of traits in the resource array
* type1: Resource type 1
* type2: Resource type 2
* type3: Resource type 3
* ========================================================================== */
double chapman_est(double **obs_array, double *para, int obs_array_rows,
int obs_array_cols, int trait_number, int type1, int type2,
int type3){
int row, col;
int total_marks, recaptures, mark_start, recapture_start;
int *marked, sum_marked, n, K, k;
double estimate, floored_est;
total_marks = (int) para[11];
recaptures = (int) para[10];
mark_start = trait_number + 1;
recapture_start = mark_start + (total_marks - recaptures);
if(total_marks < 2 || recaptures < 1){
printf("ERROR: Not enough marks or recaptures for management");
return 0;
}
n = 0;
marked = malloc(obs_array_rows * sizeof(int));
for(row = 0; row < obs_array_rows; row++){
marked[row] = 0;
if(obs_array[row][1] == type1 &&
obs_array[row][2] == type2 &&
obs_array[row][3] == type3
){
for(col = mark_start; col < recapture_start; col++){
if(obs_array[row][col] > 0){
marked[row] = 1;
n++;
break;
}
}
}
}
K = 0;
k = 0;
for(row = 0; row < obs_array_rows; row++){
if(obs_array[row][1] == type1 &&
obs_array[row][2] == type2 &&
obs_array[row][3] == type3
){
for(col = recapture_start; col < obs_array_cols; col++){
if(obs_array[row][col] > 0){
K++;
if(marked[row] > 0){
k++;
}
break;
}
}
}
}
estimate = ((n + 1) * (K + 1) / (k + 1)) - 1;
floored_est = floor(estimate);
free(marked);
return floored_est;
}
No confidence intervals are calculated at the moment, since I’m not sure how the simulated manager would use the uncertaintly, but if we eventually want real people to be able to ‘play’ the game as managers, then it shouldn’t be too difficult to add confidence intervals to all population size estimates within the C functions of manager.c
.
Transect estimation of resource abundances
Manager estimation of abundances collected from transect type sampling (i.e., case 2
and case 3
) are considerably easier than density-based and mark-recapture matrics. The times a resource is observed is simply stored in the 12th column (in C; 13 in R) of the observation matrix. The transect_est
does the job for any number of resources all in one go.
/* =============================================================================
* This function calculates mark-recapture-based (Chapman) abundance estimates
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* obs_array_rows: Number of rows in the observation array obs_array
* abun_est: Vector where abundance estimates for each type are placed
* interact_table: Lookup table to get all types of resource values
* int_table_rows: The number of rows in the interact_table
* ========================================================================== */
void transect_est(double **obs_array, double *para, int obs_array_rows,
double *abun_est, int **interact_table, int int_table_rows){
int resource, observation, type1, type2, type3;
for(resource = 0; resource < int_table_rows; resource++){
abun_est[resource] = 0;
if(interact_table[resource][0] == 0){ /* Change when turn off type? */
type1 = interact_table[resource][1];
type2 = interact_table[resource][2];
type3 = interact_table[resource][3];
for(observation = 0; observation < obs_array_rows; observation++){
if(obs_array[observation][1] == type1 &&
obs_array[observation][2] == type2 &&
obs_array[observation][3] == type3
){
abun_est[resource] += obs_array[observation][12];
}
}
}
}
}
Abundances need to now be compared to manager utilities (and for now, I’m just going to assume that the agent with agentID = 1
is the head manager (other type1 = 0
agents can be ‘managers’ collecting data, but I don’t see how or why we would want multiple managers bargaining over resources with different util
values; not yet at least, and probably not ever).
Getting marginal utilities for management and putting them in ACTION
Back to the big picture, I have now finished the first five (easiest) of the tasks below.
/* 1. Get summary statistics for resources from the observation array */
/* 2. Place estimated resource abundances in a vector the size of int_d0 */
/* 3. Initialise new vector of size int_d0 with temp utilities of manager */
/* 4. Subtract abundances from temp utilities to get marginal utilities */
/* 5. Insert the marginal utilities into the agent = 1 col1 of ACTION */
/* 6. Run the genetic algorithm (add extension to interpet cost effects) */
/* 7. Put in place the new ACTION array from 6 */
/* 8. Adjust the COST array appropriately from the new manager actions */
Essentially, the manager.c
function now gets estimates for the abundances of each resource, then places those estimates in a temporary vector. Elements in this vector (corresponding to resource abundances) are then subtracted from the manager’s utility values (corresponding to desired resource counts). What’s left is then the marginal utility of resources – if there are more resources than the manager desires, then the marginal utility is negative, and if there are fewer, then the marginal utility is positive. The marginal utility is then placed back into the first layer of the ACTION
array (corresponding to the manager) where column 1 equals 1 (i.e., intepreted as actions of the manager affecting their own costs – existing values of which aren’t really being used because the concept doesn’t make a lot of sense, and the values are really just there as place-holders for where they mean things in other layers of the array). Hence the util
column then includes values for the ideal resource abundance (where column 1 equals -2 – util
is in column 5) and the marginal utility given estimated resource abundance (where column 1 equals 1). See below.
-2.000000 1.000000 0.000000 0.000000 100.000000 1.000000 1.000000
-1.000000 1.000000 0.000000 0.000000 1.000000 1.000000 1.000000
1.000000 1.000000 0.000000 0.000000 -330.696014 0.000000 0.000000
2.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
3.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
In the above, the manager sees 100 as the ideal population size, but there are ca 430 resources of type1 = 1
, type2 = 0
, type3 = 0
in the population. Hence the manager would like to see ca 330 fewer of these kinds of resources. The -330.696014
printed from a test simulation above will allow the genetic algorithm to adjust the COST
array accordingly, decreasing COST
columns that correspond to the killing or castrating (but not moving, I suppose) of resources.
Resolved Issue: #20
Issue: #20 has now been resolved. The res_type
has now been removed from the observation model, and observations simply occur for all unique resource types – if some are not needed, then they are not analysed. Doing this required very little modification for transect type sampling (case 2
and case 3
), but considerably more for density base sampling (case 0
) and especially mark and recapture (case 1
). In these cases, I decided to split the sampling functions up more clearly. Testing revealed some initial errors, but these were ironed out and fixed. Currently, the correct OBSERVATION
array is returned, although this array is not analysed correctly when plotting in R for more than one resource (the code for this has not yet be written).
NOTE: there is no code written to ignore subdivisions yet. I’m not sure whether or not we’ll actually want this, but the option could simply be something placed in para
and checked in the subfunctions in observation.c
Major changes to observation.c
In resolving Issue: #20, I have re-worked the code in observation.c
to be more readable. instead of the switch(methods)
in the main observation function calling density based estimation or mark-recapture, but both of these functions, confusingly, calling the same mark_res
sub-function, I now have mark_res
being called for density based estimation. Hence, each method of observation has its own (considerably smaller) sub-function, each of which calls another sub-function. For example, with density-based estimation, we have the following function called times_obs
times.
/* =============================================================================
* Density method of estimation
* ===========================================================================*/
/* =============================================================================
* This simulates the capture-mark-recapture of a resource type
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* land: The landscape on which interactions occur
* paras: vector of parameter values
* res_rows: Total number of resources that can be sampled
* a_row: Total number agents that could possibly sample
* obs_col: The number of columns in the observational array
* a_type: The type of agent that is doing the marking
* by_type: The type column that is being used
* find_type: The type of finding that observers do (view-based or rand)
* Output:
* Accumlated markings of resources by agents
* ========================================================================== */
void mark_res(double **resource_array, double **agent_array, double ***land,
double *paras, int res_rows, int a_row, int obs_col, int a_type,
int by_type, int find_type){
int resource;
int agent;
int count;
int edge; /* How does edge work? (Effects agent vision & movement) */
int samp_res; /* A randomly sampled resource */
int ldx, ldy;
int move_t;
int sample_num; /* Times resources observed during one time step */
edge = (int) paras[1]; /* What type of edge is on the landscape */
sample_num = (int) paras[11];
ldx = (int) paras[12]; /* dimensions of landscape -- x and y */
ldy = (int) paras[13];
move_t = (int) paras[14]; /* Type of movement being used */
for(agent = 0; agent < a_row; agent++){
if(agent_array[agent][by_type] == a_type){
mark_in_view(resource_array, agent_array, paras, res_rows, agent,
find_type, obs_col);
}
if(sample_num > 1){
a_mover(agent_array, 4, 5, 6, edge, agent, land, ldx, ldy, move_t);
}
}
}
The above function calls mark_in_view
, which marks all resources in the agent’s view (regardless of type, which will get sorted out later).
/* =============================================================================
* This simulates an individual agent doing some field work (observing)
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* paras: vector of parameter values
* res_rows: Total number of rows in the res_adding data frame
* worker: The row of the agent that is doing the working
* find_proc: The procedure used for finding and marking resources
* res_type: The type of resources being marked
* obs_col: The number of columns in the observation array
* Output:
* The resource_array is marked by a particular agent
* ========================================================================== */
void mark_in_view(double **resource_array, double **agent_array, double *paras,
int res_rows, int worker, int find_proc, int obs_col){
int xloc; /* x location of the agent doing work */
int yloc; /* y location of the agent doing work */
int view; /* The 'view' (sampling range) around agent's location */
int edge; /* What type of edge is being used in the simulation */
int resource; /* Index for resource array */
int r_x; /* x location of a resource */
int r_y; /* y location of a resource */
int seeme; /* Test if observer sees/captures the resource */
int ldx; /* Landscape dimension on the x-axis */
int ldy; /* Landscape dimension on the y-axis */
int EucD; /* Is vision based on Euclidean distance? */
double min_age; /* Minimum at which sampling can occur */
xloc = (int) agent_array[worker][4];
yloc = (int) agent_array[worker][5];
view = (int) agent_array[worker][8];
edge = (int) paras[1];
ldx = (int) paras[12];
ldy = (int) paras[13];
EucD = (int) paras[20];
min_age = paras[16];
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][11] >= min_age){
r_x = resource_array[resource][4];
r_y = resource_array[resource][5];
seeme = binos(xloc, yloc, r_x, r_y, edge, view, ldx, ldy, EucD);
agent_array[worker][10] += seeme;
resource_array[resource][obs_col] += seeme;
resource_array[resource][12] += seeme;
}
}
}
The mark-recapture technique, in contrast, calls the new function sample_fixed_res
once (time_obs
is taken care of in the sub-function).
/* =============================================================================
* Mark re-capture method of estimation
* ===========================================================================*/
/* =============================================================================
* This simulates the capture-mark-recapture of a resource type
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* land: The landscape on which interactions occur
* paras: vector of parameter values
* lookup: The table listing resources and landscape layers to lookup
* res_rows: Total number of resources that can be sampled
* agent_number: Total number of agents in the agent array
* a_type: The type of agent that is doing the marking
* trait_number: The number of traits (columns) of the resource array
* lookup_rows: The number of rows in the lookup table
* Output:
* Accumlated markings of resources by agents
* ========================================================================== */
void sample_fixed_res(double **resource_array, double **agent_array,
double ***land, double *paras, int **lookup, int res_rows,
int agent_number, int a_type, int trait_number,
int lookup_rows){
int edge_type, move_type, fixed_sample, times_obs, move_res, by_type;
int land_x, land_y;
int obs_iter, agent;
int row, type1, type2, type3;
edge_type = (int) paras[1];
move_type = (int) paras[2];
fixed_sample = (int) paras[10];
land_x = (int) paras[12];
land_y = (int) paras[13];
by_type = (int) paras[17];
move_res = (int) paras[19];
if(fixed_sample < 1){
printf("ERROR: Fixed sample must be >= 1 \n ... Making = 1 \n");
paras[10] = 1;
fixed_sample = 1;
}
for(row = 0; row < lookup_rows; row++){
if(lookup[row][0] == 0){
obs_iter = trait_number + 1;
times_obs = (int) paras[11];
type1 = lookup[row][1];
type2 = lookup[row][2];
type3 = lookup[row][3];
while(times_obs > 0){
for(agent = 0; agent < agent_number; agent++){
if(agent_array[agent][by_type] == a_type){
mark_fixed(resource_array, agent_array, paras, res_rows,
agent, obs_iter, type1, type2, type3);
}
}
obs_iter++;
times_obs--;
if(move_res == 1){ /* Move resources if need for new sample */
res_mover(resource_array, 4, 5, 6, res_rows, edge_type,
land, land_x, land_y, move_type);
}
}
}
}
}
The sub-function mark_fixed
marks a fixed number of a specific type of resource.
/* =============================================================================
* This simulates an individual agent marking a fixed number of resources
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* paras: vector of parameter values
* res_rows: Total number of rows in the res_adding data frame
* worker: The row of the agent that is doing the working
* obs_col: The number of columns in the observation array
* type1: Resource type 1 being marked
* type2: Resource type 2 being marked
* type3: Resource type 3 being marked
* Output:
* Specific resources in resource_array are marked by a particular agent
* ========================================================================== */
void mark_fixed(double **resource_array, double **agent_array, double *paras,
int res_rows, int worker, int obs_col, int type1, int type2,
int type3){
int xloc; /* x location of the agent doing work */
int yloc; /* y location of the agent doing work */
int view; /* The 'view' (sampling range) around agent's location */
int edge; /* What type of edge is being used in the simulation */
int resource; /* Index for resource array */
int r_x; /* x location of a resource */
int r_y; /* y location of a resource */
int seeme; /* Test if observer sees/captures the resource */
int ldx; /* Landscape dimension on the x-axis */
int ldy; /* Landscape dimension on the y-axis */
int fixn; /* If procedure is to sample a fixed number; how many? */
int count; /* Index for sampling a fixed number of resource */
int sampled; /* The resource randomly sampled */
int type_num; /* Number of the type of resource to be fixed sampled */
int EucD; /* Is vision based on Euclidean distance? */
double sampl; /* Random uniform sampling of a resource */
double min_age; /* Minimum at which sampling can occur */
xloc = (int) agent_array[worker][4];
yloc = (int) agent_array[worker][5];
view = (int) agent_array[worker][8];
edge = (int) paras[1];
ldx = (int) paras[12];
ldy = (int) paras[13];
EucD = (int) paras[20];
min_age = (int) paras[16];
fixn = (int) paras[10];
type_num = 0;
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][1] == type1 &&
resource_array[resource][2] == type2 &&
resource_array[resource][3] == type3 &&
resource_array[resource][11] >= min_age
){
type_num++;
}
}
if(type_num > fixn){ /* If more resources than the sample number */
/* Temp tallies are used here to sample without replacement */
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][1] == type1 &&
resource_array[resource][2] == type2 &&
resource_array[resource][3] == type3
){
resource_array[resource][13] = 0; /* Start untallied */
}
}
count = fixn;
sampl = 0;
while(count > 0){
do{ /* Find an un-tallied resource in the array */
sampl = runif(0, 1) * res_rows;
sampled = (int) sampl;
} while(resource_array[sampled][13] == 1 ||
resource_array[sampled][1] != type1 ||
resource_array[sampled][2] != type2 ||
resource_array[sampled][3] != type3 ||
resource_array[sampled][11] < min_age ||
sampled == res_rows /* In case sample returns 1 */
);
resource_array[sampled][obs_col]++; /* Marks accumulate */
resource_array[sampled][12]++;
resource_array[sampled][13] = 1; /* Tally is noted */
count--;
}
agent_array[worker][10] += fixn;
}else{ /* Else all of the resources should be marked */
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][1] == type1 &&
resource_array[resource][2] == type2 &&
resource_array[resource][3] == type3 &&
resource_array[resource][11] >= min_age
){
resource_array[resource][obs_col]++; /* Mark all */
resource_array[resource][12]++;
}
}
agent_array[worker][10] += type_num; /* All resources marked */
}
}
This still isn’t the most readable code, but it’s better than what it was before. Eventually, I would prefer to get rid of as man of the function arguments as possible and place these as elements within para
. The identities of para
elements could then be explained in the function more clearly (and consistently). This would lead to less bulky functions and a bit clearer structure for the code, but I think it can be implemented later as the code is given a more general clean-up.
Issue: #22 User function crashes with multiple resources
For some reason, the user.c
call appears to crash when there is more than one resource. I only noticed this after the overhaul of the observation model, but I doubt they are related. More likely, one of the many arrays with dimensions that depend on resource number is being built or called improperly, leading to a segmentation fault. This should be fixed, of course, but I suspect the problem is not too far buried.
Next steps still to get abundance estimate from observation array in manager.c
The above issues were progress, but it has set back the pace of the manager model a bit. The next item on the agenda is to still get individual abundance estimates within for each unique resource type in the manager model. The density estimate is completed, and the whole thing should work because the code is already written to loop through the interaction table and get abundance estimates for each unique resource. This should be tested by including more than one resource and turning off the user function (see Issue: #22). Once it works, then I need to do the same thing for mark-recapture and transect-based estimates of abundance. Then, I’ll try to get through items 2-5 from the list on the list from Monday.
Issue: #20 Remove res_type
from observation model
Currently, the observation model only records resources into the observation array if they are of a particular type1
, which is specified in the para
vector and used to produce a data array with only one type of specified resource. Originally, this seemed like a good idea, but after spending some initial time writing the management model, I don’t think there is any need nor good reason to restrict observation to a specific resource type. Instead, all types should be marked and moved to the observation array. Then, if management analysis wants statistics for only one type of resource, its very easy to use an if
statement to check that the type is appropriate. It’s much easier to ignore parts of the array than to make more than one array when needed through multiple calls of the observation function.
To fix this, it shouldn’t be much more than a simple removal of specifying res_type
values in observation.c
. When there is only one resource type, all calculations should proceed normally, but when more resources are introduced, an if
is needed for both management and plotting (different groups of resources could even be made, ignoring subdivisions, by skipping the if
if the type specified to look at equals -1
.
Issue: #21 Improve code readability using para
Originally, I had the idea to use the global vector para
as a way of storing information easily and using it across all of the models. The vector para
would store key information about pretty much everything, then be dynamically updated as need be from higher level functions in the model. In the last two months of coding, I have been specifying parameter names in functions explicitly, which has made sense during the coding process for my own writing, but it will be beneficial to clean all of this up later by reading para
into these sub-functions that otherwise have sometimes about a dozen arguments. Most functions would then have considerably fewer arguments, and the description of variables stored as vector elements in para
could be immediately defined within sub-functions and used by name thereafter. The whole program would then have a similar feel of reading in key arrays and vectors and then specifying the key variables within sub-functions.
1. Get summary statistics for resources from the observation array
I have now written the functions for case 0
(density-based estimation) getting abundance estimates from the observation
array, as outlined yesterday. This took slightly longer than anticipated because it turns out that there was a minor error in R’s estimation of abundances due to an incorrect column being summed. I had to figure out why C and R did not agree on the same abundance estimates; they now do (i.e., both independent codings to get the same estimate from the same observation array). I now need to do the other three types of observation and get abundance estimates from it. In the highest level function for this part of the manager model, the estimate_abundances
function is called.
/* =============================================================================
* This function uses the observation array to estimate resource abundances
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* interact_table: Lookup table to get all types of resource values
* agent_array: Agent array, including managers (agent type 0)
* agents: Total number of agents (rows) in the agents array
* obs_x: Number of rows in the observation array
* obs_y: Number of cols in the observation array
* abun_est: Vector where abundance estimates for each type are placed
* int_table_rows: The number of rows in the interact_table
* ========================================================================== */
void estimate_abundances(double **obs_array, double *para, int **interact_table,
double **agent_array, int agents, int obs_x, int obs_y,
double *abun_est, int int_table_rows){
int estimate_type, recaptures;
double abun;
estimate_type = (int) para[8];
switch(estimate_type){
case 0:
dens_est(obs_array, para, agent_array, agents, obs_x, obs_y,
abun_est, interact_table, int_table_rows);
break;
case 1:
recaptures = (int) para[10];
break;
case 2:
break;
case 3:
break;
default:
break;
}
}
The above function will call subfunctions based on estimate type (0 to 3). Now only the density function dens_est
has been written and tested.
/* =============================================================================
* This function calculates density-based abundance estimates
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* agent_array: Agent array, including managers (agent type 0)
* agents: Total number of agents (rows) in the agents array
* obs_array_rows: Number of rows in the observation array obs_array
* obs_array_cols: Number of cols in the observation array obs_array
* abun_est: Vector where abundance estimates for each type are placed
* interact_table: Lookup table to get all types of resource values
* int_table_rows: The number of rows in the interact_table
* ========================================================================== */
void dens_est(double **obs_array, double *para, double **agent_array,
int agents, int obs_array_rows, int obs_array_cols,
double *abun_est, int **interact_table, int int_table_rows){
int i, j, resource;
int view, a_type, land_x, land_y, type1, type2, type3;
int vision, area, cells, times_obs, tot_obs;
double prop_obs, estimate;
a_type = (int) para[7]; /* What type of agent does the observing */
times_obs = (int) para[11];
land_x = (int) para[12];
land_y = (int) para[13];
view = 0;
for(i = 0; i < agents; i++){
if(agent_array[i][1] == a_type){
view += agent_array[i][8];
}
}
vision = (2 * view) + 1;
area = vision * vision * times_obs;
cells = land_x * land_y; /* Plus one needed for zero index */
tot_obs = 0;
for(resource = 0; resource < int_table_rows; resource++){
abun_est[resource] = 0;
if(interact_table[resource][0] == 0){ /* Change when turn off type? */
type1 = interact_table[resource][1];
type2 = interact_table[resource][2];
type3 = interact_table[resource][3];
tot_obs = res_obs(obs_array, obs_array_rows, obs_array_cols, type1,
type2, type3);
prop_obs = (double) tot_obs / area;
estimate = prop_obs * cells;
abun_est[resource] = estimate;
}
}
}
The dens_est
function above calls the res_obs
function, which returns the number of observations for a specific resource type.
/* =============================================================================
* This function calculates density-based abundance estimates
* obs_array: The observation array
* obs_rows: Number of rows in the observation array obs_array
* obs_cols: Number of cols in the observation array obs_array
* type1: Resources of type 1 being observed
* type2: Resources of type 2 being observed
* type3: Resources of type 3 being observed
* ========================================================================== */
int res_obs(double **obs_array, int obs_rows, int obs_cols, int type1,
int type2, int type3){
int i, j, obs_count;
obs_count = 0;
for(i = 0; i < obs_rows; i++){
if( (obs_array[i][1] == type1 || obs_array[i][1] < 0) &&
(obs_array[i][2] == type2 || obs_array[i][2] < 0) &&
(obs_array[i][3] == type3 || obs_array[i][3] < 0)
){
for(j = 15; j < obs_cols; j++){
obs_count += obs_array[i][j];
}
}
}
return obs_count;
}
The point of the above break-down, aside from making things more readable, is that we might want to get abundance estimates for each resource type – at least have G-MSE produce them even if we pretend that managers cannot see them. When Issue: #20 is resolved, all resources will then be estimated. NOTE: This could be an issue because if a fixed number of resource types are sampled, as with mark-recapture, then it could sample different resources. It might be best to change mark-recapture so that it takes a fixed_obs
for each unique resource type, somehow. The point is that it’s easier and more computationally efficient to ignore some data (and not allow managers to notice it) than it is to have to run observation
multiple times to re-collect using the same protocol.
Eventually, I also want the obs_array[i][1, 2, or 3]
to be able to take -1 as a value in here somehow – basically, I want the counts to be taken to ignore one of a resource’s type. For example, we could imagine wanting to have separate sexes indicated by type2 = 0
or type2 = 1
in column 2 of the obs_array
, but perhaps not want managers to actually use this when estimating abundance (alternatively – could combine observations later).
Manager function to genetic algorithm link
There is a minor conceptual issue regarding the implementation of the genetic algorithm with the manager function. The manager’s actions need to be based on the OBSERVATION
array, but stake-holders need not use this information. There are two options for implementing the genetic algorithm regarding observations.
OBSERVATION
array could just be read into the genetic algorithm and not used for stake-holders. This might require the user
function to be initially called after the manager
function so that an OBSERVATION
array exists (or a dummy could be made easily enough in the user.R
function.ACTION
and COST
arrays into the genetic algirthm to zero in on the actions. For example, if there are too many resources, then the util
could be adjusted within manager.c
(or manager.R
) to be negative, hence making the genetic algorithm select strategies that lower costs on killem
actions proportional to how many the manager wants killed.I think that option 2 is actually a bit faster, and will probably be easier to implement in terms of coding.
Isolating effects of uncertainty
It is worth pointing out in passing that above option 2 offers a very straight-forward way of looking at uncertainty with respect to management decisions. When passing resource abundances to update temporary util
values for managers, we could compare the estimates of abundances produced from the observation model to the actual abundances from the resource model. This could be a very simple option in the software, and it might be useful to run the genetic algorithm twice for managers in each time step to simulate side-by-side how decisions would be made in the presence and absence of uncertainty.
Initialising the manger model
To get the ball rolling on the manager function’s implementation of the genetic algorithm, now is as good of a time as any to initialise manager.R
and manager.c
, since the arguments passed to game.c
need to be coaxed into the right form via the manager model. It’s important to keep in mind that I still need to implement the lobbying option for stake-holders, but I think that this will be easier once the manager’s genetic algorithm is built. It’s also notworthy that we’re probably not going to need managers to adjust stake-holder’s utilities. So really, in a pinch, their are three types of actions that are really going to be important, probably always.
To this list of three essential types of actions, there are a few additional actions that would be good to have, ideally as fitting within the general framework of the model seemlessely, but if necessary could be add-ons for future development.
To this, there are a few more other possible options that I can’t, at the moment, see why anyone would want to model. I’m not entirely sure these are really sensical, actually.
Framework for manager actions
The framework for manager actions in both R and C is now entirely built – data structures can be read in and out, so now all that is left is to do the modelling. I’ve commented what will happen within manager.c
; each of the numbers below might or might not represent uniqe sub-functions.
/* Do the biology here now */
/* ====================================================================== */
/* 1. Get summary statistics for resources from the observation array */
/* 2. Place estimated resource abundances in a vector the size of int_d0 */
/* 3. Initialise new vector of size int_d0 with temp utilities of manager */
/* 4. Subtract abundances from temp utilities to get marginal utilities */
/* 5. Insert the marginal utilities into the agent = 1 col1 of ACTION */
/* 6. Run the genetic algorithm (add extension to interpet cost effects) */
/* 7. Put in place the new ACTION array from 6 */
/* 8. Adjust the COST array appropriately from the new manager actions */
/* This code switches from C back to R */
/* ====================================================================== */
With all of these in place, the end result should be a new COST
array based on manager actions. It will be important to make sure that the manager’s costs are defined appropriately so that the manager doesn’t start doing actions themselves. This could actually be a bit of a problem; if we want the manager to do things themselves, then it’s hard to see why they wouldn’t just perform the actions instead of adjusting the costs. Then again, perhaps this is kind of the point? Maybe sufficiently high costs of actions and sufficiently low costs of policy adjustment should cause the genetic algorithm to naturally find policy as a better means of acheiving what the manager wants. In fact, this seems almost certain; if managers in the real world could achieve all policy aims single-handedly, then that’s what they would probably be hired to do. In the real world, changing policy is more effective – it’s also possible that we could allow them to do their own direct actions to resources in the user
model, like the stake-holders. Costs of setting policy could then be independent from costs of doing actions by changing manager COST
between models.
Concrete plan for manager fitness function
The next step in the coding is to allow managers to generate policy by using their utilities to affect the costs of other agents. This will require that managers recognise how the actions of other agents will affect resources and the landscape, then adjust costs to encourage agents to act in a particular way. There are several things to keep in mind here.
bankem
column could be used to suck up actions in the genetic algorithm if doing nothing is advantageous..killem
) should stop managers from feeding resources on public land. Maybe it does, but it doesn’t seem like it should have to be this way.Some solutions to account for the above
Given that the manager already has a special status in the rest of G-MSE, maybe it’s not too much of a stretch to make their cost-adjusting actions apply to all non-managers by default by making all cost-adjusting rows in the ACTION
array (on the manager’s layer) equivalent. Or, even better, the first row, which corresponds to the manager’s own costs (or any agent’s own cost row, since the cost of adjusting their own cost doesn’t really come into play – or really make much sense), could simply define the cost of affecting all stake-holders cost values.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 101 | 101 | 101 | 3 | 8 | 4 | 4 | 2 | 1 |
-2 | 2 | 0 | 0 | 101 | 101 | 101 | 4 | 3 | 6 | 2 | 3 | 1 |
-1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
1 | 2 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
2 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
2 | 2 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
3 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
3 | 2 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
Assume that the above COST
array layer corresponds to the manager: agent = 1
. In the above COST
array, we have three agents (agent = 1
is a manager, while agent = 2
and agent = 3
are stake-holders), two resource types, and one landscape layer. So in rows 4 and 5 above where the first column agent = 1
, we have, essentially, costs of what it takes to affect the entire array of stake-holders actions (all layers where agent = -2
or agent = -1
) on a particular resource. This value can then be used to directly implement actions in the first two rows of all layers of the ACTION
array. Note that the structure of the code does not allow managers to make policy on landscape use – only resources, which might or might not have to be changed. What we’re essentially saying with this is that a manager cannot tell a farmer to not kill or fertilise crops on the farmer’s own land. Only resources are affected by manager policy (we could, of course, make crops resources – though this would be a bit of a time consuming work around). We could find a way around this if need be, but I can’t think of many situations in which we would want a manager to be able to tell a stake-holder that they can’t increase their own crop yield or kill their crops.
Issue #19: Cost and Action arrays: landscape level initialisation
The number of landscape layers in the COST
and ACTION
arrays is too many – the utility_layer
adds one in for every unique resource instead of for every unique landscape layer. This is an easy fix by adding an option to the function to specify landscape layers.
More concrete ideas
A general algorithm-sketch for the managers could be as follows in the fitness function of the genetic algorithm. Note that this will be called in the manager model, not the user model, so there isn’t a worry about the stake-holders updating their actions at the wrong time – their actions will be static while the manager is considering what to do.
ACTION
array, column util
.ACTION
matrix of stake-holders (e.g., from all rows where agent = -2
). Hence, resources will be decreased by some columns and increased by others based on the past actions of stake-holders. This might or might not also need to include resource abundance projections based on birth and death rates – ideally this would be the case, with projections using estimates from the observation model, but maybe just use the abundances as a first step? Note, the ‘projected’ birth rate need not use the RESOURCE
array explicitly, but could be estmated and applied from the history of observation – in fact, this would probably be better.The algorithm above should be fairly fast, and while it won’t provide the optimal solution for managers, it isn’t actually intended to do so. The point is to find an adaptive strategy based on the tools available to the manager and the limited information that the manager realistically has about resource abundacnes and stake-holder strategies. Following from the above, I dare say that the fitness function for stake-holders affecting manager’s util
might be not too difficult, but I’ll need to think carefully about the best way to implement it.
Resolved issue #12
I have finally resolved issue #12, which was always a bit annoying but never terribly serious. The problem was that density-based estimation as done in Nuno et al. (2013) would only plot correctly when times_observe = 1
; that is, when managers went out to observe a sample of the population exactly once per time step. Obviously we want the option to allow multiple trips to sample in a single time step, as sampling once (unless the number of cells viewed on the landscape include almost the entire landscape) leads to highly variable results – and even more so now that resource distributions tend to become clumped on the landscape when agents scare them off their land. Previously the proportion estimate used the number of unique resources observed, but what we really needed was the unique observations. By simply summing all values in columns 16
to 16 + times_observed - 1
of the observation array, we get the total number of observations.
Quick note about agents affecting each other’s costs
It’s worth noting that there is no need to save any additional data structure to have agents affect each others costs – at least not at the moment. This is because the user and manager models are separated in a broader time. When a manager uses the genetic algorithm, the ACTION
array that they have to use has already been updated for stake-holders, so the managers are effectively seeing the most recent actions of stake-holders and will be able to adjust costs and use the recent actions to predict changes (can perhaps assume proportional allocation of actions, so if the manager makes one action more costly, then the stake-holder will shift to increase other actions – or perhaps this is too much to predict; maybe managers should just assume that increasing cost will decrease an action as if the action isolated. How strategic do we assume managers think?). Likewise, stake-holders are seeing the managers most recent priorities and might lobby them accordingly.
Resolved issue #18
The landscape actions within the user model are now affected by the interaction matrix from the appropriate diagonal element. This effectively adjust the effect of a user’s actions to increase a cell value by some magnitude. For example, if an agent wants to increase their crop yield they will not do so by the current cell yield plus one times whatever the appropriate element is in the interaction matrix (default could be one – doubling yield). Initial testing shows that this works as intended; stake-holders interested in maximising crop yield do so reliably when they can increase yield on a cell twenty-fold; mean crop yield on the whole landscape increases in turn. When the increase is smaller (50%), then a range of strategies appears possible – one stake-holder chose to kill resources while the other chose to directly increase yield (dependent on costs, which varied among stake-holders). Next, the plan is to address some of the minor clean-up tasks (bulleted list from yesterday) before getting to the ultimate goal of allowing agents to affect one another’s costs in the genetic algorithm.
Resolved issue #11
I appear to have resolved issue #11 by calling the a_mover
function in observation.c
from within the anecdotal
function. This gives the option to move agents when the R function anecdotal
is called during a time step. Later I might also consider giving the option to specify moving agents onto land that they own; this would probably be best accomplished by calling send_agents_home
, which is currently in user.c
, but could be moved to utilities.c
. Tests of anecdotal
in R confirm that it is moving agents as expected.
I have begun to implement actions on the landscape as an option. For now, these actions will include increasing crop yield directly in some way (magnitude to be affected by the interaction array, see below), and killing crops.
New issue #18: Make landscape actions from interaction array
It will be helpful to link the appropriate element of the interaction array (Jacobian matrix) to the actions in the landscape_actions
function in user.c
. As of now, the amount of increase in crop yield (and decrease) is hard-coded in the function, but it really should be linked with the appropriate diagonal element in the interaction array – increasing or decreasing a cell’s value by the magnitude in the array element.
More testing, success
More testing shows that the genetic algorithm and user function is working as intendend, and I have looped the genetic algorithm so that it is run for all simulated agents with now issues – even when landscape dimensions or agent number changes. There are, however, some minor things that need to be tweaked.
observe_type = 0
still isn’t plotting correctly. I’m not sure why this is, but it seems to produce underestimates of population abundance that are off consisently by half the times_obs
. It will be a good idea to get this settled, finally.The minor bug from yesterday has been resolved. The following bet of code needed to be within the larger loop that cycled through the population of agents in the genetic algorithm.
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
The count changes and utilities were not being initialised at zero, meaning that count_change
was cumulative over agents. When this is fixed, and agents highly value the resource (utility = 100
), they either evolve to feedem
or helpem
as much as possible, as reflected in the ACTION
array.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 100 1 1 0 0 0 92 0 0
[2,] -1 1 0 0 1 1 1 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 1 0
So with this very simplified test, the function is doing what it is supposed to do. Next, I fixed the utility of the landscape to 100 to see if the agent recognises that it can increase crop yield by killing or scaring the resource via the interaction array.
Test of killing resource to maximise crop yield
After some further debugging, the agents in the genetic algorithm now figure out to kill resources when resources destroy crops.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 1 1 1 94 0 0 0 0 0
[2,] -1 1 0 0 100 1 1 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 0 1
In the above ACTION
array, utility of the crop yield is 100, and the interaction matrix indicates that resource of type1 = 0
decreases crop yield on a cell by one half. In response to this, the stake-holders find the solution of killing resources on their land (indicated by the 94
in the eight column above). THe code for doing this is not terribly readable at the moment.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double **agent_array, double **jaco,
int **interact_table, int interest_num){
int agent, i, row, act_type, type1, type2, type3, interest_row;
double agent_fitness, *count_change, foc_effect;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(agent = 0; agent < pop_size; agent++){
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
for(row = 0; row < ROWS; row++){
foc_effect = 0;
act_type = (int) population[row][0][agent];
type1 = population[row][1][agent];
type2 = population[row][2][agent];
type3 = population[row][3][agent];
utility = population[row][4][agent];
movem = population[row][7][agent];
castem = population[row][8][agent];
killem = population[row][9][agent];
feedem = population[row][10][agent];
helpem = population[row][11][agent];
switch(act_type){
case -2:
foc_effect -= movem; /* Times birth to account for repr? */
foc_effect -= castem; /* But only remove E offspring? */
foc_effect -= killem; /* But also remove E offspring? */
foc_effect += feedem; /* But should less mortality */
foc_effect += helpem; /* But should affect offspring? */
interest_row = 0;
while(interest_row < interest_num){
if(interact_table[interest_row][0] == 0 &&
interact_table[interest_row][1] == type1 &&
interact_table[interest_row][2] == type2 &&
interact_table[interest_row][3] == type3
){
break;
}else{
interest_row++;
}
}
for(i = 0; i < interest_num; i++){
count_change[i] += foc_effect * jaco[interest_row][i];
}
utilities[interest_row] = utility;
case -1:
interest_row = 0;
while(interest_row < interest_num){
if(interact_table[interest_row][0] == 1 &&
interact_table[interest_row][1] == type1 &&
interact_table[interest_row][2] == type2 &&
interact_table[interest_row][3] == type3
){
break;
}else{
interest_row++;
}
}
utilities[interest_row] = utility;
break; /* Add landscape effects here */
default:
break;
}
}
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){
fitnesses[agent] += count_change[i] * utilities[i];
}
}
free(utilities);
free(count_change);
}
The above can be greatly simplified and made clearer, with the goal towards simple fitness functions for the case in which agents directly affect resources or crops. The indirect interactions will be sub-functions in the above called in the switch
where case
is greater than zero.
Breaking down the strategy_fitness
function
I’ve broken down the strategy_fitness
function into three more manageable functions that can be further developed as necessary. The strategy_fitness
function now calls functions thatupdate the count_change
and utilities
arrays as a result of direct actions to resources and the landscape.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double **agent_array, double **jaco,
int **interact_table, int interest_num){
int agent, i, row, act_type, type1, type2, type3, interest_row;
double agent_fitness, *count_change, foc_effect;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(agent = 0; agent < pop_size; agent++){
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
for(row = 0; row < ROWS; row++){
act_type = (int) population[row][0][agent];
switch(act_type){
case -2:
res_to_counts(population, interact_table, interest_num,
count_change, utilities, jaco, row, agent);
break;
case -1:
land_to_counts(population, interact_table, interest_num,
utilities, row, agent);
break;
default:
break;
}
}
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){
fitnesses[agent] += count_change[i] * utilities[i];
}
}
free(utilities);
free(count_change);
}
The case -2
calls res_to_counts
below.
/* =============================================================================
* This function updates count change and utility arrays for direct actions on
* resources
* population: The population array of agents in the genetic algorithm
* interact_table: The lookup table for figuring out how resources interact
* int_num: The number of rows and cols in jac, and rows in the lookup
* count_change: A vector of how counts have changed as a result of actions
* utilities: A vector of the utilities of each resource/landscape level
* jaco: The interaction table itself (i.e., Jacobian matrix)
* row: The row of the interaction and lookup table being examined
* agent: The agent in the population whose fitness is being assessed
* ========================================================================== */
void res_to_counts(double ***population, int **interact_table, int int_num,
double *count_change, double *utilities, double **jaco,
int row, int agent){
int i, act_type, interest_row;
double foc_effect;
foc_effect = 0.0;
foc_effect -= population[row][7][agent]; /* Times birth account for repr?*/
foc_effect -= population[row][8][agent]; /* But only remove E offspring? */
foc_effect -= population[row][9][agent]; /* But also remove E offspring? */
foc_effect += population[row][10][agent]; /* But should less mortality */
foc_effect += population[row][11][agent]; /* But should affect offspring? */
interest_row = 0;
while(interest_row < int_num){
if(interact_table[interest_row][0] == 0 &&
interact_table[interest_row][1] == population[row][1][agent] &&
interact_table[interest_row][2] == population[row][2][agent] &&
interact_table[interest_row][3] == population[row][3][agent]
){
break;
}else{
interest_row++;
}
}
for(i = 0; i < int_num; i++){
count_change[i] += foc_effect * jaco[interest_row][i];
}
utilities[interest_row] = population[row][4][agent];
}
And the case -1
calls land_to_counts
below.
/* =============================================================================
* This function updates count change and utility arrays for direct actions on
* a landscape
* population: The population array of agents in the genetic algorithm
* interact_table: The lookup table for figuring out how resources interact
* int_num: The number of rows and cols in jac, and rows in the lookup
* utilities: A vector of the utilities of each resource/landscape level
* row: The row of the interaction and lookup table being examined
* agent: The agent in the population whose fitness is being assessed
* ========================================================================== */
void land_to_counts(double ***population, int **interact_table, int int_num,
double *utilities, int row, int agent){
int i, act_type, interest_row;
double foc_effect;
interest_row = 0;
while(interest_row < int_num){
if(interact_table[interest_row][0] == 1 &&
interact_table[interest_row][1] == population[row][1][agent] &&
interact_table[interest_row][2] == population[row][2][agent] &&
interact_table[interest_row][3] == population[row][3][agent]
){
break;
}else{
interest_row++;
}
}
utilities[interest_row] = population[row][4][agent];
}
For each sub-case, how the population array is interpreted can be specialised. For example, if castem
doesn’t really mean anything on the landscape, then it can simply be ignored and agents will adapt by not doing it. In this sense, these two sub-functions become easy things to tinker with for translating actions to utilities.
Version v0.0.9
: A working genetic algorithm
I moved the function do_actions
and its dependency resource_actions
to the user.c
file so that the actions of a particular agent could be performed on the actual population after the genetic algorithm simulated and selected an adaptive strategy. As a test drive, I simulated the actions of only one stake-holder who is trying to maximise crop yield, and whose only avenue for doing so is getting resources of their land one way or another. The figure below shows the output.
The figure tells an interesting story. The light blue individual in the right-hand panels represents a farmer, who has quickly figured out that little black dots on their farm are decreasing crop yield, which the farmer wants to maximise. Initially, the growing population of black dots causes crop yield to decline, but by generation 8 or 9, the farmer has opted to scare these dots to public land. Consequently, mean farm yield over all land goes up a bit (due to intraspecific competition between black dots), and almost all of the crop damage occurs on the public land (dark blue) while the farmer’s land (light blue) has better yield. The spatial distribution of the black dots is very easy to see – all of the back dots have been ‘scared’ into the public land.
This is exciting – we have a working model in which a genetic algorithm is being used to identify and enact a stake-holders strategy given their specific interests. The only major conceptual hurdle now is likely to be the manager’s response, enacting policy by affecting costs of actions, and stake-holders actions that affect other agent’s costs (e.g., lobbying a manager). This isn’t even much of a jump though – really, the framework is in place and a lot of the work from here is just grunt work in terms of coding the specifics what options will be available to what agents. It shouldn’t be too long before we have a working model of conflict that can be applied to real-world case studies. Hence, I’m calling this v0.0.9
and pushing to master
. This implementation of the genetic algorithm is also not noticably slower than previous versions – it took about a second to run the above; my goal is to keep it low.
The next step will be to figure out what options should be available for directly affecting the landscape, and what needs to be done to apply the genetic algorithm to costs of other agents actions (hooks for this are already coded in switch
functions of the genetic algorithm). I would also like to build the manager.c
function with the ability to empirically derive the Jacobian matrix, and (eventually) make it possible for agents to consider the histories of each others actions (shouldn’t be too much of a stretch, but this is an extension of the genetic algorithm that can come later).
It’s worth pointing out that the interaction array from yesterday’s make_interaction_array
function can be defined more generally as a Jacobian matrix. I think it’s worth doing this for the sake of clarity and generality, and thinking about the elements of the array as first order partial derivatives.
Planning with the Jacobian matrix
Note that one benefit of individual-based modelling is that each individual can be unique – for example, an individual’s consumption rate of crops does not have to be completely defined by its type; there can be individual variation within types too. Hence, it is probably undesirable to have the Jacobian matrix of type and landscape cell layers define how individual interactions should occur. There should be some variation and uncertainty at least as an option. Hence, the interaction array should probably be calculated a posteriori as much as possible – ideally from looking at interactions on the landscape (e.g., by the eventual manager.c
), or perhaps a function should go through the resource and landscape arrays and figure out the average interaction for each data type somewhere within G-MSE (but not from within the genetic algorithm; it would take too long). These details can be worked out later in the manager.c
file, or perhaps somehow with the anecdotal
function, which I think now can be made more general. For now, I’m going to manually set values in the matrix and use them to build an efficient genetic algorithm.
Dealing with issues of order in the fitness function
Resource type order needs to be identified for all resource types and landscape layers. The easiest way to do this is to just have a new array that lists all resource and landscape types, such as the below.
Res | Type 1 | Type 2 | Type 3 |
---|---|---|---|
1 | 1 | 0 | 0 |
1 | 2 | 0 | 0 |
0 | 1 | 0 | 0 |
The first column just identifies whether or not the row refers to a resource or a landscape level. The second through fourth columns identify a type (2 and 3 are always zero for landscape levels). This strikes me as the most clear way of keeping track of which rows go with which types in both the Jacobian matrix and the resource array, which I eventually will want to include columns associated with each row, potentially?
The table is initialised with a simple function now in initialise.R
to be called from the main gmse.R
.
#' Initialise array of resource and landscape-level interactions
#'
#'@param resources the resource array
#'@param landscape the landscape array
#'@export
make_interaction_table <- function(resources, landscape){
resource_types <- unique(resources[,2:4]);
resource_part <- matrix(data=0, nrow=dim(resource_types)[1], ncol=4);
resource_part[,2:4] <- resource_types;
landscape_count <- dim(landscape)[3] - 2; # Again, maybe all in later?
landscape_part <- matrix(data = 0, nrow = landscape_count, ncol = 4);
landscape_part[,1] <- 1;
landscape_part[,2] <- 1:landscape_count;
the_table <- rbind(resource_part, landscape_part);
}
The table, along with the Jacobian matrix, is now passed to the user function and into the genetic algorithm where it can be used by the fitness function.
A revised fitness function
A revised fitness function is below, which has not passed unit tests because it doesn’t appear to be maximising utility correctly. There is likely one or more minor bugs in the code that need to be fixed, and it would be better anyway to break the below down into a couple smaller functions anyway.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double **agent_array, double **jaco,
int **interact_table, int interest_num){
int agent, i, row, act_type, type1, type2, type3, interest_row;
double agent_fitness, *count_change, foc_effect;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
for(agent = 0; agent < pop_size; agent++){
for(row = 0; row < ROWS; row++){
foc_effect = 0;
act_type = (int) population[row][0][agent];
type1 = population[row][1][agent];
type2 = population[row][2][agent];
type3 = population[row][3][agent];
utility = population[row][4][agent];
movem = population[row][7][agent];
castem = population[row][8][agent];
killem = population[row][9][agent];
feedem = population[row][10][agent];
helpem = population[row][11][agent];
switch(act_type){
case -2:
foc_effect -= movem; /* Times birth to account for repr? */
foc_effect -= castem; /* But only remove E offspring? */
foc_effect -= killem; /* But also remove E offspring? */
foc_effect += feedem; /* But should less mortality */
foc_effect += helpem; /* But should affect offspring? */
interest_row = 0;
while(interest_row < interest_num){
if(interact_table[interest_row][1] == type1 &&
interact_table[interest_row][2] == type2 &&
interact_table[interest_row][3] == type3
){
break;
}else{
interest_row++;
}
} /* Found the right row in the look-up table */
for(i = 0; i < interest_num; i++){
count_change[i] += foc_effect * jaco[interest_row][i];
}
utilities[interest_row] = utility;
case -1:
break; /* Add landscape effects here */
default:
break;
}
}
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){
fitnesses[agent] += count_change[i] * utilities[i];
}
/* The below will be removed -- once a minor bug is found */
/* fitnesses[agent] = population[0][12][agent]; */
}
free(utilities);
free(count_change);
}
Nevertheless, this is definitely some progress – and the code is still fast. The next step is to print output from the above function to track down what is incorrect.
An additional thought that could be useful for the genetic algorithm is that it might make sense for the AGENT
array to also include abundances of each resource type and landscape level in columns at the end of the array, the order of which matches the order of the 2D array described yesterday. Something like the former anecdotal
function could then be used to fill in abundance values as appropriate (e.g., matching resource on the agent’s owned land, or on public land, or nearby to the location of the agent).
Initialise interaction array
A new function in R initialises an array of interactions among resource types and landscape layers.
#' Initialise array of resource and landscape-level interactions
#'
#'@param resources the resource array
#'@param landscape the landscape array
#'@export
make_interaction_array <- function(resources, landscape){
resource_types <- unique(resources[,2:4]);
resource_count <- dim(resource_types)[1];
landscape_count <- dim(landscape)[3] - 2; # Maybe put all of them in later?
total_dims <- resource_count + landscape_count;
INTERACTIONS <- matrix(data = 0, nrow = total_dims, ncol = total_dims);
name_vec <- NULL;
for(i in 1:dim(resource_types)[1]){
name_vec <- c( name_vec,
paste(resource_types[i,1],
resource_types[i,2],
resource_types[i,3],
sep = "" )
);
}
name_vec <- c(name_vec, as.character(paste("L",1:landscape_count,sep="")));
rownames(INTERACTIONS) <- name_vec;
colnames(INTERACTIONS) <- name_vec;
return(INTERACTIONS);
}
Specific values can be added in outside the make_interaction_array
function and updated as need be by G-MSE.
It’s a bit painful, but I’m going to delete some major pieces of code in the genetic algorithm (which will obviously be preserved in version control). The following functions are slowing things down, and given the new approach outlined in option 3 from 13 APR, I’m going to remove them and focus on the ACTION
array only, assuming that agents act as if their actions will yield the intended results.
/* =============================================================================
* This function calculates an individual agent's fitness
* ========================================================================== */
double calc_agent_fitness(double ***population, int ROWS, int COLS,
int landowner, double ***landscape,
double **resources, int res_number, int land_x,
int land_y, int land_z, int trait_number,
double *fitnesses, double *paras){
int agent, resource, resource_new, trait, row, col, xloc, yloc, zloc;
int res_on_land, res_nums_added, res_nums_subtracted, res_num_total;
double *payoff_vector, *payoffs_after_actions, *payoff_change;
double **TEMP_RESOURCE, **TEMP_ACTION, ***TEMP_LANDSCAPE;
double **ADD_RESOURCES, **NEW_RESOURCES;
double a_fitness;
payoff_vector = malloc(ROWS * sizeof(double));
payoffs_after_actions = malloc(ROWS * sizeof(double));
payoff_change = malloc(ROWS * sizeof(double));
/* --- Make tempororary resource, action, and landscape arrays below --- */
TEMP_RESOURCE = malloc(res_number * sizeof(double *));
for(resource = 0; resource < res_number; resource++){
TEMP_RESOURCE[resource] = malloc(trait_number * sizeof(double));
}
for(resource = 0; resource < res_number; resource++){
for(trait = 0; trait < trait_number; trait++){
TEMP_RESOURCE[resource][trait] = resources[resource][trait];
}
}
TEMP_ACTION = malloc(res_number * sizeof(double *));
for(row = 0; row < ROWS; row++){
TEMP_ACTION[row] = malloc(COLS * sizeof(double));
}
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
TEMP_ACTION[row][col] = population[row][col][landowner];
}
}
TEMP_LANDSCAPE = malloc(land_x * sizeof(double *));
for(xloc = 0; xloc < land_x; xloc++){
TEMP_LANDSCAPE[xloc] = malloc(land_y * sizeof(double *));
for(yloc = 0; yloc < land_y; yloc++){
TEMP_LANDSCAPE[xloc][yloc] = malloc(land_z * sizeof(double));
}
}
for(zloc = 0; zloc < land_z; zloc++){
for(yloc = 0; yloc < land_y; yloc++){
for(xloc = 0; xloc < land_x; xloc++){
TEMP_LANDSCAPE[xloc][yloc][zloc] = landscape[xloc][yloc][zloc];
}
}
}
/* ----------------------------------------------------------- */
calc_payoffs(TEMP_ACTION, ROWS, landscape, TEMP_RESOURCE, res_number,
landowner, land_x, land_y, payoff_vector);
do_actions(landscape, TEMP_RESOURCE, land_x, land_y, TEMP_ACTION, ROWS,
landowner, res_number, COLS);
/* ===== Below re-creates key parts of the resource model ===== */
project_res_abund(TEMP_RESOURCE, paras, res_number);
res_nums_added = 0;
res_nums_subtracted = 0;
for(resource = 0; resource < res_number; resource++){
res_nums_added += TEMP_RESOURCE[resource][10];
if(TEMP_RESOURCE[resource][8] < 0){
res_nums_subtracted += 1;
}
}
ADD_RESOURCES = malloc(res_nums_added * sizeof(double *));
for(resource = 0; resource < res_nums_added; resource++){
ADD_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
res_place(ADD_RESOURCES, TEMP_RESOURCE, res_nums_added, res_number,
trait_number, 10, 11);
res_num_total = res_number + res_nums_added - res_nums_subtracted;
NEW_RESOURCES = malloc(res_num_total * sizeof(double *));
for(resource = 0; resource < res_num_total; resource++){
NEW_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
resource_new = 0;
for(resource = 0; resource < res_number; resource++){
if(TEMP_RESOURCE[resource][8] >= 0){
for(trait=0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] =
TEMP_RESOURCE[resource][trait];
}
resource_new++;
}
}
for(resource = 0; resource < res_nums_added; resource++){
for(trait = 0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] = ADD_RESOURCES[resource][trait];
}
resource_new++;
}
res_landscape_interaction(NEW_RESOURCES, 1, 1, 8, res_num_total, 14,
TEMP_LANDSCAPE, 1);
/* ============================================================*/
calc_payoffs(TEMP_ACTION, ROWS, landscape, NEW_RESOURCES, res_num_total,
landowner, land_x, land_y, payoffs_after_actions);
a_fitness = payoffs_to_fitness(TEMP_ACTION, ROWS, payoffs_after_actions);
/* ----------------------------------------------------------- */
for(resource = 0; resource < res_num_total; resource++){
free(NEW_RESOURCES[resource]);
}
free(NEW_RESOURCES);
for(resource = 0; resource < res_nums_added; resource++){
free(ADD_RESOURCES[resource]);
}
free(ADD_RESOURCES);
for(xloc = 0; xloc < land_x; xloc++){
for(yloc = 0; yloc < land_y; yloc++){
free(TEMP_LANDSCAPE[xloc][yloc]);
}
free(TEMP_LANDSCAPE[xloc]);
}
free(TEMP_LANDSCAPE);
for(row = 0; row < ROWS; row++){
free(TEMP_ACTION[row]);
}
free(TEMP_ACTION);
for(resource = 0; resource < res_number; resource++){
free(TEMP_RESOURCE[resource]);
}
free(TEMP_RESOURCE);
free(payoff_change);
free(payoffs_after_actions);
free(payoff_vector);
return a_fitness;
}
The above re-creation of the resouce model was particularly slow – essentially running a big chunk of resource.c
2000 times (once for each of 100 simulated agents in the genetic algorithm for 20 generations). With more stake-holders or longer convergence times, this would become very time-consuming without much benefit.
The above function calls project_res_abund
(below) which is no longer needed.
/* =============================================================================
* This function looks at the resources and projects how many new resources
* their will be after deaths and births.
* resources: The resource array
* paras: Relevant parameter values
* res_number: The number of rows in the resource array
* ========================================================================== */
void project_res_abund(double **resources, double *paras, int res_number){
int birthtype, deathtype;
int birth_K, death_K;
int resource;
birthtype = (int) paras[3];
deathtype = (int) paras[4];
birth_K = (int) paras[5];
death_K = (int) paras[6];
res_add(resources, res_number, 9, birthtype, birth_K);
res_remove(resources, res_number, 8, deathtype, death_K);
}
The function calc_agent_fitness
also calls calc_payoffs
, which can be removed.
/* =============================================================================
* This function calculated each payoff for rows in the action matrix
* population: array of the population that is made (malloc needed earlier)
* ROWS: Number of rows in the COST and ACTION arrays
* landscape: The landscape array
* resources: The resource array
* res_number: The number of rows in the resource array
* landowner: The agent ID of interest -- also the landowner
* land_x: The x dimension of the landscape
* land_y: The y dimension of the landscape
* payoff_vector: A vector of payoffs for each row of the action array
* ========================================================================== */
void calc_payoffs(double **population, int ROWS, double ***landscape,
double **resources, int res_number, int landowner,
int land_x, int land_y, double *payoff_vector){
int xloc, yloc, yield_layer;
int resource, row;
int landscape_specific;
int res_count;
double cell_yield;
for(row = 0; row < ROWS; row++){
payoff_vector[row] = 0;
if(population[row][0] == -2){
for(resource = 0; resource < res_number; resource++){
if(population[row][1] == resources[resource][1] &&
population[row][2] == resources[resource][2] &&
population[row][3] == resources[resource][3]
){
landscape_specific = population[row][6];
if(landscape_specific == 0){
res_count++;
}else{
xloc = resources[resource][4];
yloc = resources[resource][5];
if(landscape[xloc][yloc][2] == landowner){
res_count++;
}
}
}
}
payoff_vector[row] += res_count;
}
if(population[row][0] == -1){
yield_layer = population[row][1];
for(xloc = 0; xloc < land_x; xloc++){
for(yloc = 0; yloc < land_y; yloc++){
if(landscape[xloc][yloc][2] == landowner){
cell_yield = landscape[xloc][yloc][yield_layer];
payoff_vector[row] += cell_yield;
}
}
}
}
if(population[row][0] > -1){
payoff_vector[row] = 0;
}
}
}
I’m leaving in the functions do_actions
and resource_actions
, which, while not part of the genetic algorithm now, might be used in user.c
to to enact the strategies selected by the genetic algorithm.
In place of all these functions, I’m going to write a modified version of payoffs_to_fitness
(below, which will also be removed) called actions_to_fitness
, which will need the ACTION
array and RESOURCE
array to return a value the_fitness
.
/* =============================================================================
* This function translates resouce abundances and crop yields to the fitness
* of an agent
* action: The action array
* ROWS: Number of rows in the COST and ACTION arrays
* payoffs: Payoffs associated with each row of the action arrray
* ========================================================================== */
double payoffs_to_fitness(double **action, int ROWS, double *payoffs){
int row;
double utility, abundance, the_fitness;
for(row = 0; row < ROWS; row++){
utility = action[row][4];
abundance = payoffs[row];
the_fitness += utility * abundance;
}
return the_fitness;
}
The idea will be to have agents assume that their actions will have the intended results (e.g., killing 5 resources) without using the entire resource model to project whether or not this is really expected (e.g., if only 3 resources are avaialble to kill). Since the ACTION
array includes utilities, we can multiply the assumed action effects times utility to calculate fitness. One necessary added complication is that there needs to be some way to model indirect effects on fitness, for example, if resources increase or decrease crop cell values (or other resource abundances) or vice versa. There needs to be some way for agents to recognise that they can, e.g., kill resources to increase crop yield. Rather than go through the computationally intense task of replicating full interactions within the genetic algorithm, I think it would be better to have G-MSE create a 2D array that identifies the effect of each resource type or landscape layer on each other resource type or landscape layer. This array wouldn’t need to be re-created ex nihilo every time the genetic algorithm is run, but could instead be either produced at a higher level from the parameters of the genetic algorithm, or perhaps calculated somehow in the manager function (not yet written). Hence, the consequences of an action on any given resource type or landscape layer could be followed through the 2D array instead by re-creating the resource algorithm. This would allow us to directly manipulate error as well, if for example some stake-holders don’t recognise certain consequences of affecting one or another resource. For proof of concept, only a two by two array needs to be used.
Resource_1 | Landscape_1 | |
---|---|---|
Resource_1 | 0 | -0.5 |
Landscape_1 | 0.1 | 0 |
Where the rows above are the focal thing of interest and the columns show what the focal thing is having an effect on, per capita. This could be challenging because the per capita effect might vary with resource abundances, and might be factored through other parameters (e.g., landscape cells affecting resource birth or death). Getting expected change in abundance could be a bit challenging, though would be certainly less computationally intense than the way I was doing it before. I’ll start with using defined parameter values for proof of concept, but I do think that this array would be best built in the manager model, perhaps with multiple options that can incorporate error and uncertainty.
Double-check resource functions
It’s now time to simulate the recreation of the RESOURCE
array within the genetic algorithm, so it was useful to re-check the resource functions to remember what they RESOURCE
array looks like after the resource model portion of G-MSE and why. The genetic algorithm needs to simulate births and deaths, making the code below from resources.c
particularly relevant.
for(resource = 0; resource < rows; resource++){
res_adding[resource][realised] = 0;
rand_pois = rpois(res_adding[resource][add]);
res_adding[resource][realised] = rand_pois;
added += (int) rand_pois;
}
The above is in a switch
function that is currently superfluous but might later model different types of reproduction. Hence it is probably best to just run the whole function res_add
, which will add the number of new resource each existing resource produces to column 10
in C.
In fact, it will probably be considerably cleaner and more readable to just make the biology-centred part of the whole resource
function in resource.c
its own function, resource_dynamics
, then run resource_dynamics
in the genetic algorithm with appropriate links. As a bonus, this would take care of the landscape-level effects of resources too.
Calculate agent fitness function almost finished
The function calc_agent_fitness
is almost complete, which will be an initial draft of the genetic algorithm after I write in the code to translate resource abundances and crop yields to realised utilities. The meat of the function (excluding intialisation an memory management) is below
/* ----------------------------------------------------------- */
calc_payoffs(TEMP_ACTION, ROWS, landscape, TEMP_RESOURCE, res_number,
landowner, land_x, land_y, payoff_vector);
do_actions(landscape, TEMP_RESOURCE, land_x, land_y, TEMP_ACTION, ROWS,
landowner, res_number, COLS);
/* ===== Below re-creates key parts of the resource model ===== */
project_res_abund(TEMP_RESOURCE, paras, res_number);
res_nums_added = 0;
res_nums_subtracted = 0;
for(resource = 0; resource < res_number; resource++){
res_nums_added += TEMP_RESOURCE[resource][10];
if(TEMP_RESOURCE[resource][8] < 0){
res_nums_subtracted += 1;
}
}
ADD_RESOURCES = malloc(res_nums_added * sizeof(double *));
for(resource = 0; resource < res_nums_added; resource++){
ADD_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
res_place(ADD_RESOURCES, TEMP_RESOURCE, res_nums_added, res_number,
trait_number, 10, 11);
res_num_total = res_number + res_nums_added - res_nums_subtracted;
NEW_RESOURCES = malloc(res_num_total * sizeof(double *));
for(resource = 0; resource < res_num_total; resource++){
NEW_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
resource_new = 0;
for(resource = 0; resource < res_number; resource++){
if(TEMP_RESOURCE[resource][8] >= 0){
for(trait=0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] =
TEMP_RESOURCE[resource][trait];
}
resource_new++;
}
}
for(resource = 0; resource < res_nums_added; resource++){
for(trait = 0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] = ADD_RESOURCES[resource][trait];
}
resource_new++;
}
res_landscape_interaction(NEW_RESOURCES, 1, 1, 8, res_num_total, 14,
TEMP_LANDSCAPE, 1);
/* ============================================================*/
calc_payoffs(TEMP_ACTION, ROWS, landscape, NEW_RESOURCES, res_num_total,
landowner, land_x, land_y, payoffs_after_actions);
/* Need a calc_utilities function */
/* ----------------------------------------------------------- */
The next step is to write the calc_utilities
. Overall, the whole program is noticeably slower, so I will want to optimise a bit if possible. I also need to do some unit testing for all of this to make sure that the genetic algorithm is doing what I intend it to do.
Moving forward: optimsation and error in ACTION
Having completed some initial coding and testing, there is a lot to do on everything downstream of calc_agent_fitness
. The function doesn’t appears to alter agent utilities somehow, and slows down the simulations dramatically, from about half a second to several minutes to get through 100 generations. Options for addressing this include:
calc_payoffs
function and changes the nature of the payoffs_to_fitness
function to directly assess fitness by assumingactions will be successful.I find the third option most tempting. Perhaps there will be a case for extreme accuracy in predicting the effects of actions, but I think that it’s unlikely that we will lose much if we assume that agent actions are successful in the genetic algorithm. This also builds in the kind of error that would seem to be realistic in terms of human behaviour. It will be necessary, however, to still have agents link resource abundance with changes on the landscape – e.g., the indirect fitness benefit in terms of crop production increase caused by killing a resource needs to be realised in some way. The best way might be to rewrite res_landscape_interaction
somehow to link the two without looping through each landscape cell for each resource. I don’t know the best way to do this yet – perhaps something in an observation model that estimates mean crop loss due to a resources of type X
?
Resolved Issue #17
I am now closing Issue #17 introduced yesterday, as the issue is resolved such that action array columns util
, u_loc
, and u_land
are now not touched by the genetic algorithm where the first column of the action array takes a negative value.
Moving on to castration
I have tested to confirm that the moving (i.e., scaring) action is working and actually moving resources as intended. I am now moving on to the code for castrating (decreasing birth rate to zero) resources. As with the moving, there is really no analog on the landscape for this (since crops modelled using the landscape don’t reproduce explicitly – if we wanted them to, we could just model them as a different kind of resource), so I am also only doing a function of this for resources. Any positive values in the ACTION
array therefore have no effect on landscape rows (i.e, where the first column equals -1
).
The castration function (up and working) reuses a lot of code from the moving function, which initially led me to trying to make all of the actions part of one function.
/* =============================================================================
* This function causes the agents to castrate a resource
* land: The landscape array
* resources: The resource array
* owner: The agent ID of interest -- also the landowner
* u_loc: Whether or not an agent's actions depend on owning land cell
* casts_left: The number of remaining times an agent will castrate
* res_number: The total number of resources in the resources array
* land_x: The x dimension of the landscape
* land_y: The y dimension of the landscape
* res_type1: Type 1 category of resources being moved
* res_type2: Type 2 category of resources being moved
* res_type3: Type 3 category of resources being moved
* ========================================================================== */
void castrate_resource(double ***land, double **resources, int owner, int u_loc,
int casts_left, int res_number, int land_x, int land_y,
int res_type1, int res_type2, int res_type3){
int xpos, ypos, xloc, yloc;
int cell, cast;
int resource, t1, t2, t3;
resource = 0;
while(casts_left > 0 && resource < res_number){
t1 = (int) resources[resource][1];
t2 = (int) resources[resource][2];
t3 = (int) resources[resource][3];
if(t1 == res_type1 && t2 == res_type2 && t3 == res_type3){
xpos = (int) resources[resource][4];
ypos = (int) resources[resource][5];
cell = land[xpos][ypos][2];
cast = check_if_can_act(u_loc, cell, owner);
if(cast == 1){
resources[resource][9] = 0;
casts_left--;
}
}
resource++;
}
}
Nevertheless, having these modular actions makes the code a bit more readable, and to combine all of them would require multiple while
loops within the function anyway – the resource type check could be pulled out, but then this would defeat the whole point of being able to switch the order of actions. Then again, it could make it easier to avoid having the same resource experiencing multiple actions. This is probably undesirable.
Even more importantly, there is an issue here that all of these actions will start out with resource = 0
, so the first resource will by default experience multiple actions wherever this is possible. Clearly this needs to be either randomised or done systematically in some way. I think that the best solution is to create a function to sample without replacement, put that function in utilities.c
, then use it select resources to be acted on – hence each resource will only experience one action. In the unlikely event that there are more actions than resources, it would be useful to somehow randomise which actions are taken – perhaps smaller action specific functions should operate within the larger function ordering actions. In any case, the above castrate_resource
function and the moving function should change.
Major restructure of actions successful
In working through the separate user actions, I found it challenging to try to code things such that correct resources were affected, but these actionable resources were not affected in any particular order (e.g., all scaring first, then killing, etc.). If there was a particular order, then it’s possible that users could systematically run out of resources to do things to (perhaps because they exausted the resources on their land) and hence always move resources but not kill them for arbitrary reasons. To work around this, it is necessary to randomly select an action and perform it on an actionable resource. This is solved with a new function resource_actions
, which initial testing finds to work as intended.
/* =============================================================================
* This function enacts all user actions in a random order
* resources: The resource array
* row: The row of the action array (should be 0)
* action: The action array
* can_act: Binary vector length res_number where 1 if resource actionable
* res_number: The number of rows in the resource array
* land_x: The x dimension of the landscape
* land_y: The y dimension of the landscape
* ========================================================================== */
void resource_actions(double **resources, int row, double **action,
int *can_act, int res_number, int land_x, int land_y){
int resource, xloc, yloc, i;
int util, u_loc, u_land;
int movem, castem, killem, feedem, helpem;
int *actions, total_actions, action_col, sample;
actions = malloc(5 * sizeof(int));
total_actions = 0;
for(i = 0; i < 5; i++){
action_col = i + 7;
actions[i] = action[row][action_col];
total_actions += action[row][action_col];
}
resource = 0;
while(resource < res_number && total_actions > 0){
if(can_act[resource] == 1){
do{ /* Sampling avoids having some actions always first */
sample = floor( runif(0, 5) );
}while(actions[sample] == 0 && sample == 5);
/* Enact whichever action was randomly sampled */
switch(sample){
case 0: /* Move resource */
xloc = (int) floor( runif(0, land_x) );
yloc = (int) floor( runif(0, land_y) );
resources[resource][4] = xloc;
resources[resource][5] = yloc;
actions[0]--;
break;
case 1: /* Castrate resource */
resources[resource][9] = 0;
actions[1]--;
break;
case 2: /* Kill resource */
resources[resource][8] = 1;
actions[2]--;
break;
case 3: /* Feed resource (increase birth-rate)*/
resources[resource][9]++;
actions[3]--;
break;
case 4: /* Help resource (increase offspring number directly) */
resources[resource][10]++;
actions[4]--;
break;
default:
break;
}
total_actions--;
}
resource++;
}
free(actions);
}
The above function is called by do_actions
within a switch
statement. Recall that the resources
array here is a temporary array that will later be used to assess the impact of actions with respect to user utility to ultimately assign each agent in the genetic algorithm a fitness value. Some functions within resource.c
are going to need to be used for this.
Since I have added multiple functions that allocate memory, now is probably a good time to check for any errors or memory leaks.
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < gmse.R
After running valgrind, all appears to be clear.
==5787== HEAP SUMMARY:
==5787== in use at exit: 95,076,495 bytes in 17,685 blocks
==5787== total heap usage: 12,369,394 allocs, 12,351,709 frees, 2,246,425,240 bytes allocated
==5787==
==5787== LEAK SUMMARY:
==5787== definitely lost: 0 bytes in 0 blocks
==5787== indirectly lost: 0 bytes in 0 blocks
==5787== possibly lost: 0 bytes in 0 blocks
==5787== still reachable: 95,076,495 bytes in 17,685 blocks
==5787== suppressed: 0 bytes in 0 blocks
==5787== Reachable blocks (those to which a pointer was found) are not shown.
==5787== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5787==
==5787== For counts of detected and suppressed errors, rerun with: -v
==5787== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The changes now have been pushed from a local branch to dev
. The next thing to work on is to get key parameters from the temporary resource array that was changed (how many resources added, lost, moved, etc.). After this information is collected, then another calc_payoffs
can be run on the changed array to get an updated estimate of key values and compare before and after user actions.
I’ve decided that u_loc
should actually refer to the actions of a particular agent being taken on their own land (u_loc = 1
) or on all land (u_loc = 0
). I’m debating whether a third option u_loc = -1
should be available for forcing action to only occur on public land. I have also created a more readable structure for the do_actions
function in game.c
, which is going to be a bit on the long side, and will therefore need to be written in a way that is easy to follow, going through each action and performing the action through a series of nested while
loops.
Cost issue: Issue #17
The action array has three columns of util
, u_loc
, and u_land
, which represent the utility of a resource, whether or not actions on the resource are restricted to the user’s land, and whether or not the utility of the resource is dependent on it being on the user’s land. Currently, any positive values correspond to some cost in the cost array, which means that they are changed to zero when the cost is high. In essence, these three columns represent identity, while the remaining rows to the right represent actions. Ideally, we don’t want the users to be affecting, or the constrain_costs
function changing, util
, u_loc
, and u_land
columns – only the ones to the right.
What needs to happen next is for util
, u_loc
, and u_land
columns to be untouchable by the genetic algorithm when the first row in the action array (agent
) is negative – corresponding to direct actions of the user on resources or landscape layers. Remaining util
, u_loc
, and u_land
should be touchable. Hence, within constrain_costs
, it is necessary to block adjustment to the relevant columns.
I appear to have found a fix for this, but I’m going to wait a day before I call the issue resolved. The fix basically involved telling the program not to touch columns below 7 if the first column of the row is less than one.
start_col = 4;
if(population[row][0][agent] < 0){
start_col = 7;
}
The new variable start_col
then defines the column to start on when considering whether or not to constrain costs. The above also needs to appear in the functions affecting population initialisation, mutation, and crossover of the genetic algorithm. I’m not sure if there is a more elegant or more readable solution, but the above appears to work fine. The appropriate columns are untouchable in rows where the first column is negative. The constraining part of the constrain_costs
function also looks a bit messy.
while(tot_cost > budget){
do{ /* This do assures xpos never equals ROWS (unlikely) */
xpos = (int) floor( runif(0,ROWS) );
}while(xpos == ROWS);
if(population[xpos][0][agent] > 0){
do{
ypos = (int) floor( runif(4,COLS) );
}while(ypos == COLS);
}else{
do{
ypos = (int) floor( runif(7,COLS) );
}while(ypos == COLS);
}
if(population[xpos][ypos][agent] > 0){
population[xpos][ypos][agent]--;
tot_cost -= COST[xpos][ypos][layer];
}
}
I think the messiness is really mostly caused by the do
loops, which are there as a safety precaution against the unlikely event that the random number selected exactly equals ROWS
or COLS
, and hence returns a segfault.
Next steps
With the util
, u_loc
, and u_land
column situation seemingly resolved in the action array, I’ve done some initial testing again on the move_resource
. The move_resource
function now appears to only move resources when it’s supposed to (i.e., when they’re on the land and the action array says to move them – assuming u_loc = 1
). Now, before moving on, I should check to make sure that resources are actually being moved in the resources array. Once this is finished, I will double check Issue 17, then move on to a castrate_resource
function.
I have re-arranged the fitness function structure to calculate fitness payoffs more clearly. One top level strategy_fitness
function will calculate all strategy fitness in the genetic algorithm by looping through calc_agent_fitness
for each agent in the population (note: this is each agent in the genetic algorithm population, from which the new strategy for one agent in the bigger G-MSE will be selected). The calc_agent_fitness
will itself call the calc_payoffs
function (see below) to get a vector with the same rows as the ACTION
and COST
arrays. Each element will eventually represent a change in the resource or landscape, corresponding to some utility value which will make it possible to calculate and compare overall fitness of the strategy.
void calc_payoffs(double ***population, int ROWS, double ***landscape,
double **resources, int res_number, int landowner,
int land_x, int land_y, double *payoff_vector, int agent){
int xloc, yloc, yield_layer;
int resource, row;
int landscape_specific;
int res_count;
double cell_yield;
for(row = 0; row < ROWS; row++){
payoff_vector[row] = 0;
if(population[row][0][agent] == -2){
for(resource = 0; resource < res_number; resource++){
if(population[row][1][agent] == resources[resource][1] &&
population[row][2][agent] == resources[resource][2] &&
population[row][3][agent] == resources[resource][3]
){
landscape_specific = population[row][6][agent];
if(landscape_specific == 0){
res_count++;
}else{
xloc = resources[resource][4];
yloc = resources[resource][5];
if(landscape[xloc][yloc][2] == landowner){
res_count++;
}
}
}
}
payoff_vector[row] += res_count;
}
if(population[row][0][agent] == -1){
yield_layer = population[row][1][agent];
for(xloc = 0; xloc < land_x; xloc++){
for(yloc = 0; yloc < land_y; yloc++){
if(landscape[xloc][yloc][2] == landowner){
cell_yield = landscape[xloc][yloc][yield_layer];
payoff_vector[row] += cell_yield;
}
}
}
}
if(population[row][0][agent] > -1){
payoff_vector[row] = 0;
}
}
}
The above will needed to be called twice in calc_agent_fitness
so that the difference between vector elements can be calculated.
Use of memcpy
to copy whole arrays
I have saved a bit of hassle by switching from the multiple loops to the simple use of memcpy
in c, which works as follows in the calc_agent_fitness
function.
void calc_agent_fitness(double ***population, int ROWS, int COLS, int landowner,
double ***landscape, double **resources, int res_number,
int land_x, int land_y, int trait_number,
double *fitnesses){
int agent, resource;
int res_on_land;
double *payoff_vector;
double **TEMP_RESOURCE;
payoff_vector = malloc(ROWS * sizeof(double));
TEMP_RESOURCE = malloc(res_number * sizeof(double *));
for(resource = 0; resource < res_number; resource++){
TEMP_RESOURCE[resource] = malloc(trait_number * sizeof(double));
}
memcpy(&TEMP_RESOURCE, &resources, sizeof(TEMP_RESOURCE));
for(resource = 0; resource < 10; resource++){
printf("%f\t%f\t || %f\t%f\n", resources[resource][0],
resources[resource][1], TEMP_RESOURCE[resource][0],
TEMP_RESOURCE[resource][1]);
}
/*
calc_payoffs(population, ROWS, landscape, resources, res_number, landowner,
land_x, land_y, payoff_vector, agent);
*/
free(payoff_vector);
}
The temporary vector TEMP_RESOURCE
needs to be made and remade so many times, and it appears that memcpy
is slightly faster than for loops. Nevertheless, I fear that use of memcpy
might make the code less readable and its implementation could depend on the hardware and compiler, which I don’t want. For now, I’m going to do this the more readable way.
Using do_actions
function
A do_actions
function will enact the actions of one (usually out of 100) member of the population in the genetic algorithm. So the general procedure will be to do the following.
strategy_fitness
, loop through each member of the population (agent ACTION
array) running calc_agent_fitness
.calc_agent_fitness
, copy a dummy version of the resource, action, and landscape arrays.calc_agent_fitness
, run calc_payoffs
to get the payoffs before performing actions.calc_agent_fitness
, then run do_actions
, which causes a change in the temporary resource array as a consequence of the temporary action array.calc_agent_fitness
, re-run calc_payoffs
to get new payoffs after having performed the actionsThis should give a fitness function that is then returned to strategy_fitness
(might want to have calc_agent_fitness
return an int
), which will store all fitnesses in a vector after the above loops. More bells and whistles can be added on to this later, but when this is finished, it should be a working genetic algorithm for modelling complex stake-holder behaviour.
Working through implementing the ideas for the fitness function from yesterday. I’ve linked some key parameters now through user
and ga
so they can be run in strategy_fitness
(namely the resource number and agent_ID, landowner
). Now it’s important to note that building local resources has to be conditional – if an agent has no land, they cannot do things on their land. And if their interests are global, this needs to be considered too. I think a collection of small functions called according to parameter options is needed, and landscape specific changes are really a subset of general actions, so maybe there’s a better way to do this. Really though, it would be nice to have a way for the cost of performing actions on land owned versus land not owned to be different. Then again, it would be nice to have different utilities for resources on and off your land, but this could get very complex very quickly (might, however, be interesting in that maybe a farmer values crops positively on their own land but negatively on the land of other farmers). I think it will also shake out when the manager actions affecting costs comes into play – so a manager will naturally up the cost if shooting somehow becomes not tied to a location in a way that affects other stake-holders or management decisions, for whatever reason. The bottom line is that I think it’s okay for now to run the ga
with the constraint that if a resource/landscape utility value is tied to land ownership, then the actions should also be tied to owned landcape cells. If not, then actions should happen either on all cells or only public land.
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double ***landscape,
double **resources, double **agent_array,
int res_number, int landowner){
int xloc, yloc;
int agent, resource;
int res_on_land;
double **RESOURCE_LOCAL;
/* Need something here -- check if:
*
* 1) agent has landscape-specific utility
* 2) agent actually owns some land
*
* If neither are true, then RESOURCE_LOCAL should not be built, and actions
* of stake-holders should be interpreted accordingly (e.g., agents could be
* allowed to do some actions on public land, or not at all -- perhaps an
* option added to paras?
*/
res_on_land = 0; /* Make a sub-function returning an int for this */
for(resource = 0; resource < res_number; resource++){
xloc = (int) resources[resource][4];
yloc = (int) resources[resource][5];
if(landscape[xloc][yloc][2] == landowner){
res_on_land++;
}
}
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] = population[0][12][agent];
}
}
Definitions of the COST
and ACTION
arrays
A quick reminder to myself what’s going on in the COST
and ACTION
arrays, as it is most relevant for fitness functions. In the example COST
array there are two agents (very simple – just a manager and a stake-holder) and one resource. The table below is one layer of a 3D 2-layer array where each layer identifies actions for each unique agent.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The way to read the above in the code is as follows. All of the actions are for the one stake-holder (i.e., assume we’re looking at the second layer in the 3D array). The first row of actions is special because it represents the degree to which a stake-holder themselves values and things they will do to a resource of type1 = 1
, type2 = 0
, and type3 = 0
. Value is indicated by util
, whether or not that value is ‘visible’ to the agent (which may be implemented in different ways, but for now is within some distance of their location) is indicated by u_loc
, and whether or not that value is tied to it being on land that the stake-holder owns is indicated by u_land
. Actions are indicated by the remaining columns, and if u_land = TRUE
, then we assuem that actions are restricted to resources on the agent’s owned land – though I am tempted to change u_loc
to mean whether or not actions are (1
) or are not (0
) restricted in this way.
So the first row where agent = -2
represents values and actions of the focal agent (indicated by this layer of the 3D array) for a particular type of resource (note that more rows where agent = -2
would be needed for more resources). The second row where agent = -1
refers specifically to the values and actions of a particular layer of the landscape, which is indicated in the type1
column (making type2
and type3
effectively useless in this row, for now). Right now this is type1 = 1
, which is the index (for C – R is of course 2) where the values of crop production are stored; I’m not sure if it’s worth adding more rows for additional layers later, but this framework at least allows the possibility for other landscape properties to be valued. Of course, whenever agent = -1
and we’re looking at the landscape, actions such as movem
and castem
will need to have different meanings – or no meaning, but feedem
and killem
could be fairly straightforward. The third row is action taken to the agent whose ID is 1
with reference to resource type1 = 1
, type2 = 0
, and type3 = 0
. Any nonzero values here in util
, u_loc
, and u_land
cause the focal agent to change the value of another agent, while any nonzero values in movem
, castem
, …, bankem
cause the focal agent to change the cost of another agent taking a particular action (i.e., it affects the other agent’s layer at agent = -2
), increasing it or decreasing it (NOTE: I just noticed that I really need to set up these tables with values for type1 = -1
or something here – to allow agents to affect other agents costs of actions affecting the landscape). So in theory the agent in the above table could change the manager’s values (e.g., modelling lobbying) or the cost of them performing actions (e.g., modelling something like protesting or lobbying third parties?). The could also in theory change their own values and costs of perfoming actions, though I think this should almost always be prohibited by making the cost of doing so effectively infinite, which brings me to the cost array.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 101 | 101 | 101 | 3 | 8 | 4 | 4 | 2 | 1 |
-1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
2 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
Assume that each agent gets a total budget of 100
. In that case, all of the table elements equal to 101 are effectively off limits because they are too costly (I might actually want to make them 1000 just in case some agent gets crafty and tries to lower another agent’s cost). So in the above, the stake-holder can only take six possible actions, all of which directly affect resources and not other stake-holders’ values or costs. By setting it up this way, the genetic algorithm will converge on the best set of these actions and the ACTION
array above will never change values where COST
elements are 101
. Later, we might decrease the value of util
in row 3 to allow the stake-holder to lobby the manager. Managers (different layer) would simply have cost arrays that have lower values allowing them to affect actions of stake-holders. With these tables now (I hope) completely clear, the code writing itself should become much smoother – I’ve anchored to the title immediately above because I know it’s going to be necessary to come back to these two difficult to remember tables.
Objectives for fitness function
Eventually we’ll want to find some kind of mult-objective fitness function (Lee 2012), especially for the managers. For now I’m going to simplify and just make fitness a simple matter of abundance times deviation from utility – where for a landscape ‘abundance’ is replaced by owned cell crop yield, then sum over all resources and landscape layers. For now, utilities can be set unreasonably high – so conservationists might want 1000000 geese and farmers might want 1000000 in yield – much higher than is possible so that more is always better. The fitness function will then assign fitnesses minimising the deviation from utility somehow. The deviation from will eventually allow managers to have more reasonable goals – allowing the genetic algorithm to find more flexible and dynamic strategies.
Quick check on neural networks
I want to make sure I understand neural networks well enough to be able to explain why I’m not using them (yet). Daniel Shiffman’s book chapter in The Nature of Code helps out here. Because the simulated agents aren’t trying to recognise a particular pattern, I don’t think a neural network is how I would describe the COST
and ACTION
arrays – nor would a more explicit network structure going from input (e.g., costs and resources abundances) to output (actions) be terribly useful for current purposes. Nevertheless, a neural network will be useful if combined with empirical data to mimic human behaviour. For example, if we want to make an agent that predicts stake-holders’ actions based off of empirically collected data from behavioural games, then a neural network could be fed input and then callobrated to the ‘correct’ behaviour observed by humans through correlations with specific conditions. This would, in effect, create an artifical bot that does what a human would do based on correlating situations with actions. I’m not sure how much data it would require to parameterise effectively, but I suspect more than would be a lot – we would probably need to have dozens of people act as stake-holders or managers within G-MSE and collect their actions.
Back to fitness functions
I now need to complete the genetic algorithm with a useful fitness function. The fitness function should calculate the change in resources caused by a stake-holder’s actions, then match them to utility. Note that this doesn’t require all resources to be calculated to figure out what total utility is before and after an agent has acted – only how the agent’s actions have increased or decreased resources, and the weight (utility) assigned to each.
A starting point is to do some clean-up. There are several values hard coded in to the ga()
function that need to be assigned variables to be set in gmse.R
. Once I have the ga()
function a bit more readable, then I can move on to the specifics of the fitness function.
Having done the clean-up, I now note again that the utility from an action isn’t always direct – e.g., killing one resource might increase crop yield. Somehow, the action of removing an individual from the population must therefore be recognised by the agent as increasing yield by a particular amount. There are a few ways that I could think to do this:
Have stake-holders correlate resources with crop production on cells. This would be the most complex way of doing things – probably the most flexible too, but I’m not sure if it would actually be the most realistic. Not for farmers watching their crops being eaten at least; the cause and effect is something I think stake-holders could probably observe pretty clearly.
Give stake-holders complete access to the resouce array and have them figure out exactly how much damage their land is going to sustain by seeing the number of resources on it and the amount of damage that each reosurce reduces per cell (column 14 in C, 15 in R). Maybe this is the best starting point, though it does seem to be a bit too exact; no farmer is going to know exactly how many animals are on their farm and exactly how much damage they will do. Still, perhaps we assume this and add in error later.
Give stake-holders access to the resource array column in which crop damage is specified, then have them associate mean damage per cell with each resource type. Do not, however, give them access to resource locations, and require that they instead estimate the density of resources on their landscape in the same way that managers might in an observation model type 0
(i.e., look at a few cells on their property, then infer the total number of resources and how much damage they’ll do). I like this because it seams reasonable that a farmer could know roughly how much damage an animal does to their crop in the area, but probably doesn’t have the time or ability to sample every corner of their land to find exactly the number of animals on it. It also doesn’t give stake-holders a superior ability to estimate local population size.
res_landscape_interaction
from resource.c
directly). If some resources are created or destroyed, then this would need to be accounted for by making a dummy resource array. Perhaps the following:
RESOURCE
array on the stake-holder’s landRESOURCE_LOCAL
with only resources on the stake-holder’s landRESOURCE_LOCAL
array in relevant columns (e.g., birth
, remove_pr
, etc.)res_add
and res_remove
to get the number of individuals being added or removed.res_landscape_interaction
function to find the effect of the added and removed individuals on landscapeAlthough I initially thought option 3 was pretty good, I’m now leaning toward option 4 as being the best one to try out first; it seems more flexible. Eventually, of course, we can specify options for different ways of calculating fitness, but I think it’s best to pick one option first and go with it. I think option 4 will be slightly slower than option 3, but I’m curious as to exactly how much slower. Hence, I’ll try number 4 first, then potentially move to 3 as a default if it’s too clunky. I have to keep in mind as well that managers are probably going to need to run the user
functions to set policy eventually (unless I can find a work-around that gets managers to anticipate stake-holder actions in making policy), and this will likely slow things down exponentially. Time isn’t much of an issue now, and I want to keep things efficient as possible. Also important, I need to make sure that there is some if
statement that only deals with the landscape if the stake-holder owns land. If they don’t own any, then their actions need to be restricted accordingly – maybe to lobbying the manager or only doing things on public land?. As a next step, I will attempt to write the code for option 4 above, perhaps excluding (for now) stake-holders that own no land.
More thoughts on genetic algorithms
I’ve come back to thinking more about how to write the fitness function of the G-MSE genetic algorithm, and about the relationship between evolution and individual learning, more generally. Watson and Szathmary (2016) argue that learning and (adaptive) evolution are formally linked. In practice, they note that ‘’In a good model space, desirable future behaviours should be similar (nearby) to behaviours that were useful in the past. For example, perhaps ’eating apples’ should be close to ‘eating pears’ but far from ‘eating red things’.’’ Watson and Szathmary (2016) also note that ‘’The representation of associations or correlations has the same fundamental relationship to learning as transistors have to electronics or logic gates to computation (and synapses to neural networks). Although mechanisms to learn a single correlation between two features can be trivial, these are also sufficient, when built up in appropriate networks, to learn arbitrarily complex functions’’. A potentially confusing aspect of this with respect to G-MSE is that we have two scales of time of interest. The first scale is within a single time step (i.e., inside the user model), and the second scale is over multiple time steps (population model \(\to\) observation model \(\to\) management model \(\to\) user model). Most of the time, when we focus on learning, we’re talking about the program learning to make a decsion within a time step rather than stake-holders learning to make decisions across time steps. I’m not opposed to modelling the latter, but the former needs to come first in software development. So when we model learning through the genetic algorithm, it’s the iterative processes in ga()
– there is less worry, I think, about the correlations that Watson and Szathmary (2016) describe; rather, the associations are explicit. A value in the ACTION
array is associated with a particular outcome that can be tied to stake-holder interests. More abstract learning over G-MSE generations can be added in later with estimates of correlations between actions and outcomes.
Major updates merged to master
I have merged all of the recent updates on the genetic algorithm to the master branch. We now have a bug-free G-MSE model v0.0.8
that has all of the necessary framework of proper machine learning once a fitness function is written that links costs and utilities of each agent to agent actions. There are a few things that will need to be updated thereafter, which I am putting off until later when the full genetic algorithm is complete and I am sure how it should be called by user.c
. As of now, the function runs only once for the first agent in the AGENT
array. Eventually, the function ga
will need to be looped within user.c
for each stake-holder (and called in manager.c
, not yet written, for the manager). I also still need to pass the parameter vector to ga
with values for the genetic algorithm which are currently hard coded into ga
.
After some additional debugging of the find_descending_order
in utilities.c
(which was returning the incorrect index and therefore not selecting for high fitness strategies), I have a working genetic algorithm with a very simple fitness function.
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double ***landscape,
double **resources, double **agent_array){
int agent;
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] = population[0][12][agent];
}
}
Essentially, the above function checks row zero and column 12 in an agent’s action array, and defines fitness as whatever value is in this array element. Fitness cannot increase indefinitely because of the cost constraints from the COST
array. Hence the genetic algorithm should increase fitness up to the point where it can’t any longer because it is constrained by costs. We can see this over 20 generations of the genetic alogrithm (note, this is different than simulation time steps – each simulated time step of G-MSE includes, in this example, a genetic algorithm where strategies updated over 20 generations). The plot below therefore represents an agent ‘’evolving’’ the best strategy for one G-MSE time step
The ACTION
array for the zero agent (the only one run for a genetic algorithm in test simulations) showed a corresponding change in each simulated G-MSE time step, with agents having the actions below (or very similar actions).
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 0 0 0 0 0 0 0 0 12
[2,] -1 1 0 0 0 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 0 0
In the above, the agent’s only action is to invest all of their energy to doing the action in ACTION[1,13]
(bankem
), as predicted given the simple fitness function assigned a priori. Hence, with a working genetic algorithm for agents, what is necessary now is to clarify the fitness function to reflect agent utilities. Some clean-up is also necessary to call genetic algorithm specific parameters from the main gmse.R
file – right now there are some hard-coded values in the ga
function, and user.c
doesn’t loop through multiple agents (or check and use only stake-holders).
Part of the problem from last Friday was that the arrays fitnesses
and winners
were uninitialised in the genetic algorithm before being used. Fixing this and running Valgrind returns no errors and no memory leaks.
==32451==
==32451== HEAP SUMMARY:
==32451== in use at exit: 89,001,346 bytes in 13,024 blocks
==32451== total heap usage: 5,218,764 allocs, 5,205,740 frees, 621,820,827 bytes allocated
==32451==
==32451== LEAK SUMMARY:
==32451== definitely lost: 0 bytes in 0 blocks
==32451== indirectly lost: 0 bytes in 0 blocks
==32451== possibly lost: 0 bytes in 0 blocks
==32451== still reachable: 89,001,346 bytes in 13,024 blocks
==32451== suppressed: 0 bytes in 0 blocks
==32451== Reachable blocks (those to which a pointer was found) are not shown.
==32451== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==32451==
==32451== For counts of detected and suppressed errors, rerun with: -v
==32451== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
So we have an error that Valgrind can’t figure out for some reason. It’s worth noting that the crash never occurs on the first simulation; it always takes a couple re-runs of gmse
in success for it to crash. It could just be overloading Rstudio, but I want to keep pushing to figure it out. I’m now going to test by simply running 100 times in succession.
test <- NULL;
for(i in 1:100){
test <- gmse( observe_type = 0,
agent_view = 20,
res_death_K = 400,
plotting = FALSE,
hunt = FALSE,
start_hunting = 95,
fixed_observe = 1,
times_observe = 1,
land_dim_1 = 100,
land_dim_2 = 100,
res_consume = 0.5
)
print(i);
}
Unfortunately, the above crashed in the first loop upon running. Then in the second attempt, it crashed on the 11th loop. When I stop running the genetic algorithm, it never crashes though, so I can at least start to isolate the problem. I have a feeling it’s in the utilites.c
file.
Except that now the crash occurs when the ga
is commented out. The issue actually appears to be somewhere else in user.c
because I can run the above for i in 1:1000
and not get an error if I don’t call user.c
from R at all. Now I need to try to really examine user.c
and see what’s happening. I’m going to start by not calling send_agents_home
or count_cell_yield
within the user function (while still calling the genetic algorithm) to see if a crash occurs.
The problem, as it turns out, was in the function send_agents_home
. After much hassle and multiple times running Valgrind, I found that initialising agent_xloc
or agent_yloc
to the agent’s array values would occaisionally produce a segfault because the values were not always within the landscape. This was corrected by initialising these values to zero before ‘’sending agents home’’, but I’m not sure why it arose at all in the first place. Where the agents are has never been a focus, except for when manager agents (type1 = 0
) are observing (the code for which is stable). To solve the problem more flexibly, I’ve replaced a straight assignment with the below code.
agent_xloc = agent_array[agent][4];
agent_yloc = agent_array[agent][5];
if(agent_xloc < 0 || agent_xloc >= xdim){
agent_xloc = 0;
}
if(agent_yloc < 0 || agent_yloc >= ydim){
agent_yloc = 0;
}
Now in the very rare cases where agent locations are off the map (and it might be worth figuring out why – perhaps they’re getting moved somewhere arbitrarily and not moved back?), they will be placed on a cell that they own. This was the point of the function anyway, so it’s not a huge deal. It’s still a bit odd though, and I’m not sure why it was affecting only about one in thirty simulations. I’ll consider Issue #16: Potential bug: In user.c
closed now, and move on to the genetic algorithm again.
Placing tournament winners into a new array
At the end of the tournament function, we have a vector of winners with high fitness. These winners represent the array layers that need to comprise the new 3D array, which will be the start of the next generation of the genetic algorithm. Hence the need for a place_winners
function to make a new POPULATION
array to replace the old one. This could be done by individually replacing elements of a NEW_POPULATION
into the old array POPULATION
, but a handy swapping of pointers can do this without the multiple loops.
/* =============================================================================
* Swap pointers to rewrite ARRAY_B into ARRAY_A for a an array of any dimension
* ========================================================================== */
void swap_arrays(void **ARRAY_A, void **ARRAY_B){
void *TEMP_ARRAY;
TEMP_ARRAY = *ARRAY_A;
*ARRAY_A = *ARRAY_B;
*ARRAY_B = TEMP_ARRAY;
}
The above function works for 2D and 3D arrays by running the below.
swap_arrays((void*)&MAT1, (void*)&MAT2);
We can see the arrays swapped in the output (the first 3 columns before the “|” partition denotes layer 1, and after denotes layer 2, so the array is \(3 \times 3 \times 2\) dimensions).
=========================================
---------------- Pre-swap MAT 1 ------------
0 0 1 | 6 5 0
2 8 6 | 9 2 4
1 2 1 | 9 2 5
---------------- Pre-swap MAT 2 ------------
1 4 3 | 8 8 6
1 5 8 | 3 2 4
3 9 2 | 8 3 8
---------------- Post-swap MAT 1 ------------
1 4 3 | 8 8 6
1 5 8 | 3 2 4
3 9 2 | 8 3 8
---------------- Post-swap MAT 2 ------------
0 0 1 | 6 5 0
2 8 6 | 9 2 4
1 2 1 | 9 2 5
Since this works, we can use swap_arrays
to write a concise function for placing the new individuals.
Potential bug: In user.c
I can’t tell if I’m just overloading R by running the simulation too many times too quickly (clicking to fast), or if there’s actually a bug here. But when I comment out the below lines of code in the send_agents_home
function of user.c
, things seem fine.
while(agent_ID != landowner){
do{
agent_xloc = (int) floor( runif(0, xdim) );
}while(agent_xloc == xdim);
do{
agent_yloc = (int) floor( runif(0, ydim) );
}while(agent_yloc == ydim);
landowner = (int) landscape[agent_xloc][agent_yloc][layer];
}
When I re-run the code quickly in succession, the above (I think) will very rarely crash the G-MSE program. I can’t figure out why yet. It’s logged as an issue now. Valgrind report below.
==15500== Invalid read of size 8
==15500== at 0xC298756: is_number_on_landscape (user.c:19)
==15500== by 0xC298811: send_agents_home (user.c:50)
==15500== by 0xC299166: user (user.c:303)
Valgrind doesn’t appear to like the comparing of a landscape value (double
) with an int
, so I’m going to change this now. So the function is_number_on_landscape
now defines land_num = (int) landscape[xval][yval][layer];
instead of calling the landscape value directly. I have also gotten rid of the sub-function is_number_on_landscape
, but the crash still sometimes happens. It’s possible that this was actually two bugs though, one affecting the ga
. From Valgrind below now (invalid read is gone).
==16758== Conditional jump or move depends on uninitialised value(s)
==16758== at 0xC29819E: sort_vector_by (utilities.c:63)
==16758== by 0xC29A1E1: tournament (game.c:280)
==16758== by 0xC29A66D: ga (game.c:415)
==16758== by 0xC29914F: user (user.c:294)
==16758== by 0x4F0A57F: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F4272E: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F43DDC: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F422FC: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F45FB5: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== Uninitialised value was created by a heap allocation
==16758== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16758== by 0xC29A583: ga (game.c:390)
==16758== by 0xC29914F: user (user.c:294)
==16758== by 0x4F0A57F: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F4272E: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F43DDC: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F422FC: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F45FB5: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==16758==
This all goes back to the sort_vector_by
, which I should probably look at a potentially rewrite. The sort function is called by the tournament
function.
Some progress on the genetic algorithm
Despite this, there has been progress on the genetic algorithm. Enough that I want to merge the local branch to dev and rev, but not master yet. The place_winners
function appears to work fine.
void place_winners(double ****population, int *winners, int pop_size, int ROWS,
int COLS){
int i, row, col, layer, winner;
double a_value;
double ***NEW_POP;
NEW_POP = malloc(ROWS * sizeof(double *));
for(row = 0; row < ROWS; row++){
NEW_POP[row] = malloc(COLS * sizeof(double *));
for(col = 0; col < COLS; col++){
NEW_POP[row][col] = malloc(pop_size * sizeof(double));
}
}
for(i = 0; i < pop_size; i++){
winner = winners[i];
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
a_value = (*population)[row][col][winner];
NEW_POP[row][col][i] = a_value;
}
}
}
swap_arrays((void*)&(*population), (void*)&NEW_POP);
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
free(NEW_POP[row][col]);
}
free(NEW_POP[row]);
}
free(NEW_POP);
}
Once I get the bugs worked out of it, the genetic algorithm should start to work. Then a fitness function needs to be made that is more realistic. Fortunately, all of the bugs now appear to be isolated in the genetic algorithm, but I might need to keep testing to be sure.
Initialise new function to constrain costs in the genetic algorithm
A new function has been written to constrain costs in the genetic algorithm when they go over budget as a consequence of crossover and mutation.
/* =============================================================================
* This function will ensure that the actions of individuals in the population
* are within the cost budget after crossover and mutation has taken place
* Necessary variable inputs include:
* population: array of the population that is made (malloc needed earlier)
* COST: A 3D array of costs of performing actions
* layer: The 'z' layer of the COST and ACTION arrays to be initialised
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* budget: The budget that random agents have to work with
* ========================================================================== */
void constrain_costs(double ***population, double ***COST, int layer,
int pop_size, int ROWS, int COLS, double budget){
int xpos, ypos;
int agent, row, col;
double tot_cost, action_val, action_cost;
for(agent = 0; agent < pop_size; agent++){
tot_cost = 0;
for(row = 0; row < ROWS; row++){
for(col = 4; col < COLS; col++){
action_val = population[row][col][agent];
action_cost = COST[row][col][layer];
tot_cost += (action_val * action_cost);
}
}
while(tot_cost > budget){
do{ /* This do assures xpos never equals ROWS (unlikely) */
xpos = floor( runif(0,ROWS) );
}while(xpos == ROWS);
do{
ypos = floor( runif(4,COLS) );
}while(ypos == COLS);
if(population[xpos][ypos][agent] > 0){
population[xpos][ypos][agent]--;
tot_cost -= COST[xpos][ypos][layer];
}
}
}
}
The function has been tested, and works as intended. When the sum of the action elements of an individual multiplied by the cost of each action (tot_cost
in the above function) are higher than the allowable budget
, actions are randomly removed until the total costis at or under budget. Note that lower-cost actions are not removed preferentially so as not to bias evolution toward low-cost actions.
Initial thoughts on the fitness function
Having now completed functions modelling crossover, mutation, and cost-constraints in C, there are two functions left in the genetic algorithm that are needed. The second is a tournament function modelling selection – this will be relatively easy to code once I have individual fitnesses in the population. The first is the fitness function, which be very complex – so much so that I’m planning to write a very quick simplified version of the fitness function before expanding it out to deal with more difficult questions. What has to happen with the fitness function is that each simulated individual in the popuation has to use whatever information is available to an agent (e.g., manager observations, anecdotal surveys, past decisions of other agents, landscape status, etc.) to predict what the future status of the resources and landscape will be, then assign a fitness to that prediction. Utilities of each resource are in the (truncated) action and cost arrays, as below.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 1 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
Above we have the utilities of each resource type (type1
), but I’m just realising that the utilities of the landscape are absent. There isn’t really anything in the above table, for example to say that a stake-holder assigns a utility to the value of a given landscape cell.. But this needs to be the case if we want something like crop yield (perhaps I should more generally be calling it ‘’food security’’) to be modelled as part of the landscape. I think the best solution for this is to include the landscape in type1
as a negative integer. The landscape layer identifying crop yield is 1 in C (2 in R) – if I placed a new row of type1 = -1
in the COST
and ACTION
arrays for each agent, then the negative could simply indicate that we are looking at the LANDSCAPE
array instead of the RESOURCE
array. I also don’t think more than one layer of landscape will ever be used, so I’m not seeing a confusing mess of negative and positive types. The corresponding action columns (movem
, castem
, etc.) could have interpretations for landscape, some of them such as feedem
are obvoius, while others could just be ignored because they don’t really apply. In the end the arrays would then look something like the below.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-1 | -1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 1 |
1 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
Maybe not the most elegant solution, but it keeps everything on a single array and the interpretations of types are fairly straightforward. I’ll implement this next as new array initialisations, then build a prototype fitness function that attempts to maximise crop yield through feedem
(not sure if this should actually be an action in the model).
Manager summary missing
In working with the fitness function in the user model, I realised that the manager information was obviously missing, so this will have to be added in later (should be easy to do so). One reason for doing the user model first is because the manager model (particularly the genetic algorithm) is going to get much more complicated. Nevertheless, the manager model’s use of the genetic algorithm necessitates that the genetic algorithm be able to use both the OBSERVATION
array and the manager’s OBS_SUMMARY
of the array. Different users will have access to do different information, but I’m starting small to make sure everything is built clearly.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent -- as of
* now, fitness is just defined by how much action is placed into savem (last
* column). Things will get much more complex in a bit, but there needs to be
* some sort of framework in place to first check to see that everything else is
* working so that I can isolate the fitness function's effect later.
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* landscape: The landscape array
* resources: The resource array
* agent_array: The agent array
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double ***landscape,
double **resources, double **agent_array){
int agent;
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] += population[0][12][agent];
}
}
The above function therefore simply returns the last column (bankem
) as the individual’s fitness. I’m now going to maximise this using a tournament approach to fitness, as suggested by Hamblin (2013).
Functioning tournament function
After some toiling with swaps and pointers, I’ve managed to come up with a somewhat concise and clear function that randomly samples sampleK
individuals from the population and selects the chooseK
individuals with highest fitness.
/* =============================================================================
* This function takes an array of fitnesses and returns an equal size array of
* indices, the values of which will define which new individuals will make it
* into the next population array, and in what proportions.
* fitnesses: Array to order fitnesses of the agents in the population
* winners: Array of the winners of the tournament
* pop_size: The size of the total population (layers to population)
* sampleK: The size of the subset of fitnesses sampled to compete
* chooseK: The number of individuals selected from the sample
* ========================================================================== */
void tournament(double *fitnesses, int *winners, int pop_size,
int sampleK, int chooseK){
int samp;
int *samples;
int left_to_place, placed;
int rand_samp;
double *samp_fit;
samples = malloc(sampleK * sizeof(int));
samp_fit = malloc(sampleK * sizeof(double));
placed = 0;
while(placed < pop_size){ /* Note sampling is done with replacement */
for(samp = 0; samp < sampleK; samp++){
do{
rand_samp = floor( runif(0, pop_size) );
samples[samp] = rand_samp;
samp_fit[samp] = fitnesses[rand_samp];
}while(rand_samp == pop_size);
}
sort_vector_by(samples, samp_fit, sampleK);
if( (chooseK + placed) >= pop_size){
chooseK = pop_size - placed;
}
samp = 0;
while(samp < chooseK && placed < pop_size){
winners[placed] = samples[samp];
placed++;
samp++;
}
}
free(samp_fit);
free(samples);
}
Note that in writing the above, I had to write a simple sort (sort_vector_by
) and swap function in utilities.c
. I also need to write some error messages into the above (or in ga
itself); chooseK
cannot be larger than sampleK
. Next up will be to iterate the ga
functions and make sure that fitnesses asymptote to high fitness. The framework for the genetic algorithm will then be in place, and it will be time to switch to the complex part of more interesting fitness functions.
Initialisation of action populations
A new function has been written to initialise a population of agents, duplicated from a single agent in the larger G-MSE model and to be used for the genetic algorithm. Initial testing of this function shows that it returns appropriate arrays, in which actions are selected appropriately based on their cost values in the COST
array.
/* =============================================================================
* This function will initialise a population from the ACTION and COST arrays, a
* particular focal agent, and specification of how many times an agent should
* be exactly replicated versus how many times random values shoudl be used.
* Necessary variable inputs include:
* ACTION: A 3D array of action values
* COST: A 3D array of costs of performing actions
* layer: The 'z' layer of the COST and ACTION arrays to be initialised
* pop_size: The size of the total population (layers to population)
* carbon_copies: The number of identical agents used as seeds
* budget: The budget that random agents have to work with
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* population: array of the population that is made (malloc needed earlier)
* ========================================================================== */
void initialise_pop(double ***ACTION, double ***COST, int layer, int pop_size,
int budget, int carbon_copies, int ROWS, int COLS,
double ***population){
int xpos, ypos;
int agent;
int row, col;
double lowest_cost;
double budget_count;
double check_cost;
/* First read in pop_size copies of the ACTION layer of interest */
for(agent = 0; agent < pop_size; agent++){
for(row = 0; row < ROWS; row++){
population[row][0][agent] = ACTION[row][0][layer];
population[row][1][agent] = ACTION[row][1][layer];
population[row][2][agent] = ACTION[row][2][layer];
population[row][3][agent] = ACTION[row][3][layer];
if(agent < carbon_copies){
for(col = 4; col < COLS; col++){
population[row][col][agent] = ACTION[row][col][layer];
}
}else{
for(col = 4; col < COLS; col++){
population[row][col][agent] = 0;
}
}
}
lowest_cost = min_cost(COST, layer, budget, ROWS, COLS);
budget_count = budget;
if(lowest_cost <= 0){
printf("Lowest cost is too low (must be positive) \n");
break;
}
while(budget_count > lowest_cost){
do{
do{ /* This do assures xpos never equals ROWS (unlikely) */
xpos = floor( runif(0,ROWS) );
}while(xpos == ROWS);
do{
ypos = floor( runif(4,COLS) );
}while(ypos == COLS);
}while(COST[xpos][ypos][layer] > budget_count);
population[xpos][ypos][agent]++;
budget_count -= COST[xpos][ypos][layer];
} /* Should now make random actions allowed by budget */
}
}
The above function cals the min_cost
function, which simply examines the COST
array to find the lowest cost action. It keeps filling up actions in the ACTION
array until it’s full.
/* =============================================================================
* This function will find the minimum cost of an action in the COST array
* for a particular agent (layer). Inputs include:
* COST: A full 3D COST array
* layer: The layer on which the minimum is going to be found
* budget: The total budget that the agent has to work with (initliases)
* rows: The total number of rows in the COST array
* cols: The total number of cols in the COST array
* ========================================================================== */
int min_cost(double ***COST, int layer, double budget, int rows, int cols){
int i, j;
double the_min;
the_min = budget;
for(i = 0; i < rows; i++){
for(j = 0; j < cols; j++){
if(COST[i][j][layer] < the_min){
the_min = COST[i][j][layer];
}
}
}
return the_min;
}
We now have a functioning way to initialise a population of agents that will later go through a genetic algoirthm to select the best actions. In working through this, I’ve seen that an earlier idea of mine (not sure if I wrote this down below) might be useful – have a column in both COST
and ACTION
that is simply bankem
– essentially stashing costs in a way that doesn’t do anything. This might be important for situations in which an agent actually benefits by doing nothing, or when we want some general way to consider the benefits of stake-holder actions that affect utility but have no effect on resources or other stake-holders (e.g., holiday time).
Add new bankem
action on COST
and ACTION
arrays
I have added a new action bankem
onto the COST
and ACTION
arrays, which was not too difficult at all in practice. I envision this category of actions as (probably) always having a cost equal to one. Essentially, it’s a way to shift unspent costs to a category, which might or might not affect the agent’s overall utility.
Initialise a new crossover function
I have written a crossover function that, for each individual in the population, assigns a crossover partner (e.g., as would occur in sexual reproduction). With the partner assigned, the function then swaps ACTION
array elements with some fixed probability (uniform crossover method). I don’t see any reason to consider multiple types of crossover at this point, so I believe this method will be sufficient.
/* =============================================================================
* This function will use the initialised population from intialise_pop to make
* the population array undergo crossing over and random locations for
* individuals in the population. Note that we'll later keep things in budget
* Necessary variable inputs include:
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* pr: Probability of a crossover site occurring at an element.
* ========================================================================== */
void crossover(double ***population, int pop_size, int ROWS, int COLS,
double pr){
int agent, row, col;
int cross_partner;
double do_cross;
double agent_val, partner_val;
for(agent = 0; agent < pop_size; agent++){
do{
cross_partner = floor( runif(0, pop_size) );
}while(cross_partner == agent || cross_partner == pop_size);
for(row = 0; row < ROWS; row++){
for(col = 4; col < COLS; col++){
do_cross = runif(0,1);
if(do_cross < pr){
agent_val = population[row][col][agent];
partner_val = population[row][col][cross_partner];
population[row][col][agent] = partner_val;
population[row][col][cross_partner] = agent_val;
}
}
}
}
}
Originally, I was going to use a swap function to swap agent and partner values. The swap function is still in the utilities.c
file, but I think the above code is more readable.
I think it will make more sense to deal with the budget after mutation. That is, as a result of crossover and mutation, some individuals might go overbudget on their actions. I think randomly removing actions in the event of being over budget is best solved after mutation to prevent redundancy; this was a constrain_cost
command originally written in R, so I can use this as a template.
Mutation function created
I have written a function to cause random mutations in the population array during the genetic algorithm.
/* =============================================================================
* This function will use the initialised population from intialise_pop to make
* the population array undergo mutations at random elements in their array
* Necessary variable inputs include:
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* pr: Probability of a mutation occurring at an element.
* ========================================================================== */
void mutation(double ***population, int pop_size, int ROWS, int COLS,
double pr){
int agent, row, col;
double do_mutation;
double agent_val;
double half_pr;
half_pr = 0.5 * pr;
/* First do the crossovers */
for(agent = 0; agent < pop_size; agent++){
for(row = 0; row < ROWS; row++){
for(col = 4; col < COLS; col++){
do_mutation = runif(0,1);
if( do_mutation < half_pr ){
population[row][col][agent]--;
}
if( do_mutation > (1 - half_pr) ){
population[row][col][agent]++;
}
if( population[row][col][agent] < 0 ){
population[row][col][agent] *= -1;
} /* Change sign if mutates to a negative value */
}
}
}
}
I might or might not want to tweak this later on because I’m not sure if the type of mutation is agressive enough to search for adaptive strategies. This issue will be greatly mitigated by the seeding of random action arrays and crossover, but I might want to come back to allow mutation to a wider range of numbers later. For now, there is simply a probability of a mutation occurring at each element, then, if a mutation occurs, the action value will either increase by one or decrease by one (if the original value was zero, it will increase to one). It’s tempting to allow for bigger jumps, but if they are too big then they will regularly go over budget and hence cause the whole array to reshuffle again (essentially creating a random array and removing a potential opportunity for increased fitness.
The next function that needs to be written is one that constrains the costs to be at or under budget after crossover and mutation, then a fitness function is needed (which will probably require several sub-functions to keep the code readable).
Separate ACTION
and COST
arrays
I’ve now made separate the arrays that affect an agent’s actions and the agents costs (from a total budget) for performing things actions. The indices of these arrays will match at all times, such that COST[i][j][k]
will be the cost of an agent k
performing ACTION[i][j][k]
Each agent will therefore have its own 2D layer that will include rows of other agents and columns of utilities and actions. This adds an extra array to a considerable number of things that we already need to keep track of, but I think it is less confusing than what I was doing before, and in the end separating costs from actions will be worth it. Ideally, all of this would just be some special struct
in C, but, as mentioned yesterday, this won’t work because R and C need to work seemlessely.
This is much more comprehensivle in another respect; the genetic algorithm only needs to deal with the ACTION
array, using the COST
array as a reference. This readability of the code alone will probably be worthwhile. As another bonus, while re-writing the code, it is now obvious that it is unecessary to mutate, crossover, etc., only a select few rows; in the ACTION
array, they are all fair game as determined by COST
(columns 0-3 cannot be changed, but this is easy to remember).
Working call to game.c
, but bad action return
There is now a working game.c
file that user.c
functions call, with proper header files to link. For some reason, the action arrays returned right now are incorrect, so this is the next thing that needs to be done. In general, I think it will be a good idea to make sure that calls from gmse.R
are maintained without crash.
Begin working on the genetic algorithm
I have now initialised the file game.c
, which will hold everything related to the genetic algorithm, including multiple functions for running each individual process. The file will include a high-level function that brings in five arrays.
UTILITY
array. The whole thing will need to be read in because agents need to have the option to affect one another’s arrays (e.g., the potential to affect the cost of each others actions). I’ll need to be careful, eventually, regarding the order of agent actions to make sure that the order in which stake-holders are put through the genetic algorithm doesn’t affect resulting agent strategies (or, if this is inevitable, then stake-holder order should be randomised).AGENTS
array will be necessary for agents to look up one anothers (and their own) locations, yield, etc.RESOURES
array will be needed for agents to look up how many resources there are of each type, where they are located, and what consequences of these agents might be expected.para
array of parameter values will be needed for any specifications of the genetic algorithm (e.g., mutation and crossover rate) we might want to implement from R.LANDCAPE
array needs to be read in to identify both the owners of cells and the yield from cells, and anything else that might be of interest.A couple other challenges that I need to keep in mind (but do not want to implement yet).
parameters
, for now) will need to be included. Or, at least, the histories back to some arbitrary point in time. The reason for this is that we’ll eventually want agents to be able to look back on past decisions and adjust their behaviours to maximise their own utilities. This will get nasty, and I think the best thing to do might be to read in histories as separate arrays (e.g., have a UTILITY
and a UTILITY_REC
), or at least immediately separate them after reading them into the ga
function. Nevertheless, doing so will be a challenge, in the case of UTILITY
requiring a 4D array that agents will search through. I will build the framework of the genetic algorithm with this in mind, making it flexible enough to expand into histories. This needs to be done in C, else it will be extremely slow, and it might take some time even with good coding in C.ga
function to be able to call from R and C. This isn’t actually difficult, but worth mentioning because I think it will be helpful for users of the G-MSE R package. Really, the ga
function will be called by default within user.c
and manager.c
, being linked to each in compiling – keeping the genetic algorithm code in its own file seems like a good idea.I think it will be best to force ga
to specify a single agent whose fitness will be maximised (as this agent will need to be replicated 100ish times for the evolution of a single agent to be simulated). If nothing else, this will make the code easier to follow. Hence, the main functions of both manager.c
and user.c
will call ga
(linked with the game header file #include "game.c"
), reading in all of the five arrays above and specifying for which agent it is running the genetic algorithm. In manager.c
, for example, only type1 = 0
agents will be run, while these agents will be exclused in user.c
.
Progess while coding the initialisation of a population
I think it makes sense to keep these functions generaly and very explicit about what can and cannot be tweaked. For example, given a 2D array, I am using x0
, x1
, y0
and y1
as indices that determine where to start and stop in terms of changing things. For example, this function that will be called from the initialise_pop
function specifies all points in where to search the UTILITY
layer for the lowest possible cost (needed for later).
/* =============================================================================
* This function will find the minimum cost of an action in the UTILITY array
* for a particular agent (layer). Inputs include:
* UTILITY: A full 3D utility array
* layer: The layer on which the minimum is going to be found
* budget: The total budget that the agent has to work with (initliases)
* ========================================================================== */
int min_cost(double ***UTILITY, int layer, double budget, int x0, int x1,
int y0, int y1){
int i, j;
double the_min;
the_min = budget;
for(i = x0; i < x1; i++){
for(j = y0; j < y1; j++){
if(UTILITY[i][j][layer] < the_min){
the_min = UTILITY[i][j][layer];
}
}
}
return the_min;
}
This requires more input, but I think it’s also clearer what is meant to happen. The above function compiles without error.
Change to the UTILITY array
Having started coding in C, I’ve decided that it will be much easier to code if I switch what is represented in the first four rows of the a layer of the UTILITY
array. Now, the first two rows in which agent = -2
will be the focal agent’s cost, while rows 3 and 4 will be the focal agents actions. This will make it easier to code for the manager’s actions later.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-2 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
The reason this is easier is because now I can just randomise elements in the genetic algorithm below some value of rows. Agents should never be able to change their own costs, but can always change their own actions (agent = -1
), and potentially the actions of other agents (agent
> -1). Fortunately, this doesn’t require any extra coding of the initialisation of UTILITY
– I just need to note that I’m doing it this way from now on.
Scrap the above idea completely
It was better the way it was – I confused myself with the 3 dimensions. The only actions on resources are in the focal agents first two rows. The second two rows will always be the costs of the focal agent for performing the first two rows of actions, and every other row is a cost associated with adjusting the cost of each other agent – but the actual change that is made where these costs are not infinite (i.e., for the managers) will be made in other layers of the UTILITY function.
Here’s how it will work: Agents can do things to resources movem
, castem
, killem
, feedem
, helpem
at a cost. What they do is specified by the first two rows of their UTILITY
layer (agent = -2
). The cost of doing each of these is specified in the second two rows (agent = -2
). They can also potentially change the cost of other agents doing things to resources; this is determined by other remaining rows. But the tricky bit is that their actions need to take effect in the other layers of UTILITY
. Hence, we need to somehow hold the actions as they apply to UTILITY
without affecting the UTILITY
array itself throughout the process of the genetic algorithm (if we start changing UTILITY
, then we need some way to test changes with respect to agent fitness and then put the array back as it was – actions therefore need to be recorded).
I didn’t want to do this, but I think it might actually be necessary to have two arrays instead of one UTILITY
array. These two arrays would include:
COST
array, which would be a 3D array (layers are agents) that identifies the cost of each agent changing something that affects agent actions.agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
The agent = -1
here would just be the direct cost of the focal agent in the layer affecting resources.
ACTION
array, which would be a 3D array of dimenions identical to that of COST
that would determine what an agent actually does.agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The benefit here is that the elements would line up completely so that it would be easy to keep track of actions and costs, and the ACTION
array would be all that needs to be tweaked for the genetic algorithm.
It would be nice to specify a new struct
in C for all of this, but that wouldn’t change the fact that everything needs to read in and out seemlessly with R, so I don’t think that this is possible.
Regrouping and finding a way forward on the utility functions
Reviewing my old thoughts on getting the genetic algorithm to work and get agents to do something to maximise thier own utilities. The first thing to do is to initialise a UTILITY
array. I don’t see anyway around this – what is needed is a three dimensional array where each dimension z
is an agent. A single agent’s utility and decision-making process is therefore represented in a matrix like the one below.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-2 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Each agent will need to have a total cost budget, which will be specified in the AGENT
array in its own column. In the UTILITY ARRAY
above, the rows where agent = -2
(column 1) identify the actions of an agent – these are the things that an agent can do to resources. In the above example, the agent is not doing anything to resources (all are zeros). The rows where agent = -1
indicate the costs of doing things that affect the resources (i.e., the columns where agent = -2
. The agent represented by this z
layer of the 3D array can therefore spend from their total budget where agent = -1
to add actions where agent = -2
, which in turn affects resources in one way or another. All of the remaining rows (agent = 0
to agent = 2
) define actions that would affect the costs of other agents. Esseentially, values (all currently Inf
) represent the cost of changing another agent’s cost by 1. So if we imagine a manager that wants to change the cost of movem
for a stake-holder from 5 to 10, and their cost value in the table is 0.5, then it will cost 2.5 from their budget to increase this amount (or decrease). Note that there is also the opportunity for stake-holders to directly affect the utilities of other stake-holders – for a cost. I’m not going to play around with these options yet because it will get very complicated. Instead, I will now write a function for inialising this array in R. Once the simple case of a genetic algorith for affecting resources based on utilities and budgets is up and running, then I will start doing more complex things like having stake-holders affect one another’s utilities and costs.
Note that column 1 refers to the agent ID, not the agent type. Hence, agent = 1
will be a manager, not a stake-holder. It’s possible that there could be other managers too, but the status of an agent can be accessed with the AGENT array.
Initial function making utility array
A function below returns all of the necessary information for the table above, but with random numbers placed for all columns after type3
.
make_utilities <- function(AGENTS, RESOURCES){
UTILITY <- NULL;
agent_IDs <- c(-2, -1, unique(AGENTS[,1]) );
agent_number <- length(agent_IDs);
res_types <- unique(RESOURCES[,2:4]);
unique_types <- dim(res_types)[1];
types_data <- lapply(X = 1:agent_number,
FUN = function(quick_rep_list) res_types);
column_1 <- sort( rep(x = agent_IDs, times = unique_types) );
columns_2_4 <- do.call(what = rbind, args = types_data);
static_types <- cbind(column_1, columns_2_4);
dynamic_types <- matrix(data = 0, nrow = dim(static_types)[1], ncol = 8);
dynamic_vals <- sample(x = 1:10, size = length(dynamic_types),
replace = TRUE);
dynamic_types <- matrix(data = dynamic_vals, nrow = dim(static_types)[1],
ncol = 8);
colnames(static_types) <- c("agent", "type1", "type2", "type3");
colnames(dynamic_types) <- c("util", "u_loc", "u_land", "movem", "castem",
"killem", "feedem", "helpem");
UTILITY <- cbind(static_types, dynamic_types);
return( UTILITY );
}
I’m not sure the best way to add the currently random numbers in a function, except that it these values might need to be put into the array by the user, who will want to specify which agents care about which resources and how much it will cost to change things. Better, the user could just perhaps, eventually, just specify the utilities of each stake-holder with each type (this is less to input). Then, once the genetic algorithm for the manager is up and running, all of the costs will be initialised by the manager, somehow – with default costs for the manager to affect stake-holder costs. This scheme would minimise user input and have the costs arise organically from the model and management system, while the utilities would be specified by the user. For now though, I’ll have to input the cost values by hand.
Function tweak to make 3D array
The previous function wasn’t quite right because it only made one layer of the 3D UTILITY
array. Really, each layer needs to be replicated for each agent, as below.
#' Utility initialisation
#'
#' Function to initialise the utilities of the G-MSE model
#'
#'@param AGENTS The agent array
#'@param RESOURCES The resource array
#'@export
make_utilities <- function(AGENTS, RESOURCES){
agent_IDs <- c(-2, -1, unique(AGENTS[,1]) );
agent_number <- length(agent_IDs);
res_types <- unique(RESOURCES[,2:4]);
UTIL_LIST <- NULL;
agent <- 1;
agents <- agent_number - 2;
while(agent <= agents){
UTIL_LIST[[agent]] <- utility_layer(agent_IDs, agent_number, res_types);
agent <- agent + 1;
}
dim_u <- c( dim(UTIL_LIST[[1]]), length(UTIL_LIST) );
UTILITY <- array(data = unlist(UTIL_LIST), dim = dim_u);
return( UTILITY );
}
#' Utility layer for initialisation
#'
#' Function to initialise a layer of the UTILITY array of the G-MSE model
#'
#'@param agent_IDs Vector of agent IDs to use (including -1 and -2)
#'@param agent_number The number of agents to use (length of agent_IDs)
#'@param res_types The number of unique resource types (cols 2-4 of RESOURCES)
#'@export
utility_layer <- function(agent_IDs, agent_number, res_types){
LAYER <- NULL;
unique_types <- dim(res_types)[1];
types_data <- lapply(X = 1:agent_number,
FUN = function(quick_rep_list) res_types);
column_1 <- sort( rep(x = agent_IDs, times = unique_types) );
columns_2_4 <- do.call(what = rbind, args = types_data);
static_types <- cbind(column_1, columns_2_4);
dynamic_types <- matrix(data = 0, nrow = dim(static_types)[1], ncol = 8);
dynamic_vals <- sample(x = 1:10, size = length(dynamic_types),
replace = TRUE); # TODO: Change me?
dynamic_types <- matrix(data = dynamic_vals, nrow = dim(static_types)[1],
ncol = 8);
colnames(static_types) <- c("agent", "type1", "type2", "type3");
colnames(dynamic_types) <- c("util", "u_loc", "u_land", "movem", "castem",
"killem", "feedem", "helpem");
LAYER <- cbind(static_types, dynamic_types);
return( LAYER );
}
So when there are two agents, the make_utilities
function returns a 3D array of 4 rows, 12 columns, and 2 layers.
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] -2 1 0 0 9 2 1
[2,] -1 1 0 0 7 3 1
[3,] 1 1 0 0 9 8 4
[4,] 2 1 0 0 8 5 1
[,8] [,9] [,10] [,11] [,12]
[1,] 8 8 8 3 9
[2,] 2 7 10 2 1
[3,] 5 10 6 3 8
[4,] 2 6 6 1 5
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] -2 1 0 0 1 8 9
[2,] -1 1 0 0 3 7 2
[3,] 1 1 0 0 6 2 4
[4,] 2 1 0 0 4 3 7
[,8] [,9] [,10] [,11] [,12]
[1,] 9 5 3 9 10
[2,] 6 5 7 5 7
[3,] 1 2 2 2 9
[4,] 4 8 9 7 10
I’ll record changes in the UTILITY
array over time to track social changes and game strategy. For now, the next goal is to write a genetic algorithm that will work on the UTILITY
array (with input from the AGENT
, LANDSCAPE
, and RESOURCE
arrays) to optimise stake-holder actions. The simplest case will be maximising crop yield.
Plans for the genetic algorithm, short and long term
In the short term, it is therefore necessary to write a set of functions for a genetic algorithm, starting first with the functions written in R on 7 FEB 2017 to show proof of concept. I will use these on the UTILITY arrays that I made today and show how agent actions can be simulated to maximise a simple scenario – trying to make as much crop yield as possible, where resources decrease yield if they are on the land. The most difficult part of this will be the fitness function. Essentially, stake-holder agents are going to need to learn or know the relationship between resources and their crop yields, then do something to affect the resources. There are two ways that the relationship between resource and crop yield could be implemented in the model:
consume
column in the RESOURCES
array. This is pretty straightforward to implement. Each agent could simply count the number of resources on its cells, look at the landscape cel values, then calculate the proportion their crop yield is predicted to decrease and act accordingly to maximise yield (e.g., by killing resources). This is probably the first implementation to try.Bringing in the manager will, of course, make things even more complex. I think the best order to do all of this is to focus on 1 above first, then build managers into the model with 1, and then work on thinking about how to implement 2.
The plotting of \(2 \times 2\) figures that include maps of land ownership and individual stake-holder yields is now complete for observation types 2 and 3. With this complete, I will now turn to writing yesterday’s R function in C (which needs to happen anyway – may as well do it now to keep things fast). Once this is complete, then it will be easier to start building a genetic algorithm for maximising the utility of one stake-holder. Ignoring manager decision-making and conflicting stake-holders for the time being, I will focus on a stake-holder type with a relatively clear goal: maximise crop yield. Using the utility matrices and genetic algorithm notes from earlier, I’ll be able to write a general function in c that affects user behaviour.
User function now written in C
The user function that was written originally in R has now been coded in c. This makes it much faster to first place agents on their own land (if they own land), then count up their yield from the landscape. Testing of this function finds that everything appears to work normally for all observation types and different land dimensions.
I have run valgrind to check for memory leaks again (since it’s been a while).
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < gmse.R
No memory leaks were reported.
==26147== HEAP SUMMARY:
==26147== in use at exit: 104,719,416 bytes in 18,583 blocks
==26147== total heap usage: 5,168,708 allocs, 5,150,125 frees, 953,760,506 bytes allocated
==26147==
==26147== LEAK SUMMARY:
==26147== definitely lost: 0 bytes in 0 blocks
==26147== indirectly lost: 0 bytes in 0 blocks
==26147== possibly lost: 0 bytes in 0 blocks
==26147== still reachable: 104,719,416 bytes in 18,583 blocks
==26147== suppressed: 0 bytes in 0 blocks
==26147== Reachable blocks (those to which a pointer was found) are not shown.
==26147== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==26147==
==26147== For counts of detected and suppressed errors, rerun with: -v
==26147== ERROR SUMMARY: 196884 errors from 2 contexts (suppressed: 0 from 0)
Next, I can start to make the users actually do things that might maximise their own yield (e.g., shoot resources or farm cells more effectively). I play to write a flexible genetic algorithm function in c. The function itself could be called from a higher-level function so as to be used directly in R (though I don’t plan to do this for normal G-MSE operations, but it might be useful to include direct R call optoins once the package is complete).
New landscape layer identifying land ownership
There is now a new layer of landscape, and I have tweaked things to make the current default three layers. These layers include:
When the cell owner is 0, this effectively means the land is under manager (e.g., public) control. The new initialise landscape function now allows the user to explicitly set the proportion of cells that should go to each owner (vector input).
make_landscape <- function(model, rows, cols, cell_types, cell_val_mn,
cell_val_sd, cell_val_max = 1, cell_val_min = 0,
layers = 3, ownership = 0, owner_pr = NULL){
the_land <- NULL;
if(model == "IBM"){
if(rows < 2){
stop("Landscape dimensions in IBM must be 2 by 2 or greater");
}
if(cols < 2){ # Check to make sure the landcape is big enough
stop("Landscape dimensions in IBM must be 2 by 2 or greater");
}
cell_count <- cols * rows;
the_terrain <- sample(x = cell_types, size = cell_count,
replace = TRUE);
the_terrain2 <- rnorm(n = cell_count, mean = cell_val_mn,
sd = cell_val_sd);
if( length(ownership) == 1 ){
who_owns <- sample(x = 0:ownership, size = cell_count,
replace = TRUE);
the_terrain3 <- sort(who_owns); # Make contiguous for now
}else{
who_owns <- sample(x = ownership, size = cell_count,
replace = TRUE, prob = owner_pr);
the_terrain3 <- sort(who_owns);
}
the_terrain2[the_terrain2 > cell_val_max] <- cell_val_max;
the_terrain2[the_terrain2 < cell_val_min] <- cell_val_min;
alldata <- c(the_terrain, the_terrain2, the_terrain3);
the_land <- array(data = alldata, dim = c(rows, cols, layers));
}
if( is.null(the_land) ){
stop("Invalid model selected (Must be 'IBM')");
}
return(the_land);
}
Hence in the above, if ownership = 0
, then the layer is effectively ignored, or if it is a scalar, then ownership of landscape cells is divided equally among integer values from zero to the scalar. However, the most thorough way to set ownership will be by setting ownership
to a vector of possible owners and owner_pr
to their relative proportions of cells owned. Addition of this landscape layer has been tested and runs without error.
Linking cell yield with agents
I have now begun a user
R function, which currently (1) moves agents to somehwere on their owned landscape (if not already there) and (2) calculates the amount of their total yield from the landscape and stores this total amount in the AGENTS
array.
user <- function(resource = NULL,
agent = NULL,
landscape = NULL,
paras = NULL,
model = "IBM"
) {
check_model <- 0;
if(model == "IBM"){
# Relevant warnings below if the inputs are not of the right type
if(!is.array(resource)){
stop("Warning: Resources need to be in an array");
}
if(!is.array(agent)){
stop("Warning: Agents need to be in an array");
}
if(!is.array(landscape)){
stop("Warning: Landscape need to be in an array");
} # TODO: make sure paras is right length below
if(!is.vector(paras) | !is.numeric(paras)){
stop("Warning: Parameters must be in a numeric vector");
}
# If all checks out, then run the population model
#======================================================================
# TEMPORARY R CODE TO DO USER ACTIONS (WILL BE RUN FROM C EVENTUALLY)
#======================================================================
for(agent_ID in 1:dim(agent)[1]){
owned_cells <- sum(landscape[,,3] == agent_ID);
# --- Put the agent on its own land
if(owned_cells > 0){ # If the agent owns some land
a_xloc <- agent[agent_ID, 5];
a_yloc <- agent[agent_ID, 6];
while(agent[agent_ID,1] != landscape[a_xloc, a_yloc, 3]){
a_xloc <- sample(x = 1:dim(landscape)[1], size = 1);
a_yloc <- sample(x = 1:dim(landscape)[2], size = 1);
}
agent[agent_ID, 5] <- a_xloc;
agent[agent_ID, 6] <- a_yloc;
}
# --- count up yield on cells
agent_yield <- 0;
xdim <- dim(landscape[,,3])[1]
ydim <- dim(landscape[,,3])[2]
for(i in 1:xdim){
for(j in 1:ydim){
if(landscape[i,j,3] == agent[agent_ID,1]){
agent_yield <- agent_yield + landscape[i,j,2];
}
}
}
agent[agent_ID, 15] <- agent_yield
}
USER_OUT <- list(resource, landscape, agent);
# TODO: User actions are next...
#======================================================================
check_model <- 1;
}
if(check_model == 0){
stop("Invalid model selected (Must be 'IBM')");
}
return(USER_OUT);
}
It might be useful to also have a column in the AGENTS
array that records percent capacity of yield for stake-holders, perhaps by saving the original landscape (before resources remove yield) and calculating a proportion. A couple notes, the indicated code above will need to be put into C – it’s much to slow for R already. Also, for some reason, if I don’t store a_xloc
and a_yloc
back into the appropriate agent[agent_ID, 5]
and agent[agent_ID, 6]
, respectively, a weird bug appears. The actual resource population (but not its estimate) flatlines after 20 or so generations at some value. This is very weird because the file gmse.R
doesn’t even return the resource or landscape arrays – not yet. I’m not sure why a bug in this the code affects population demographics, but fixing it also appears to correct the problem completely. This is something to watch out for, however.
Plotting owned landscape and stake-holder yield
The figure below shows some new output for G-MSE. The left column of the figure is familiar, but the right column now provides some feedback for five simulated stake-holders that own roughly equal amounts of land. The actual plots of land are shown in the upper right, while the individual yields for each stake-holder’s plots are shown over time in the lower right.
As of now, this image is only produced for the first two observation functions (case 0 and 1), so I need to replicate it in the other two observation functions. Eventually, it would be better to just have one function for plotting so that any changes made would really be global.
Tracking crop yield over time
Given that resources now can affect the second layer of the landscape, which can model the percent crop yield (or anything else), we can now plot the mean percent yield per cell (orange) over time along with resource abundance (black) and its estimate (blue). The figure below shows this for an example in which each independent visit by a resource reduces crop yield by 50% (e.g., the individual consumes half of the resources on a cell if it arrives there at a time step).
This has now only been coded for the mark-recapture plot, so the next task is to fill this out for all of the plot types, then add a new layer of the landscape that will designate each cell with a number that identifies the owner of the land, or if the land is public (type 0). This will allow me to link crop yield to a specific agent.
Fix read in and out of landscape array from R to C
While testing the resource-landscape interaction, there was an issue with the landscape array being read into C correctly. When R sends an array or vector into C, it is sending the contents of a list (i.e., what might be a \(2 \times 2\) array in R gets read in as if each element were in a list of four elements). The structure of the array then needs to be correctly defined in C so that it matches what it was in R. This requires placing the contents of the elements coming in from R in the correct order with respect to pointers in C, and this occurs in reverse order, so if we had a table in R
Y1 | Y2 | |
---|---|---|
X1 | 1 | 2 |
X2 | 3 | 4 |
The list would be read in (apparently) as [1, 3, 2, 4], so if we want to read this in to an array in R, and we prefer to make a pointer to X1 and X2 location (which is easier for my brain because it allows array[i][j]
to refer to the i individual and j trait), then we need to read in the array as follows:
the_array = malloc(x_size * sizeof(double *));
for(i = 0; i < x_size; i++){
the_array[i] = malloc(y_size * sizeof(double));
}
vec_pos = 0;
for(j = 0; j < y_size; j++){
for(i = 0; i < x_size; i++){
the_array[i][j] = R_ptr[vec_pos];
vec_pos++;
}
}
This is not quite intuitive at first, but doing it this way gets R and C on the same page. For example, here is the RESOURCES
array moving from R to C and back again. Printed in each environment, the array is the same (note, they could be differently structured and still be technically consistent – e.g., if all arrays were transposed – but this would be a nightmare to code).
> RESOURCES[1:4,1:4]
IDs type1 type2 type3
[1,] 1 1 0 0
[2,] 2 1 0 0
[3,] 3 1 0 0
[4,] 4 1 0 0
> RESOURCE_NEW <- resource(resource = RESOURCES,
+ landscape = LANDSCAPE_r,
+ paras = paras,
+ move_res = TRUE,
+ model = "IBM"
+ );
1.000000 1.000000 0.000000 0.000000
2.000000 1.000000 0.000000 0.000000
3.000000 1.000000 0.000000 0.000000
4.000000 1.000000 0.000000 0.000000
> RESOURCES <- RESOURCE_NEW[[1]];
> RESOURCES[1:4,1:4]
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 2 1 0 0
[3,] 3 1 0 0
[4,] 4 1 0 0
When reading in the landscape, this got confusing beause the same thing had to be done in three dimensions, and initially I lost track of the pointers causing the layers to mix. This has been resolved now, and I have tested to ensure that landscape elements are identical when read into C and when returned back into R
> LANDSCAPE_r[1:4,1:4,1:2]
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 2 2 2
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
, , 2
[,1] [,2] [,3]
[1,] 1.540700 -1.7960987 2.7759525
[2,] 1.483312 0.5855166 -0.4789347
[3,] 1.579536 1.0600302 2.2923279
[4,] 1.745043 0.2437264 0.6171671
[,4]
[1,] 0.6596141
[2,] 0.8117666
[3,] 2.0330554
[4,] 1.1496975
> RESOURCE_NEW <- resource(resource = RESOURCES,
+ landscape = LANDSCAPE_r,
+ paras = paras,
+ move_res = TRUE,
+ model = "IBM"
+ );
1.000000 2.000000 2.000000 2.000000
2.000000 2.000000 2.000000 2.000000
1.000000 1.000000 1.000000 2.000000
1.000000 1.000000 2.000000 2.000000
1.540700 -1.796099 2.775952 0.659614
1.483312 0.585517 -0.478935 0.811767
1.579536 1.060030 2.292328 2.033055
1.745043 0.243726 0.617167 1.149697
> RESOURCES <- RESOURCE_NEW[[1]];
> RESOURCE_REC[[time]] <- RESOURCES;
>
> LANDSCAPE_r <- RESOURCE_NEW[[2]];
> LANDSCAPE_r[1:4,1:4,1:2]
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 2 2 2
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
, , 2
[,1] [,2] [,3] [,4]
[1,] 1.540700 -1.7960987 2.7759525 0.6596141
[2,] 1.483312 0.5855166 -0.4789347 0.8117666
[3,] 1.579536 1.0600302 2.2923279 2.0330554
[4,] 1.745043 0.2437264 0.6171671 1.1496975
>
The biological interactions (i.e., the function from 3 MAR) now does what it is supposed to do, and I will move on to make the landscape interactions more interesting.
Allow layers to change by themselves each generation
Given that some resources will affect layers of the landscape, modelling consumption of biomass on cells, it is necessary to also include a function that changes the landscape cell values without any input from resources. This can model the growth of biomass on cells between time steps. I’ve therefore written a new function that does this in R (I don’t think this will be complex enough to require it in C).
update_landscape <- function(model = "IBM", landscape, layer, mean_change,
sd_change = 0, max_val = 1, min_val = 0){
the_land <- NULL;
if(model == "IBM"){
xlength <- dim(landscape[,,layer])[1];
ylength <- dim(landscape[,,layer])[2];
lsize <- xlength * ylength;
adj_vals <- rnorm(n = lsize, mean = mean_change, sd = sd_change);
adj_layer <- matrix(data = adj_vals, nrow = xlength, ncol = ylength);
new_layer <- landscape[,,layer] + adj_layer;
new_layer[new_layer > max_val] <- max_val;
new_layer[new_layer < min_val] <- min_val;
landscape[,,layer] <- new_layer;
the_land <- landscape;
}else{
stop("Invalid model selected (Must be 'IBM')");
}
return(the_land);
}
One feature of the G-MSE model is now that, in addition to a hard imposed carrying capacity on resource types, it is also possible to make the carrying capacity a natural function of the landscape. For example, we might force individuals on the landscape to consume a certain amount of resources on the landscape to survive or reproduce. Hence, as landscape cell values decrease modelling the consumption of biomass, fewer individual resources can survive or reproduce.
Ideally, it will then be possible to parameterise the model using data for, e.g., how much damage to biomass a goose can do to a patch of land. As of now, by default, I’m just assuming that it decreases crop yield by 10%, and increases its own survival probability by the same when it lands on a cell.
For some reason, a function that I wrote to reset the landscape values screwed with the resource abundances (flat-lined after 20 gens for no clear reason). I’ve reverted to a simpler function, and will build up off of this tomorrow, but it would be nice to know why the R function was affecting the population dynamics even when it returned the same landscape that it took in. Tomorrow, I will build up a new function with similar features piece by piece to make sure it works. Then, I will do some initial simulations modelling crop growth as affected by resources on a landscape, and resource dynamics in turn affected by crops. Things to add after include:
I’m not sure which to tackle first just yet – perhaps the former because the latter doesn’t seem necessary now.
Resource-landscape interactions
Having now resolved the issue concerning multi-layered landscapes, it’s time to actually use one of these layers in the model. The goal here is to do the following:
It would be nice if, for example, individuals could have their probability of death decrease if they are on a cell of high value (modelling increased food consumption), or their probability of giving birth (or number of offspring) increase. Movement rules could also allow individuals to gravitate towards high value cells (or stop when landing on one), thereby modelling behavioural change to move toward areas where opportunities for foraging (or nesting, or something else) are high. This could affect consumption of food on different landscape types (e.g., cropland) and hence make it possible to also model management strategies of diversionary feeding.
To incorporate the above, a new function in c is going to be needed that models interaction beween resources and landscapes. This function will require input of:
resource
arraylandscape
arrayI will program this in a flexible way within c, and use some default features that will probably decrease a trait and landscape value by a uniform proportion each time (which seems intuitively more reasonable than a uniform value if we’re thinking about probabilities of mortality and proportion of food on a landsdcape eaten). Key options will be called from R.
Progress on resource-landscape interactions
The initial code to allow interaction is written in the form of the following function, locating on a local branch (not pushed on GitHub).
/* =============================================================================
* This function reads in resources and landscape values, then determines how
* each should affect the other based on resource position and trait values
* Inputs include:
* resource_array: resource array of individuals to interact
* resource_type_col: which type column defines the type of resource
* resource_type: type of resources to do the interacting
* resource_col: the column of the resources that affects or is affected
* rows: the number of resources (represented by rows) in the array
* resource_effect: the column of the resources of landscape effect size
* landscape: landscape array of cell values that affect individuals
* landscape_layer: layer of the landscape that is affected
* ========================================================================== */
void res_landscape_interaction(double **resource_array, int resource_type_col,
int resource_type, int resource_col, int rows,
int resource_effect, double ***landscape,
int landscape_layer){
int resource;
int x_pos, y_pos;
double c_rate;
double current_val;
double esize;
for(resource = 0; resource < rows; resource++){
if(resource_array[resource][resource_type_col] == resource_type){
x_pos = resource_array[resource][4];
y_pos = resource_array[resource][5];
c_rate = resource_array[resource][14];
landscape[x_pos][y_pos][landscape_layer] *= (1 - c_rate);
current_val = resource_array[resource][resource_col];
esize = resource_array[resource][resource_effect];
resource_array[resource][resource_col] += (1 - current_val) * esize;
}
}
}
This needs to be tested more carefully – for some reason both layers are being affected, and I need to make sure that the landscape is being read in correctly.
RESOLVED ISSUE #14: Success on multi-layered landscapes
Initial testing suggests that I have successfully coded landscapes into G-MSE that have more than one layer. The G-MSE program now initialises (for the moment) landscapes that have depth of two layers, such as the below.
## , , 1
##
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 9 3 7 8 3 2 8 4 1 1
## [2,] 8 5 10 6 5 10 6 1 9 2
## [3,] 6 8 5 9 2 3 4 9 7 9
## [4,] 9 2 8 9 10 4 8 7 2 1
## [5,] 1 7 5 3 9 9 3 7 1 1
## [6,] 3 7 4 10 7 5 9 1 8 8
## [7,] 10 9 4 4 5 7 10 8 3 5
## [8,] 3 4 4 2 5 5 2 5 7 2
## [9,] 3 3 3 7 6 5 3 10 9 7
## [10,] 8 8 2 7 4 7 7 8 10 6
##
## , , 2
##
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0.99 0.76 0.12 0.83 0.67 0.28 0.26 0.30 0.51 0.68
## [2,] 0.68 0.71 0.79 0.74 0.28 0.66 0.53 0.59 0.18 0.39
## [3,] 0.85 0.42 0.56 0.50 0.92 0.05 0.78 0.98 0.14 0.59
## [4,] 0.90 0.29 0.42 0.82 0.07 0.29 0.37 0.95 0.32 0.14
## [5,] 0.63 0.11 0.33 0.46 0.18 0.26 0.73 0.66 0.77 0.42
## [6,] 0.36 0.69 0.15 0.96 0.94 0.52 0.04 0.27 0.79 0.83
## [7,] 0.33 0.85 0.16 0.30 0.99 0.95 0.44 0.27 0.41 0.44
## [8,] 0.93 0.68 0.75 0.15 0.67 0.20 0.67 0.71 0.91 0.87
## [9,] 0.06 0.53 0.94 0.13 0.14 0.14 0.42 0.22 0.12 0.31
## [10,] 0.28 0.46 0.27 0.01 0.86 0.68 0.05 0.69 0.18 0.33
Hence, we can now have different layers representing different aspects of the landscape. For example, the first layer of the array above (,,1
) could represent the kind of terrain type for each cell, while the second layer (,,2
) could represent the potential crop yield of the cell. The resource function now returns both the resource array and the multi-layer landscape, meaning the code structure is now in place to do some actual biology. We might have resources located on a particular cell increase or decrease the values of one or more layers. This can then be returned as information to agents or retained somehow. It might get fairly memory-intensive if G-MSE is saving ever layer of landscape for every time step, so it’s worth thinking about how to use the dynamic landscape in each generation.
Running valgrind
on the new program reassures me that I’ve not done anything too bone-headed in allocating memory for a three dimensional landscape array.
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < gmse.R
It all appears to be freed successfully, with no memory leaks picked up.
==27105==
==27105== HEAP SUMMARY:
==27105== in use at exit: 78,219,488 bytes in 16,679 blocks
==27105== total heap usage: 2,791,027 allocs, 2,774,348 frees, 621,207,897 bytes allocated
==27105==
==27105== LEAK SUMMARY:
==27105== definitely lost: 0 bytes in 0 blocks
==27105== indirectly lost: 0 bytes in 0 blocks
==27105== possibly lost: 0 bytes in 0 blocks
==27105== still reachable: 78,219,488 bytes in 16,679 blocks
==27105== suppressed: 0 bytes in 0 blocks
==27105== Reachable blocks (those to which a pointer was found) are not shown.
==27105== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==27105==
==27105== For counts of detected and suppressed errors, rerun with: -v
==27105== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Hence, I have merged from a local branch to branch dev. I have also read in the new landscape into the observation function.
Where all of this is going now
Now that I have the hang of returning multiple elements from C to R simultaneously (which is accomplished basically by making a structure SEXP
allocating to a list VECSXP
, each element of which can be an array), it will be easier to think about the code more holistically – each part of the model can potentially take in and return every type of object, hence there are no restrictions on what one model can affect. To model geese, which is probably the first type of conflict I’m tempted to look at, I can use the population model to allow landscape layers to affect probability of geese mortality and for geese (RESOURCE
array) in turn to affect the landscape, and hence crop output. Next week, I’m hoping to get the required code for doing this in place, and to also get some feedback regarding how utility functions of agents should be modelled at the upcoming workshop after my presentation. The game-theoretic component can probably be a work in progress though, and it should be possible to model geese without adding these complexities until after receiving feedback, though thinking about the game-theoretic algorithm and data structures continues to be a high priority of mine.
Future decision-making algorithm
A recent paper by Miyasaka et al. (2017) looks at land use in a social-ecological system using an agent-based model and some interesting decision-making rules. Individuals calculate utility ‘’(expressed in terms of probability) for all land-use and location options […] and select the option with the highest utility’’. Basically, agents in this model rank probabilities of all land-use options, then try the one with highest probability, then go down the line if the first is not successful (I assume that the payoffs after success are identical, though this isn’t entirely clear to me). Agents also shift decisions and labour allocation based on similar households (imitation).
Coding goals for the day
movement_dir
: Causes movement in the x
or y
directionedge_effect
: Does something at the edge when encounteredFurther progress
Goal 1 has been completed, and, as I’ve been tempted to do before, I have added a new utilities.c
file for holding functions that need to be called by other c files (e.g., moving resources).
The next thing I have done (on an unpushed branch, now merged) is to tweak the observation
function in C to return a list with two elements. The first element is the set of observations in array form (as before), and the second element is the AGENTS
array. With the new c code, I can now return multiple things from C to R with the same function, which opens up some new possibilities, in particular allowing the landscape to change along with the resource and observation arrays within the same C function. It also potentially makes the anecdotal function obsolete, as the change in the AGENTS
array can just be made within observation
instead of calling a new R function.
The next challenge is to get a multi-layered landscape in and out of C from R. I’m not sure how many ways there are to do this, exactly, but the simplist might actually be to write a three dimensional array to read in the same way as the two dimensional arrays. This will require a very nasty set of loops allocating pointers of pointers of pointers (i.e., ***land), but the idea should be easy enough.
Moving back to machine learning in G-MSE
Having now completed a new R package modelling a genetic algorithm for iterated games played on \(2 \times 2\) symmetric payoff matrices, and applied this package to an upcoming presentation on the future of G-MSE, I now turn back to coming up with a functional genetic algorithm for G-MSE software.
Additional issues
Currently, there are five outstanding issues in G-MSE. Issues 9, 11, and 12 are rather trivial and easy to implement. Issues 10 (dealing with the implementation of multiple resources) and 14 (dealing with multiple landscape dimensions) require more serious consideration. Perhaps partly because of the recent ConFooBio focus on geese and the sizeable special issue in Ambio that just came out, I am thinking about first coding in multiple layers to LANDSCAPE
. As noted in issue 10, this can be done fairly straightforwardly in R and C – the different layers can simply be list elements in R (so LANDSCAPE[[i]]
is one layer that is actually a matrix of real values) and read in as a three dimensional array in C. I was able to do this while making the gamesGA package, with the agents
array being set in R, then being unlisted with unlist()
before being changed into array form and read into C in fitness.R. This isn’t a particularly elegant solution, but it’s one that could work with code such as the following:
land_r_vec <- unlist(LANDSCAPE_r);
land_r_arr <- matrix(data = land_r_vec, nrow = dim(LANDSCAPE_r)[2]);
Alternatively, and probably better in the long run for efficiency (though I doubt that the above call would lose much) if the lists were directly read into C and returned as lists. I’m not sure if this is possible, but if it is, I’ll have to make use of C data structures that are read in from R’s C interface.
The reason that I’m keen to start with the landscape layer implementation is that I think this might be the best way to initially model crop production. The more flexible way to do it would be to put crops in the RESOURCE
array, but this would require much more memory and computation time than I really think is necessary for what, in all cases that I can currently concieve, really comes down to just a real number at a location. By adding this real number to the landscape and letting it be increased or decreased by agents and resources, we can have the most straight-forwad method of modelling crop production as affected by farmers, managers, and animals. Another layer, however, is potentially needed mapping x
and y
locations to ownership of a particular stake-holder. Thus, I can imagine a landscape with three layers, the combination of which will let us address some basic questions concerning conflict between farmers and conservationists:
## [[1]]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 1 2 4 2 4 4 3 4 1 1
## [2,] 2 1 1 1 1 1 1 3 4 2
## [3,] 4 4 1 2 2 1 1 3 1 1
## [4,] 1 2 4 3 4 3 1 2 1 2
## [5,] 3 3 4 2 2 2 4 3 3 3
## [6,] 1 1 3 2 3 2 1 1 3 3
## [7,] 4 3 4 4 2 4 2 2 3 3
## [8,] 4 4 4 3 2 2 2 2 2 3
## [9,] 3 3 3 3 2 3 1 4 4 2
## [10,] 4 3 1 4 2 1 4 3 4 4
##
## [[2]]
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.1216903 0.87620468 0.11585701 0.63549572 0.5499610 0.7247075
## [2,] 0.7033615 0.19920331 0.16832448 0.75077540 0.2100236 0.4698127
## [3,] 0.5296123 0.67776576 0.53901556 0.66568077 0.3636397 0.1929119
## [4,] 0.3904591 0.35405551 0.05141652 0.02382895 0.7146165 0.4595963
## [5,] 0.7282723 0.87061639 0.52147139 0.95831419 0.0962034 0.1766737
## [6,] 0.5647454 0.71690912 0.14866888 0.24325914 0.1882054 0.2663904
## [7,] 0.9803632 0.09975753 0.61459262 0.06857492 0.8732086 0.6268729
## [8,] 0.5676452 0.69077684 0.57366414 0.18195190 0.1786565 0.3679847
## [9,] 0.4064361 0.61380977 0.36716846 0.89387672 0.3675970 0.6841347
## [10,] 0.6857836 0.81209502 0.99363538 0.94351049 0.5384275 0.3058252
## [,7] [,8] [,9] [,10]
## [1,] 0.08078416 0.40033129 0.36557829 0.3415115
## [2,] 0.32981568 0.44815580 0.66727945 0.6074423
## [3,] 0.27434346 0.55946386 0.35760666 0.1560804
## [4,] 0.65027376 0.85410670 0.02720575 0.7191878
## [5,] 0.10279874 0.10008282 0.38078379 0.4057249
## [6,] 0.55986298 0.08748274 0.29607922 0.0820963
## [7,] 0.64130285 0.14691714 0.28797907 0.6648468
## [8,] 0.23655619 0.42326755 0.91423327 0.1083085
## [9,] 0.06163966 0.15931278 0.39174257 0.8304182
## [10,] 0.71175442 0.02287805 0.28183796 0.2976164
##
## [[3]]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 1 1 2 1 2 1 1 2 1 2
## [2,] 2 1 2 1 2 2 1 2 1 2
## [3,] 2 2 1 2 1 1 2 2 2 1
## [4,] 2 2 2 2 2 1 2 2 1 1
## [5,] 1 2 2 1 1 1 2 2 2 2
## [6,] 1 2 1 2 1 1 1 2 1 1
## [7,] 1 2 1 1 2 2 1 2 2 1
## [8,] 1 1 2 2 1 1 1 2 1 2
## [9,] 2 1 1 1 2 2 1 1 2 1
## [10,] 2 2 1 1 2 2 2 2 2 2
Where, above, the first element (i.e., layer) is terrain type (already in G-MSE), the second element is the production potential of crops, and the third element is the stake-holder that owns the land (a zero could be included for public land). It would make sense if the land was contiguous – I don’t see a good algorithm for this, so it might be necessary to make one. I could imagine something that breaks down the map into equal segements (e.g., like programs created to avoid gerrymandering, but easier because we don’t have to worry about population size – at the moment). Of course, if the number of cells does not evenly divide by the number of simulated farmers, then some farmers are going to have bigger farms than others, but perhaps we want this to be the case? It would be nice to be able to specify the variation in farm size. In fact, it would probably be good to use this to also incorporate the total amount of farm space. Something like the below
#farmers <- 10;
#total_cells <- dim(landscape[[3]])[1] * dim(landscape[[3]])[2]
#pr_farmland <- 0.9;
#farm_cells <- floor( pr_farmland * total_cells );
#extras <- farm_cells - farmers; # Every farmer needs at least one cell
#farm_prp <- rep(x = extras / farmers, times = farmers); # Vec can change
#farm_cells <- sample(x = farmers, size = extras,
# prob = farm_prp, replace = TRUE);
#farm_cells <- c(farm_cells, 1:farmers);
#cell_table <- table(farm_cells);
#print(cell_table);
NOTE: The above has been commented out due to errors in making the page, for some reason.
The above cell_table
therefore shows how many cells each farmer gets. Some sort of (simple) cluster algorithm is needed to distribute each farmer’s allocated cells to an area of the landscape. Remaining cells will be 0
cells, indicating other land. Note that the pr_farmland
could also be a function of the number of cell types in landscape[[1]]
. I haven’t decided if this should be done in R (as above) or C. I’m leaning toward C (or, at least, kicking things to C when then get complicated) because I can imagine the need to specify some detail in these landscapes.
Landscape connections to birth rate
Given the above proposed additions to the landscape, it’s worth perhaps considering an option in the resource
function to link resource birth rate to properties of the landscape. As of now, carrying capacity is just assumed to be a parameter of the model that is static, but it would be particularly interesting if the parameter value could change based on properties of the landscape. For example, all of the values of landscape[[2]]
could be summed up (perhaps multiplied by a scalar) to determine number of offspring produced on any given cell. If instead of random uniforms between zero and one, landscape[[2]]
instead represented something more concrete such as kilograms of edible biomass produced (or whatever), this could be directly translated to offspring reared. Of course, with the geese, this opens up some issues – mainly that breeding is done somewhere else; perhaps the landscape[[2]]
should instead affect survival instead of birth rate?
It’s important to think about the scale here too – as a habit, I’ve been thinking of cells as kind of mid-sized things, perhaps a square kilometer at most, but it might be better to think of them as much larger, so each cell could be its own farm with potentially many geese. Of course, we’ll want the option to do both, but given the scale of the geese scenario, I’m thinking bigger might be better. It also would be useful to have managers be able to allocate refuge space from their budgets.
Big picture notes regarding G-MSE presentation
In presenting G-MSE, I think it is important to emphasise that game-theory is the standard, formal, tool for understanding conflict between rational agents. Hence, it is the natural tool for addressing issues of cooperation and conflict in conservation (Colyvan et al. 2011; Lee 2012; Kark et al. 2015; Adami et al. 2016; Tilman et al. 2016). It’s important to also recognise that game theory is broader in scope than the application of standard pay-off matrices, and includes extensions such as adaptive dynamics. Where strategies are complex, machine learning techniques such as the use of genetic algorithms can be used to find adaptive strategies for games (e.g., Balmann and Happe 2000; Tu et al. 2000). And a game-theoretic framework is entirely compatible with agent-based modelling (An 2012, Tesfatsion et al. (2017)). Bonabeau (2002), citing Axelrod, argues strongly for an agent-based approach to game theory. Hence, a good summary of the key concepts of G-MSE is shown below.
The big green circle is the engine that drives the decision-making of rational agents (i.e., managers and stake-holders) under complex choices and payoffs.
New considerations for machine learning in gmse
Finishing the gameGA R package has given me a bit more perspective on the eventual structure of the genetic algorithm of gmse
. Taking into account the history of interations between two agents was straightforward in the case of Prisoner’s dilemma, or any symmetrical \(2 \times 2\) payoff matrix. Because there were only two options to consider (‘cooperate’ and ‘defect’), every locus of each agent’s strategy just represented a response to a different interaction history. By changing the default parameters, I was also able to recreate the results of Darwen and Yao (1995), who found that strategy evolution under the following payoff values tended to result in long periods of defection or cooperation punctuated by rapid transitions from one to the other.
Opponent cooperates | Opponent defects | |
---|---|---|
Focal player cooperates | 3, 3 | 0, 5 |
Focal player defects | 5, 0 | 1, 1 |
Typical evolution of strategies given the above payoff matrix and a three-move memory history with 100 rounds per opponent looks like the below. Periods of low fitness show areas where most strategies have evolved to defect, while periods of high fitness show areas where most strategies cooperate.
This simple example highlights something that is potentially important for understanding conflict in conservation scenarios, cooperation (and conflict) might be fragile, with rapid shifts from one strategy dominating to another taking over without much external pressure.
The fragility or robustness of conflict in conservation could have major influences on policy, particularly where we’re interested in coming up with long-term sustainable solutions. Two key questions immediately come to mind:
gamesGA
scale up with complexity? That is, real-world conflicts are much more complex than this simplified model, so will this complexity make existing cooperation and conflict more robust, or more fragile? We can draw a comparison here, perhaps, with the community ecology literature, where the questions have long been posed, are more complex communities more stable and more productive? There is a lot of recent literature on this, both from theoretical and empirical studies, and stretching all the way back to the early works of Elton and May. Applying similar ideas to social-ecological systems could be useful – it could be that, like community ecology, there are qualities of such systems that make cooperation or conflict more robust (I’ve been particularly interested in degeneracy). I’ll revisit some of the community ecology literature to remind myself what the key points are.I’m wondering whether a couple papers could be especially useful to the conservation literature – one could be a perspective piece just on the application of game theory to understanding conflict and management, and things that will need to be taken into account (more on this later – but would include time lags, interactions among stake-holders, degeneracy of effects, etc.); the idea of applying game theory to management questions and conflict is now familiar to ConFooBio, but a lot of the thinking we’ve done could risk being lost if not published as a lead-in to more complex modelling, behavioural games, or time-series analyses. A second paper could be a basic starting point for addressing how robust cooperation and conflict are predicted to be – this might be answerable without the full power of a complex gmse
software, focusing instead on modelling some simplified scenarios (using a version of gmse
with a more samplified ‘g’), then (probably) concluding that additional work will be needed to really get at complex real-world case studies (which we’ll do with gmse
).
Other thoughts on strategy
It’s also worth noting that gamesGA
does not allow for some strategies that might be relevant, such as the ‘win-stay’ and ‘lose-shift’ strategy (Nowak et al. 1995). There is also considerable work on the robustness of cooperation in games such as Prisoner’s dilemma – a lot of which consider spatial effects (local networks, grid-based cooperation) explicitly. I’ve not found anything that looks at this in the context of conservation though, so I think there could well be scope for a high-level perspective paper here. It could also be worth considering other types of games, such as the snowdrift (chicken, hawk-dove) game, which cooperation and conflict are potential outcomes.
Opponent cooperates | Opponent defects | |
---|---|---|
Focal player cooperates | 2, 2 | 3, 1 |
Focal player defects | 1, 3 | 0, 0 |
The above payoff matrix produces even more fragile results (shown below).
.
The robustness of these results gets stronger though when the payoff differences get more severe so that mutual defection is much worse. For example, consider the following payoff matrix.
Opponent cooperates | Opponent defects | |
---|---|---|
Focal player cooperates | 12, 12 | 13, 11 |
Focal player defects | 11, 13 | 0, 0 |
Defection given the payoffs below has a much more difficult time getting a foothold because anytime defectors become sufficiently frequent in a population, their fitness drops dramatically compared to cooperators.
.
The point is that the differences between payof values will matter by increasing the risk associated defection (or cooperation). Note that values near zero in the first plot implied a population of mostly defectors, whereas equal magnitude of change in mean fitness does not correspond to as great a difference in the proportion of cooperators and defectors in the second figure.
A perspective paper on theory of conflict and cooperation in conservation?
Given that there is little to nothing on application of game theory to conflict and cooperation in conservation, it strikes me that a forward-looking paper could be useful for establishing some things – perhaps making use of the gamesGA
R package as a conceptual tool for demonstrating some key points. Relevant topics include:
Key questions in applying game theory to conflict and cooperation in conservation social-ecological systems.
gamesGA
point about robustness.Specific points regarding the complexity inherent to predicting social-ecological conflict: how to address these in a way that can be beneficial for management recommendations
The ultimate goal: towards a modelling framework that simulates adaptive management of populations under the influence of conflicting stake-holders, and is capable of simulating management options in silico to predict the efficacy of policy.
I’m not sure if this is the best structure or not, but I think I could see a paper like this succeeding in setting up the importance for future work.
gameGA: a new R package that also can be run from a browser
In preparation for two upcoming workshop talks, I have developed a new R package to demonstrate the potential of machine learning and genetic algorithms in understanding human conflicts between food security and biodiversity. Package installation instructions are available on the GitHub repository, and the program can also be run directly from a browser courtesy of shiny. This package is mostly to serve as a proof of concept; while limited in its applications (though it could later be incorporated into gmse
, if desired), it demonstrates a relevant and flexible application of machine learning to games theory. Further, the fact that the processing time of simulations is very rapid – not even noticeable even when run from a browswer (note, the fitness function is coded in c; had it been coded in R, most simulations would take up to a minute), shows that it is realistic to simulate multiple genetic algorithms (underlying multiple agent strategies) within a program. I have no desire to upload gameGA
onto the CRAN Repository, unless it is requested.
In the coming days, I will continue to put together a forward-looking talk that outlines how management strategy evaluation can be combined with game theory (making use of genetic algorithms to understand behaviour) to better understand and potentially help resolve conflicts over food security and biodiversity. I think that it will be reasonable to argue that the range of strategies predicted by even a simple iterated Prisoner’s dilemma (or any other two player two decision symmetric payoff scenario) probably reflect, reasonably well, the kind of variation in human behaviour that might be predicted in real systems. Most humans will not act completely rationally, therefore we might expect a lot of uncertainty in human behavoiur where conflict arises; most strategies will be aligned with the interests of individual stake-holders, even if they are not perfect at maximising stake-holder interests.
A central purpose of G-MSE software will be to provide a user-friendly yet flexible tool for simulating the management of populations, with particular emphasis on a mechanistic simulation of uncertainty and interactions among managers and stake-holders. Hence, the software will be able to address key questions concerning conflict in all of the specific ConFooBio case studies, but also provide a general framework for developing social-ecological theory. My hope is to introduce v1.0 by the end of the year, which will take advantage of shiny to let users run simulations and view results within a browser, giving as many users as possible access to the key features of G-MSE. Because shiny is called directly from R, users who are familiar with R will be able to use functions within a gmse
package (the name is not yet taken on the CRAN list). All of the code underlying G-MSE, and its complete development history, will be available on GitHub for maintenance, further development, and collaboration (currently, the repository is private, meaning it is viewable by invite only, but I’ve no qualms with making it public). My goal now is to summarise what aspects of G-MSE have already been developed, and to outline my plans for future development. Feedback at this stage is very welcome, particularly concerning what features of the software are most (or least) important. The figure below illustrates a general overview of G-MSE. The left panel represents how users will interact with the software, and the right panel represents the model itself, which uses the G-MSE concept proposed in the ConFooBio ERC proposal.
We now have a working, stable (i.e., bug-free, as far as I can tell), resource model (blue box above) and observation model (yellow box above). Details of how these models are coded and used can be found in the notes below, and I am happy to summarise them. For now, I will avoid the technical details and focus on what these models can do; the code is written with future development in mind, meaning that if there is a feature that is not in either of these models that should be, adding it will almost certainly be a matter of inserting a bit of additional code rather than re-coding major chunks of the model. I’ll start by talking about the resource model.
The resource model is, by default, individual-based. What this means is that each resource is represented as a discrete entity with a number of different attributes. I use ‘resource’ as a general term because these resources can really be anything that we want them to be; potential resources include grouse, hen harriers, geese, fish, elephants, crops, hunting licenses, etc. Basically, anything that we want to represent discretely that is not an agent (a manager or stake-holder) can be considered a resource. Each resoure has its own ID, and can take an natural number of types in three type categories (i.e., type1
can take any natural number to group resources in some way, as can type2
and type3
). Types could be used for different populations of resources within the same simulation (e.g., hen harriers and grouse; wild and farmed salmon), or further define life-history stages, sex, or something else.
Resources occupy some x
y
location on a landscape. The landscape can be of any length and width combination, and has a torus edge whereby opposite edges are joined so that resource moving off of one side appear on the other (I can easily add a hard boundary, or a reflecting edge if desired). Currently, the landscape has one layer (more could be added), with cells on the landscape taking any real number. I’m not using cell values at the moment, but these could represent anything from terrain types to environmental variables. During one iteration of the resource()
function, resources move according to one of four pre-specified rules:
x
and y
direction selected from a uniform distribution.x
and y
direction selected from a Poisson distribution.x
and y
direction selected from a uniform distribution (Duthie and Falcy 2013)After movement, each resource can potentially reproduce according to a growth
parameter. The number of offspring that a resource produces is determined by using this as the rate parameter in sampling from a poisson distribution. A carrying capacity can be applied to birth such that if too many offspring are born, then offspring are randomly removed until carrying capacity is reached. Offspring resources have identical traits to their parent resources. It is also possible to not allow any birth for some resources.
After birth, resources that were not just born can be removed (i.e., death) in one of three ways:
remove
trait.The resource model then returns the new set of individuals; we therefore have the basic processes of movement, birth, and death. These processes could be made more complex (e.g., sexual reproduction, more complex movement rules – toward or away from other resources, perhaps), and any number of other processes might be added into the resource model, including interactions between resources, if desired. I’m considering what we have now as a starting point.
The observation model simulates the process of data collection (but not data analysis, which is done elsewhere – eventually probably in the manager model). It basically generates an uncertain snapshot of the real population(s) by sampling from it in one of four ways (A-D):
The figure below shows the dynamics of a real population (black line) with a carrying capacity on death of 800 (dotted red line), as estimated by each method (blue lines, panels A-D).
We can run 100 time steps of 800 resources in a trivial amount of time (less than half a second) using any observation method. Of course, things slow down when adding more resources or generations, but even hundreds of thousands of resources can be simulated over 100 time steps can be simulated in under a minute.
In addition to the resource and observation models, I have played around with a few more minor things that can be called on when desired. This includes a function called anecdotal
that allows agents
(managers and stake-holders) to see any resources within their local view
– essential mimicking anecdotal observation through seeing how many resources are around them at any given time (this might later affect stake-holder attitudes or decisions).
The most interesting other thing that I’ve added is a prompt for user input. Basically, after a certain number of time steps (or right from the start of the simulation), an option in the program allows the user to act like a stake-holder or manager. After a time step has finished, the user is prompted with a message like the following on the R command line:
Year: 95
The manager says the population size is 181
You observe 11 animals on the farm
Enter the number of animals to shoot
The user then types in how many animals that they wish to shoot, and these animals are removed from the population.
A detailed journal of recent development history is below. Here I will summarise how I plan to complete the software, and the rationale behind some (tentative) decisions. Right now, I am focused on getting through the main engine of G-MSE (red, green, and orange boxes from the first figure above), with the primary challenge of integrating game theory into G-MSE. The manager and users models are unique in that both require agents to make decisions that potentially affect each other and the resources. I am simulating agents as discrete individuals, but unlike resources, agents have different traits and are represented by different data structures in the code. Like resources, however, agents can take on any number of three different category types. Category type1
is the type of most importance, which is used primarily for distinguishing among managers and different types of stake-holders. The manager(s) is always of type1 = 0
, and plays a special role in observing the population, and will make policy decisions based on the observational model and (eventually) the numbers and past behaviour of stake-holders through the manager model. Other type1
agents act as stake-holders instead of managers, and act through the user model.
I’ve spent some time trying to decide how to incorporate game theory into the G-MSE software. There is more than one way to do it. My first thought was to model games using the traditional \(2 \times 2\) payoff structure, with managers setting the payoffs and two stake-holders acting as players trying to maximise their gains. Given this sort of structure, solving for optimal strategies can be easily accomplished, and we can certainly add this type of mathematical solution as an option in G-MSE. The utility of this kind of mathematical approach starts to unravel, however, when games become more complex (discussion and references all below, mostly from late January). In particular, solving for optimal strategies and equilibria (of which there can be multiple) can become increasing difficult to intractable given any of the following:
Any realistic social-ecological conflict is probably going to include one or more more of the above complications. While I really like simple mathematical and conceptual models (particularly those that provide unifying explanations), and believe that they are especially useful in developing theory, I don’t believe that the case studies that we are interested in will be tractable if we exclude the above bulleted possibilities. Nor will the software be very flexible if we confine users to very simple games. Hence, I think a different approach is needed to model the strategies of rational agents.
An increasingly used method of simulating complex, goal-oriented strategies, is through the use of machine learning. The idea behind the machine learning approach is to teach a program to learn, so the program can figure out how to solve a problem (e.g., find the best strategy) without actually being told the solution; figuring out the best solution would be effectively impossible because there are too many possible solutions to explore and compare. One technique for narrowing down the possibilities and arriving at a very good (though possibly not best) solution is to simulate the process of natural selection using a genetic algorithm. The genetic algorithm starts of with a random set of simulated genomes (genotype), each of which maps to a random strategy (phenotype). It then allows for recombination between genomes, and mutation, and checks each strategy to see how good each is at solving the problem at hand (e.g., maximising the payoff in a game). The most fit strategies reproduce, and more generations are simulated until some criteria is filled (e.g., the mean fitness of strategies is no longer improving, or 100 generations have passed). Once this criteria is met, the evolved strategies have been selected to solve the problem.
Additionally, a machine learning approach can use data to learn how to behave in a particular scenario. Google uses this in some of their software, perhaps most familiarly in gmail sorting incoming emails into different categories. Most timely, and perhaps most excitingly for those of us who are interested in games, a machine learning algorithm has been very recently developed that can consistently beat professional poker players. From the linked article in MIT Technology Review,
‘’DeepStack learned to play poker by playing hands against itself. After each game, it revisits and refines its strategy, resulting in a more optimized approach. Due to the complexity of no-limit poker, this approach normally involves practicing with a more limited version of the game. The DeepStack team coped with this complexity by applying a fast approximation technique that they refined by feeding previous poker situations into a deep-learning algorithm.’’
My goal is to apply a genetic algorithm to G-MSE, which will allow manager and stake-holder behaviour to be modelled for any potential objectives (e.g., maximise crop yield, keep populations near carrying capacity, keep all stake-holders happy, etc.) and allowing for multiple types of actions (e.g., hunt, scare, plant crops, protect offspring, pester the manager, forbid stake-holders). I’m not yet sure if this is realistic or not, but I think the genetic algorithm approach will at least get us further than anything else (save for some sort of brilliant new conceptual theory that shows how we can avoid the aforementioned complications). I’ve drafted a prototype genetic algorithm in R, which conceptually looks like the following figure.
The end result of this kind of implementation of G-MSE could allow us to:
This concludes the summary. There are a lot of challenges to implementing the genetic algorithm, but a very initial prototype below shows the idea. My hope is to have a beta version of G-MSE up and running sometime in the summer (with a polished version later in the year), and to continue to build upon the model as needed to allow for new scenarios and improved genetic algorithms. I am very open to feedback on what is and is not important for initial versions of G-MSE.
Prototype of genetic algorithm in R
I have constructed a prototype of a genetic algorithm, written in R, but deliberately avoiding most base R functions that are not available (or that I won’t want to use) in c. Once I have a prototype that I’m happy with, I will write it up in c and start to implement it into G-MSE. There are afew tricks that I’m going to want to use, particularly to swap arrays in the tournament, which I believe can be accomplished just by swapping all pointer addresses. Additional optimisation ideas might be found here; I’ll probably need to be careful to keep this process speedy, but even the initial R code is fairly efficient, so I’m optimistic.
I’ve broken the R code down into five basic functions, representing the boxes from the most recent conceptual diagram from 3 FEB, with the exception of ‘replace’, which is done automatically within ‘tournament’. The first function identifies the focal agent in UTILITY, then initialises a population of 100 agents, 10 of which are identical to the focal agent, and 90 of which are identical in all except their five action columns, which are randomised instead (note, the whole file is recorded by git in scratch.R
, which might later be overwritten). We also need a min_cost
function to run in initialise_pop
too allocate actions according to costs cleanly.
min_cost <- function(budget_total, util, row){
the_min <- budget_total;
for(check in 1:5){
index <- (2*check) + 5;
if(util[row, index] < the_min){
the_min <- util[row, index];
}
}
return( as.numeric(the_min) );
}
# Add row 10X to 90 random (first brown box)
initialise_pop <- function(UTILITY, focal_agent, population){
for(agent in 1:dim(population)[1]){
if(agent < clone_seed){
for(u_trait in 1:dim(population)[2]){
population[agent, u_trait] <- UTILITY[focal_agent, u_trait];
}
}else{ # No need to bother with a loop here -- unroll to save some time
population[agent, 1] <- UTILITY[focal_agent, 1];
population[agent, 2] <- UTILITY[focal_agent, 2];
population[agent, 3] <- UTILITY[focal_agent, 3];
population[agent, 4] <- UTILITY[focal_agent, 4];
population[agent, 5] <- UTILITY[focal_agent, 5];
population[agent, 6] <- UTILITY[focal_agent, 6];
population[agent, 7] <- UTILITY[focal_agent, 7];
population[agent, 9] <- UTILITY[focal_agent, 9];
population[agent, 11] <- UTILITY[focal_agent, 11];
population[agent, 13] <- UTILITY[focal_agent, 13];
population[agent, 15] <- UTILITY[focal_agent, 15];
population[agent, 8] <- 0;
population[agent, 10] <- 0;
population[agent, 12] <- 0;
population[agent, 14] <- 0;
population[agent, 14] <- 0;
population[agent, 16] <- 0;
lowest_cost <- min_cost(budget_total = budget_total, util = UTILITY,
row = focal_agent);
budget_count <- budget_total;
while(budget_count > lowest_cost){
affect_it <- 2 * floor( runif(n=1) * 5); # In c, do{ }while(!=6)
cost_col <- affect_it + 7;
act_col <- affect_it + 8;
the_cost <- population[agent, cost_col];
if(budget_count - the_cost > 0){
population[agent, act_col] <- population[agent, act_col]+1;
budget_count <- budget_count - the_cost;
} # Inf possible if keeps looping and can't remove 1
}
}
}
return(population);
}
After the initiali population of agents is made, we include functions through which the genetic algorithm will loop, simulating key evolutionary processes. The first such process is crossing over.
# Crossover (second brown box)
# Would really help to define the SWAP function in c here -- use int trick
crossover <- function(population){
agents <- dim(population)[1];
cross_prob <- 0.1;
for(agent in 1:dim(population)[1]){
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 8];
population[cross_with, 8] <- population[agent, 8];
population[agent, 8] <- temp;
}
c2 <- runif(n=1);
if(c2 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 10];
population[cross_with,10] <- population[agent, 10];
population[agent, 10] <- temp;
}
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 12];
population[cross_with,12] <- population[agent, 12];
population[agent, 12] <- temp;
}
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 14];
population[cross_with,14] <- population[agent, 14];
population[agent, 14] <- temp;
}
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 16];
population[cross_with,16] <- population[agent, 16];
population[agent, 16] <- temp;
}
}
return(population);
}
Crossing over is followed by mutation.
# Mutation (third brown box)
# Note that negative values equate to zero -- there can be a sort of threshold
# evolution, therefore, a la Duthie et al. 2016 Evolution
mutation <- function(population, mutation_prob){
mutation_prob <- mutation_prob * 0.5;
for(agent in 1:dim(population)[1]){
c1 <- runif(n=1);
if(c1 < mutation_prob){
population[agent,8] <- population[agent, 8] - 1;
}
if(c1 > (1 - mutation_prob) ){
population[agent,8] <- population[agent, 8] + 1;
}
c2 <- runif(n=1);
if(c2 < mutation_prob){
population[agent,10] <- population[agent, 10] - 1;
}
if(c2 > (1 - mutation_prob) ){
population[agent,10] <- population[agent, 10] + 1;
}
c3 <- runif(n=1);
if(c3 < mutation_prob){
population[agent,12] <- population[agent, 12] - 1;
}
if(c3 > (1 - mutation_prob) ){
population[agent,12] <- population[agent, 12] + 1;
}
c4 <- runif(n=1);
if(c4 < mutation_prob){
population[agent,14] <- population[agent, 14] - 1;
}
if(c4 > (1 - mutation_prob) ){
population[agent,14] <- population[agent, 14] + 1;
}
c5 <- runif(n=1);
if(c5 < mutation_prob){
population[agent,16] <- population[agent, 16] - 1;
}
if(c5 > (1 - mutation_prob) ){
population[agent,16] <- population[agent, 16] + 1;
}
}
return(population);
}
The function below ensures that the costs of agents actions are not over the total budget. If they are after crosover
and mutation
, then actions are randomly removed until they are within the costs.
# Need to incorporate selection on *going over budget* and *negative values*
constrain_cost <- function(population){
for(agent in 1:dim(population)[1]){
over <- 0;
if(population[agent, 8] < 0){
population[agent, 8] <- 0;
}
over <- over + (population[agent, 8] * population[agent, 7]);
if(population[agent, 10] < 0){
population[agent, 10] <- 0;
}
over <- over + (population[agent, 10] * population[agent, 9]);
if(population[agent, 12] < 0){
population[agent, 12] <- 0;
}
over <- over + (population[agent, 12] * population[agent, 11]);
if(population[agent, 14] < 0){
population[agent, 14] <- 0;
}
over <- over + (population[agent, 14] * population[agent, 13]);
if(population[agent, 16] < 0){
population[agent, 16] <- 0;
}
over <- over + (population[agent, 16] * population[agent, 15]);
while(over > budget_total){
affect_it <- 2 * floor( runif(n=1) * 5); # Must be a better way
cost_col <- affect_it + 7;
act_col <- affect_it + 8;
if(population[agent,act_col] > 0){
the_cost <- population[agent, cost_col];
population[agent, act_col] <- population[agent, act_col] - 1;
over <- over - the_cost;
}
}
}
return(population);
}
After mutation, a fitness function checks the fitness of each agent. This will eventually be a complex function balancing actions according to costs and utility, but for now I’ve just given the agent with the highest 16th column the highest fitness (i.e., maximise helpem
).
# Fitness -- this is the most challenging function
# Just as proof of concept, let's just say fitness is maximised by helpem (16)
strat_fitness <- function(population){
fitness <- rep(0, dim(population)[1]);
for(agent in 1:length(fitness)){
fitness[agent] <- population[agent,16];
}
return(fitness);
}
Finally, we have tournament selection, which also replaces the original population. Tournament selection proceeds by randomly selecting four agents from the population, and passes the agent out of the four with the highest fitness to the next population. This kind of tournament selection seems effective, and will be more efficient in c that some other tournament types (e.g., best 4 out of 10), I think.
# Tournament selection on population
tournament <- function(population, fitness){
agents <- dim(population)[1];
traits <- dim(population)[2];
winners <- matrix(data = 0, nrow = agents, ncol=traits);
for(agent in 1:dim(winners)[1]){
r1 <- floor( runif(n=1) * dim(winners)[1] ) + 1
r2 <- floor( runif(n=1) * dim(winners)[1] ) + 1
r3 <- floor( runif(n=1) * dim(winners)[1] ) + 1
r4 <- floor( runif(n=1) * dim(winners)[1] ) + 1
wins <- r1;
if(fitness[wins] < fitness[r2]){
wins <- r2;
}
if(fitness[wins] < fitness[r3]){
wins <- r3;
}
if(fitness[wins] < fitness[r4]){
wins <- r4;
}
for(trait in 1:dim(winners)[2]){
winners[agent, trait] <- population[wins, trait];
}
}
return(winners);
}
We can therefore simulate the genetic algorithm with the following code, which simulates 30 iterations (i.e., generations) of crossover, mutation, and selection.
mean_fitness <- NULL;
clone_seed <- 11;
budget_total <- 100;
focal_agent <- 2;
# Add three agents, representing three stake-holders, to the utility array
a0 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
a1 <- c(1, 0, 0, 2, 0, 0, 8, 5, 30, 0, 20, 0, 10, 0, 10, 0);
a2 <- c(2, 0, 0, -1, 1, 1, 0, 0, 50, 0, 0, 1, 1, 2, 2, 1);
UTILITY <- rbind(a0, a1, a2);
population <- matrix(data = 0, ncol = 16, nrow = 100);
population <- initialise_pop(UTILITY = UTILITY, focal_agent = 2,
population = population);
mean_fit <- NULL;
iterations <- 30;
while(iterations > 0){
population <- crossover(population = population);
population <- mutation(population = population, mutation_prob = 0.2);
population <- constrain_cost(population = population);
fitness <- strat_fitness(population);
population <- tournament(population = population, fitness = fitness);
mean_fit <- c(mean_fit, mean(fitness));
iterations <- iterations - 1;
}
The plot below shows that the algorithm converges on the best fitness strategy (mean_fit
) quite rapidly.
Note that a strategy fitness of 10 is the highest possible because agents have a total budget of 100 and each helpem
costs 10 from this total budget. The rapid convergence is encouraging – the time taken from start to finish for this genetic algorithm is only 0.182 seconds in R (note also, that it found the solution in half the time), and will of course be much, much faster in c. Things will get slower as fitness functions become more complicated, and convergence might take a while given optimisation of multiple things.
Also note that the genetic algorithm will need to be run for multiple agents, slowing the processes down.
Re-structuring the UTILITY array
I’m now noticing that there is an error in the genetic algorithm as applied to G-MSE. While the algorithm shows a proof-of-concept well, agents aren’t actually individual rows in UTILITY
, they’re list elements made up of data frames. Hence, The code that I just constructed needs to be applied not to an individual row, but to lists.
I think that the above point might be a good excuse to improve upon the data frame itself, and specifically to incorporate costs in a more effective way, then improve upon the algorithm. It’s always important to keep in mind that the goal of the genetic algorithm is to teach agents to learn to maximise their own utility. Different agents will do this in different ways, so we need to keep everything broad – one idea might be to incorporate everyone’s UTILITY in the utility list; this could help out with the manager too. So now the data frame would like like the below.
agent | type1 | type2 | type3 | util | cost_util | … | cost_h | helpem |
---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | 0 | 2 | 101 | … | 101 | 0 |
0 | 2 | 0 | 0 | 0 | 101 | … | 101 | 0 |
1 | 1 | 0 | 0 | 2 | 0 | … | 10 | 0 |
1 | 2 | 0 | 0 | -1 | 1 | … | 2 | 1 |
2 | 1 | 0 | 0 | 0 | 101 | … | 101 | 0 |
2 | 2 | 0 | 0 | 1 | 101 | … | 101 | 0 |
Something is still not quite perfect yet. Note that I’ve added a cost_util
, which could be the cost of lobbying another stake-holder, or manager. We could then see stake-holders using some of their budget
to affect manager utilities. Each agent has its own array too – so the costs need to be uniquely reflecting the cost of agent index
on affecting another agent’s action or utility.
Maybe this is the wrong way to go – first two columns might be the utility then costs of the focal agent (as partially imposed by other agents), where subsequent rows could just be costs for imposing on all other agents? The array could then look something like the below. Note that values of Inf
, in the code, should just be some value that is higher than the cost of any agent, making it impossible that such values can be altered.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-2 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
0 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
0 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Note the agent number -2 refers to the utility values of the focal agent – in its rawest form, what does the agent want or value. This is defined for each type of resource (type1
). In the above, for example, if we consider resources where type1 = 1
to be crops and type1 = 2
to be geese, then we might have a farmer represented (note – the farmer likes crops, but is neutral on geese per se – I’m just assuming that geese are fine as long as they don’t affect crop production). The farmer can also specify whether the utility of each resource is dependent upon its location u_loc
being within its view
, or some other range – I’m a bit nervous about this column as I fear that it might constrain the kinds of questions that can be addressed. It might be good actually make this any natural number, rather than a TRUE
or FALSE
, then have utility attached to a natural number on some layer of the landscape, so that one layer of LANDSCAPE
can be the number of who owns it (zero for manager); a -1 could always just code for within view
. The u_land
specifies whether or not utility is attached to the value of some landscape layer (e.g., perhaps representing quality of land, or production in some cases). And finally the variables movem
, castem
, killem
, feedem
, and helpem
are all actions – what the focal agent does (initialised at zero above).
The agent number -1 is identical for all but the last five rows, which refer to the cost values for affecting each of the actions; cost is drawn from some total budget. Finally, the remaining rows are the costs for changing all other agents’ costs with respect to all resource types (note that before now, we’ve effectively only had the first four rows – this adds to things). This means that the focal agent might, potentially, increase or decrease the cost of a different stake-holder performing an action – this will mostly be applied when the manager is the focal agent, changing, for example, how much it costs for stake-holders to scare or hunt resources. But the model allows stake-holders to affect each other’s costs too, if this is useful. Focal agents might also try affect others utility values; for example, a stake-holder might pay some cost to try to close the gap between their utility value for a particular resource and the manager’s (or rival stake-holder’s) value (i.e. ‘lobbying’). Note that a focal agent’s costs are represented twice, once where agent = -1
, and once where agent
equals the agent’s natural number. However, the latter represents the ‘cost’ of changing its own cost which I’ve outlawd by setting it to Inf
. There are potentially some other uses for this redundancy. It might be useful in the future if we want to simulate negotiations, so an agent has their original cost/utility values, and a copy of what they are after they have been altered in some way.
The data frame above is therefore one of three data frames (one for each agent), each of which is an element in a list. The data frame above shows the utilities, actions, and costs of a focal agent, and the costs of affecting other agents’ costs. In the above, affecting other agents costs is always forbidden, so these values are all Inf
. A manager should be able to adjust these costs to enact policy – for example, by outlawing killing of resources.
Modelling crop yields and compensation
The compensation scheme and importance of government funding that Saro has noted in her summary of geese and farming conflicts leads me to believe that some direct form of compensation needs to be included in the genetic algorithm. Adjust the government funding is simple enough – we can just change the budget
of the manager. How compensation – and farming more generally – will work is a bit more complicated. Here are three ways that I see could work:
LANDSCAPE
to represent maximum farm yield. Reduce this yield for each organism on the land – assign one type of stake-holder to a patch of land using type
as an index of an individual farmer (note that the extra rows in the AGENTS
array might need to be ignored for determining agent actions).
RESOURCE
, thereby allowing it to interact more directly with other resources.
RESOURCE
array could get quite big, and more loops would probably be required to manage it.utility
and cost
columns in the UTILITY
array.
Of course, there’s no reason that all of these options can’t be implemented depending on the situation; they aren’t mutually exclusive in the code. I’m inclind to try the first option as a default. Adding a real number to each landscape cell could represent expected crop yield, and this number could change depending on the presence of resources and the actions of farmers. Note that the landscape cells are already initialised with a real number that was meant to represent types of landscape – this can just be changed to a real number that represents crop yield. The files resource.c
and observation.c
already read in c as an array of type double
; it’s really just a matter of using the landscape that is already available.
Crop yield could therefore affect utility – in that some utility value is assigned to (and multiplied by) the value of each cell. Presence of an organism could decrease this value (sidenote: we’ll need to think about order of operations in the model). Compensation could directly off-set the loss of utility. So we could take the data frame from Friday:
type1 | type2 | type3 | util | u_loc | u_land | cost_m | movem | cost_c | castem | cost_k | killem | cost_f | feedem | cost_h | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 2 | 0 | 0 | 8 | 5 | 30 | 0 | 20 | 0 | 10 | 0 | 10 | 0 |
2 | 0 | 0 | -1 | 1 | 1 | 0 | 0 | 50 | 0 | 0 | 1 | 1 | 2 | 2 | 1 |
The column util
could just be the direct effect that resources of type1
have on the agent, u_loc could be whether or not utility is affected by location, and if it is, then u_land could be its effect on the landscape layer (for now, there is only one layer of landscape values, but another column should probably be added to specify). Hence, representing a goose to a farmer that doesn’t care about geese at all but wants high crop yield could be util = 0
, u_loc = 0
, and u_land = -1
. A farmer that kind of likes geese but also crop yeld could be something like util = 1
, u_loc = 0
, and u_land = -1
. Note that the last farmer likes geese, but does not care where the geese are (u_loc = 0
), but does not want their effects on the farmer’s land (u_land = -1
). I think that this is probably the right way to go, though optimising fitness given these multiple interests will be a challenge.
It might also be useful to have a compensation
column, which managers could affect, though this could be done in other ways too. Also, I think I have interprted util
in a couple different ways throughout the notes, so it might be worth having a different column – perhaps target
, or something that identifies the target value that affects util
in some way. If a resource is at carrying capacity, for example, we might want some mechanism in the model by which stake-holders no longer want it to increase.
Given Saro’s notes, it might also be worth just allowing an option for a traditional \(2 \times 2\) payoff matrix too. This would allow another use of a genetic algorithm – to simulate the barganing process as in Tu et al. (2000).
Minor error correction
At 12:50, I have corrected a typo in the function ind_to_land()
, which was making it impossible to plot non-square landscapes correctly.
Late night (or early morning) idea
One way to solve the first major concern at the end of yesterday’s notes could be to do the following: Create a new list strategy
in which each element of the list corresponds to an individual agent, strategy[[agent_i]]
. The element itself would be a data frame that is eight columns wide and 1000 (ish) rows down. The first six of eight columns would identify a particular kind of individual – an agent or a resource. The columns would indicate a type as follows:
In the above, any negative values would indicate all individuals (i.e., disregard ID if 1 is -2) In all cases, some value The remain columns would define:
It could be a messy optimisation procedure, but this would ensure that agents could pinpoint which individuals to target, what to target, and by how much. The cost of doing so could be factored in perhaps by either: only enacting rows until actions are below the cost or constraining all actions to have an effect that is lower than the cost (e.g., by normalisation). There would need to be some error checks in it, but this could be the most flexible way to handle the search algorithm. The new structure would be either a list of data frames or (perhaps interpreted in c) a 3D array that is \(1000 \times 8 \times agent_{number}\).
Another look at the search algorithm
The above proposal seems reasonable, though I now need to think a bit more about the implementation. If doing something is the consequence of an if
statement in the code, then inapplicable values (e.g., non-existent types or locations) simply don’t add to the actions (or cost) – they would just be junk. Alternatively, they could add to the cost, and therefore be selected out in favour of better actions. I kind of like the latter more, for now, because I suspect it would cause convergence to happen more quickly and make the ‘genome’ more readable.
I also think that the entire strategy
array should probably be an int
, with random sampling during a mutation – note that this is a change from my earlier notes (anything before yesterday) in which I was planning to use double
values within the AGENTS
array. The previous plan would have just been a mess to implement, and this way we have a separate structure that acts as a ‘genome’ for strategies (I’ve not decided on how the utilities of strategies will be held yet – probably a separate UTILITY
array). The first seven columns will always be integers anyway, and it will be faster and more easy to understand if mutation just causes an integer change – just sample with new_mutation = floor( rbinom(0,1) * maxcol)
. Or, because we don’t have to do this biologically, maybe we come up with something like the pseudocode below:
mutate = rbinom(0,1);
if(mutate < 0.05){
effect = floor( rbinom(0,1) * maxcol);
value -= effect;
}
if(mutate > 0.95){
effect = floor( rbinom(0,1) * maxcol);
value += effect;
}
Note that the above avoids calling rbinom more than necesary, and it is fairly agressive in searching. My other thought was to just have mutation cause either value--
or value++
. My fear is that this could result in local maxima issues because ‘jumping over’ a type would be impossible. This could be fixed by something like the below:
mutate = rbinom(0,1);
if(mutate < 0.05){
effect = floor(mutate * 100);
value -= effect;
}
if(mutate > 0.95){
effect = floor( (1 - mutate) * 100);
value += effect;
}
This avoids calling rbinom
more than necessary, and avoids local maxima by letting mutations jump over types. It should also result in a mutation rate of 0.1, with equal probably of incrementing from 1 to 5 or decrementing from 1 to 5. But I’m still not terribly excited about the idea of making it add or subtract from value
above, as types are not ordinal.
Here’s another idea, maybe start with the aforementioned utility list/array. This array could be a list of arrays UTILITY[[agent]][row][col]
in R and a UTILITY[agent][row][col]
3D array in c in which each agent
is a list element or dimension. Rows could exhaust all possible types with their utilities in the final column such that the element corresponding to one agent, e.g., could be:
type0 | type1 | type2 | type3 | utility | cost |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 1000 |
0 | 1 | 0 | 0 | 0 | 1000 |
0 | 2 | 0 | 0 | 0 | 1000 |
1 | 1 | 0 | 0 | 2 | 8 |
1 | 2 | 0 | 0 | -1 | 12 |
In the above, there are three types of agents (type0 = 0
), which includes the manager (type1 = 0
) and two stake-holders (type1 = 1 & type1 = 2
). There are also two different types of resoures (type0 = 1
), which include type1 = 1
and type1 = 2
. Types 2 and 3 are unused for both agents and resources. Each type has a utility
– how much the particular agent values the identified agent or resource (though I’m not sure how this will be interpreted for agents, particularly when the identity becomes self-referential). The cost
identifies how much expenditure is required to affect the agent or resource – Note that this already creates the problem that different attributes of resources and agents should cost different amounts to affect. At the very least, we don’t want the cost of culling versus scaring to have to be the same. At the same time, we don’t want the ouput of this software to be so messy that end users won’t be able to interpret what is going on – the possibilities should correspond to clear management options, I think (though we also don’t want to constrain the model to force it to do what we presume is best; we want it to find novel solutions, where possible).
Ideally, it would be nice if both managers and stake-holders to potentially affect each other’s costs, but this creates a kind of infinite regress problem – the need for meta-costs for how much it costs to affect another agent’s cost; this is probably too much. Instead, maybe costs are a function of the manager’s utility
, and that all lobbying occurs on utility
. This avoids the ‘cost of costs’ problem – the only thing we lose is that one stake-holder might not be able to directly affect how easy it is to hunt or scare, or do anything else – though they might still affect each other’s utilities? Even this seems to get a bit too complex.
Maybe theres a starting point that gets the model working but also leaves remove for further development. What if the UTILITY
array looked more like this:
type1 | type2 | type3 | utility | cost_m | movem | cost_c | castem | cost_k | killem | cost_f | feedem | cost_h | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 2 | 8 | 5 | 30 | 0 | 20 | 0 | 10 | 0 | 10 | 0 |
2 | 0 | 0 | -1 | 0 | 0 | 50 | 0 | 0 | 1 | 1 | 2 | 2 | 1 |
Now in the above, only resources are actually considered in the UTILITY
array. There are five basic things that an agent can do to a resource – two ways to benefit it and three ways to have a negative affect on it. Agents can move resources (movem
), castrate resources (castem
), or kill resources (killem
). And agents can feed resources (feedem
) or help resources (feedem
). Doing each of these things comes with an associated cost. It’s important not to take these categories too literally, but for now they could loosely correspond to:
xloc
and yloc
(movem
),castem
)killem
)feedem
)helpem
)I’m not sure how this last one will work yet. This sacrifices some of the generality of the code, but in the context of what G-MSE is for, I don’t think that we lose much, and it’s easy to see how we could add columns to UTILITY
later as necessary. Then, following the diagram from yesterday, when mangers use the genetic algorithm, they could affect their own cost columns and everyone elses as a function of their own utilities and the population sizes of each resource. The costs would then be update for the stake-holders, who could adjust their own parameters accordingly to maximise their utility.
Note that the index of each array in the list UTILITY
will correspond to the ID in the AGENT
array. This could be useful, if we want, for example, to eventually let stake-holders affect one another. The first agent is also always the manager (and it’s hard to see a situation where we have more than one, but even if so, we can always have one head manager), so stake-holders can lobby the manager by indexing the zero index of UTILITY
with ease.
Focusing first on the stake-holder genetic algorithm
The genetic algorithm can be fairly straightforward, and (I think), fairly efficient given the data structure above. All of the five aforementioned columns can be mutated by any integer value, and unlike the case in which types were random, the numbers are ordinal so that the following code isn’t too much of a problem:
mutate = rbinom(0,1);
if(mutate < 0.05){
value--;
}
if(mutate > 0.95){
value++;
}
So the the code would do the following in c for a single agent:
malloc
a 3D array that is 100 deep, and copy the agent’s entirey UTILITY
data frame each time.Given the above, we need to update the internal structure of the genetic algorithm to the below:
The result is simpler, and therefore it should be faster and easier to implement. I’m hoping that it will be possible to code this to be as flexible as possible – enough to really allow for some complex interactions with agents affecting agents in different ways, but I think this will have to be addressed when I actually start writing the code. For the moment, I think the above is a good balance for stake-holder genetic algorithms. I also think that the spatial effects will also better emerge organically through restricting stake-holders to affecting only their own cell (or the view around their cell). If we let the genetic algorithm try to evolve to find the locations where an agent should do something, I think it would slow down considerably. We can always use view = 100
, or turn off spatial implementation, to have stake-holders affect across the whole region, and it’s hard to see why we would want stake-holders to arbitrarily pick out parts of the map to care about (if it’s caused by the presence of another resource, then the algorithm should find the right actions based on the resource’s utility).
Looking specifically at the fitness function
Let’s assume that costs (‘policy’ in the updated figure) are fixed for now, and the genetic algorithm takes these as a given. Then, instead of accounting for every other agents actions (like the manager might have to do – figure this one out later), each stake-holder could just check to see how their actions affect the abundance or local density of each resource. In fact, why don’t we add another column:
type1 | type2 | type3 | util | u_loc | cost_m | movem | cost_c | castem | cost_k | killem | cost_f | feedem | cost_h | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 2 | 0 | 8 | 5 | 30 | 0 | 20 | 0 | 10 | 0 | 10 | 0 |
2 | 0 | 0 | -1 | 1 | 0 | 0 | 50 | 0 | 0 | 1 | 1 | 2 | 2 | 1 |
For space, utility
has been shortened to util
, but I’ve also added a column u_loc
, which codes for whether or not the utility of the resource depends on that resource being ‘local’, in whatever sense this could be relevant. For example, a zero could be a simple FALSE
, while a 1 could be TRUE
in the sense that resource do not affect utility or therefore fitness unless they are on the same cell as the agent, or within view
. Better, values within this column might correspond to different definitions of ‘local’ – It could mean utility is important when the resource is on any cell of the same type of agent, or on a cell with a particular landscape property.
Now, to implement this, we can’t really just use the RESOURCE
array, because neither managers nor stake-holders should have access to it. We have to use either the observation array (or summary statistics from it) or the agent array. When stake-holders have an estimate of how many resources there are, their utility is affected and they can act in a certain way. Note that I’ve placed some odd utility values in the rows above, but maybe they should actually be much different, reflecting the ideal number of resources – perhaps more utility values are needed too:
I wonder if a second utility parameter is needed so that utility doesn’t not need to increase or decrease linearly with utility. Perhaps up to 3-5
util
andu_loc
values are needed – unneeded ones can be ignored later.
The above could see stake-holders switch strategies when satisfied. It might also be worth having some sort of a dummy resource, or a reluctance to spend cost
if not necessary – stake-holders have other things to do, and if everything is working fine, then they don’t need to do anything in the model, including use up costs, and perhaps there should be some pressure against it anyway.
In any case, I could see two types of information about resources being most relevant for stake-holders decisions, thereby affecting the fitness function:
anecdotal
).I can’t, offhand, thinking of why anything else would be absolutely necessary – not as a starting point, at least. Stake-holders could use one or both estimates to make their decisions, so those values and UTILITY
could be read into the fitness function. The fitness function would then estimate how resource abundances would change as a consequence of the stake-holder’s action, with higher fitness being awarded to actions that match utility values.
To make things even more complicated, it might be necessary to record the UTILITY
actions over time – eventually, stake-holders should be able to correlate them with changes in utility (both as a consequence of their own actions and other stake-holder responses) and use this in the genetic algorithm. For now, I think establishing a record-keeping method is enough. I’ll worry about how to incorporate the game history into decision making after I have a simpler working model.
Additional thoughts
While this is going on, I’ll want to keep in mind the four categories of Tu et al. (2000), which I mentioned on 30 JAN 2017. These define conflict by different utility functions. This might be especially relevant because next I’ll need to figure out how the manager is going to strategise given both stake-holders and resources.
For next week: Consider writing a proto-type for the genetic algorithm in R. Make sure that it works and trouble-shoot any unforseen issues before trying to write the whole thing in c
I have now started using GitHub projects to better organise G-MSE development. This appears to improve the workflow a bit better, which is good because the workflow is probably going to get more complicated once I starting coding the genetic algorithm and integrating it into G-MSE. Below is an updated overview of how I expected G-MSE to work in light of the genetic algorithm approach.
Note that the Genetic algorithm (in green above) is being used twice, once by managers and once by users (stake-holders). What is happening here is that managers are taking in the observation model and updating their management plan by tapping into the genetic algorithm. Likewise, after the managers do this, the stake-holders respond by also tapping into the same genetic algorithm to update their off-take. The flow of the model here needs to be planned carefully. I think the best places to start after the observation model is by reading in the observation data into the manager model. Managers can then use the observation data for analysis in manager.R
, which will call manager.c
(I don’t think that this will need to be a very intense program, but using c for the analysis will at least allow us to build upon the code in a complex way, if need be). The output from the manager model should then be the kinds of summary statistics that are relevant for policy-making. This could be as simple as an estimate of population size or density, or perhaps the abundance of different age classes. To keep things flexible, I think that a new object is needed – a new list or array – as output that is relevant for policy making. Perhaps if the output is just a scalar, then this can be interpreted as an abundance or density estimate, while if it is a vector, than it can represent ages or types?
Perhaps what we really want to do is tidy up the observational data? The manager.c
function could serve as a special type of apply
or tapply
R function, which takes in different column arguments and then calculates summary statistics for the specified column. So, if no column is specified, then manager.c
just estimates the population size or density of the entire population based on the observed data provided (using the mark-recapture or density procedures currently implemented by the chapman_est()
and dens_est()
R functions written in gmse.R
). Whereas, instead, if we specify a type column, then the estimate is done for all estimates of that type. If a type and an age column, for example, are specified, then estimates for all unique combinations of type and age are performed. The output is then an array with fewer rows – each row corresponds to an averaged out resource array where some columns don’t mean anything (e.g., ID). This will require a lot of error checking, as bad user input could cause the manager to do odd things – in fact, I think only column types and age (4 columns total) should be allowed to be uniquely estimated; even that is a bit much. The zero column of the array could then stand for an estimate instead of an ID in the output. For the simplest of cases where only one resource is being estimated, therefore, the relevant array manage
would have a zero index manage[0][0]
.
There is a bit of fuzziness here that should perhaps be better cleared out with planning. Currently the observation.R
function requires a type and category type to be specified. In other words, the observation function has been constructed (deliberately) to look at one type at a time. This is probably good – no need to change it, but for a simulation where multiple resources are being managed, observation()
will need to be run multiple times. I think that the best thing to do in this case is to store observation data frames in a list, such that OBSERVATION[[1]]
stores type 1 and OBSERVATION[[2]]
type 2. This will keep the high-level resource types separate; within the list elements, other types (e.g. sex) and age can be managed, and we can pass each list element to the manager model separately to produce some output. The output can also be stored as a list MANAGED[[1]]
, with all list elements subsequently being merged into one data frame – ideally in c, but possibly before in R – to go into the genetic algorithm. Again, a lot of the time all of this will just end up as one vector, with really just one number being of interest, but we want the model to stay flexible so that we can deal with eventual demands. The end result will be all of the information the manager is actually going to use to make decisions – which may then be separated by resource type(s) and ages.
G-MSE will run an independent observation model for each type of resource of interest. The output of observations will be respresented by a list of arrays in R. Each array in the list will then independently be run by the manager model, each run of which will return an array of summary statistics that will be added to a new list. The list (or an array of concatenated elements – merged data frames) will then be read into the genetic algorithm. Additionally, the manager model should also affect the landscape list/array too – this will give the option of using resource distribution in making decisions; a simple increment for each time a resource is observed on an x y location should do.
Difficulties remain with the genetic algorithm
This brings us to the genetic algorithm itself. Once the ‘’managed’’ data has been finalised by the manager model (or after the manager model has been run for each reasource), then the genetic algorithm will take the managed data, the agent array, the landscape, and the parameter vector and output a new agent array in which elements of the manager’s row have potentially been changed. This models the process of the manager potentially receiving summary statistics, information about stake-holders, distributions of resources and stake-holders on the landscape (along with other landscape-level properties), and other globally relevant parameters and potentially adjusting their policy and even interests accordingly. Actions, interests, and costs are encoded in the agent’s rows:
I’m not sure if we need managers to be able to have their own actions given the point about zero costs, but I think leaving this option open is easy, even if it’s rarely used. Having three blocks of column types (actions, utilities, and costs) also mght allow stake-holders to affect each others costs, potentially, so it’s worth planning this way. I think the best way to do this is to probably have the dimensions of AGENTS
be adjustable, based on how many columns are needed. Four columns would actionally be needed to optimse
All of these values can be optimised as a consequence of data or other agents (for example, a high population size might cause the manager to allow more resources to be hunted or scared – but more or angrier stake-holders might also causes this to happen). The interests of managers, obviously, should change before the actions, as how they act will depend on what they are interested in.
The actions of managers, as changed through the genetic algorithm will be directly interpreted by users as policy. Hence the relevant row(s) of the agent array will feed into the user model. These rows will affect the costs of stake-holder actions (recall that each stake-holder has a total budget). The stake-holders react to these policies and simultaneously adjust their own actions (and potentially utility).
A major challenge here is the sheer number of things that an agent could potentially do, and getting all of those options into the AGENTS
array. Things that a set of columns is going to have to specify for an agent include:
It seems as though there should be an action to directly affect the landscape in some way. E.g., fencing – though I suppose a fence could just be a type of resource in the RESOURCE
array. The problem is that if we allow users to directly affect resources or the landscape, then there has to be some sort of switch in the code to allow this. If, however, things like fence or crop is a resource, and therefore in the resource array, then agent actions could be restricted to subtracting or adding resources by adjusting the resource array. Then again, if a particular resource (e.g., a fence) is not in the resource array, then there is nothing to adjust. Of course, code-wise, is there really a difference between a fence and scaring? They both adjust the x y location of a resource. Maybe we really don’t need much to do with the landscape – just work with a displacement cost, leaving how resources are displaced to be abstract and interpreted by the end user of the software.
Just working with the idea that displacement is all the same (maybe leave some hooks in the code for adjusting the landscape), what we really need to know then is the following:
Complicating things even more, costs might be different when adding or subtracting values from resources – e.g., it might be not so costly to greatly decrease the birth rate of an organism, but increasing it by the same amount should be near impossible. I’m not yet sure the best way to structure the arrays, or anything else, to handle all of these complications, so this will be a major project in the near future (added to GitHub projects).
Once I resolve the issue of how to structure the data so that the appropriate values on AGENTS
, RESOURCES
, and possibly even LAND
arrays can be tweaked through the genetic algorithm, the internal structure will look something like the below.
Essentially, the relevant row from AGENTS
will be brought into the genetic algorithm; ten copies of the row will be added to 90 copies with random numbers. The fitness of all 100 copies will be checked in the fitness function – the fitness function will adjust the resource and agent values according to the copy being assessed, and then some function needs to be called to predict what will happen to resouces and (possibly) agents and the landscape. Originally, I thought that this might be accomplished using the resource function itself, but I don’t think this is best anymore – mainly because it misses an error step (i.e., agents shouldn’t be able to perfectly predict effects on resources). Perhaps it should just loop back through the observation data? Or the resources within view? It could then consider the effect of resources being removed on the agents own utility. It could be something as simple instead as: directly scare or remove resources from location – does this decrease undesired resources from the location? Then lobby the manager: does the change in manager policy affected undesired resources form the location? In other words, should we just have agents look at the direct and immediate effects of what they’re doing on a particular location. In the case of the manager, the location could be the entire landscape, perhaps incorporating birth and death rates into the fitness function?
For tomorrow, it might be worth just working through some of the things that will definitely need to be coded. Alternatively, there are two definite challenges that remain for using the genetic algorithm:
More thinking about agent fitness functions
While I have a general idea of how to implement the genetic algorithm now, how agents make decisions and act on them is still not clear from a modelling perspective, so more critical thinking needs to be done here before any coding. Unlike the resource
and observation
models, I also think it might be better, given the complexity, to write a prototype of the code in R to show proof of concept before optimising the code in c. One thing that I think every agent needs will be some sort of total budget (note, this budget is not necessarily currency – at the moment, I’m thinking about it more like a time budget; it’s also possible we’ll need two budgets, giving the option of one used explicitly for time and the other for currency, but I’m keeping it simple for now). This will give us the option of constraining agents’ behaviours if desired so that agents cannot take unrealistic actions to increase their utility, and instead might have to consider trade-offs between different actions. For example, a farmer might be able to either tend crops, scare or kill organisms, build fence, or lobby the manager to increase utility, even though the best thing to do would be all four. A utility function would then determine how a combination of actions maps to utility, and a genetic algorithm could find the optimal behaviour to get the highest utility. We might consider different stake-holders, or different types of stake-holders, to have different total budgets from which to make decisions – these budgets could also be affected by, and affect, RESOURCES
.
In the software, what this might look like is each AGENT
having the opportunity to modify the following:
Note that this way of conceptualising the implementation of actions is broad enough to include managers (who might lobby stake-holders, or intimidate them through laws to not do something). There might be other things to consider, but this suggests to me at least seven potential variables that an agent could affect, and agents will need to maximise their utility using a genetic algorithm that tweaks all of these parameter values – ideally it would also take into account past actions of other agents to predict utility.
Agents might also be spatially restricted in their ability to perform any of these actions, thus making strategy dependent upon location (e.g., a farmer might not be able to hunt in certain areas). Here the option to define type2
agents could come in hand – one agent might be represented by multiple rows of the AGENT
array with actions for each type type2
, but each row having a unique xloc
and yloc
, thereby representing land owned. Managers could own all land, or just public land if they cannot do anything on stake-holder land. Some agents might have locations of -1 (or lower), meaning that they cannot do anything that requires control of land.
Implementing this type of system could be challenging, and will require that the landscape be a 3 dimensionsal array (or list) with the third dimension or [[layer]]
list element representing a different layer of the landscape. I’ll make this an ISSUE later.
The game implementation of G-MSE will require several additional AGENT
columns corresponding the bullets above, but also type specifications. These columns would correspond to the G1
to Gn
columns suggested earlier. More concrete, they will look something like this:
IDs | type1 | type2 | … | see2 | see3 | … | budget | lobby_type_1 | lobby_col_1 | lobby_val_1 | … | farm_product |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | … | 0 | 0 | … | 1000 | 0 | 0 | 0 | … | 4.5 |
1 | 1 | 0 | … | 0 | 0 | … | 100 | 2 | 14 | 11.4 | … | 19.6 |
2 | 1 | 0 | … | 0 | 0 | … | 100 | 2 | 14 | 16.1 | … | 10.3 |
… | … | … | … | … | … | … | … | … | … | … | … | … |
N-1 | 2 | 0 | … | 0 | 0 | … | 75 | 0 | 12 | 3.2 | … | 12.3 |
N | 2 | 0 | … | 0 | 0 | … | 75 | 0 | 12 | 4.2 | … | 8.8 |
Hence in the above, AGENTS
has a budget column, and columns for each type of actions that can be performed. For lobbying, agents can select a type to lobby (lobby_type_1
), the column to try to affect (lobby_col_1
), and a value to affect it by (lobby_val_
). This raises an issue that an agent might want to lobby multiple types of agents, or even multiple columns of multiple types of agents. It might be worth thinking about if there is a better way to organise what parameters can be affected and how. Of course, we can make global changes that change the number of columns in AGENTS
, giving all of the columns needed, but maybe there is a better way to do it. Between see3
and budget
, columns will include utility values on resource types and perhaps cell types on the landscape (these are what lobb_val_1
affects – agents should also be able to potentially affect each others lobby_val_1
, but probably not lobby types or columns – I can’t see how realistic it would be to convince a different stake-holder to do something with the same value to a different type of agent or a different resource). It might just have to be the other agents’ own cells (or all those of an agent’s type, if we represent type as an individual), or the cells within view
.
This setup could offer some interesting insights – potentially figuring out the conditions under which it benefits stake-holders to take different actions for themselves (increasing yield, hunting, scaring, etc.), or taking different types of actions (e.g., doing work for oneself, lobbying managers, harrassing other stake-holders). Perhaps it’s possible that conflicts could lead to energy being invested in different types of actions depending upon different costs of those actions (is it easier to lobby the manager or shoot an organism?), or arms races could develop that don’t make a whole lot of sense until we understand the history of the conflict (easy to bother other stake-holder, which causes a retaliation, which escalates, etc., with not much action taken to manage). A key here will be to adequately parameterise how much investment each type of action requires so that we have an idea of the kinds of trade-offs that stake-holders experience. Again, in the absence of these trade-offs, it seems like stake-holders should and would try to do everything – interacting with managers and other stake-holders, adding fences, maximising yield, hunting, etc. But I don’t think such an unlimited model would reflect how people actually budget their time and money (i.e., I don’t think that the assumption that agents have unlimited resources is realistic or useful, and that it would be both more realistic and more useful to allow for limited budget).
To summarise briefly here – what we’re going to do is have those columns in the AGENTS
array, then use a genetic algorithm allowing each agent to tweak these values, which will produce the effect of changing other values of the AGENTS
, RESOURCES
, and LANDSCAPE
arrays – constrained to a certain budget
– to affect the focal agent’s utility. This requires a utility function for the focal agent to somehow predict the consequences of these actions (perhaps by simulating a run of the G-MSE to predict what happens in the next generation if only their actions were to be applied). This could get computationally intense, but I don’t see a speedier option just yet.
Implementation of agent strategies
More needs to be planned for the input and output of agent strategies. That is, what variables should and should not be available to managers and stake-holders when optimising strategies through the genetic algorithm, and how should these variables be incorporated into a strategy that causes agents to take one or more actions? Note, there are plenty of resourecs for incorporating multiple objectives into genetic algorithms (Fonseca and Fleming 1993, e.g., 1998; Horn et al. 1993; Jaszkiewicz 2002), so agents can be complex in their utility functions. What I’m talking more about is what do agents get to consider when optimising to maximise their own utility functions? And what kinds of actions do stake-holders engage in upon formulating a strategy? Once the answers to these questions are clear, it will possible to start the process of coding manager
and user
functions. Some potential things to consider as variables affecting manager and stake-holder strategies:
RESOURCE
array if resources are meant to be known (e.g., if hunting licenses or crop yields are modelled as resources).Some things to consider as potential actions (outputs) of manager
and user
functions:
birth_rate = 0
), perhaps at some cost that should be considered explicitly? There are probably some high-level decisions to consider here, and it would be ideal to have many possibilities to choose from.Neither of these lists are exhaustive, and the input and output options could get very complex. I think that this is okay as long as it doesn’t cause the program to be too inefficient, intractable, or unrealistic. We want the options available to managers and stake-holders to reflect those of real systems as much as possible, but it is also worth thinking about whether some options can be safely pruned out of the software, or at least tabled for a later time.
Note, it might be that for most stake-holders, the strategy is really obvious – always act in such a way as to maximise the resources that you’re interested in – no need to optimise much then because the action to take is clear. For managers, however, I can imagine that the decision will always be a bit more challenging, requiring trade-offs between the interests of different stake-holders in determining policy.
Also Note, there should be no need to tell managers what kind of approach to take with respect to policy (though this should be an option, of course). The genetic algorithm should be able to handle this sort of thing – indeed, we might just see very different approaches come out of this model organically as a consequence of different resource abundances and distributions and stake-holder interactions. For example, between time steps, we might see managers switch from establishing a global hunting quota to prohibiting hunting and constructing fences (protected areas of landscape) instead; all we need to do is allow some sort of switch
to affect manager’s general approach, then incorporate this switch
variable into the genetic algorithm.
Use of genetic algorithms in ecology and evolution
Hamblin (2013) has a nice methods paper on the use of genetic algorithms, focused especially on a ecology and evolution audience. He cites a highly relevant book by Sean Luke, which includes a general introduction to genetic algorithms, but also chapters on coevolution (competing strategies), multiobjective optimisation, and policy optimisation (Luke 2015). Luke (2015) is particularly cited for the a quote on the utility of metaheuristics (which includes genetic algorithms), which I’ll just include here in full:
‘’Metaheuristics are applied to I know it when I see it problems. They’re algorithms used to find answers to problems when you have very little to help you: you don’t know what the optimal solution looks like, you don’t know how to go about finding it in a principled way, you have very little heuristic information to go on, and brute-force search is out of the question because the space is too large. But if you’re given a candidate solution to your problem, you can test it and assess how good it is. That is, you know a good one when you see it.’’
I think this probably applies well to G-MSE. Hamblin (2013) notes that ‘’fitness evaluation’‘is the larges performance bottelneck, so it is probably not worth investing too much energy on optimising the specifics of structure types, or crossover, mutation, and reproduction algorithms; instead, more attention might be paid to making speedy assessments of the fitness (payoffs) of agent strategies. It’s also possible to control recombination (I’m going to call it that sometimes –’‘crossover’’ strikes me as a bit of a confused term from the computer science literature) and mutation frequency through a parameter, so they could effectively be turned off if the parameter were set to zero. Hamblin (2013) notes that mutation type (e.g., random per locus or chromosome) is not terribly important (but it’s worth pointing out that the mutation rates from the literature search in Table 3 are generally much lower than Luo et al. (2014) mentioned – 0.1 still seems reasonable to me), but recombination parameters can be important – one point crossover (i.e., forcing cross-over to happen once for all individuals) can break up good linkage combinations – better to just use uniform (probabilistic) crossover. Population sizes shown in Table 2 of Hamblin (2013) references shows that population sizes around 100-200 (though some much lower, but nearly always less than or equal to 2000) are common, with run lenghts commonly around 500 (1000 is also commmon); reals are about as common as binaries. The most popular selection algorithm is truncation, making up well over half of ecology and evolutionary biology applications of genetic algorithms (Table 1 of Hamblin 2013). To my surprise, truncation selection is not the consensus recommendation for genetic algorithms (and proportional methods are quite bad when multiple strategies are near an optimum, resulting in premature convergence). The recommended selection method according to Correia (2010) is actually tournament selection. The algorithm is described by the quote below:
‘’It randomly picks k individuals from the population and copies the fittest of them to the mating pool. All the k individuals go back to the population. The process is repeated until the mating pool has the desired size.’’
So tournament selection is not probabilistic – in that sense, it is like truncated selection, but there is an extra sampling step that is iterated until the new generation is formed. If k is the same size as the mating pool, the this is effectively truncation selection, so really tournament selection is a generalisation of this that will be useful to code. Hamblin (2013) also cites a book chapter by Syswerda (I’m still waiting on the full text, but the link has all of it) that shows that overlapping generations (termed ‘’steady state’‘in the computer science literature) perform better than non-overlapping (termed’‘generational’’) algorithms. This can be easy to implement – allow selected agents to be placed in a new array, but have mutation and crossover in half. This will fit especially well with G-MSE given that agent strategies might not be expected to change much from one time step (of the model, not the genetic algorithm) to the next. Hence, the optimal solution from the previous time step will be included in next time step, and if nothing changes, then convergence will occur as soon as possible.
It will obviously be important to run diagnostic tests on the G-MSE genetic algorithm. Hamblin (2013) recommends,
‘’Plots of mean population fitness (and its variance) and the fitness of the best individual over time can be important for both diagnostic and reporting purposes; populations that reach a single solution (close to zero variance) within a few generations are a clear sign of premature convergence, likely stemming from a problem in the balance of exploration and exploitation (selection too strong, too little mutation/crossover, population size too small, etc).’’
Testing shouldn’t be too difficult – the results of genetic algorithms can be printed off to a c file, then read in by R and presented in a figure. Hamblin (2013) suggests that genetic algorithms are robust, so it’s unlikely that parameter values choices will cause major problems or affect things greatly, but it’s worth doing all of the quality checks.
More review of utility functions in genetic algorithms
I’m turning now to the use of utility functions, particularly the use of them in genetic algorithm and games. It appears that these can be found in economics and business. For example, Luo et al. (2014) address an optimsation problem for product demand using a utility function to be maximised and a genetic algorithm. Luo et al. (2014) use fuzzy numbers to model market segements, which include three numbers representing most pessimistic, most likely, and most optimistic values. The authors use conjoint analysis, apparently a technique to figure out what people will pay for, combined with a ‘part-worth utility model’. Utility is modelled as a USD amount, and as a linear function (summation) of the product of weights, part-worth utilities, and a binary variable linking product profiles and product attributes – summations over levels and product attributes. I’m not too worried about the details here, just that total utility is measured in currency in this case, and is calculated as weighted sub-utilities – this kind of logic is relevant for G-MSE.
Luo et al. (2014) then go on to model how utility determines a consumer’s choice of product (essentially, consumers pick the product of highest utility, or none at all). Several constraints on product choice and product attribute-profiles are noted in the model, but the genetic algorithm is implemented using int
coding – one chromosome has consumer choice and product configuration sections. Genes within the consumer choice section each represent consumers in a particular market segment – values of these genes correspond to choice of different product profiles (if the value is zero, no product is chosen). The product configuration section contains subsections related to product profiles; each subsection has genes whose integer values indicate the level selected for a product attribute. A population’s gene values are initialised randomly, and individual fitness is calculated using the linear models introduced prior to the genetic algorithm. The authors use a uniform crossover procedure, which might be useful type of algorithm – apparently searching a lot of strategy space, though the costs and benefits of different crossover methods are still unclear to me. Parameters for the genetic algorithm seemed unusual, to me at least; Luo et al. (2014) set a population size of 30, a crossover probability of 0.7, and a mutation probability of 0.4. I would have considered the population size much too low, and the mutation probability much too high, but it’s worth keeping in mind that perhaps these parameter combinations are useful in genetic algorithms even they appear odd biologically – it’s worth experimenting with them, at least (their Figure 2 suggests that my presumed ideal parameters might be on the low side for crossover and mutation). Surprisingly (at least, to me), Luo et al. (2014) conclusded that their algorithm had the best performance (in terms of profit maximisation) when ‘’crossover probability was 0.7 and mutation probability was 0.7’’. The authors used MATLAB to implement the genetic algorithm, so the might have been stressed on computation efficiency – it took 85 seconds for 50 generations on a Pentium IV processor; had the analysis been run in c, it surely would have been faster.
Luo et al. (2014) do note that ‘’low mutation probability (e.g. 0.1) is a good choice’’ for genetic algorithms (as a biologist, of course, this seems very, very high!), but their problem was an exception because the space that needed to be explored was very large. The general take-home I get from this is that the relatively low mutation and recombination rates that we observe as biologists are probably not appropriate for a good genetic algorithm; higher ones should be used by default – of course, this will require citation to reassure reviewers that this is standard practice.
Tu et al. (2000) look at genetic algorithms for negotiations among agents using utility functions, which is exactly the kind of thing that we’re interested in for G-MSE. In addition to being a useful resource for showing an overlap between utility functions and genetic algorithms, this conference proceedings is very interesting in that it has interacting agents, and considers negotiation as ‘’a serch for an optimal negotation outcome with respect to the utility functions of each partner’’ (Tu et al. 2000). I’m not sure if we’ve proposed it this way before, but given that I’ve been conceptualising the manager in G-MSE as a special kind of agent (and, in that sense, similar to stake-holders, but following its utility to make rules rather than work within rules to maximise utility), it would be very interesting if we could use a genetic algorithm and the manager agent’s utility function to optimise negotiation outcomes in addition to management outcomes – or perhaps, define ideal management outcomes as the optimal negotiation outcomes that maximise the interests of stake-holders. We could then use the manager genetic algorithm as a tool in real-world case studies where real or simulated stake-holders play the role of agents.
The use of automated negotiation strategies in online commerce appears to follow a protocol using simple sequential rules and threshold utility values. Tu et al. (2000) created a generic framework for a genetic algorithm, implemented using Java. Three functions needed included mutation, crossover, and reproduction. The algorithm for selection seems a bit unclear. It appears that parent individuals (i.e., reproduction) is chosen based on probability, while selection of offspring simply draws the highest fitness offspring to become the next generation of parents? (’’The parent individuals are chosen with a probability proportional to their fintess and the operators are chosen randomly. From the new population of size \(\lambda\), the \(\mu\) individuals with the highest fitness are propogated into the next generation as parents’’). This isn’t entirely clear.
The method by which agents reach a consensus is really interesting as way that an agreement – e.g., a policy – is reached. It occurs to me that there might need to be some utility in inaction as well – rather, some cost associated with doing something as a consequence of low utility, though I’m not yet entirely sure how this would be implemented practically. Stake-holders have other interests, of course. The authors consider four types of scenarios on which negotiations take place:
Tu et al. (2000) tweaked crossover and mutation probabilities to get best results (unfortunately, the exact values they used aren’t reported anywhere in the proceedings, that I can find).
Sunday musings
As a bit of an aside, I’m thinking about how biological degeneracy might fit in to the efficacy of management policies, given that multiple independent agents might affect a biological system in different ways. I think that degeneracy is interesting and probably greatly under-considered across all biological scales, but it appears entirely absent as a theoretical or practical consideration in conservation and the maintenance of ecosystem function. Man et al. (2016) very recently developed the theory to quantify degeneracy, doing so while simulating networks of complex neoronal systems characterised by non-linearity – specifically comparing degeneracy to redundancy and complexity, which were also defined mathematically. I think there’s a lot of room for theoretical development on degeneracy, and a lot of scope for the application of degeneracy theory to big questions in evolutionary ecology, community ecology, and conservation biology. The modelling in G-MSE is general enough to be potentially able to address these kinds of questions, perhaps using the mathematical definitions introduced by Man et al. (2016) for analysis of simulation results.
I’ve been doing a bit more literature review on the subject of genetic algorithms, particularly as applied to economic and social-ecological questions (e.g., Balmann and Happe 2000; Ascough et al. 2008). Given the need to keep things computationally efficient while also repeatedly updating agent strategies, I think it’s worth defining AGENTS
as an integer
array (I’m not sure why RESOURCES
can’t also be one, actually, so it might be worth checking on this) instead of a double
. Supporting this:
AGENT
array that needs to be a non-integer. the closest thing is a parameter affecting movement, but this can be made into an int
, I should think. It might also help if the parameter affecting error
was continuous, though I’m not yet convinced it must be – error
could just be the probability of error from zero to 100, interpreted as 0 to 1.0 by increments of 0.01.RESOURCES
that needs to be a non-integer either. The probabilities of removal (i.e., death) and growth (i.e., birth) are the closest, but I don’t know if there’s any good reason to have these be especially precise – i.e., why not just have an int
value from zero to 100, corresponding to a 0.01 to 1.0 probability of mortality later? That way, the whole array could be int
. I suspect the same can be done for the birth parameter, though the case is certainly less convincing than for the agent array.NEW ISSUE 13: Switch agent array to type int
In light of the above reasoning, I think I’ll plan to switch AGENTS
to an int
type, then see how this affects things. Using integers to define ‘genotypes’ that affect agent strategies would permit the use of bitwise operators to increase speed at a very computationally intense part of the model (genetic algorithm mutation and selection). The size of an int
must be at least 16 bits in c, so a signed int
could correspond to \(2^{15} - 1 = 32,767\) unique values – plenty, I would think, for coding a sufficient number of strategies. I’ll want to do a bit more digging to see how much this could be expected to speed up the genetic algorithm (see here ). Of course, if it’s trivial, then using double
and columns affecting behaviour is probably just fine. But if speed is an issue, a vector of int
values could really be better than several columns of double
values; I’m just not sure what would have to be sacrificed yet. Quick random number sampling will be needed.
Having second thoughts about binary encoding
I’m not entirely convinced yet, actually that binary instead of real encoding is needed. One advantage of real encoding, besides that it fits a bit more easily into the current data structure I’m using, is that it might converge on optimal strategies sooner even if the bitwise calculations are faster (Salomon 1996). Note that phenotypes in bitwise encoding are affected by both the position and value of bytes, whereas phenotypes in real encoding are only affected by the value of real numbers (Kumar 2013). There are some techniques to map binary values to real numbers, though I’ve not yet found anything comparing the efficiency of binary versus real encoding, but Salomon (1996) argued that real encoding was the best choice of applying genetic algorithms to optimisation – I think this might be the way to go, though I’ll want to think about how crossing over and mutation will work efficiently. I’m not entirely sure I do want to finish issue 13. In the end, using int
instead of double
could cut the memory in half, but this would be almost useless for the AGENT
array – if it could be done for the RESOURCES
array, it might be more useful, but R doesn’t differentiate, so it really won’t matter that much, if at all.
REMOVING ISSUE 13: Convinced myself that this was a bad idea
Note that Balmann and Happe (2000) writes that ‘’population size usually ranges between 10 to 50’’, though from population-genetics perspective, this seems too small to me.
Fleshing out the use of Genetic Algorithms for G-MSE
I’m becoming more convinced that some sort of genetic algorithm is the best way to model the strategies of all agents, including managers and stake-holders. Here is a rough overview of how I see the next step of the software development process:
Insert columns into the AGENT
data frame that represent utility values associated with each type of resource. This will effectively quantify how much of each resource managers and stake-holders want. For example, while managers might prefer a balance of resources (perhaps the average of stake-holders?), stake-holders might prefer to maximise only one resource with little or no concern for another (or to actually prefer some resource quantities to be minimised). The utility values of each agent will be used as variables in a utility function, which will calculate agents’ satisfaction (or happiness or contentness) with a current situation of resource quantities (note: this utility function need not be linear – for some stake-holders, I’d expected it to be more log-linear, but it might be good to try different functions and ask real stake-holders what they think). Hence, a function calc_utility
will be needed.
AGENT
that influences agent actions – how managers and stake-holders will do something in their environment. This can be thought of as analagous to genes affecting an organisms phenotype in an evolutionary model, but will have different types of effects for agents:
manager
function will therefor be needed.user
function will therefore be needed.AGENT
columns affecting manager and stake-holder actions will be updated before every decision using a genetic algorithm. This will require a separate opt_utility
function. This general function will work as follows:
calc_utility
to calculate the utility of the agent of interest.manager
or user
function for managers or stake-holders (perhaps need an R and C version of these functions – c for here, R for later), respectively, to temporarily simulate each offspring’s decision if used in one or more previous time steps (e.g., by using the current AGENT
values)calc_util
to find the utility associated with the simulated decision in 2 – this effectively tests each pseudo-agent to see if their action variables are good at maximising utility.The above genetic algorithm can be used both for maangers maximising utility through establishing game rules and for stake-holders maximising utility by affecting resources. The idea is to have the general opt_utility
to optimise what an agent does to maximise their utility through the use of a general genetic algorithm (perhaps simulating human planning, if it were as good as adaptation by natural selection, which I don’t think it is).
c
function for reasons of efficiency – we’re going to add some time onto these simulations, but I think it will be worth it provided we:
So, perhaps, the manager.r
function will take in all of the necessary information and send it to c, and then c will go through the entire process of potentially automating manager interpretation of observation data and decision of making game rules based on manager utility values. Other management options will of course be available.
Then, the user.r
function will likewise take all of the necessary information and send it to c to go through the entire process of automating stake-holder interpretation of the manager’s rules and updated actions based on stake-holder utility valuse. Other user options will of course be available.
This removes the need for a specific game arena, games.R
, because the game is defined by manager.r
and effectively played by users in user.r
. The novelty is that we’re using evolutionary game theory under the hood in both management and stake-holder actions to infer broader patterns about how cooperation and conflict might arise when all parties are acting according to their own interest.
I think this is getting on the right track, and I am starting to see how the code will look and run. We also might want to include a spatial component to all of this, affecting both manager and user actions. For example, perhaps some stake-holders can only have their utility functions affected by or act in resources within certain areas of the landscape.
NEW ISSUE 12: Observe multiple times for density estimator
Currently, estimating total population size using a sub-sample of observed area and assuming that the density of this sub-sample reflects global density (method = case 0
) only works when one sub-sample is taken. There are multiple ways of fixing this so that the population size estimate takes into account multiple sub-samples. It would be a good idea to think about the most efficient way to do this and program it into R (perhaps with tapply
to start, but eventually in the manager.c
function, maybe).
NEW ISSUE 11: Permanently move agents
Allow agents to move in each time step, permanently, in some way. This might be best done through the anecdotal
function. As of now, they go back to their original place at the end of each time step, and it would be good to have an option to let them move all around the landscape.
Agent-based modelling in economics – potentially useful ideas
Phan (2003) briefly summarises the emerging (at least, at the time emerging) field of Agent-based Computational Economics, noting that agent-based models can complement mathematical theory in economics especially when equilibrium conditions cannot be easily computed or attained by agents. Relating agent-based models to cognitive economics, Phan (2003) notes that the latter ‘’is an attempt to take into account the incompleteness of information in the individual decision making process’’, which seems especially relevant to G-MSE. The program SWARM might be useful to explore – written in java though. Software like SWARM, MODULECO, and CORMAS appear to have a similar interface as G-MSE has (or will have), but I think that writing G-MSE from the ground up was definitely the right choice. This makes G-MSE more targeted to a specific social-ecological problem, allowing it to be written in a way that is computationally efficient, but can also be accessible through a browser by end users without proficiency in R (regarding efficiency, current simulation times for the model itself are: 100 time steps = 0.241
seconds, 1000 time steps = 3.179
seconds, and 10000 time steps = 27.740
seconds; I can’t imagine anyone would want simulations longer than 1000 time steps, but the efficiency allows many replicate simulations in a time frame that will not be an issue for serious research – especially if run in parallel. Things do slow a bit when more individuals are needed, but I’ve simulated 100 time steps with over 100000 individuals and found the simulation to take only 22.8
seconds. Memory might be an issue, but I’m currently storing entire resource and observation histories – an option to not do this would cut back massively).
Phan (2003) discusses how agents might optimise behaviour over the course of some number of iterations, which appears analagous to evolution of traits, except that it’s one individual essentially working through a trial-and-error process of finding the best behaviour to adopt to maximise some sort of utility function (in this case, profit). Beliefs
are reported over time as numeric values that affect behaviour. Phan (2003) likewise considers the situation in which individuals buy or don’t buy something to maximise a surplus via a maximisation function that multiplies a binary variable to the difference between costs and benefits of a good.
Marks (1992), in a now fairly dated paper, looked at modelling generalised prisoner’s dilemmas, which involve continuous rather than discrete strategies, and discusses solutions for optimal strategies, including evolutionary stable strategies as pioneered by John Maynard Smith. The general idea of the ideas in Marks (1992) has overlap with G-MSE, in that there are agents (perhaps rational agents) attempting to maximise something through interaction. Marks (1992) first introduces the oligopoly problem, stating, ‘’with a small number of competitive sellers, what is the equilibrium pattern of price and quantity across these sellers, if any?’’ The analagy to managers and stake-holders would seem to be appropriate, perhaps: given a small number of stake-holders what is the equilibrium value of a set of resources (including population size, farm yield, etc.), if any? To do this we need to understand the agency of the stake-holders and the rules of the game as set by managers.
Marks (1992) considers an economic model of a generalised prisoner’s dilemma with three players, considering the genetic algorithm, a machine-learning technique that makes it unnecessary for a human being to consider a strategy (i.e., the strategies are derived from the conditions of the model). This is the kind of avenue that we want to go down. In fact Marks (1992) puts it quite clearly in the block below:
‘’Mathematically, the problem of generating winning strategies is equivalent to solving a multi-dimensional, non-linear optimization with many local optima. In population genetic terms, it is equivalent to selecting for fitness’’
Hence the overlap between evolutionary game theory and adaptive dynamics models with models that produce optimal strategies for maximising utility in economic situations appears to be quite large, as presumed. Therefore, using evolutionary game theory would appear to be a reasonable way of selecting stake-holder strategies in G-MSE. Delving a bit more into this literature might make the jargon clearer, and identify any subtle differences in the maths or algorithms though. And I’m still not sure how this fits in with machine learning (e.g., if machine learning is just adaptive dynamics under the hood – a quick search doesn’t give an answer to this, so I think it will be necessary to do a bit more reading to understand the two; Marks (1992) differentiates, ‘’[…] advent of [Genetic Algorithms] (and machine learning) means […]’‘). Here is an interesting example from a course in machine learning, where the instructor first looks at genetic algorithms – the instructor describes them as the’‘least practical’’ of machine learning algorithms in the course, but the instructor is also an engineer, so perhaps they’ll be more practical (probably more general, if I’m thinking correctly) for solving G-MSE type problems.
Perhaps one c function (e.g., adaptive.c
) could go through a learning process of maximising utility for each type of agent (each agent might get intense, depending on how many agents there are). The rules of the game could be passed from game.c
to adaptive.c
, where adaptive.c
also takes in the array of AGENTS
. From the starting point of each agent’s traits, agents within the program could reproduce themselves with mutation, the selection could minimise some cost function until some sort of maxima is acheived that results in agent trait values that havet he highest return on utility. The program adaptive.c
could therefore take in AGENTS
IDs | type1 | type2 | … | see2 | see3 | G1 | G2 | … | Gn |
---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | … | 0 | 0 | 0 | 0 | … | 0 |
1 | 1 | 0 | … | 0 | 0 | 0.2 | 1.1 | … | -0.1 |
2 | 1 | 0 | … | 0 | 0 | 1.0 | -0.1 | … | -2.7 |
… | … | … | … | … | … | … | … | … | … |
N-1 | 2 | 0 | … | 0 | 0 | 0.4 | -1.1 | … | 0.9 |
N | 2 | 0 | … | 0 | 0 | 2.1 | 3.0 | … | 0.5 |
Where the table above is the data frame of AGENTS
as it currently exists with additional columns G1
to Gn
that could hold real numbers that affect agent behaviour. A dummy data frame could be created that allows for evo_time
generations of reproduction with mutation and selection for minimising a cost function in attempt to find appropriate values affecting components of an agent’s strategy. I’m not sure how long such an algorithm would take, but I suspect that it could be optimised to not be painfully long – different criteria could be set, e.g., to allow for a maximum number of evolving generations (the aforementioned instructor suggests 1000) or some convergence criteria. Essentially, each agent or type of agent would go through a process of learning an optimal strategy by creating a lineage of strategies, the descendants of which would be selected by strategy performance. Note that given a convergence criteria, strategies might not need to evolve much in each time step of G-MSE – the best strategy might be stable over time in some situations (and if we don’t want strategies to change over time steps, the question of optimal strategy could be solved when initialising agents – still the idea of allowing dynamic strategies seems interesting, and might be important if management is also changing).
While for some simulations, we’ll want to take the time to allow evolution of optimal strategies, in others we might even embrace an imperfect strategies evolving as a consequence of short evolution times – this might mimic the limited time that stake-holders have to consider a particular problem.
A general summary of G-MSE as it exists at the moment
A summary of some of the challenges of putting the ‘G’ in G-MSE
A summary of some ideas for moving forward with G-MSE
Short-term plan
I’m going to finish developing thoughts on evolutionary game theory, then move onto looking at game theory from an economic perspective. I think the biggest thing to consider on the immediate horizon is what kind of approach will be used to simulate agents (stake-holders) playing games and making decisions. Once this is clear, the details can follow. Some sort of utility function will be used. Of particular consideration is how much complexity should be incorporated – or, perhaps – how much mechanistic detail.
Leombruni and Richiardi (2005) makes some interesting points regarding use of mathematical versus agent based models, noting the tractability issues with mathematical (and game-theoretic) models as things become more complex due to unique individuals needing to be represented.
Quick efficiency fix
The best way to manage memory in R is going to be by avoiding Rbind
altogether and working instead with lists, as made very clear by the following quick experiment in scratch.r
:
################################################################################
# Testing list versus array efficiency
# ARRAY FIRST:
sam <- sample(x = 1:100, size = 14000, replace = TRUE);
dat <- matrix(data=sam, ncol=14);
obs <- NULL;
proc_start <- proc.time();
time <- 1000;
while(time > 0){
obs <- rbind(obs, dat);
time <- time - 1;
}
proc_end <- proc.time();
time_taken <- proc_end - proc_start;
# TIME TAKEN: 14.09 seconds
# NOW LIST:
sam <- sample(x = 1:100, size = 14000, replace = TRUE);
dat <- matrix(data=sam, ncol=14);
obs <- list();
proc_start <- proc.time();
time <- 1000;
elem <- 1;
i <- 1;
while(time > 0){
obs[[i]] <- dat;
i <- i + 1;
time <- time - 1;
}
proc_end <- proc.time();
time_taken <- proc_end - proc_start;
# TIME TAKEN: 0.005 seconds
################################################################################
The output being deposited into a list is much, much faster. Enough to make me want to fix this immediately. Doing so was trivial – it was just a matter of replacing RESOURCE_REC <- rbind(RESOURCE_REC, RESOURCES)
with RESOURCE_REC[[time]] <- RESOURCES
, then editing the plotting functions accordingly given the new data type. The result is that simulations are now much faster, especially when time
is high, simulating many time steps. One hundred time steps used to take 10-12 seconds for some observation times – they now all take under a second. For more time steps, the efficiency difference would increase exponentially. The massively increased efficiency occurs because R now no longer allocates a whole new massive chunk of memory for each new recorded data frame – it just appends data to a list where the memory has already been allocated.
CONCLUSION THE TIME IT TAKES TO RUN 100 TIME STEPS HAS DECREASED BY AN ORDER OF MAGNITUDE BY SWITCHING FROM DATA FRAMES TO LISTS IN R (NOW LESS THAN 1 SECOND)
Note that plotting still happens slowly, deliberately, because we’re putting the system to sleep for a tenth of a second in each time step to make the animation smooth. When plotting is turned off, this no longer happens.
Proof of concept: Interactive user input as a stake-holder
The code below runs the gmse
program in a way that is interactive. I have run time steps, and specified that the hunting begins in time step 95.
> sim <- gmse( observe_type = 0,
+ agent_view = 10,
+ res_death_K = 400,
+ plotting = TRUE,
+ hunt = TRUE,
+ start_hunting = 95
+ );
This produces the following output. When prompted by the line ‘’Enter the number of animals to shoot
’’, I have typed in a number and hit enter accordingly.
Year: 95
The manager says the population size is 181
You observe 11 animals on the farm
Enter the number of animals to shoot
10
Year: 96
The manager says the population size is 408
You observe 11 animals on the farm
Enter the number of animals to shoot
10
Year: 97
The manager says the population size is 272
You observe 6 animals on the farm
Enter the number of animals to shoot
10
You can't shoot animals that you can't see
6 animals shot
Year: 98
The manager says the population size is 226
You observe 10 animals on the farm
Enter the number of animals to shoot
0
Year: 99
The manager says the population size is 294
You observe 9 animals on the farm
Enter the number of animals to shoot
5
The output of this also shows the spatial distribution of resources and a population graph over time. My hope was that allow the gmse.so
file to be sourced directly from a link so that it could be run by anyone remotely, but I think that this will take a bit more work – worth keeping in mind for later.
I am still trying to get a clear picture on how to incorporate management, user, and game-theoretic modelling components. Given uncertainty in all of these components, some unified approach would seem beneficial. Franco et al. (2016) has recently introduced a comprehensive approach to evaluate effects of disurbance on coral reefs using a Bayesian Belief Network (BBN) approach. This approach ‘’offers a methodological framework to address uncertanty.’’ This approach requires some defined outcome state, the probabilities of realisation of which are calculated. Use of BBNs requires an acyclic graph and conditional probability tables. It’s not entirely clear to me how BBNs would be incorporated into the G-MSE simulations, except maybe as a type of observation model? With the simulation, we can look at causality directly and thereby quantify direct and indirect effects, and measurement error. It could, however, be useful to know how well BBNs perform using simulated populations, simulated observational data, and appropriate analysis based on BBNs, as would be used on empirically derived data.
For coauthors, add the G-MSE files onto a public Dropbox so that they can be sourced and run remotely. There are also some useful resources for embedding R in a website. This might be faster than using Shiny, at least at first, so it could be useful for initial demonstrations. It might be useful to show a prototype of G-MSE, or what it might be:
## [1] "Managers estimate the population size is 4230"
## [1] "You encounter 35 animals around your farm"
## [1] "Estimated loss of yield is at 5%"
## [1] "Enter how many animals you intend to hunt"
Demonstrating this (and it would be quick to implement) might be useful for showing how management and games work.
Side note about computation efficiency
Note that it would really be faster to convert to a list type in R if anything computationally intense needs to be done (e.g., binding rows). C will not appear to let me read in a list via .Call
, only a vector, so it’s worth thinking later about whether doing some things on the R side will be faster:
Updated scratch.R to show how option 2 could work, though the change itself might be more inefficient than binding or other operations.
Issues related to agent-based complex modelling of human decisions
An (2012) reviewes humans as agents in agent-based models of social-ecological systems. An (2012) ties this in with complexity theory, and distinguishes agent-based from individual-based models in a useful way – with agent-based models being defined more by attention to decision making processes (as in models of human behaviour). An (2012) asks,
- What methods, in what manner, have been used to model human decision-making and behavior?
- What are the potential strengths and caveats of these methods?
- What improvements can be made to better model human decisions in coupled human and natural systmes?
An (2012) reviews nine different types of decision models, and notes that different types of decision models can be mixed and matched, as we’ll likely need to do for G-MSE. I’m not sure that we can assume that stake-holders are the same types of decision-makers. For example, I suspect that farmers might be better represented by a microeconomic model of decision making, with a focus on maximising some sort of revenue or yield. An (2012) notes the use of utility functions here (seeming to link with some of my earlier thoughts), including one in which ecological indicators are included in place of just money (Nautiyal and Kaechele 2009). Apparently, econometric work by McFadden (1973) is foundational to looking at decisions based on utility, modelling decisions as a probability of an agent choosing an option. An (2012) notes that decisions are unlikely to be completely rational, and humans will tend to seek ‘’satisfatory rather than optimal utility’’.
A second of the nine types of decision models includes the psychosocial and cognitive models, which attempt to model individual’s thoughts based on beliefs and goals – institutions can also be modelled this way, though we might think of institutions as collections of the same type of individual for the purposes of G-MSE coding.
One type of modelling that could be especially interesting is what An (2012) defines as ‘’participatory agent-based modelling’’, wherein real stake-holders tell the modeller what they would do under some set of conditions conditions, then the model runs with those decisions. This has been used, apparently, in an agricultural setting (Naivinit et al. 2010), and would be a very interesting addition to G-MSE. If we could have an option for letting a user take over the role of an agent in the model and play against a computer, it could be interesting – though I’d tend to still want to develop some game-theoretic algorithm that grounds predictions of stake-holder behaviour, rather than relying solely on empirically derived data (i.e., asking people what they would do). This could be accomplished in a couple ways, in principle – one being throught he use of a C standalone program (i.e., not linking with R) that prompts the user for input using the scanf
function and repeatedly updates the simulation with information in every cycle of the G-MSE loop. The same effect can be accomplished in R with the following code as an example of the concept:
act_agent <- function(times){
while(times > 0){
cat("\n\n\n How many geese do you shoot? \n\n");
shot_char <- readLines(con=stdin(),1);
shot_num <- as.numeric(shot_char);
gross_prod <- rpois(n=1, lambda=100);
net_prod <- gross_prod - (2 * shot_num);
cat("\n");
output <- paste("Net production = ", net_prod);
print(output);
times <- times - 1;
}
}
If you read the function into R, then run it (e.g., act_agent(times = 2)
, it will ask for input times
iterations, prompting once per iteration of the while
loop. An option in G-MSE would be nice to allow:
All of these would be fun, and An (2012) notes that they are often quite sueful. Ideally it would be nice to make the program more user-friendly than a command line interface, but that seems like a concern for a version 2.0, after an initial version has been released. More helpfully, using some sort of loop could make for easy input of the R options in the gmse
function – it could ask, in plain language, for users to insert the numbers that are currently only input within gmse()
itself (e.g., gmse(time_max = 100)
).
It’s possible that we could develop a type of rudimentary artificial intelligence by collecting data of user decisions (i.e., make a ‘bot’ that mimics human decisions). For example, we could have 100 people act as agents in G-MSE, collect data on the decisions that they make when trying to act like a stake-holder, then construct an algorithm based on real user decision in different situations (alternativley, or in addition, we could also look at actual past decisions from the case studies to make an algorithm). This could be an interesting, approach, albeit a somewhat atheoretical one – it doesn’t excite me quite as much, but it might be worth considering because the end result might predict human behaviour better than theory-driven approaches (as humans don’t always act rationally or think things through carefully – I don’t think a citation is needed for this; it’s 20 JAN 2017, and the current time is 17:00 GMT, or 12:00 EST). It could also be interesting to compare different types of approaches (i.e., have a theory-based approach and a empirically-based approach option). An (2012) warns though that ‘’Even though also based on data, researchers usually have to go through relatively complex data compiling, computation, and/or statistical analysis to obtain such rules’’ An (2012) also notes that this kind of data collection does not necessarily identify why decisions are being made. Hence, I do think game-theory will be absolutely important, with agents using underlying utility functions to maximise their own utilities as a consequence of games.
Some notes on the asymmetric nature of stake-holder games
Games between stake-holders, modelled by agents
in G-MSE, are typically, if not always, going to be asymmatric. This means that the stake-holders are distinguished by more than their strategies – they are likely to have their own unique payoffs defined by their identities (e.g., as a conservationist, a farmer, etc.). It would seem as though the only way around this – if it’s even possible – might be to make identity part of the game itself. In other words, let agents attempt to maximise some general payoff by deciding to take on a particular role, and then a strategy given their chosen role. It’s an interesting thought, but I don’t think it makes much sense for the practical application of G-MSE. In the context of the games that we’re interested in, stake-holders effectively are conservationists, farmers, hunters, etc. (or some mixture of these roles). Hence, I think we need to work with the idea that the games our stake-holders play and that G-MSE will model are going to be asymmetric.
Maynard Smith and Parker (1976) outlined three specfic ways that games might be asymmetric (they were thinking about animal contests, but the general principles apply):
Pay-offs asymmetry: Different players might stand to gain different amounts in the game – e.g., perhaps mutual cooperation returns a higher benefit for one player than another, or defection on the part of one player has a more negative effect on its opponent than vice versa.
Resource asymmetry Intrinsic difference between players might give one player an inherent advantage, allowing them to dominate in an interaction (i.e., there might not be much of a conflict because one side can always win).
Uncorrelated asymmetry Discussed earlier: Maynard Smith and Parker (1976) define this as asymmetries that ‘’do not affect either the payoffs or the’’ resources that might given one player an intrinsic advantage.
The authors offer some general conclusions about asymmetric gains with unequal payoffs, but these are really more about encounters of conflict, and perhaps not so applicabl to G-MSE. They state that, where payoffs are unequal but all parties have access to information, it is best to ‘’play high when you have more to gain and zero when you have less to gain’‘. In other words, if there is a lot to gain by sticking it out and fighting hard in an interaction, do it – if there’s not much to gain, then back off. Such contests are the central focus of Maynard Smith and Parker (1976), but the general conclusion that’‘mixed strategies will be the exception’’ when contests are asymmetrical would seem to apply more broadly. Given the many ways that a game can be asymmetrical – rather, that a symmetrical game could be changed to asymmetrical – it would seem likely that there are more ways that cause a strategy to become pure than not pure because there are more ways of adjusting payoffs to making one strategy the clear winner. This could simplify the game theory in G-MSE, in a sense, if mixed strategies do not require much consideration.
McAvoy and Hauert (2015) recently emphasised the importance of asymmetry in evolutionary games, noting that ‘’cooperation may be tied to individual energy or strength, which is, in turn, determined by a player’s role’’. This would seem to apply to social-ecological conflicts as well – cooperation might reasonably tied to the power (economic, political, etc.) of stake-holders, meaning that it might be important to take this into account in G-MSE modelling. For something like Prisoner’s dilemma, we can represent an asymmetry using subscripts, so the standard game would be represented by a payoff matrix,
\[ \left( \begin{array}{ccc} & C & D \\ C & R, R & S, T \\ D & T, S & P, P \end{array} \right). \]
Where the above satisfies: \(T > R > P > S\). An asymmetric game can be represented by,
\[ \left( \begin{array}{ccc} & C & D \\ C & R_{i}, R_{j} & S_{i}, T_{j} \\ D & T_{i}, S_{j} & P_{i}, P_{j} \end{array} \right). \]
The above is for two different types of players, \(i\) and \(j\). Note that I tried working through the same basic concept with a bit different notation earlier on, with each matrix element being defined by a utility function that is unique to each agent type. In the code, this will all be defined by agent types and their respective traits (columns in the agent_array
), but it’s good to link this up with theory and the general properties of asymmetric games.
McAvoy and Hauert (2015) go into the Prisoner’s Dilemma and Snowdrift gamse given environmental and genotypic asymmetry
Such asymmetries can complicate evolution of strategies, and, perhaps more relevant for G-MSE, can cause different types of agents to experience different types of games as a result of asymmetry:
‘’[…] Thus, based on the social dilemma implied by the ranking of the payoffs, a player who incurs a cost of \(c_{1}\) for cooperating is always playing a Snowdrift Game while a player who incurs a cost of \(c_{2}\) is always playing a Prisoner’s Dilemma. It follows that ecological asymmetry can account for multiple social dilemmas being played within a single population, even if the players all use the same set of strategies’’ (McAvoy and Hauert 2015 p. 9).
The above quote is respect to asymmetry payoffs caused by space, but the point is that the asymmetry of the payoff matrix can lead to different players experiencing different games and therefore having different – potentially conflicting – strategies.
We might also apply the concept of genotypic asymmetry with the process of cultural updating, which occurs when the ‘genotypes’ (perhaps stake-holder types) do not change, but the strategies of players can be updated over time. Note that genetic asymmetry can be reduced to a broader symmetric game given genetic updating (i.e., births and deaths of players of particular types), this is probably not applicable to G-MSE.
Some thoughts on the application of game theory
I’m trying to step back a bit to consider the manager and user models, which will both affect and/or be affected by the game-theoretic component of the model. I’ve considered how the game-theoretic component will fit into G-MSE more generally, and also a bit of how it might be implemented and applied in the context of stake-holder actions. Overall, this will require three c files to be closely integrated, but the application (perhaps even development, if necessary) of game theory requires a lot of thought.
The model will be more general if we allow agents to take any number of actions. but the number of games that are possible increases exponentially with the number of different actions that agents can take (Zeeman 1980). If only two actions are possible (e.g., cooperate and defect), then there are only four types of games that can be played (Prisoner’s dilemma, Snowdrift, Anti-coordination, and Harmony). The number of games increases to 20 for three actions and 228 for four actions (Adami et al. 2016). If we want the software to somehow identify the type of game being played – rather – if game type identification is to be an essential part of the program, then agent actions will probably need to be limited (there is of course, always the option to identify games iff there are sufficiently few actions). If most conflicts can be described by a small number of types of agents with a small number of types of actions (and this seems reasonable, perhaps, especially if we think of actions qualitatively), then constraining the software to such cases might be preferable (at least, as a starting point). The benefit is that we might then make clearer predictions for management, e.g.: Right now, stake-holders are playing a Snowdrift game, but by adopting an alternative management decision, they will transition to playing Harmony.
This is appealing, but I think it also relies on payoff matrices being symmetric, meaning that players are distinguished by their strategies and nothing else (McAvoy and Hauert 2015). In the types of games that interest us, this almost certainly won’t be true. The games we’re interested in at ConfooBio will typically be characterised by uncorrelated asymmetry; that is, situations in which agents know that they are of a certain type and will receive payoffs associated with that type of agent. Hence, the payoff structure might look like a Prisoner’s dilemma to one stake-holder, but Harmony to another (i.e., the optimal strategy is always cooperate for one, but always defect for another because each knows the type of agent that they are and how payoffs differ between types).
I’m starting to work through these ideas with an initial focus on evolutionary games, as this is the application of game theory with which I’m most familiar, and because I think some of the general developments of evolutionary game theory are probably applicable for our purposes. I’ll also need to read more widely into economics and the social sciences, but some recent work by Adami et al. (2016) and McAvoy and Hauert (2015) seems relevant.
Adami et al. (2016) argue that the optimal strategies predicted by simple mathematical games are unlikely to be very useful for predicting agent actions given the complexities associated with decisions of real-world; such complexity notably includes stochasticity, which applies to games among all kinds of agents from ‘’microbes to day traders’’ (Adami et al. 2016). Stochasticity can affect the stability of strategies (see also Adami and Hintze 2013). If strategies are conditional or based on memory of previous encounters, then the number of traits
(Adami et al. 2016 assume loci, modelling genetics, but the same applies to agents making decisions) required to model decisions increases rapidly – 21 total traits are needed for conditional expression of strategy when agents can remember the previous two games. In practice, I suspect that there is some helpful way to simplify this – perhaps not every detail of the history of interactions and possible conditions really is needed to (or even would be expected to) model stake-holder behaviour. Instead, I suspect that game history could be boiled down into one or two representative variables that, among other things, are likely to influence agent behaviour. Agents are perhaps better to be thought of as modelling stake-holders guided primarily by heuristics rather than optimally rational behaviour? Hence the agent_array
might better be thought of as containing variables underlying human values and traits in the context of games rather than as solutions to games. A couple recent and potentially relevant papers on decision rules in complex environments include Fawcett et al. (2014) and McNamara et al. (2014). Adami et al. (2016) conclude that ‘’[w]hile evolutionary games can be described succinctly in mathematical terms, they can only be solved exactly for the simplest of cases’’. Adami et al. (2016) were specifically considering games in an evolutionary context, but I don’t think that their conclusion is limited to evolutionary game theory. In the case of decision making stake-holders, the complexity associated with stochasticity and uncertainty, the possibility of more than two actions and payoffs, and the asymmetry of payoff matrices would all seem to conrtribute to the difficulty or impossibility of solving for exact solutions. Hence, when scenarios are complex in G-MSE (as we probably need them to be), it is unlikely that analytic solutions will be of much use. However, stake-holders won’t evolve in the same sense as biological organisms, so some techniques used in evolutionary game theory will be unavailable – or have to be modified. It might be worth thinking more about identifying the consequences of practical or observed strategies, or types of strategies, rather than trying to somehow solve for the best strategies. The Axelrod experiments kind of did this before a lot of complex techniques became available to analyse evolutionary games. Users proposed strategies, which were put into a tournament – the point wasn’t so much to solve the iterated Prisoner’s dilemma so much as to explore different strategies for playing the game.
In this browser app, you can play the iterated prisoner’s dilemma against ‘Lucifer’, an automated agent that response to your decisions.
NEW ISSUE 9: Observation Error It would be useful to incorporate observation error into the simulations more directly. This could be affected by one or more variables attached to each agent, which would potentially cause the mis-identification (e.g., incorrect return of seeme
) or mis-labelling (incorrect traits read into the observation array) of resources. This could be done in either of two ways:
Cause the errors to happen in ‘real time’ – that is, while the observations are happening in the simulation. This would probably be slightly inefficient, but have the benefit of being able to assign errors specifically to agents more directly.
Wait until the resource_array
is marked in the observation
function, then introduce errors to the array itself, including errors to whether or not resources are recorded and what their trait values are. These errors would then be read into the obs_array
, which is returned by the function.
NEW ISSUE 10: Multiple resource
The resource-wide parameter values (e.g., carrying capacities, movement types) will need to be either:
resource
function as necessary, and/orgmse
function, the length of which could determine how many times resource
is called in one time step (one for each type of resource, potentially, if carrying capacity is type specific – or carrying capacity could be applied within a type in c – perhaps more efficient, but would require to read in multiple K
somehow, either through the paras
vector or in the resources
array – or something else. How to do this best will need to consider both computational efficiency and clarity/ease of coding.Note that:
res_remove
can already be called in a type-specific way by resource
, so it might just be better to call resource
once and somehow input variable numbers of K
into c. I’ll need to think more about this, but it could be something like assigning each individual a competition coefficient alpha
for how it is affected by each other type of individual. Intra-type competition could then be modelled generally, with K
defined by its inverse. Meanwhile, inter-type competition coefficients could also be useful.
Along these lines, it’s also worth considering an option allowing only one resource per cell (equating to a local alpha
and K
of one). This might be worth making its own issue later.
If we were to call resource
multiple times, we would also need to paste
arrays together in R
. This wouldn’t be terrible, but it could lose some efficiency unnecessarily, and I don’t see the benefit.
MEMORY LEAK CHECK OF R CODE
I have tried running simulations at very high population sizes (>100000) to see how the simulation would react. Upon seeing quite a bit of memory being used up, I ran the following valgrind
command:
R -d "valgrind --tool=memcheck --leak-check=yes" --vanilla < gmse.R
The program valgrind
found a lot of large memory allocations and deallocations (as expected):
Warning: set address range perms: large range
The leak summary was as follows:
==14507== LEAK SUMMARY:
==14507== definitely lost: 133,373,728 bytes in 469 blocks
==14507== indirectly lost: 11,472,512 bytes in 55 blocks
==14507== possibly lost: 120,863,992 bytes in 563 blocks
==14507== still reachable: 2,319,742,586 bytes in 12,127 blocks
==14507== suppressed: 0 bytes in 0 blocks
==14507== Reachable blocks (those to which a pointer was found) are not shown.
==14507== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==14507==
==14507== For counts of detected and suppressed errors, rerun with: -v
If we shift to look only at one run of the resource
model, which is run in the new script scratch.R
, we get:
==14689== LEAK SUMMARY:
==14689== definitely lost: 3,584 bytes in 4 blocks
==14689== indirectly lost: 0 bytes in 0 blocks
==14689== possibly lost: 0 bytes in 0 blocks
==14689== still reachable: 28,837,506 bytes in 13,346 blocks
==14689== suppressed: 0 bytes in 0 blocks
==14689== Reachable blocks (those to which a pointer was found) are not shown.
==14689== To see them, rerun with: --leak-check=full --show-leak-kinds=all
And if we include one run of the observation model too, we get:
==14721== LEAK SUMMARY:
==14721== definitely lost: 6,296 bytes in 8 blocks
==14721== indirectly lost: 0 bytes in 0 blocks
==14721== possibly lost: 0 bytes in 0 blocks
==14721== still reachable: 28,948,434 bytes in 13,355 blocks
==14721== suppressed: 0 bytes in 0 blocks
==14721== Reachable blocks (those to which a pointer was found) are not shown.
==14721== To see them, rerun with: --leak-check=full --show-leak-kinds=all
A bit more worrisome, if I run an old R script (a simple individual-based model), I get the following
==15050== LEAK SUMMARY:
==15050== definitely lost: 0 bytes in 0 blocks
==15050== indirectly lost: 0 bytes in 0 blocks
==15050== possibly lost: 0 bytes in 0 blocks
==15050== still reachable: 36,846,063 bytes in 15,996 blocks
==15050== suppressed: 0 bytes in 0 blocks
==15050== Reachable blocks (those to which a pointer was found) are not shown.
Originally, I feared that this might suggest a problem with my c code, or its call to R. All the memory allocated appears to be freed though. Some searching online suggests that valgrind
is not always perfect on this front.
‘’You may be surprised to see that valgrind believes that R has leaked memory - unfortunately, it is not perfect, and in this particular case the memory is not so much ’leaked’ as it is ‘cached for the duration of that R session’, and valgrind fails to detect that ‘ownership’ of a particular block of memory is transfered.’’
This is likely what happened (given the original warning). In fact, if we run valgrind
and try to track the origin of the leak with --track-origins=yes
, it complains in exactly the this way – about memory that is allocated but definitely freed:
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < scratch.R
Below, for example, valgrind
is complaining about line 468 in the resource.c
file:
==15171== 1,560 bytes in 1 blocks are definitely lost in loss record 165 of 1,867
==15171== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15171== by 0xC2959DE: resource (resource.c:468)
==15171== by 0x4F0A57F: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F4272E: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F43DDC: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F422FC: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F45FB5: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
This line allocates memory for the res_new
array:
res_new = malloc(res_num_total * sizeof(double *));
for(resource = 0; resource < res_num_total; resource++){
res_new[resource] = malloc(trait_number * sizeof(double));
}
This appeared to be have been freed correctly, but on inspection, each malloc
to an array was missing a correspondnig free I have fixed this (with thanks to this StackOverflow thread), and now the entire gmse.R
program produces the follow valgrind
output:
==15405== LEAK SUMMARY:
==15405== definitely lost: 0 bytes in 0 blocks
==15405== indirectly lost: 0 bytes in 0 blocks
==15405== possibly lost: 0 bytes in 0 blocks
==15405== still reachable: 1,544,824,322 bytes in 12,119 blocks
==15405== suppressed: 0 bytes in 0 blocks
CONCLUSION MEMORY LEAK HAS BEEN IDENTIFIED AND FIXED
While this wasn’t a huge deal for small scale simulations, for simulations with huge arrays caused by large population sizes, this would have made a difference. The code has therefore been corrected and pushed to dev
.
With all of this in mind, it is worth thinking about the R side of memory management as it becomes more relevant (see R memory management advice). It might be worth switching to a list structure for input and output so that entire frames are not copied for each operation (which I assume R is doing for the rbind()
function). It might also be worth thinking about running rm()
and gc()
in tandem to release memory during the major loop – or also getting rid of some components of the data frame on the fly. The gmse.R
program could potentially switch from a list
to an array
after the major simulation loop finishes and plotting or returning the array is necessary.
It appears that I’m correct regarding the use of rbind()
(or c()
or cbind()
) – these are terribly inefficient with respect to what’s happening under the hood when R calls C (or C++). I’ve downloaded Svetlana Eden’s Efficiency tips for basic R loop, which might be a useful reference when working on the R side of optimisation. The rbinds
really show be avoided, if possible. One way to do this, if nothing else, would be to write to a file instead of cbind
(not sure if this would be helpful for a shiny app). StackOverflow suggests using rbindlist
, but this would introduce dependencies that I’d prefer to avoid. In the end, it might be worth it to just write a quick add_data.c
script in c for the sole purpose of joining old and new arrays. Alteratively, this might not be so important – in the end, it might not even be necessary to record the entire observation history; at least, not in the way it’s currently being done. The history might instead only record a few key things from each time period.
RESOLVED ISSUE 6: Sampling ability with agent number This issue has been resolved to my satisfaction. I did this using the second option of addressing it. Now for case 3
in which blocks of the landscape are iteratively sampled (and resources potentially move in between iterations), a transect_eff
defining transect efficiency is set as equal to the number of observing agents (working_agents
). The transect_eff
is a counter, which, after it has counted down to zero, will permit resource movement. Hence, if there is only one agent observing, transect_eff
hits zero and movement happens after every iteration; if there are two agents observing, then transect_eff
hits zero after two iterations, then movement occurs and transect_eff
is reset to working_agents
.
RESOLVED ISSUE 8: Clear up method sampling type in observation model This issue has been resolved, albeit with cases in a different order than suggested (the original suggestion, it turns out, was not ideal). Cases are now:
0: Sampling with a range of view (i.e., don’t rely on the fix_mark > 0 for switching methods) 1. Sampling fix_mark
times randomly on the landscape. 2. Linear transect 3. Square transect
Of course, there is always room for more, but these are now four clear observation methods. Separating case 0
from case 1
is especially useful now. Now the variable fix_mark
is just ignored for all cases except 1. In the code, both case 0
and case 1
still look similar, and both dig deep through mark_res
and field_work
functions to differentiate between observation methods, but I don’t think that this is necessarily a bad thing – a different argument to mark_res
differentiates them now, at least, in the observation
function, so it’s not too difficult to trace through what is going on. Note that both cases 0 and 1 add a new column for each times_obs
, which isn’t done for the transect methods.
The specially created branch fix_home_bug
has been merged. I will keep it alive for a while before removing it entirely.
Update – 14:33, after rewriting the gmse.R
code to make an easier catch-all function, with appropriate analysis, I’ve noticed that the binos
function in observation.c
is defining distance in a way that is no longer really compatible with the ponit of case 0
(i.e., sample a small area and extrapolate based on the density). The binos
function was looking at the Euclidean distance, making, e.g., 3 cells away diagonally farther than 3 cells away left or right (or up or down). This might be useful later, so I’m going to keep it in as an option, but I’m also going to make the default now as within view
cells in any direction, such that a block forms around the focal individual, and diagonal distances are not assumed to be longer than length and width. This is the more common way of simulating things, and it makes movement and observation estimates easier – I think the only reason to change it back to Euclidean distance would be if we had an actual map and really needed to be precise with the distance of things on it.
I have also simplified the master R file gmse.R
to allow for one function to do all of the work, using several default options for simulations. Below, the main gmse()
function is shown with its default values.
################################################################################
# PRIMARY FUNCTION (gmse) FOR RUNNING A SIMULATION
# NOTE: RELIES ON SOME OTHER FUNCTIONS BELOW: MIGHT WANT TO READ WHOLE FILE
################################################################################
gmse <- function( time_max = 100, # Max number of time steps in sim
land_dim_1 = 100, # x dimension of the landscape
land_dim_2 = 100, # y dimension of the landscape
res_movement = 1, # How far do resources move
remove_pr = 0.0, # Density independent resource death
lambda = 0.9, # Resource growth rate
agent_view = 10, # Number cells agent view around them
agent_move = 50, # Number cells agent can move
res_birth_K = 10000, # Carrying capacity applied to birth
res_death_K = 400, # Carrying capacity applied to death
edge_effect = 1, # What type of edge on the landscape
res_move_type = 2, # What type of movement for resources
res_birth_type = 2, # What type of birth for resources
res_death_type = 2, # What type of death for resources
observe_type = 0, # Type of observation used
fixed_observe = 1, # How many obs (if type = 1)
times_observe = 1, # How many times obs (if type = 0)
obs_move_type = 1, # Type of movement for agents
res_min_age = 1, # Minimum age recorded and observed
res_move_obs = TRUE, # Move resources while observing
Euclidean_dist = FALSE, # Use Euclidean distance in view
plotting = TRUE # Plot the results
){}
Using the function defined above, with most parameters set to default values, I looked at the four different observation types below given the following parameters.
# A: Sample of a 10 by 10 region to estimate density
# Simulation time: 1.8 seconds
gmse( observe_type = 0,
agent_view = 10,
res_death_K = 800,
plotting = TRUE
);
# B: Mark 30 resources 4 times, recapture 30 4 times
# Simulation time: 2.1 seconds
gmse( observe_type = 1,
fixed_observe = 30,
times_observe = 8,
res_death_K = 800,
plotting = TRUE
);
# C: Sample agent_view rows at a time -- all across
# Simulation time: 2.3 seconds
gmse( observe_type = 2,
agent_view = 10,
res_death_K = 800,
plotting = TRUE
);
# D: Sample agent_view rows at a time -- all across
# Simulation time: 6.5 seconds
gmse( observe_type = 3,
agent_view = 10,
res_death_K = 800,
plotting = TRUE
);
These four simulations A-D, which had identical populations models and similar observation modes, produced the four graphs below.
Overall, these simulations have been stable throughout testing, and I am (finally) merging the dev
branch to master
, pushing to GitHub, and declaring this v0.0.5
.
A couple updates that have been made, or need to be fixed. I’ll do these tomorrow, as they probably won’t require much more than a few hours in the morning
I’ve created a new temporary branch, fix_home_bug
, after noticing a crash from my home laptop. It seems that I hadn’t initialised the added
variable at zero in the res_add
function of the resources.c
file. At the office computer, it seemed to initialise it at zero automatically (or I’d not played with the right parameters to get it to crash), but at home, it was often getting initialised to very high values and crashing. I’ve fixed the issue on the new branch, but it needs to be merged.
NEW UNRESOLVED ISSUE #8: Clear up method sampling type in observation model The method
sampling for case 0
is too confusing. Sometimes it means randomly sampled fix_mark
individuals from the population, and sometimes it means sample within a particular range of view. Change this so that the switch
functions have four clear cases:
fix_mark
times randomly on the landscape.This will avoid a lot of hassle, even if the code for cases 0 and 3 end up looking the same, or very similar. It’s just very confusing to manage as it is now.
ISSUE #6 STILL NEEDS RESOLVING I was working on this when I found the bug resolved on the new branch. It didn’t take too long, and it should be an easy fix while I take care of issue 8.
TIME ISSUES: While the simulations run quickly in the office computer, 100 time steps now take about 8 seconds for the loop on my Lenovo Thinkpad X201 – something to be aware of as the coding continues.
FOR TOMORROW: Make a summary that includes an example of all 4 types of observation models and their appropriate analyses (quickly fix the plotting to do the correct analyses automatically):
case 0
View-based sampling in which the density is sampled and applied to the whole size of the landscape (as in Nuno et al. 2013) case 1
Mark-recapture sampling where there is some fixed number marked at each time and estimates show Chapman style analysis case 2
Sampling along a linear transect as resources move, and case 3
Sampling using blocks as resources move.
Some updated code is on the fix_home_bug
branch, which can be merged into the dev
branch once it’s done and is stable after some testing in the office (i.e., try to crash it).
Below shows a bit of additional coding, which resulted in two new ways (which is really just one flexible way) that observation can occur. There are couple trivial fixes and additions to make (see new issues 6 and 7), but these should be easy to implement. For now, it’s time to take a step back and plan a bit more generally, especially with respect to implementing the game-theoretic component of the modelling.
RESOLVED ISSUE #5: Sweep observation This issue has now been resolved. There are now two additional ways to observer populations, as guided by the method
variable used in the main switch
of the observational model. In biological terms, the observational model allows us to sample in the following two ways:
By sampling view
rows at a time, starting from the top of the landscape and working down to the bottom. Each time a new row is sampled, resources on the landscape can move (resource movement can also be turned off if desired). Hence, it is possible for observers to miss or double count resources. The bigger view
is, the fewer iterations of sampling are needed to make it all the way across the landscape, hence fewer total times resources will move over the course of sampling.
Identical to 1, but instead of sampling a full row and working down, observers start in the upper left corner of the landscape and sample around a view
by view
block, and hence a total of view^2
cells. Sampling proceeds with blocks across rows until sampling of the very right side of the landscape has occurred. After sampling all to the end of the right side, observers move down, sampling another row of view
by view
blocks just beneath the first. This continues until the entire landscape has been sampled, and roughly simulates an observer working their way through the whole landscape over time (time in which resources might move).
Note: The first case is redundant, and therefore will probably be removed later, but it helped as a scaffold for the more general procedure and takes up little space; for now, I’ll leave it
Testing on both of the above cases was successful (see the figure below). In each case, if resources are not allowed to move, then observers predict resource abundance with 100 percent accuracy (i.e., they sweep through the landscape and count all of the stationary resources). If resource can move, there is a bit of (normally distributed, it appears, and should be – can look later) error around the actual abundance. Either of these two methods of observation work fairly efficiently until view
gets very low (ca 2), in which case a lot of sampling happens in each generation.
After each sampling, resources moved an average of ca 5 cells away, with a distribution as shown below (Figure below shows the distance that an individual moves in one time step – between successive iterations of observer sampling along a transect.
I did not code sampling using the initially considered method, with agents physically moved to locations and then looking around. Instead, resources are just considered counted if they are within the row or block under consideration. To account for multiple agents sampling, view
is actually first multiplied by the number of agents sampling (only 1 for now). This makes sense for case 1, but for case 2, sampling ability actually increases with the square of agent number, so this will need to be changed (Adding a new issue).
INTRODUCE NEW ISSUE #6: Sampling ability with agent number
In case
two of the observational model, the length and width of a sampling block will both increase linearly with the number of agents doing the sampling; hence, sampling area increases exponentially with the number of observers, which is probably unrealistic. There are two ways to potentially address this:
+= (int) agent_array[agent][8]
. Only allow resources to move when this countdown hits zero, and reset it it thereafter. Hence, observers will observe more n more blocks if there are n more observers.INTRODUCE NEW ISSUE #7: DENSITY TYPE SAMPLING
Of course, it will be easy to make this kind of transect sampling random instead of comprehensive over the landscape. This can be done by simply randomly choosing the positions of block on a landscape some obs_iter
of times. This could allow an estimate of population size by considering density (i.e., assume that the number counted in a sampled block reflects the density of the larger landscape of known size), as was done by Nuno et al. (2013). This shouldn’t take much time to code and test.
I’m going to start referring to issues that are introduced and resolved in the gmse GitHub repository by number.
RESOLVED ISSUE #4: Repeat calls of resource within resourc.R Now poorly named given the solution. The result is a brief update on the addition of a bit of a side function. The function anecdotal
is now available in the observation.c
file, and is called from the anecdotal()
function in the file anecdotal.R
. All this function does is cause agents of one or all types to count the number of a particular resource within the agents’ view. It is similar to the observation
function, but instead of returning an array of observations of resources (augmented with columns for different observations periods – see 10 JAN) that is intended to be used by R separately, the anecdotal
function adds the number of resources viewed in an agent’s vicinity to a column in the agent array. The name of the function therefore is meant to add to an agent’s general mood or impression of the quantity of a resource, based on anecdotal evidence for what’s going on around their location. We can imagine such anecdotal evidence as affecting the opinions and behaviours of stake-holders.
INTRODUCE NEW ISSUE #5: Sweep observation Related to discussions with Jeremy and Tom regarding the Islay geese, need to have a kind of observational model in which agents move to take measurements, but resources move along roughly the same time scale. This can of course be accomplished one way if we:
if(resource_movement == 1)
type criteria at the tail end before the break
(to avoid unnecessary movement). This will require also including the resource movement function (currently in resource.c
) in the observation.c
file. May as well just dump the whole thing in in the interest of modularity, though if it stays the same, it will be tempting to create a utils.c
file of some sort. This resource movement option can be applied to the existing method
case 0
, as appears in the switch function of the observation
function.To do a sweep of the landscape while allowing resources to move, I think we’ll want a completely different method
of population size estimation (most upstream switch function). What this method will do is:
x
location of x = 0
on the landscapex
locations x
to x+view
(i.e., observe view
rows)x = x+view
x+view
is greater than the y dimension land_y
x
to land_y
.The procedure above will simulate observations over a time that is proportional to their view
(and thus ability to census) – the more time it takes, the more the resources can move and potentially lead to measurement error. The observational array returned will still be output in the same way – resources will be marked as with the case 0
option and read out as an observational array.
Note: It would be nice to eventually allow for blocks rather than long linear transects to be sampled, as square blocks might more realistically correspond to the kind of sampling that would be done by a real observer. I don’t think that this would make too much difference in terms of finding sampling error, as there is no bias to resources movement in one direction; hence, the turnover of resources for any particular number of cells will be the same for any N
cells sampled. It also stands to reason that this error should be normally distributed as the number of sampling attempts becomes large, and the error should be mean centred around the actual population size, since the probability of missing and double counting would seem to cancel out exactly. This might eventually lead to analytical estimate of observation and error actually being reasonable under some conditions.
Plan for the near future
I will try to implement this new idea tomorrow, as I don’t think it will take much more than a day’s work, if that. Then, it’s really time to take a step back and think – need to read Nuno et al. (2013) in more detail first, perhaps tonight, and potentially also add the observation model procedure used therein as different implementations of case
– this should be very similar to the solution for *ISSUE 5**, except through the use of random sampling of area and density measuring of resources. We’ll then be in a position of having a stable resource and observation model with a few different options for observation, and I’ll need to think more carefully about the big picture, and how to proceed with the rest of the model.
We now have a working G-MSE v0.0.4
, which includes a stable population model and a stable observation model. The figure below shows the visual output of the new version, with the landscape in the top panel (note: different tan colours don’t mean anything yet – the landscape is effectively uniform); resources (i.e., individuals in the population) are represented in black. In the bottom panel, the solid black line shows the actual change in (adult) population size over time, stabilising around a carrying capacity of 400 (red dotted line). The dark solid cyan line shows an estimate of the population size from the observation model, simulated through mark-recapture (other types of observation are available, see below). The shading around this line shows \(95%\) confidence interval estimates. More details about this specific estimate below.
I’ve made a few minor updates to the population model code, and included one new type of movement that is allowed – borrowed from individual-based modelling literature on plant-pollinator-exploiter interactions (Bronstein et al. 2003; Duthie and Falcy 2013). This type of movement makes use of an individual’s movement parameter move
by having an individual move Poisson(move)
times each time step, and with each movement travelling up to move
cells away (Euclidean distance). This type of movement is case 0:
in the mover
function in resource.c
.
This update includes the major addition of the observation.c
file, called by observation.R
to simulate the sampling of resources (i.e., individuals) from the population model. The file observation.R
holds the observation()
function, which returns a data frame of observed resources. The observation function thereby simulates the process of acquiring observational data, but not analysing those data. Analysis of these data is left to R, or to a (yet written) c function (note, current analyses are fairly simple).
The function observation.R
requires the following three data frames:
resources
: holds all of the resources simulated.landscape
: holds the landscape on which resources and agents are located.agent
: holds all of the agents simulated (this also includes at least one manager of type 0 – even if the manager does not eventually participate in games).The observation.R
function also requires the paras
vector, which holds all parameters that might be important throughout the simulation.
Optional inputs include:
type
, which specifies the type of resource being observed (default = 1).fix_mark
, which either sets a fixed number of resources to be sampled during an observation (positive integer value) or sets an observer to ‘’observe’’ all resources in its view
(0 or FALSE).times
, which sets how many times an observer will make observations during a time step (must be >0)samp_age
, which defines the minimum age at which resources are sampled (the default is set to 1, meaning that resources just added are not sampled – could conceptualise this as sampling only adults; for now, it also makes the initial testing easier because carrying capacity has not yet been applied to juveniles during before observation – can change this, of course.agent_type
, which identifies which agents are doing the observing. The default value is 0, which identifies the managers in the model. For most purposes, we will only need to have managers doing the observing, but there is definitely some utility in allowing other agents to do their own observing; more on this below.model
, which currently has to be “IBM”. Eventually it might be nice to allow observation.R
to shunt observations to something not individual-based, such as Nilsen’s model, or another analytical equivalent, but not yet.The file observation.R
calls the function observation
in the file observation.c
. This c file follows the following general protocol:
The function observation
is called, which does the following:
observation.R
function).mark_res
a total of times
times – each time simulating a unique trip to do field work. mark_res
is a general function for marking individuals. Other functions can eventually be called instead of, or in addition to, mark_res
, but the function is already very flexible, so it’s hard to imagine what other function might be needed – mark_res
is currently the default and only function called. Details on the function are below.obs_array
. This array includes a row for every resource observed and all of the columns that also exist in the resource array (e.g., identifying resource location, identity number, types, life-history parameter values, etc.). Additionally, the observational array also includes a column for each times
– the number of times that observations are made. These columns hold values of 0 or 1, which indicate whether (1) or not (0) a resource was observed during a particular observation (can think of times
as outings in the field, each producing a column of whether a resource was spotted/marked/recaptured or not).obs_array
into a format that can be returned to RThe function mark_res
is called by observation
, and does the following:
observation
):field_work
causes the agent to go out and do some observational field work.a_mover
causes the agent to move according to some specified rules, as stored in the parameter vector and agent array. The default is simple uniform movement some Euclidean distance away after doing field work – setting up for field work in a different location. The code is almost identical to the code that moves reources in resource.c
, so I’ll not explain this here.The function field_work
simulates the process of an agent looking for and tagging resources in some way (this can later be interpreted as viewing, tagging, marking, recapturing, etc.). There are currently two different tagging procedures possible (with the option to build more):
binos
function (simulating, e.g., binoculars).fix_mark
resources on the landscape (note: which resources is not a function of space)After the observation function is run, we thereby have an obervational data frame in which rows are individual resources, and columns include traits of those resources (same as in the resource data frame) and whether or not the resource was observed during a particular simulated outing. Through a combination of specifications for times
and fix_mark
options, observational data frame can then be interpreted in multiple ways and used in a simulated analysis:
There are multiple ways to interpret the observation results. Examples of this are as follows (for now, I’m assuming that there is one observer, but we can substitute the below with any number of observers):
Details of the technique used to produce the above figure include the following:
gmse.R
figures out what the estimate of the population size would be for each time step. The analysis uses a very simple chapman_est
function that I wrote in R. This function, or something like it, might be later incorporated as part of the observation model itself (likely by having observation.R call a different c file or R function), or in the manager model, or somewhere inbetween. I haven’t decided.For now, it’s time to take another step back and take stock of what needs to be done next. A manager model and user model will need to start looking at multiple resources for making decisions, and somehow both potentially feed into a game-theoretic model. The complexity involved with the integration of management, games, and user actions should be a bit mitigated by all of these eventual functions revolving mostly around the agent array, with some input from the observation array. Of course, at least one type of agent will need access to the observational data as input (perhaps only to ignore it, sometimes), and users will need access to the resource array for off-take and other things. Some careful planning is needed for what happens next. I am particularly becoming aware that the flexibility of this model, while definitely a good thing, has the potential to tempt me into creating a lot of end user options that no one will actually want. It might be a good idea to develop a list at some point separating key options that we definitely want to be visible to all end users from more obscure options that are available to us by editing the central gmse.R script. It’s also likely that a model of this scope will require a well written R function that translates different combinations of user-friendly inputs into an R list, which can then be interpreted by the script that calls resource.R
, observation.R
, manager.R
, game.R
, and user.R
, and which places inputs into the vector para
appropriately.
It’s worth noting that the flexibility of the observation
function might be used to address social questions that interest us. I’ve been mainly conceptualising the observation model as something done by a disinterested third party – a manager rather than a stake-holder per se. The manager would make some decision that then affected payoffs in a game among stake-holders. We can do this of course, but we can also allow the stake-holders themselves to observe, perhaps less thoroughly and with more potential for bias (as we assume that they have less time and expertise). For example, we might imagine some stake-holders to estimate population size or change over time for themselves by observing all of the resources within a short distance around their location – perhaps (incorrectly) biased by large population changes (e.g., way more geese around my location this year than last – estimate a lot of total geese this year overall). These observations could feed into the game and user models.
Also – and this might require some tweaking – the flexibility of the type columns (type1, type2, type3) means that observing can be flexible too. We could allow each individual to observe, or groups of individuals of the same type to observe. NEW: We can also specify the type of individuals doing the observing by any category, including individual ID. This means that we can tell a specific agents (assuming they are represented by rows) to observe, or loop through the function with specific agents. The agent’s type (or ID) is stored in the observation output, indicating which agent did the observing if data frames get amalgamated from looping the observation
function.
As a quick update, I now have a working population model for G-MSE, and have reached the point where it will probably be better for me to take a step back and plan a bit, then work on other aspects of the full model rather than add more bells and whistles to the population sub-component. The development that I have done includes five files (happy to send these for the curious):
gmse.R – A master file that I’m currently using to call everything else
landscape.R – A file that constructs an \(m \times n\) landscape (in the code, this is a simple 2D array, the elements of which can contain any real number). Currently, there is an option to make this landscape any size and randomly place any number of ‘resources’ onto it, if desired. In the past, I have used some code to produce autocorrelation of values on the landscape; if it suits us, I can rewrite this code (to improve the readability) for application to G-MSE. I also think it would be useful to have the option of reading in an image (i.e., a map) and converting it to an array to be used as the landscape (e.g., JPG, BMP, etc.) – I suspect some stakeholders might find this especially useful, as it might help them see the applicability more clearly. Also, I’ve left hooks in the R file to allow eventual development of a non-spatial model.
initialise.R – A file that generates a single ‘RESOURCE’ array, which will hold everything that might be of value to stakeholders; this includes, most obviously, individuals in populations of conservation interest, but can also be used to respresent things like hunting licenses or crop plots. The idea is to have a data structure that provides maximum flexibility – individuals can be represented as rows (or sets of rows) within the array, and their types and attributes can be indexed by column:
## IDs type_1 type_2 x_loc y_loc move time remov_pr growth offspr age
## res_1 1 1 0 10 10 2 0 0.1 1.1 0 0
## res_2 2 2 0 6 14 2 0 0.1 1.1 0 0
## res_3 3 2 0 20 18 2 0 0.1 1.1 0 0
## res_4 4 1 0 20 15 2 0 0.1 1.1 0 0
## res_5 5 1 0 12 11 2 0 0.1 1.1 0 0
## res_6 6 1 0 1 1 2 0 0.1 1.1 0 0
## res_7 7 2 0 18 14 2 0 0.1 1.1 0 0
## res_8 8 2 0 20 17 2 0 0.1 1.1 0 0
## res_9 9 2 0 4 17 2 0 0.1 1.1 0 0
## res_10 10 2 0 4 16 2 0 0.1 1.1 0 0
resource.R – This file has only one real job, and that is to read in the RESOURCE
array, LANDSCAPE
array, PARAMETER
vector, and MODEL TYPE
(currently only individual-based model, “IBM”), and then call the appropriate resource model. this intermediary R file allows us to be flexible in re-routing the whole G-MSE to different population models, if need be. We could even mix and match the extent to which components use simple equation-based modelling (e.g., as in Nilsen’s MSE), and which use the more computationally-intense agent-based simulation (though I really don’t think computation time will be much of an issue, even with the agent-based model). Currently, all this R file is doing is calling the C code and the file resource.c – or, more accurately, it is calling the compiled file resource.so, which allows R to link to C.
RESOURCE
and LANDSCAPE
arrays, and a PARAMATER
vector (containing any key parameter values) from R, and returns a new RESOURCE
array (hence, landscape and parameter values are unchanged). A rough outline of what this key function does is as follows:
add_time
, which writes a time step and adds an age to all rows (see table above)mover
to move individuals some Euclidean distance according to a parameter (see above) and movement rules (currently: uniform probability of cell distances, Poisson probability of distances). This program also uses a parameter to determine what happens at the edge of the landscape – currently, either nothing happens (i.e., individuals are just ‘out of view’) or the landscape wraps around as a torus (i.e., if you leave on the left side, you come back on the right).res_add
and res_place
to simulate the addition of new resources (e.g., birth of individuals) and place them in a new array, respectively. Currently, old rows (e.g., individuals) directly create new rows according to a growth
parameter (see table above), simulating birth, but this can be changed. A carrying capacity can also be applied to addition of new rows. New rows are also identical to their ‘parent’ rows in everything except ID and age, but this can also be changed.remove
to remove some of the old rows from the input array – currently removal of rows occurs with some fixed probability (remov_pr
, see table above), or probabilistically based on a set carrying capacity.RESOURCE
array that were not removed with the newly created resources to make one single array (might want to make this its own function later, for readability).A small script can help us see the output of what’s going on in the population, both in terms of individual movement and change in population abundance over time. The run time of the below population is negligible – all of the data underlying the 100 time steps shown in the figure below is produced in a tenth of a second (4 JAN Update: Assuming instead a carrying capacity of 40000, closer to the ball-park of the Islay geese, 100 time steps takes 11 seconds). The upper panel of the figure below shows a landscape (light and dark brown – these colours don’t mean anything at the moment, but could represent different landscape properties) with individuals (black) that move around, reproduce, and die in each time step. The lower panel shows the abundance of these individuals as they increase to carrying capacity (red dashed line), whereafter the population size remains stable (of course, simulating a bigger population takes a bit more time – it takes about nine tenths of a second to simulate 100 time steps at a carrying capacity of 4000).
I would like to develop one general, efficient, open-source, and user- and developer-friendly program for G-MSE that would be a general tool for applying game theory and management strategy evaluation to specific problems of conflict among stake-holders. I’m somewhat flexible on the development, but my preference would be to have software that is:
Open-source, with all version-controlled development history being publicly available on GitHub.
Written primarily or entirely in C (for efficiency and portability)
Easily called from R using an R package (see also) and appropriate R functions (as many scientists would likely want to integrate the program with other R packages and their own code or data). Note that this could be tricky for windows users. See details on the most flexible way to call R from C.
Usable with a browser-based GUI (or perhaps an app, though I’d have to learn how to do this), probably ‘shiny’ on top of R.
Useful for scientists or stake-holders unfamiliar with R, or command line code more generally
Perhaps useful as a teaching tool for students or the general public
Could look similar to this: < https://tomhopper.shinyapps.io/TB_Cases_shiny/ >, the code repository of which is availabile here: < https://github.com/tomhopper/TB_Cases_shiny >. Each tab could have a different set of related inputs and outputs, which together could produce a full report in the browser.
Comparable in scope to something like RangeShifter: http://rsdevs.github.io/RSwebsite/ (Bocedi et al. 2014)
MAJOR POINTS: Some major points fleshed out given the thinking below:
Question: The objects (i.e., populations, resources, commodities) will often be represented as discrete entities (individual animals in populations, but also things like licenses sold and crop patches saved or raided – which could have individual locations). Should the stake-holders also be modelled as (potentially multiple) discrete entities? This is easy to see if, e.g., stake-holders are potential hunters that do or do not buy licenses and engage in hunting, but maybe conservationists could also be considered as discrete – each individually affecting the decision of an organisation in a game.
Given the question above: Stake-holders could then also be represented by a data frame, which could generalise the model to allow many individual stake-holders to play a game (or not, if data frame is single row, or scalar). This could then more naturally incorporate mixed strategies (some will take one strategy, some another) and uncertainty. In the case that it is some sort of organisation making a decision, this would allow the individual stake-holders to collectively affect a single action or policy. This would appear to drift more into the realm of agent-based computational economics, which might be a good thing given the goals of ConFooBio. This could allow for maximum flexibility too, if agents could also be discrete individuals making decisions.
Should the model therefore be focused on at least four data frames modelling individuals? At least two modelling individual species or resources of interest (and at least one being a population of conservation interset), and at least two modelling modelling individuals with interests in the former?
I think that the agent-based model is really going to be the default one to use, with other models being useful only if the end user is really tied to them in some way. In general, to find emergent phenomena and predict dynamics and decisions accurately, I think it will be useful to keep in mind the maxim of keeping situation rules simple while allowing agents to be complex (Volker Grimm said something like this in one of his talks or publications, and given the ConFooBio focus, I think it’s especially applicable).
Before getting into specifics, it will be useful to walk through the G-MSE model conceptually to figure out what kinds of approaches are going to be most useful for the following:
Each of these needs a general framework that will be most usefully applied to real-world problems of conflict. Ideally, these models will be modular – i.e., not depend on the type of modelling being done in other areas of G-MSE. That way, we might, e.g., decide to substitute an entirely different kind of natural resources model (e.g., simple numerical Lotka-Volterra versus spatially explicit individual-based model), but still be able to generate input/output in each component to be used by the next.
Nevertheless, there needs to be some conceptual framework that is consistent, in addition to the five above modules. I’ve written down some of these ideas, deliberately avoiding Nilsen’s MSEtools repository for now. Some potential things that are common to G-MSE:
The model is therefore going to need to generally hold two or more variables or objects that represent populations or resources (including biomass) that can both be affected by any of the sub-components (note: even something like fishing licenses sold can be oberved, perhaps with trivial error – we can therefore apply the same process of MSE to both populations and the things with which they are in conflict).
In any case, there will be a need to model how properties of the population change from one time step to the next. Properties of interest for populations might include:
It would seem as though properties for conflicting resources would be more likely to boil down to one number (e.g., crop yield, licenses sold), but maybe not. We could, for example, assign a location to farms and licenses, or units of biomass in some way.
I think an individual-based model that represents individuals and resources with a table is probably the best way to go in most cases. We can perhaps broaden this out so that the observation model will recognise a table (IBM), a vector (classes), or a number (just size), with some indication of the type of data being returned, but most of the time a full table will be the way to go (in fact, we could probably just make everything a data frame, and have \(1 \times 1\) data frames be interpreted as scalar, and \(1 \times n\) data frames be interpreted as a vector). The information about the population will represent all of the relevant information about the natural population being modelled, so it can pass all of this information onto the observation model, which can then run some function to search through it and extract parameters of interest (with error, potentially). Within this model, we’ll want functions to model birth, death, immigration, and emigration.
For scalar or vector inputs, observation error could be more directly simulated – just with a parameter for bias and error (e.g., around population size, or sizes of each age or stage class).
Alternatively, a different, more general way of doing it might be to instead simulate some length of time \(t_{obs}\) for modelling the process of observation. Then each time step could include a probability of observing an individual. This might be even better because I think it would be more generalisable. In the case of the IBM, individuals could be observed following a Poisson process at each time step that:
The benefit here is that a scalar or vector could be modelled in the same way, just by sampling from a Poisson distribution to find observation number at each time step of some number of individuals (potentially of different ages or classes).
It will then spit out something that will affect both the game that agents play and therefore actions of users.
One job of the management model will be to calculate statistics associated with the uncertainty surrounding these observations (e.g., confidence intervals), which will affect management decisions that are simulated.
TODO: Need to figure out how management decions are going to be implemented. These deisions will feed directly into the game model, and possibly the user model.
This part is especially tricky. Need some common framework to convert the dynamic things (resource, population) into a utility function, then into a payoff matrix (or perhaps something even more general). Questions that need addressing before building the model:
We also want to include uncertainty in the games.
The general structure of the program itself, I think, could fit into Figure 1 of (Bunnefeld et al. 2011) (TREE paper), with a game-theoretic component added into the management model and harvester operating. Would game-theory among agents then be applied to the harvesters who are making decisions? A basic computational model would then proceed something like as follows:
Master file: gmse.R [also create standalone gmse.c with int main(void)]
initialise.R: code within R to organise key data frames
STAKEHOLDER_1
(Stake-holders can be discrete)STAKEHODLER_2
(rows = individuals; cols = attributes)RESOURCE_1
(note: resources can be populations)RESOURCE_2
(rows = individuals; cols = attribuets)LANDSCAPE
(start with an \(m \times m\) matrix)resources.c: sub-functions affect dynamics of resources
RESOURCE_1
, RESOURCE_2
, and LANDSCAPE
move(double RESOURCE)
: move individuals or resources on LANDSCAPEreproduce(double RESOURCE)
: New resources added based on some rulesdie(double RESURCE)
: Resources removed based on some rulesimmigrate(double RESOURCE)
: resources added by different rules (later?)emgirate(double RESOURCE)
: resources removed by different rules (later?)interact(double RESOURCE_1, double RESOURCE_2)
: Resources interactRESOURCE_1
, RESOURCE_2
, and LANDSCAPE
observe.c: sub functions affecting simulated data collection
manager.c: sub functions affecting management decision model
game.c: sub functions affecting game played based on management decisions
user.c: sub functions affecting implementation of users given game.c
summary.R: Summarise information and plot (also create C standalone)
Note: The c standalone will also need the file gmse_util.c, for all of the other components (e.g., random number generation) which would normally be done in R. In R, these components can be incorporated with the appropriate R.h and Rmath.h header files.
Note: The RESOURCE_2
will have to be optional, because in some scenarios, two stake-holders might simply be in conflict over the use of one resource.
Note that Erlend Nilsen has constructed the basic MSE framework in R already, and I’ve forked his repository on GitHub as a potential starting point. I’ve also starred a repository for calling C from R, as I think that this will be necessary. I’d like a standalone version of the model in C, but the focus should probably first to be writing the intense code in C while immediately making it called from R – cloning and making a C standalone can come later (maybe avoid using too much of Rmath.h so that a C standalone is easier).
This would allow a harvester operating module or function to fit within the broad simulation or program, G-MSE.
The spatial aspect of some of the key cases studies (e.g., Nellemann et al. 2000), and the importance of space more broadly in ecological processes, suggests to me that the G-MSE program will need to ahve a spatial component – landscapes need to be a part of it, perhaps?
Overall, based on the ERC proposal and Bunnefeld et al. (2011), the model will function something like the below (subject to change):
As long as not too many generations are run (e.g., not too much more than 100), I am cautiously optimistic that this program will be able to include an individual-based model of a focal population, and all of the other game-theoretic components, and not take more than a few minutes to run and produce simulated results (obviously less if it is called directly from C, but I’m shooting for this calling from shiny in a browser). For end users, dynamic graph production can make the wait time a bit more interesting, if it’s possible. For us, the time it will take for me to call in c, especially if using a the cluster, will be trivial.
For the natural resources model, it might be nice to have an option of burning in several time steps before starting the loop (if, e.g., no empirical data are available, and the model instead relies on parameters plugged into a Lotka-Volterra or Ricker model). Or, if data are available, long-term demographic data could be used and assumed to represent the true population dynamics (i.e., just use these data to simulate N individuals) before starting the G-MSE model loop. It is worth thinking about how much population structure we might want to add – my inclination is to make the software as flexible as possible (e.g., allow sex, age, etc., to be attributes of discrete individuals), but this will depend on other aspects of the model.
In the interest of making this model as general as possible, I believe that we’ll eventually want to use an extensive-form game to allow for the sequence of moves to affect stake-holder actions. Nevertheless, just to get the basic framework underway, I think we can start out with a normal-form game, with the intent of generalising the model later (the code will be modular enough to allow this). Generalisation should be easy if we have a separate function to keep track of the game tree, and then allow agents to access the game tree (or parts of it, in the case of incomplete information) to make decisions about how to act. An extensive-form game package exists in R, published by Kenkel and Signorino (2014) with code available on GitHub, but the focus of this package is for ‘estimating recursive, sequential games, and not simultaneous move games or dynamic games with infinite time horizons’. Since the quoted probably describes the kinds of games that ConFooBio is interested in, I think the games package will be a useful reference, but not something to directly apply. It incorporates uncertainty, which could be something useful to return to for further reference.
A couple other (Java based) examples of games are available on GitHub, such as GTE, which has a GUI web application and a corresponding published paper (Savani and Stengel 2014). This model leads me to think that it’s probably best to give each player two matrices:
do.call
function to be used – probably easier to deal with in R).Another java extensive-form games package exists, though it seems like less useful for ConFooBio purposes.
Some notation to try out: For the purpose of the below, to keep things simple, I’m going to just start with payoff matrices, and assume that history of interactions is not yet used in decisions.
To further simplify, I am going to assume that there are only two players. The general payoff matrices can be represented as below (loosely following the notation of Débarre et al. (2014)):
\[ {\bf A^{1}} = \left( \begin{array}{cc} U^{1}_{a} & U^{1}_{b} \\ U^{1}_{c} & U^{1}_{d} \end{array} \right), {\bf A^{2}} = \left( \begin{array}{cc} U^{2}_{a} & U^{2}_{b} \\ U^{2}_{c} & U^{2}_{d} \end{array} \right). \]
In the above \(a\), \(b\), \(c\), and \(d\) are all different possible outcomes that depend upon the decisions of players 1 and 2. We can think about these in terms of the actions \(X^{1}_{i}\) and \(X^{2}_{i}\), and put these into the familiar payoff table below,
Player 2 | ||
---|---|---|
Player 1 | Strategy 3 | Strategy 4 |
Strategy 1 | \(a \to \{U^{1}, U^{2}\}\) | \(b \to \{U^{1}, U^{2}\}\) |
Strategy 2 | \(c \to \{U^{1}, U^{2}\}\) | \(d \to \{U^{1}, U^{2}\}\) |
For doing the maths though, individual matrices will be used. Note that to keep things general, the above strategies are unique to each player. I think that this will be relevant to ConFooBio because each actor will have a unique role. Hence, a vector \(I\) can represent all possible options for action, with players (normally) only having access to a subset \(i \in I\), though we might conceive of some players being able to do the same thing despite having different roles.
Making payoff matrices a list with \(M\) elements of vectors is probably the best way to go in R, with \(M=2\) players for most of what we’ll do. Each player \(m\) will have its own options for acting within the list M[m]
.
M <- 2; # Number of players in the game
S <- list(); # Strategy vectors (elements all possible strategies)
A <- list(); # Payoff vectors (elements all possible strategy combinations)
For now, let’s just assume that each player has two possible strategies, and we’ll just use the traditional matrix to calculate Nash equilibria; for future reference, Avis et al. (2009) might be useful for quick calculation of Nash equilibria for two player games. Continuing with the above, here’s a basic setup computing the Prisoner’s dilemma:
S[[1]] <- c("C","D"); # Cooperate or defect strategies (change to numeric?);
S[[2]] <- c("C","D");
A[[1]] <- c(3,0,5,1); # Payoffs for player 1
A[[2]] <- c(3,5,0,1); # Payoffs for player 2
A1 <- matrix(data=A[[1]], nrow=length(S[[1]]), byrow=FALSE);
A2 <- matrix(data=A[[2]], nrow=length(S[[2]]), byrow=FALSE);
print(A1); # Note the traditional Prisoner's dilemma payoff structure
## [,1] [,2]
## [1,] 3 5
## [2,] 0 1
print(A2);
## [,1] [,2]
## [1,] 3 0
## [2,] 5 1
Now check to see if the best possible response for each player is the same regardless of its opponent’s strategy.
best1 <- apply(A1,1,which.max); # Best strategies for Player 1
best2 <- apply(A2,2,which.max); # Best strategies for Player 2
tabl1 <- tabulate(best1); # Frequency of bests
tabl2 <- tabulate(best2);
str1 <- tabl1 / sum(tabl1); # Frequency of each strategy
str2 <- tabl2 / sum(tabl2);
summ1 <- matrix(data=str1,nrow=1); # Summary vector of strategies
summ2 <- matrix(data=str2,nrow=1);
colnames(summ1) <- S[[1]];
colnames(summ2) <- S[[2]];
rownames(summ1) <- "Proportion";
rownames(summ2) <- "Proportion";
print(summ1); print(summ2);
## C D
## Proportion 0 1
## C D
## Proportion 0 1
One goal will be to develop a function that can return optimal strategies for each player, including mixed strategies, for any given \(2 \times 2\) payoff matrix. The function below does not do this; it needs to be fixed. A starting point for looking at appropriate algorithms is Avis et al. (2009), who come up with an efficient solution.
Before investing too much time in this, let’s make sure that finding equilibrium solutions make sense in the context of games with uncertainty. We might need a different approach, e.g., if the payoffs themselves are uncertain and the optimal strategies are reflected in this uncertainty
One package in R can solve Nash equilibria, though the documentation for it is not excellent. There’s also a repository that can do it in C, but that might take more time than it is worth – the paper underlying it is Miltersen and Sørensen (2009). A benefit here is that it uses extensive-form games and computes quasi-perfect equilibria, which are specifically equilibria that assumes that a player’s opponent is not perfect, and accounts for past mistakes.
## XXX FIXIT: There is an error in calculating what each should play -- it is tabulating the frequency of best plays, but when mixed strategies occur, it returns a 1/2, 1/2 instead of the proportion based on the value.
solve.nash <- function(){ #Function to be made to solve Nash equilibrium
return(NULL);
}
game <- function(payoff1, payoff2){
if(length(payoff1) != length(payoff2)){
print("WARNING: Payoff vectors must be the same length");
return(NULL);
}
if(min(payoff1) < 0){
payoff1 <- payoff1 + min(payoff1);
}
if(min(payoff2) < 0){
payoff2 <- payoff2 + min(payoff2);
}
if(is.matrix(payoff1)==FALSE){
payoff1 <- matrix(data=payoff1, nrow=2, byrow=TRUE);
}
if(is.matrix(payoff2)==FALSE){
payoff2 <- matrix(data=payoff2, nrow=2, byrow=TRUE);
}
S <- list();
S[[1]] <- c("Strategy_1","Strategy_2");
S[[2]] <- c("Strategy_3","Strategy_4");
best1 <- apply(payoff1,1,which.max); # Best strategies for Player 1
best2 <- apply(payoff2,2,which.max); # Best strategies for Player 2
tabl1 <- tabulate(best1); # Frequency of bests
tabl2 <- tabulate(best2);
expe1 <- apply(payoff1,2,sum) * tabl1;
expe2 <- apply(payoff2,1,sum) * tabl2;
str1 <- expe1 / sum(expe1); # Frequency of each strategy
str2 <- expe2 / sum(expe2);
summ1 <- matrix(data=str1,nrow=1); # Summary vector of strategies
summ2 <- matrix(data=str2,nrow=1);
colnames(summ1) <- S[[1]];
colnames(summ2) <- S[[2]];
rownames(summ1) <- "Proportion";
rownames(summ2) <- "Proportion";
strategy_pr <- list(player1=summ1,player2=summ2);
return(strategy_pr);
}
We can now use the function above to figure out and return strategies for any given payoff vectors from \(a\), \(b\), \(c\), and \(d\) for each player (1 and 2).
u <- shinyUI(pageWithSidebar(
headerPanel(""),
sidebarPanel(
textInput('vec1', 'Player 1: a, b, c, d', "3, 5, 0, 1"),
textInput('vec2', 'Player 2: a, b, c, d', "3, 0, 5, 1")
),
mainPanel(
h4('Proportion strategy is optimally played: (DOES NOT WORK YET)'),
verbatimTextOutput("oid1")
)
))
s <- shinyServer(function(input, output) {
output$oid1<-renderPrint({
p1 <- as.numeric(unlist(strsplit(input$vec1,",")))
p2 <- as.numeric(unlist(strsplit(input$vec2,",")))
pay <- game(payoff1=p1, payoff2=p2)
o1 <- as.numeric(pay$player1)
o2 <- as.numeric(pay$player2)
cat("Player 1 (Strategy 1, 2):\n")
print(o1)
cat("\n\n")
cat("Player 2 (Strategy 3, 4):\n")
print(o2)
}
)
}
)
#shinyApp(ui = u, server = s)
How do we quantify costs and benefits in situations in which there is conflict between conservation and food security? Game theoretic models rely on numeric values being maximised by individual agents, with games promoting cooperation or conflict depending on equilibrium solutions when each agent maximises its value. But for conservation and food security, the values do not seem to be straightforwardly assigned – how do we compare something like extinction risk against food production (or, e.g., tourism income)? It seems that we need to either figure out how to play games in which payoffs are in different, difficult-to-compare currencies, or figure out how to standardise disparate payoff types into a common currency to model games.
Note that there is a whole literature surrounding utility and utility functions, most of which appears to be based in economics. This is probably the best thing to tap into, although the question of what kind of utility functions to use (e.g., ordinal, continuous, etc.) is still something that will need to be worked out.
Could figure out some sort of way to rank order or bin preferences for each agent (Added note: this might link up with Jeremy’s idea of attitude in some way?). This might also help with dealing with uncertainty because the uncertaintly of outcomes could be expressed as the likelihood or probability of hitting a rank or getting into a bin. Successful cooperation could then be defined by increasing, or perhaps maximising, ranks or bins of each agent. I had a actually played around with an idea for using something like this in philosophy (ethics theory), in which ‘maximise well-being’ is sometimes considered a fundamental concept, but one that is hard to pin down (i.e., could have links to environmental ethics). As a bonus, the ranks or bins could be easier for real-world agents to understand.
In any case, a game-theoretic model will need some sort of numbers to work with (even if they are just ordinal preferences), so I think this will be a key question early on.
Should we have tables, such as the hypothetical one below? This is the typical way that games are modelled, but it assumes that different agents are playing the same game. If there are conflicts among more than two types of agents (i.e., agents with three or more unique interests), then fitting games into two-by-two boxes could be difficult (this was mentioned in the project proposal).
Agent 2 | ||
---|---|---|
Agent 1 | Strategy 1 | Strategy 2 |
Strategy 1 | A1 pay, A2 pay | A1 pay, A2 pay |
Strategy 2 | A1 pay, A2 pay | A1 pay, A2 pay |
I also think it is important to recognise early on that these games are unlikely to be symmetrical – the payoffs are unlikely to be the kinds of simple prisoners dilemmas that lead to both agents having the same effective strategy (see also Colyvan et al. 2011).
Note, I don’t think that this means that simple Nash equilibria are impossible to find – the solutions might just look a bit odd, depending on the payoff values in the matrices.
We should not, however, overlook the possibility of solutions that are optimally cooperative when played iteratively but not cooperative when played once. Prisoner’s dilemma is the classic example; see the Axelrod experiments and Wilkinson (1990), Carter and Wilkinson (2013), Carter and Wilkinson (2015), Trivers (1985) work on reciprocal allocation. Dawkins (1976) also had a chapter on this, I think.
Given the above, we should also, perhaps, consider that payoffs might change over time (e.g., one year to the next) with changing environmental conditions (defined very loosely as anything outside of the agent’s control that structures the payoff matrix), and that agents might capitalise on this stochasticity to maximise net gains. Further, they might change in a non-linear way such that one way of maximising payoffs is to let one agent ‘win’ in one year and another agent ‘win’ in the next year. This could benefit all if the payout in a given year has a huge benefit for the ‘winner’, but not an abnormally large loss for the ‘loser’ (probably should use different terminology than ‘winner’ and ‘loser’); in subsequent years then, the other agent might find themselves in a situation where they have an abnormally high amount to gain from ‘winning’ and the other agent does not have an unusually bad year by ‘losing’. Note that, I think, this implies that the changing payoff structure of a game over time might be dynamic in a way that is not purely a zero-sum situation; i.e., gains are non-additive (in the previous example ‘sub-additive’) over time. Non-additivity could work the opposite way too – it might be that when it is unusually good time to ‘win’, it is an even worse time for the other agent to ‘lose’ – I’d need to flesh out this idea more; it has conceptual connections to the community ecology (species interactions) literature.
As a concrete example of the above – maybe, e.g., the conditions are particularly good for hen harrier conservation in the current year (i.e., a population is poised to grow especially well, or rebound in some critical way) – so good that maximising gains now would well compensate for the expected losses if grouse hunters enforced control in the subsequent few years. Perhaps banking these conservations gains would be the best solution, if at a later date the conditions would be such as to cause grouse hunters to benefit disproportionately from target control at a time in which the losses of control to conservation would not be especially severe. The net result of all this could be that each agent benefits by maximising its gains when times are tough at the cost of suffering higher losses when times are good. Again, this depends on variation in the payoff structure over time, and that the payoffs will vary in such a way as to cause sub-additive growth in gains. It also might require more certainty about gains that is reasonable.
Need to think about uncertainty more.
The following recreates Nilsen’s MSE modelling work.
The manager model receives the single estimate of population size (density or abundance), then returns a total allowable catch. A second function models hunter frustration, and is meant to be run after the first function. The second function checks to see if hunter frustration is within a set of bounds; if it is, then the function returns the original total allowable catch. If it is not, then the function adjust the total allowable catch.
The user model (called the implementation model) includes four separate functions, including the very simple, which just samples from a random binomial or poisson function around total allowable catch.
Hence, we can put four of these functions together to simulate a very simple MSE model:
pop_abund <- 100;
harvest <- 20;
growth_rate <- 1;
K <- 200;
pr_harvest <- 0.7;
time <- 1;
time_end <- 30;
track <- matrix(data=0, nrow=time_end, ncol=5);
while(time <= time_end){
pop_vars <- PopMod1(X_t0=pop_abund, sigma2_e=0.2, N_Harv=harvest, K=K,
r_max = growth_rate);
pop_abund <- as.numeric(pop_vars[4]);
obs_vars <- obs_mod1(scale="Abund", value=pop_abund, bias=1, cv=0.4);
if(obs_vars < 0){ # Nilsen's model allows estimate to be negative
obs_vars <- 0; # Make it so that negative equates to est. of extinction
}
har_vars <- HarvDec1(HD_type="A", qu=0.2, PopState_est=obs_vars);
imp_vars <- Impl1(TAC=floor(har_vars), ModType="B", p=pr_harvest);
track[time,] <- c(time, pop_abund, obs_vars, har_vars, imp_vars);
time <- time + 1;
}
colnames(track) <- c("time", "Pop. Size", "Pop. Est.", "Harv. Rate", "Harv.");
We run the above code, and we can look at how key population and management quantities change over time:
The below figure shows all of these quantities over time.
We can re-run the code at any point and essentially recreate a run of Nilsen’s MSE model. The hard work is now to come up with a G-MSE, which will allow for much more individual complexity through an agent-based approach.
The function do.call
in R apparently calls a function and passes the arguments for the function from a list (e.g., if A
is in a list form, or put in a list form with list(A)
, then do.call("f", list(A))
calls the function f
for every list element in A
, where individual list elements can be vectors with function arguments). This is a base R function.
Scottish Ecology, Environment, and Conservation Conference (‘’The conference aims to bring together researchers in ecology, conservation, and environmental sciences across Scotland’’ – ‘’The conference is primarily for PhD, Masters and advanced undergraduate students’’) University of Aberdeen: 3-4 APR 2017 6 FEB abstract submission deadline
Modelling Biological Evolution 2017: Developing Novel Approaches (topics include: Evolutionary Game Theory and Solving Social Dilemmas) http://www.math.le.ac.uk/people/ag153/homepage/MBE_2017/MBE_2017_1.htm University of Leicester: 4-5 APR 2017 1 FEB 2017 register and abstract submission deadline.
Workshop on behavioural game theory (topic is Pyschological Game Theory) https://www.uea.ac.uk/economics/news-and-events/workshop-on-behavioural-game-theory-2017 University of East Anglia (Norwich): 5-6 JUL 2017 28 FEB 2017 submission deadline (no workshop fee)
Game theory and management (topics include: Game theory and management applications, cooperative games and applications, dynamic games and applications, stochastic games and appications) http://gsom.spbu.ru/en/gsom/research/conferences/gtm/ Saint Petersburg University: 28-30 JUN 2017
6th workshop on stochastic methods in game theory ( ‘’Many decision problems involve elements of uncertainty and of strategy. Most often the two elements cannot be easily disentangled. The aim of this workshop is to examine several aspects of the interaction between strategy and stochastics. Various game theoretic models will be presented, where stochastic elements are particularly relevant either in the formulation of the model itself or in the computation of its solutions.’’ Example topics include: Large games and stochastic and dynamic games) https://sites.google.com/site/ericegametheory2017/home Sicily, Italy: 5-13 MAY 2017
13 European Meeting on Game Theory (SING13) (topics include: cooperative games and their applications, dynamic games, stochastic games, learning and experimentation in games, computational game theory, game theory applications in fields such as management). http://www.lamsade.dauphine.fr/sing13/ Paris, France: 5-7 JUL 2017 28 FEB abstract submission deadline
Adami, C., Schossau, J., & Hintze, A. (2016). Evolutionary game theory using agent-based methods. Physics of Life Reviews, 19, 1–26. https://doi.org/10.1016/j.plrev.2016.08.015
An, L. (2012). Modeling human decisions in coupled human and natural systems: Review of agent-based models. Ecological Modelling, 229, 25–36. https://doi.org/10.1016/j.ecolmodel.2011.07.010
Ascough, J. C., Maier, H. R., Ravalico, J. K., & Strudley, M. W. (2008). Future research challenges for incorporation of uncertainty in environmental and ecological decision-making. Ecological Modelling, 219(3–4), 383–399. https://doi.org/10.1016/j.ecolmodel.2008.07.015
Bautista, C., Naves, J., Revilla, E., Fernández, N., Albrecht, J., Scharf, A. K., … Selva, N. (2016). Patterns and correlates of claims for brown bear damage on a continental scale. Journal of Applied Ecology. http://doi.org/10.1111/1365-2664.12708
Bennett, E. M. (2017). Changing the agriculture and environment conversation. Nature Ecology and Evolution, 1(January), 1–2. https://doi.org/10.1038/s41559-016-0018
Bischof, R., Nilsen, E. B., Brøseth, H., Männil, P., Ozoliņš, J., & Linnell, J. D. C. (2012). Implementation uncertainty when using recreational hunting to manage carnivores. Journal of Applied Ecology, 49(4), 824–832. https://doi.org/10.1111/j.1365-2664.2012.02167.x
Bjerketvedt, D. K., Reimers, E., Parker, H., & Borgstrøm, R. (2014). The Hardangervidda wild reindeer herd: a problematic management history. Rangifer, 34(1), 57–72.
Bonabeau, E. (2002). Agent-based modeling: methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences, 99, 7280–7287. https://doi.org/10.1073/pnas.082080899
Bunnefeld, N., & Keane, A. (2014). Managing wildlife for ecological, socioeconomic, and evolutionary sustainability. Proceedings of the National Academy of Sciences, 111(36), 12964–12965. http://doi.org/10.1073/pnas.1413571111
Bunnefeld, N., Hoshino, E., & Milner-Gulland, E. J. (2011). Management strategy evaluation: A powerful tool for conservation? Trends in Ecology and Evolution, 26(9), 441–447. http://doi.org/10.1016/j.tree.2011.05.003
Chollett, I., Garavelli, L., O’Farrell, S., Cherubin, L., Matthews, T. R., Mumby, P. J., & Box, S. J. (2016). A Genuine Win-Win: Resolving the ``Conserve or Catch’’ Conflict in Marine Reserve Network Design. Conservation Letters, 0(0), 1–9. https://doi.org/10.1111/conl.12318
Cobano, J. A., Conde, R., Alejo, D., & Ollero, A. (2011). Path planning method based on Genetic Algorithms and the Monte-Carlo method to avoid aerial vehicle collisions under uncertainties. In Proceedings of the IEEE International Conference on Robotics and Automation (pp. 4429–4434). https://doi.org/10.1109/ICRA.2011.5980246
Colyvan, M., Justus, J., & Regan, H. M. (2011). The conservation game. Biological Conservation, 144(4), 1246–1253. http://doi.org/10.1016/j.biocon.2010.10.028
Duffy, R., St John, F. A. V, Büscher, B., & Brockington, D. (2016). Toward a new understanding of the links between poverty and illegal wildlife hunting. Conservation Biology, 30(1), 14–22. https://doi.org/10.1111/cobi.12622
Elston, D. A., Spezia, L., Baines, D., & Redpath, S. M. (2014). Working with stakeholders to reduce conflict-modelling the impact of varying hen harrier Circus cyaneus densities on red grouse Lagopus lagopus populations. Journal of Applied Ecology, 51(5), 1236–1245. http://doi.org/10.1111/1365-2664.12315
Eythórsson, E., Tombre, I. M., & Madsen, J. (2017). Goose management schemes to resolve conflicts with agriculture: Theory, practice and effects. Ambio, 46(S2), 231–240. https://doi.org/10.1007/s13280-016-0884-4
Farmer, J. D., & Foley, D. (2009). The economy needs agent-based modelling. Nature, 460(August), 685–686. https://doi.org/10.1038/460685a
Franco, C., Hepburn, L. A., Smith, D. J., Nimrod, S., & Tucker, A. (2016). A Bayesian Belief Network to assess rate of changes in coral reef ecosystems. Environmental Modelling and Software, 80, 132–142. https://doi.org/10.1016/j.envsoft.2016.02.029
Hake, M., Mansson, J., & Wiberg, A. (2010). A working model for preventing crop damage caused by increasing goose populations in Sweden. Ornis Svecica, 20(3-4), 225–233.
Hamblin, S. (2013). On the practical usage of genetic algorithms in ecology and evolution. Methods in Ecology and Evolution, 4(2), 184–194. https://doi.org/10.1111/2041-210X.12000
Heinonen, J. P. M., Palmer, S. C. F., Redpath, S. M., & Travis, J. M. J. (2014). Modelling hen harrier dynamics to inform human-wildlife conflict resolution: A spatially-realistic, individual-based approach. PLoS ONE, 9(11). http://doi.org/10.1371/journal.pone.0112492
Hindar, K., Fleming, I. A., McGinnity, P., & Diserud, O. (2006). Genetic and ecological effects of salmon farming on wild salmon: modelling from experimental results. ICES Journal of Marine Science, 63(7), 1234–1247. https://doi.org/10.1016/j.icesjms.2006.04.025
Janssen, M. A., Holahan, R., Lee, A., & Ostrom, E. (2010). Lab experiments for the study of socio-ecological systems. Science, 328, 613–618. http://doi.org/10.1126/science.1229223
Karlsson, S., Diserud, O. H., Fiske, P., & Hindar, K. (2016). Widespread genetic introgression of escaped farmed Atlantic salmon in wild salmon populations. ICES Journal of Marine Science, 0, fsw121. https://doi.org/10.1093/icesjms/fsw121
Liu, Y., Diserud, O. H., Hindar, K., & Skonhoft, A. (2013). An ecological-economic model on the effects of interactions between escaped farmed and wild salmon (Salmo salar). Fish and Fisheries, 14(2), 158–173. http://doi.org/10.1111/j.1467-2979.2012.00457.x
Luo, X., Yang, W., Kwong, C., Tang, J., & Tang, J. (2014). Linear programming embedded genetic algorithm for product family design optimization with maximizing imprecise part-worth utility function. Concurrent Engineering, 22(4), 309–319. https://doi.org/10.1177/1063293X14553068
Man, M., Zhang, Y., Ma, G., Friston, K., & Liu, S. (2016). Quantification of degeneracy in Hodgkin-Huxley neurons on Newman-Watts small world network. Journal of Theoretical Biology, 402, 62–74. http://doi.org/10.1016/j.jtbi.2016.05.004
Manfredo, M. J., Bruskotter, J. T., Teel, T. L., Fulton, D., Schwartz, S. H., Arlinghaus, R., … Sullivan, L. (2016). Why social values cannot be changed for the sake of conservation. Conservation Biology. Accepted. https://doi.org/10.1111/cobi.12855.This
Mansson, J., Nilsson, L., & Hake, M. (2013). Territory size and habitat selection of breeding Common Cranes (Grus grus) in a boreal landscape. Ornis Fennica, 90(2), 65–72.
Marks, R. E. (1992). Breeding hybrid strategies: optimal behaviour for oligopolists. Journal of Evolutionary Economics, 2(1), 17–38. https://doi.org/10.1007/BF01196459
McAvoy, A., & Hauert, C. (2015). Asymmetric evolutionary games. PLoS Computational Biology, 11(8), e1004349. https://doi.org/10.1371/journal.pcbi.1004349
Mccann, R. K., Marcot, B. G., & Ellis, R. (2006). Bayesian belief networks: applications in ecology and natural resource. Canadian Journal of Forest Research, 36, 3053–3062.
Miyasaka, T., Le, Q. B., Okuro, T., Zhao, X., & Takeuchi, K. (2017). Agent-based modeling of complex social–ecological feedback loops to assess multi-dimensional trade-offs in dryland ecosystem services. Landscape Ecology. https://doi.org/10.1007/s10980-017-0495-x
Nellemann, C., Jordhoy, P., Stoen, O. G., & Strand, O. (2000). Cumulative impacts of tourist resorts on wild reindeer (Rangifer tarandus tarandus) during winter. Arctic, 53(1), 9–17. https://doi.org/10.14430/arctic829
Nellemann, C., Vistnes, I., Jordhoy, P., Strand, O., & Newton, A. (2003). Progressive impact of piecemeal infrastructure development on wild reindeer. Biological Conservation, 113(2), 307–317. https://doi.org/10.1016/S0006-3207(03)00048-X
Olaussen, J. O., & Skonhoft, A. (2008). On the economics of biological invasion: An application to recreational fishing. Natural Resource Modeling, 21(4), 625–653. https://doi.org/10.1111/j.1939-7445.2008.00026.x
Rumpff, L., Duncan, D. H., Vesk, P. A., Keith, D. A., & Wintle, B. A. (2011). State-and-transition modelling for Adaptive Management of native woodlands. Biological Conservation, 144(4), 1244–1235. http://doi.org/10.1016/j.biocon.2010.10.026
Strand, O., Nilsen, E. B., Solberg, E. J., & Linnell, J. C. D. (2012). Can management regulate the population size of wild reindeer (Rangifer tarandus) through harvest? Canadian Journal of Zoology, 90, 163–171. http://doi.org/Doi 10.1139/Z11-123
Tilman, A. R., Watson, J. R., & Levin, S. (2016). Maintaining cooperation in social-ecological systems: Theoretical Ecology. https://doi.org/10.1007/s12080-016-0318-8
Tu, M. T., Wolff, E., & Lamersdorf, W. (2000). Genetic algorithms for automated negotiations: a FSM-based application approach. Proceedings 11th International Workshop on Database and Expert Systems Applications, 1029–1033. https://doi.org/10.1109/DEXA.2000.875153
Wam, H. K., Bunnefeld, N., Clarke, N., & Hofstad, O. (2016). Conflicting interests of ecosystem services: Multi-criteria modelling and indirect evaluation to trade off monetary and non-monetary measures. Ecosystem Services.
Wang, P., Poe, G. L., & Wolf, S. A. (2017). Payments for ecosystem services and wealth distribution. Ecological Economics, 132, 63–68. https://doi.org/10.1016/j.ecolecon.2016.10.009
Wright, G. D., Andersson, K. P., Gibson, C. C., & Evans, T. P. (2016). Decentralization can help reduce deforestation when user groups engage with local government. Proceedings of the National Academy of Sciences, 201610650. https://doi.org/10.1073/pnas.1610650114
Adami, C., and A. Hintze. 2013. Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nature communications 4:2193. Nature Publishing Group.
Adami, C., J. Schossau, and A. Hintze. 2016. Evolutionary game theory using agent-based methods. Physics of Life Reviews 19:1–26. Elsevier B.V.
An, L. 2012. Modeling human decisions in coupled human and natural systems: Review of agent-based models. Ecological Modelling 229:25–36.
Ascough, J. C., H. R. Maier, J. K. Ravalico, and M. W. Strudley. 2008. Future research challenges for incorporation of uncertainty in environmental and ecological decision-making. Ecological Modelling 219:383–399.
Avis, D., G. D. Rosenberg, R. Savani, and B. von Stengel. 2009. Enumeration of Nash equilibria for two-player games. Economic Theory 42:9–37.
Balmann, A., and K. Happe. 2000. Applying parallel genetic algorithms to economic problems: The case of agricultural land markets. in IIFET conference “microbehavior and macroresults”. proceedings.
Bocedi, G., S. C. F. Palmer, G. Pe, R. K. Heikkinen, Y. G. Matsinos, K. Watts, and J. M. J. Travis. 2014. RangeShifter: a platform for modelling spatial eco-evolutionary dynamics and species’ responses to environmental changes. Methods in Ecology and Evolution 5:388–396.
Bonabeau, E. 2002. Agent-based modeling: methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences 99:7280–7287.
Bronstein, J. L., W. G. Wilson, and W. F. Morris. 2003. Ecological dynamics of mutualist/antagonist communities. American Naturalist 162:S24–39.
Bunnefeld, N., E. Hoshino, and E. J. Milner-Gulland. 2011. Management strategy evaluation: A powerful tool for conservation? Trends in Ecology and Evolution 26:441–447.
Carter, G. G., and G. S. Wilkinson. 2013. Food sharing in vampire bats: reciprocal help predicts donations more than relatedness or harassment. Proceedings of The Royal Society B 280:20122573.
Carter, G. G., and G. S. Wilkinson. 2015. Social benefits of non-kin food sharing by female vampire bats. Proceedings of The Royal Society B 282:20152524.
Colyvan, M., J. Justus, and H. M. Regan. 2011. The conservation game. Biological Conservation 144:1246–1253. Elsevier Ltd.
Correia, L. 2010. Computational evolution: Taking liberties. Theory in Biosciences 129:183–191.
Darwen, P. J., and X. Yao. 1995. On evolving robust strategies for iterated prisoner’s dilemma. Pp. 276–292 in Progress in evolutionary computation.
Dawkins, R. 1976. The Selfish Gene. Oxford University Press, Oxford.
Débarre, F., C. Hauert, and M. Doebeli. 2014. Social evolution in structured populations. Nature Communications 5:3409.
Duthie, A. B., and M. R. Falcy. 2013. The influence of habitat autocorrelation on plants and their seed-eating pollinators. Ecological Modelling 251:260–270.
Fawcett, T. W., B. Fallenstein, A. D. Higginson, A. I. Houston, D. E. W. Mallpress, P. C. Trimmer, and J. M. McNamara. 2014. The evolution of decision rules in complex environments. Trends in Cognitive Sciences 18:153–161. Elsevier Ltd.
Fonseca, C. M., and P. J. Fleming. 1993. Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. Pp. 416–423 in Icga.
Fonseca, C. M., and P. J. Fleming. 1998. Multiobjective optimization and multiple constraint handling with evolutionary algorithms - Part I: A unified formulation. IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans. 28:26–37.
Franco, C., L. A. Hepburn, D. J. Smith, S. Nimrod, and A. Tucker. 2016. A Bayesian Belief Network to assess rate of changes in coral reef ecosystems. Environmental Modelling and Software 80:132–142. Elsevier Ltd.
Hamblin, S. 2013. On the practical usage of genetic algorithms in ecology and evolution. Methods in Ecology and Evolution 4:184–194.
Horn, J. rey, N. Nafpliotis, and D. E. Goldberg. 1993. Multiobjective optimization using the niched pareto genetic algorithm.
Jaszkiewicz, A. 2002. Genetic local search for multi-objective combinatorial optimization. European Journal of Operational Research 137:50–71.
Kark, S., A. Tulloch, A. Gordon, T. Mazor, N. Bunnefeld, and N. Levin. 2015. Cross-boundary collaboration: Key to the conservation puzzle. Current Opinion in Environmental Sustainability 12:12–24. Elsevier B.V.
Kenkel, B., and C. S. Signorino. 2014. Estimating Extensive Form Games in R. Journal of Statistical Software 56:1–27.
Kumar, A. 2013. Encoding schemes in genetic algorithm. International Journal of Advanced Research in IT and Engineering 2:1–7.
Lee, C. S. 2012. Multi-objective game-theory models for conflict analysis in reservoir watershed management. Chemosphere 87:608–613. Elsevier Ltd.
Leombruni, R., and M. Richiardi. 2005. Why are economists sceptical about agent-based simulations? Physica A 355:103–109.
Luke, S. 2015. Essentials of Metaheuristics.
Luo, X., W. Yang, C. Kwong, J. Tang, and J. Tang. 2014. Linear programming embedded genetic algorithm for product family design optimization with maximizing imprecise part-worth utility function. Concurrent Engineering 22:309–319.
Man, M., Y. Zhang, G. Ma, K. Friston, and S. Liu. 2016. Quantification of degeneracy in Hodgkin-Huxley neurons on Newman-Watts small world network. Journal of Theoretical Biology 402:62–74. Elsevier.
Marks, R. E. 1992. Breeding hybrid strategies: optimal behaviour for oligopolists. Journal of Evolutionary Economics 2:17–38.
Maynard Smith, J., and G. A. Parker. 1976. The logic of asymmetric contests. Animal Behaviour 24:159–175.
McAvoy, A., and C. Hauert. 2015. Asymmetric evolutionary games. PLoS Computational Biology 11:e1004349.
McFadden, D. 1973. Conditional logit analysis of qualitative choice behavior. Pp. 105–142 in P. Zarembka, ed. Frontiers in econometrics. Academic Press Inc, New York.
McNamara, J. M., P. C. Trimmer, and A. I. Houston. 2014. Natural selection can favour “irrational” behaviour. Biology Letters 10:20130935.
Milner-Gulland, E. J. 2011. Integrating fisheries approaches and household utility models for improved resource management. Proceedings of the National Academy of Sciences 108:1741–1746.
Miltersen, P. B., and T. B. Sørensen. 2009. Computing a quasi-perfect equilibrium of a two-player game. Economic Theory 42:175–192.
Miyasaka, T., Q. B. Le, T. Okuro, X. Zhao, and K. Takeuchi. 2017. Agent-based modeling of complex social–ecological feedback loops to assess multi-dimensional trade-offs in dryland ecosystem services. Landscape Ecology, doi: 10.1007/s10980-017-0495-x. Springer Netherlands.
Naivinit, W., C. Le Page, G. Trébuil, and N. Gajaseni. 2010. Participatory agent-based modeling and simulation of rice production and labor migrations in Northeast Thailand. Environmental Modelling and Software 25:1345–1358. Elsevier Ltd.
Nautiyal, S., and H. Kaechele. 2009. Natural resource management in a protected area of the Indian Himalayas: A modeling approach for anthropogenic interactions on ecosystem. Environmental Monitoring and Assessment 153:253–271.
Nellemann, C., P. Jordhøy, O. G. Støen, and O. Strand. 2000. Cumulative impacts of tourist resorts on wild reindeer (Rangifer tarandus tarandus) during winter. Arctic 53:9–17.
Nowak, M. A., K. Sigmund, and E. El-Sedy. 1995. Automata, repeated games and noise. Journal of Mathematical Biology 33:703–722.
Nuno, A., N. Bunnefeld, and E. J. Milner-Gulland. 2013. Matching observations and reality: Using simulation models to improve monitoring under uncertainty in the Serengeti. Journal of Applied Ecology 50:488–498.
Phan, D. 2003. From agent-based computational economics toward cognitive economics. Pp. 369–396 in P. Bourgine and J.-P. Nadal, eds. Cognitive economics: An interdisciplinary approach. Springer, London.
Pollock, K. H., J. D. Nichols, C. Brownie, and J. E. Hines. 1990. Statistical inference for capture-recapture experiments. Wildlife Monographs 27:938–942.
Salomon, R. 1996. The influence of different coding schemes on the computational complexity of genetic algorithms in function optimization. Pp. 227–235 in Parallel problem solving from nature. Springer Berlin Heidelberg.
Savani, R., and B. von Stengel. 2014. Game theory explorer: software for the applied game theorist. Computational Management Science 5–33.
Tesfatsion, L., C. R. Rehmann, D. S. Cardoso, Y. Jie, and W. J. Gutowski. 2017. An agent-based platform for the study of watersheds as coupled natural and human systems. Environmental Modelling and Software 89:40–60. Elsevier Ltd.
Tilman, A. R., J. R. Watson, and S. Levin. 2016. Maintaining cooperation in social-ecological systems: Theoretical Ecology, doi: 10.1007/s12080-016-0318-8. Theoretical Ecology.
Trivers, R. 1985. Social Evolution. The Benjamin/Cummings Publishing Company, Inc., Menlo Park, California.
Tu, M. T., E. Wolff, and W. Lamersdorf. 2000. Genetic algorithms for automated negotiations: a FSM-based application approach. Proceedings 11th International Workshop on Database and Expert Systems Applications 1029–1033.
Watson, R. A., and E. Szathmary. 2016. How Can Evolution Learn? Trends in Ecology and Evolution 31:147–157. Elsevier Ltd.
Wilkinson, G. S. 1990. Food sharing in vampire bats. Scientific American 262:64–70.
Zeeman, E. C. 1980. Population dynamics from game theory. Pp. 471–497 in Z. Nitecki and C. Robinson, eds. Global theory of dynamical systems. Springer-Verlag, Berlin.