4 Modelling

4.1 Guiding Questions

  • What decisions did you make when creating your iSSAs and why?
  • What is the biological or statistical justification for these decisions?
  • How might your decisions impact your inferences?

Often practitioners need to combine data from multiple individuals. If only ‘population-level’ inference is the goal, then include the same number of clusters from each individual in a single model. An equal sampling intensity helps address potential bias.

4.2 Model Building

The model or model sets require justification. We direct to (Fieberg and Johnson 2015, Northrup et al. 2021) for detailed discussion and reference regarding model building. We advocate for global models or distinct competing candidates representing ecological processes. We do not recommend using a dredging approach, or large candidate model sets, as it often results in the interpretation of spurious results.

4.3 Two-step approach

4.3.1 Step 1

Global model or alternative hypotheses when the goal is to be descriptive of the ecological processes.

The global or alternative models can be composed of core or interest variables.

The concept of a core model is to identify key features of animal movement that are important but perhaps not the covariates of interest to the particular study or hypotheses. – Prokopenko et al 2017

# > code

4.3.1.1 Troubleshooting

Starting with simpler models could help to identify covariates that might be causing problems

If there are no observations for certain categories or interactions the model will likely not converge.

4.3.2 Step 2

  1. Bootstrap individual models to get population mean and CIs (Prokopenko et al. 2017, Scrafford et al. 2018)

  2. Calculate a population level average by modelling each variable as a function of anything that interacted with that variable and the availability as an explanatory factor, with inverse variance as a weighting (Dickie et al. 2020) - See Supplementary information.

4.4 Mixed Model Approach

(Muff et al. 2020)

Regarding our discussion of nesting random effects: https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-how-do-they-differ-and-how-are-they-specified nested is notated as:

(1 | group1 / group2)

4.5 Output

Here we are just describing what you will see from your output.

4.5.1 Coefficients

The estimates are the selection or movement coefficients, either for individuals or the population depending on your input data and model structure.

For a mixed model, the random effects output is relative to the fixed effect

To calculate individual selection coefficients = Fixed Effect + Random Effect

4.5.2 Std. Error/CIs

Check the fixed and random effect standard errors to see if they are really large or NAs.

For example, note the NAs in the example model using land cover. In the summary, at the bottom under “Conditional model”.

summary(tar_read(model_lc))
##  Family: poisson  ( log )
## Formula:          
## case_ ~ -1 + I(log(sl_)) + I(log(sl_)):lc_adj + lc_adj + I(log(dist_to_water +  
##     1)) + I(log(dist_to_water + 1)):I(log(sl_)) + (1 | indiv_step_id) +  
##     (0 + I(log(sl_)) | id) + (0 + I(log(sl_)):lc_adj | id) +  
##     (0 + lc_adj | id) + (0 + I(log(dist_to_water + 1)) | id) +  
##     (0 + I(log(dist_to_water + 1)):I(log(sl_)) | id)
## Data: DT
## 
##      AIC      BIC   logLik deviance df.resid 
##       NA       NA       NA       NA    26488 
## 
## Random effects:
## 
## Conditional model:
##  Groups        Name                                  Variance Std.Dev. Corr 
##  indiv_step_id (Intercept)                           1.00e+06 1.00e+03      
##  id            I(log(sl_))                           1.87e-90 1.37e-45      
##  id.1          I(log(sl_)):lc_adjdisturbed           1.08e-02 1.04e-01      
##                I(log(sl_)):lc_adjforest              2.05e-03 4.52e-02 -1.00
##                I(log(sl_)):lc_adjopen                1.08e-01 3.28e-01  1.00
##                I(log(sl_)):lc_adjwetlands            2.05e-03 4.53e-02 -1.00
##  id.2          lc_adjdisturbed                       4.42e-01 6.64e-01      
##                lc_adjforest                          3.32e-02 1.82e-01 -1.00
##                lc_adjopen                            4.42e+00 2.10e+00  1.00
##                lc_adjwetlands                        2.38e-26 1.54e-13 -0.27
##  id.3          I(log(dist_to_water + 1))             1.34e-02 1.16e-01      
##  id.4          I(log(dist_to_water + 1)):I(log(sl_)) 5.26e-90 2.29e-45      
##              
##              
##              
##              
##              
##  -1.00       
##   1.00 -1.00 
##              
##              
##  -1.00       
##   0.30 -0.28 
##              
##              
## Number of obs: 26521, groups:  indiv_step_id, 2411; id, 6
## 
## Conditional model:
##                                       Estimate Std. Error z value Pr(>|z|)
## I(log(sl_))                           -0.13107        NaN     NaN      NaN
## lc_adjdisturbed                       -3.74565        NaN     NaN      NaN
## lc_adjforest                          -3.42646        NaN     NaN      NaN
## lc_adjopen                            -5.94910        NaN     NaN      NaN
## lc_adjwetlands                        -3.67823        NaN     NaN      NaN
## I(log(dist_to_water + 1))             -0.03581        NaN     NaN      NaN
## I(log(sl_)):lc_adjforest               0.23726        NaN     NaN      NaN
## I(log(sl_)):lc_adjopen                 0.29305        NaN     NaN      NaN
## I(log(sl_)):lc_adjwetlands             0.28385        NaN     NaN      NaN
## I(log(sl_)):I(log(dist_to_water + 1))  0.00072        NaN     NaN      NaN

4.5.3 Troubleshooting

We have had success troubleshooting by putting the error in google and looking for it as a github issue with the package or lme4 since they’re built more or less the same. Bolker has lots of hidden tips and tricks in there. Ben Bolker is also very responsive.

https://cran.r-project.org/web/packages/glmmTMB/vignettes/troubleshooting.html

Use set.seed() to get the same model output, check that the output does not vary greatly with different seeds or when it is not set.

Be conservative in “trusting” the model. Don’t accept models with any NAs in the response.

Unlike with clogit in amt, for glmmTMB simpler models do not always improve convergence, but adding covariates with informative variation will improve model performance and convergence.

We have found through trial and error that cos(TA) can make or break the model. These poisson models seem to like lots of data and a fair number of variables, but the optimizer is cranky. If you have too few, and they’re correlated/have high VIF, then you will get NAs.

Use the performance package and the check_model() or model_performance() commands

glmmTMB gives the Model convergence problem; non-positive-definite Hessian matrix error very liberally. Generally, you don’t have to worry about it unless you have other errors with it.

EXERCISE: note of individuals or variables that are not converging or are on the extremes of response. Do they have different availability, fewer points, more NAs

# > plot coefficient by sample size – is there a relationship?