
Fit a Spatial Model for Age Parameters using TMB
fit_spatial_model.Rd
Fits a spatial model for age parameters using Template Model Builder (TMB) and C++. The model incorporates spatial correlation through distance matrices and handles both scale and shape parameters simultaneously. Progress is tracked with clear status updates. Can optionally load from cache.
Usage
fit_spatial_model(
data,
country_code = NULL,
scale_outcome,
shape_outcome,
covariates,
cpp_script_name,
verbose = TRUE,
control_params = list(trace = 2),
manual_params = NULL,
output_dir = NULL,
ignore_cache = FALSE
)
Arguments
- data
A data frame containing the response variables, covariates, and spatial coordinates (web_x, web_y)
- country_code
Optional country code to save/load cached model. Default NULL runs model without caching.
- scale_outcome
Character string specifying the column name for the scale parameter response variable
- shape_outcome
Character string specifying the column name for the shape parameter response variable
- covariates
Character vector of covariate names to include in both scale and shape models
- cpp_script_name
Character string specifying the name of the C++ file (without extension) containing the TMB model definition
- verbose
Logical indicating whether to show progress updates. Default TRUE
- control_params
List of control parameters passed to nlminb optimizer. Default: list(trace = 2)
- manual_params
Optional list of manual parameter values. If NULL (default), initial parameters are estimated from linear regression. The list should contain:
beta1: Vector of coefficients for scale model
beta2: Vector of coefficients for shape model
gamma: Scalar value (default 1.0)
log_sigma2: Log of sigma squared (default log(1.0))
log_phi: Log of phi (estimated from variogram)
log_tau2_1: Log of tau squared (default log(1.0))
- output_dir
Directory to save cached models. Only used if country_code is provided.
- ignore_cache
Whether to ignore existing cache. Default FALSE.
Value
An object of class 'nlminb' containing:
par - Optimized parameter values
objective - Final value of objective function
convergence - Convergence code
message - Convergence message
iterations - Number of iterations
evaluations - Number of function/gradient evaluations
scale_formula - Formula used for scale model
shape_formula - Formula used for shape model
variogram - Fitted variogram model from automap containing:
range - Spatial correlation range parameter
psill - Partial sill (structured variance)
nugget - Nugget effect (unstructured variance)
kappa - Smoothness parameter for Matern models
Details
The function performs the following steps with progress tracking: 1. Fits initial linear models for scale and shape parameters 2. Calculates spatial distance matrix from web coordinates 3. Estimates optimal phi parameter using variogram: - Computes empirical variogram using automap - Automatically selects best theoretical variogram model - Range parameter is used to initialize spatial correlation - Default range of 100 used if estimation fails 4. Compiles and loads the TMB C++ template 5. Optimizes the joint likelihood using nlminb
The spatial correlation is modeled using an exponential variogram with parameters estimated from the data. The distance matrix is computed from the web coordinates (web_x, web_y) and used in the spatial covariance structure.
The C++ template should implement the joint spatial model for both parameters.
Note
Requires TMB package and a working C++ compiler. The C++ template must be properly structured for TMB. The automap package is required for variogram fitting.
Examples
# \donttest{
set.seed(123)
# Set parameters for simulation
total_population <- 266
urban_proportion <- 0.602
total_coords <- 266
lon_range <- c(-16.802, -13.849)
lat_range <- c(13.149, 13.801)
mean_web_x <- -1764351
mean_web_y <- 1510868
# Simulate processed survey dataset for Gambia
df_gambia <- NULL
df_gambia$age_param_data <- dplyr::tibble(
country = "Gambia",
country_code_iso3 = "GMB",
country_code_dhs = "GM",
year_of_survey = 2024,
id_coords = rep(1:total_coords, length.out = total_population),
lon = runif(total_population, lon_range[1], lon_range[2]),
lat = runif(total_population, lat_range[1], lat_range[2]),
web_x = rnorm(total_population, mean_web_x, 50000),
web_y = rnorm(total_population, mean_web_y, 50000),
log_scale = rnorm(total_population, 2.82, 0.2),
log_shape = rnorm(total_population, 0.331, 0.1),
urban = rep(c(1, 0), c(
round(total_population * urban_proportion),
total_population - round(total_population * urban_proportion)
)),
b1 = rnorm(total_population, 0.0142, 0.002),
c = rnorm(total_population, -0.00997, 0.001),
b2 = rnorm(total_population, 0.00997, 0.002),
nsampled = sample(180:220, total_population, replace = TRUE)
)
tf <- file.path(tempdir(), "test_env")
dir.create(tf, recursive = TRUE, showWarnings = FALSE)
#initialise files and key scripts
init(
r_script_name = "full_pipeline.R",
cpp_script_name = "model.cpp",
path = tf,
open_r_script = FALSE
)
#>
#> ── Package Installation Required ──
#>
#> The following packages are missing:
#> 1. remotes
#> ! Non-interactive session detected. Skipping package installation.
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1a_survey_data/processed
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1a_survey_data/raw
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/urban_extent
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/pop_raster
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1c_shapefiles
#> ! Exists: /tmp/RtmpioKZmp/test_env/02_scripts
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3a_model_outputs
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3c_table_outputs
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3d_compiled_results
#> ✔ Folder structure created successfully.
#> ℹ R script created but could not open automatically: RStudio not available.
#> ✔ C++ script '/tmp/RtmpioKZmp/test_env/02_scripts/model.cpp' successfully created.
mod <- fit_spatial_model(
df_gambia$age_param_data,
scale_outcome = "log_scale",
shape_outcome = "log_shape",
covariates = "urban",
cpp_script_name = file.path(tf, "02_scripts/model"),
country_code = "GMB",
output_dir = file.path(tf, "03_outputs/3a_model_outputs")
)
#> ℹ Fitting initial linear models...
#> ✔ Fitted initial linear models.
#>
#> ℹ Calculating empirical variogram...
#> ✔ Empirical variogram fitted.
#>
#> ℹ Initializing data and parameters for optimisation...
#> ✔ Initialization complete.
#>
#> ℹ Compiling TMB model
#> using C++ compiler: ‘g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
#> ✔ Compiled TMB model
#>
#> ℹ Optimizing model
#> 0: 7391.7541: 2.83720 0.323435 1.00000 0.00000 8.83716 0.00000
#> Warning: NA/NaN function evaluation
#> 2: -45.724052: 2.82851 0.313906 0.139014 0.00633131 8.83460 0.652761
#> 4: -112.76135: 2.67969 0.338834 0.140349 0.230713 8.76187 0.641797
#> 6: -148.55923: 2.59956 0.304243 0.135334 0.348598 8.77452 0.627783
#> 8: -154.74828: 2.59119 0.288384 0.126840 0.362371 8.77260 0.620797
#> 10: -187.95776: 2.43726 0.294003 0.122847 0.521362 8.82711 0.531083
#> 12: -203.40876: 2.32277 0.244920 0.139613 0.557847 8.93342 0.475095
#> 14: -214.09635: 2.27434 0.236596 0.126814 0.564783 8.97112 0.452987
#> 16: -392.53144: 0.902964 -0.00738539 0.145402 0.586521 10.0772 -0.0783819
#> 18: -471.84194: 0.424267 -0.0679535 0.141062 0.560513 10.4651 -0.230572
#> 20: -497.15737: 0.378278 -0.0430941 0.132131 0.555632 10.5040 -0.241520
#> 22: -549.04862: -0.0662097 -0.0736989 0.124765 0.538575 10.8866 -0.385386
#> 24: -555.47256: -0.0647047 -0.0682112 0.128599 0.540465 10.8952 -0.386791
#> 26: -564.60555: -0.0573346 -0.0526167 0.125055 0.553066 10.9454 -0.396557
#> 28: -639.98506: -0.0320552 -0.0834152 0.133506 1.38982 12.9236 -0.0509674
#> 30: -646.66141: 0.112249 -0.0232367 0.113961 1.48952 13.0390 0.108164
#> 32: -661.31186: 0.0182220 -0.00863173 0.118296 1.37007 12.9474 -0.0181810
#> 34: -665.47144: 0.0462376 -0.00381306 0.116628 1.08306 12.8775 -0.0713908
#> 36: -666.80581: 0.0431125 -0.0105411 0.120972 1.13366 13.0155 -0.107353
#> 38: -667.15127: 0.0446306 -0.00837370 0.120140 1.47726 13.3372 -0.122730
#> 40: -667.30019: 0.0439417 -0.00987688 0.120452 1.91549 13.7884 -0.123881
#> 42: -667.30316: 0.0437350 -0.00984942 0.120447 2.00270 13.8763 -0.123219
#> 44: -667.30316: 0.0437301 -0.00982706 0.120441 2.00456 13.8780 -0.123348
#> ✔ Optimized model
#>
#> ✔ Model fitted and saved at /tmp/RtmpioKZmp/test_env/03_outputs/3a_model_outputs/gmb_age_param_spatial.rds
# }