Skip to contents

Fits a spatial model for age parameters using Template Model Builder (TMB) and C++. The model incorporates spatial correlation through distance matrices and handles both scale and shape parameters simultaneously. Progress is tracked with clear status updates. Can optionally load from cache.

Usage

fit_spatial_model(
  data,
  country_code = NULL,
  scale_outcome,
  shape_outcome,
  covariates,
  cpp_script_name,
  verbose = TRUE,
  control_params = list(trace = 2),
  manual_params = NULL,
  output_dir = NULL,
  ignore_cache = FALSE
)

Arguments

data

A data frame containing the response variables, covariates, and spatial coordinates (web_x, web_y)

country_code

Optional country code to save/load cached model. Default NULL runs model without caching.

scale_outcome

Character string specifying the column name for the scale parameter response variable

shape_outcome

Character string specifying the column name for the shape parameter response variable

covariates

Character vector of covariate names to include in both scale and shape models

cpp_script_name

Character string specifying the name of the C++ file (without extension) containing the TMB model definition

verbose

Logical indicating whether to show progress updates. Default TRUE

control_params

List of control parameters passed to nlminb optimizer. Default: list(trace = 2)

manual_params

Optional list of manual parameter values. If NULL (default), initial parameters are estimated from linear regression. The list should contain:

  • beta1: Vector of coefficients for scale model

  • beta2: Vector of coefficients for shape model

  • gamma: Scalar value (default 1.0)

  • log_sigma2: Log of sigma squared (default log(1.0))

  • log_phi: Log of phi (estimated from variogram)

  • log_tau2_1: Log of tau squared (default log(1.0))

output_dir

Directory to save cached models. Only used if country_code is provided.

ignore_cache

Whether to ignore existing cache. Default FALSE.

Value

An object of class 'nlminb' containing:

  • par - Optimized parameter values

  • objective - Final value of objective function

  • convergence - Convergence code

  • message - Convergence message

  • iterations - Number of iterations

  • evaluations - Number of function/gradient evaluations

  • scale_formula - Formula used for scale model

  • shape_formula - Formula used for shape model

  • variogram - Fitted variogram model from automap containing:

    • range - Spatial correlation range parameter

    • psill - Partial sill (structured variance)

    • nugget - Nugget effect (unstructured variance)

    • kappa - Smoothness parameter for Matern models

Details

The function performs the following steps with progress tracking: 1. Fits initial linear models for scale and shape parameters 2. Calculates spatial distance matrix from web coordinates 3. Estimates optimal phi parameter using variogram: - Computes empirical variogram using automap - Automatically selects best theoretical variogram model - Range parameter is used to initialize spatial correlation - Default range of 100 used if estimation fails 4. Compiles and loads the TMB C++ template 5. Optimizes the joint likelihood using nlminb

The spatial correlation is modeled using an exponential variogram with parameters estimated from the data. The distance matrix is computed from the web coordinates (web_x, web_y) and used in the spatial covariance structure.

The C++ template should implement the joint spatial model for both parameters.

Note

Requires TMB package and a working C++ compiler. The C++ template must be properly structured for TMB. The automap package is required for variogram fitting.

Examples


# \donttest{
set.seed(123)
# Set parameters for simulation
total_population <- 266
urban_proportion <- 0.602
total_coords <- 266
lon_range <- c(-16.802, -13.849)
lat_range <- c(13.149, 13.801)
mean_web_x <- -1764351
mean_web_y <- 1510868

# Simulate processed survey dataset for Gambia
df_gambia <- NULL
df_gambia$age_param_data <- dplyr::tibble(
  country = "Gambia",
  country_code_iso3 = "GMB",
  country_code_dhs = "GM",
  year_of_survey = 2024,
  id_coords = rep(1:total_coords, length.out = total_population),
  lon = runif(total_population, lon_range[1], lon_range[2]),
  lat = runif(total_population, lat_range[1], lat_range[2]),
  web_x = rnorm(total_population, mean_web_x, 50000),
  web_y = rnorm(total_population, mean_web_y, 50000),
  log_scale = rnorm(total_population, 2.82, 0.2),
  log_shape = rnorm(total_population, 0.331, 0.1),
  urban = rep(c(1, 0), c(
    round(total_population * urban_proportion),
    total_population - round(total_population * urban_proportion)
 )),
 b1 = rnorm(total_population, 0.0142, 0.002),
 c = rnorm(total_population, -0.00997, 0.001),
  b2 = rnorm(total_population, 0.00997, 0.002),
  nsampled = sample(180:220, total_population, replace = TRUE)
)


tf <- file.path(tempdir(), "test_env")
dir.create(tf, recursive = TRUE, showWarnings = FALSE)

#initialise files and key scripts
init(
  r_script_name = "full_pipeline.R",
  cpp_script_name = "model.cpp",
  path = tf,
  open_r_script = FALSE
)
#> 
#> ── Package Installation Required ──
#> 
#> The following packages are missing:
#> 1. remotes
#> ! Non-interactive session detected. Skipping package installation.
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1a_survey_data/processed
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1a_survey_data/raw
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/urban_extent
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/pop_raster
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1c_shapefiles
#> ! Exists: /tmp/RtmpioKZmp/test_env/02_scripts
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3a_model_outputs
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3c_table_outputs
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3d_compiled_results
#>  Folder structure created successfully.
#>  R script created but could not open automatically: RStudio not available.
#>  C++ script '/tmp/RtmpioKZmp/test_env/02_scripts/model.cpp' successfully created.

mod <- fit_spatial_model(
  df_gambia$age_param_data,
  scale_outcome = "log_scale",
  shape_outcome = "log_shape",
  covariates = "urban",
  cpp_script_name = file.path(tf, "02_scripts/model"),
  country_code = "GMB",
  output_dir = file.path(tf, "03_outputs/3a_model_outputs")
)
#>  Fitting initial linear models...
#>  Fitted initial linear models.
#> 
#>  Calculating empirical variogram...
#>  Empirical variogram fitted.
#> 
#>  Initializing data and parameters for optimisation...
#>  Initialization complete.
#> 
#>  Compiling TMB model
#> using C++ compiler: ‘g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0’
#>  Compiled TMB model
#> 
#>  Optimizing model
#>   0:     7391.7541:  2.83720 0.323435  1.00000  0.00000  8.83716  0.00000
#> Warning: NA/NaN function evaluation
#>   2:    -45.724052:  2.82851 0.313906 0.139014 0.00633131  8.83460 0.652761
#>   4:    -112.76135:  2.67969 0.338834 0.140349 0.230713  8.76187 0.641797
#>   6:    -148.55923:  2.59956 0.304243 0.135334 0.348598  8.77452 0.627783
#>   8:    -154.74828:  2.59119 0.288384 0.126840 0.362371  8.77260 0.620797
#>  10:    -187.95776:  2.43726 0.294003 0.122847 0.521362  8.82711 0.531083
#>  12:    -203.40876:  2.32277 0.244920 0.139613 0.557847  8.93342 0.475095
#>  14:    -214.09635:  2.27434 0.236596 0.126814 0.564783  8.97112 0.452987
#>  16:    -392.53144: 0.902964 -0.00738539 0.145402 0.586521  10.0772 -0.0783819
#>  18:    -471.84194: 0.424267 -0.0679535 0.141062 0.560513  10.4651 -0.230572
#>  20:    -497.15737: 0.378278 -0.0430941 0.132131 0.555632  10.5040 -0.241520
#>  22:    -549.04862: -0.0662097 -0.0736989 0.124765 0.538575  10.8866 -0.385386
#>  24:    -555.47256: -0.0647047 -0.0682112 0.128599 0.540465  10.8952 -0.386791
#>  26:    -564.60555: -0.0573346 -0.0526167 0.125055 0.553066  10.9454 -0.396557
#>  28:    -639.98506: -0.0320552 -0.0834152 0.133506  1.38982  12.9236 -0.0509674
#>  30:    -646.66141: 0.112249 -0.0232367 0.113961  1.48952  13.0390 0.108164
#>  32:    -661.31186: 0.0182220 -0.00863173 0.118296  1.37007  12.9474 -0.0181810
#>  34:    -665.47144: 0.0462376 -0.00381306 0.116628  1.08306  12.8775 -0.0713908
#>  36:    -666.80581: 0.0431125 -0.0105411 0.120972  1.13366  13.0155 -0.107353
#>  38:    -667.15127: 0.0446306 -0.00837370 0.120140  1.47726  13.3372 -0.122730
#>  40:    -667.30019: 0.0439417 -0.00987688 0.120452  1.91549  13.7884 -0.123881
#>  42:    -667.30316: 0.0437350 -0.00984942 0.120447  2.00270  13.8763 -0.123219
#>  44:    -667.30316: 0.0437301 -0.00982706 0.120441  2.00456  13.8780 -0.123348
#>  Optimized model
#> 
#>  Model fitted and saved at /tmp/RtmpioKZmp/test_env/03_outputs/3a_model_outputs/gmb_age_param_spatial.rds

# }