Skip to contents

This function runs the entire spatial modeling workflow for a given country code and logs the results. It processes Survey data, fits a spatial model, generates predictions, creates population tables, and produces raster outputs. The function is modular and can be reused for different countries with minimal adjustments.

Usage

run_full_workflow(
  country_code,
  survey_data_path = here::here("01_data", "1a_survey_data", "processed"),
  survey_data_suffix = "dhs_pr_records_combined.rds",
  shape_path = here::here("01_data", "1c_shapefiles"),
  shape_suffix = "district_shape.gpkg",
  pop_raster_path = here::here("01_data", "1b_rasters", "pop_raster"),
  pop_raster_suffix = "_ppp_2020_constrained.tif",
  ur_raster_path = here::here("01_data", "1b_rasters", "urban_extent"),
  ur_raster_suffix = "afurextent.asc",
  pred_save_file = FALSE,
  raster_width = 2500,
  raster_height = 2000,
  raster_resolution = 300,
  save_raster = TRUE,
  generate_pop_raster = FALSE,
  pyramid_line_color = "#67000d",
  pyramid_fill_high = "#fee0d2",
  pyramid_fill_low = "#a50f15",
  pyramid_caption = paste0("Note: Total population includes ",
    "ages 99+, pyramid shows ages 0-99"),
  output_paths = list(),
  model_params = list(),
  return_results = FALSE,
  n_cores = parallel::detectCores() - 2,
  ...
)

Arguments

country_code

Character. The ISO3 country code (e.g., "TZA").

survey_data_path

Character. Path to Survey data. Default: "01_data/1a_survey_data/processed".

survey_data_suffix

Character. Suffix for Survey data files. Default: "dhs_pr_records_combined.rds".

shape_path

Character. Path to shapefile data. Default: "01_data/1c_shapefiles".

shape_suffix

Character. Suffix for shapefile data. Default: "district_shape.gpkg".

pop_raster_path

Character. Path to population raster data. Default: "01_data/1b_rasters/pop_raster".

pop_raster_suffix

Character. Suffix for population raster files. Default: "_ppp_2020_constrained.tif".

ur_raster_path

Character. Path to urban-rural extent data. Default: "01_data/1b_rasters/urban_extent".

ur_raster_suffix

Character. Suffix for urban-rural raster. Default: "afurextent.asc".

pred_save_file

Logical. Whether to save prediction files. Default: FALSE

raster_width

Integer. Width of raster plots in pixels. Default: 2500

raster_height

Integer. Height of raster plots in pixels. Default: 2000

raster_resolution

Integer. Resolution of PNG outputs. Default: 300

save_raster

Logical. Whether to save raster outputs to disk. Default: TRUE

generate_pop_raster

Logical. Whether to generate population raster. Default: FALSE

pyramid_line_color

Character. Hex color code for the age pyramid's outline. Default: "#67000d"

pyramid_fill_high

Character. Hex color code for the age pyramid's higher values fill. Default: "#fee0d2"

pyramid_fill_low

Character. Hex color code for the age pyramid's lower values fill. Default: "#a50f15"

pyramid_caption

Character. Caption text for the age pyramid plot. Default: "Note: Total population includes ages 99+, pyramid shows ages 0-99"

output_paths

List of output paths:

  • model: Path for model outputs. Default: "03_outputs/3a_model_outputs"

  • plot: Path for plots. Default: "03_outputs/3b_visualizations"

  • raster: Path for rasters. Default: "03_outputs/3c_raster_outputs"

  • table: Path for tables. Default: "03_outputs/3c_table_outputs"

  • compiled: Path for compiled results. Default: "03_outputs/3d_compiled_results"

  • excel: Path for Excel outputs. Default: "03_outputs/3d_compiled_results/age_pop_denom_compiled.xlsx"

  • log: Path for logs. Default: "03_outputs/3a_model_outputs/modelling_log.rds"

model_params

List of model parameters:

  • cell_size: Cell size in meters. Default: 5000

  • n_sim: Number of simulations. Default: 5000

  • ignore_cache: Whether to ignore cache. Default: FALSE

  • age_range: Age range vector. Default: c(0, 99)

  • age_interval: Age interval. Default: 1

  • return_prop: Return proportions. Default: TRUE

  • scale_outcome: Scale outcome variable. Default: "log_scale"

  • shape_outcome: Shape outcome variable. Default: "log_shape"

  • covariates: Model covariates. Default: "urban"

  • cpp_script: C++ script path. Default: "02_scripts/model"

  • control_params: Control parameters. Default: list(trace = 2)

  • manual_params: Manual parameters. Default: NULL

  • verbose: Verbose output. Default: TRUE

  • age_range_raster: Age range for raster output. Default: c(0, 10)

  • age_interval_raster: Age interval for raster output. Default: 1

return_results

Logical. Whether to return results. Default: FALSE.

n_cores

Integer number of cores for parallel processing for age population table, default detectCores()-2

...

Additional arguments passed to subfunctions.

Value

If return_results is TRUE, a list containing:

  • spat_model_param: Fitted spatial model parameters

  • predictor_data: Predictor dataset

  • gamma_prediction: Generated gamma predictions

  • pred_list: Processed gamma prediction results

  • final_age_pop_table: Age-population table data

  • final_pop: Compiled population data

  • all_mod_params: Compiled model parameters

If return_results is FALSE, the function saves all outputs to disk and returns NULL invisibly.

See also

Examples


# \donttest{
# set country code
country_codeiso <- "GMB"

set.seed(123)
# Set parameters for simulation
total_population <- 266
urban_proportion <- 0.602
total_coords <- 266
lon_range <- c(-16.802, -13.849)
lat_range <- c(13.149, 13.801)
mean_web_x <- -1764351
mean_web_y <- 1510868

# Simulate processed survey dataset for Gambia
df_gambia <- NULL
df_gambia$age_param_data <- dplyr::tibble(
  country = "Gambia",
  country_code_iso3 = "GMB",
  country_code_dhs = "GM",
  year_of_survey = 2024,
  id_coords = rep(1:total_coords, length.out = total_population),
  lon = runif(total_population, lon_range[1], lon_range[2]),
  lat = runif(total_population, lat_range[1], lat_range[2]),
  web_x = rnorm(total_population, mean_web_x, 50000),
  web_y = rnorm(total_population, mean_web_y, 50000),
  log_scale = rnorm(total_population, 2.82, 0.2),
  log_shape = rnorm(total_population, 0.331, 0.1),
  urban = rep(c(1, 0), c(
    round(total_population * urban_proportion),
    total_population - round(total_population * urban_proportion)
  )),
  b1 = rnorm(total_population, 0.0142, 0.002),
  c = rnorm(total_population, -0.00997, 0.001),
  b2 = rnorm(total_population, 0.00997, 0.002),
 nsampled = sample(180:220, total_population, replace = TRUE)
)


# Create temp directory with normalized path
tf <- file.path(tempdir(), "test_env")
dir.create(tf, recursive = TRUE, showWarnings = FALSE)
tf <- normalizePath(tf, winslash = "/", mustWork = FALSE)

AgePopDenom::init(
  r_script_name = "full_pipeline.R",
  cpp_script_name = "model.cpp",
  path = tf,
  open_r_script = FALSE
)
#> 
#> ── Package Installation Required ──
#> 
#> The following packages are missing:
#> 1. remotes
#> ! Non-interactive session detected. Skipping package installation.
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1a_survey_data/processed
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1a_survey_data/raw
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/urban_extent
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/pop_raster
#> ! Exists: /tmp/RtmpioKZmp/test_env/01_data/1c_shapefiles
#> ! Exists: /tmp/RtmpioKZmp/test_env/02_scripts
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3a_model_outputs
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3c_table_outputs
#> ! Exists: /tmp/RtmpioKZmp/test_env/03_outputs/3d_compiled_results
#>  Folder structure created successfully.
#>  R script created but could not open automatically: RStudio not available.
#>  C++ script '/tmp/RtmpioKZmp/test_env/02_scripts/model.cpp' successfully created.

# save as processed dhs data
saveRDS(
  df_gambia,
  file = file.path(
    tf, "01_data", "1a_survey_data", "processed",
    "dhs_pr_records_combined.rds"
  ) |>
    normalizePath(winslash = "/", mustWork = FALSE)
)

# Download shapefiles
download_shapefile(
  country_codes = country_codeiso,
  dest_file = file.path(
    tf, "01_data", "1c_shapefiles",
    "district_shape.gpkg"
  ) |>
    normalizePath(winslash = "/", mustWork = FALSE)
)
#>  Downloading missing WHO ADM2 data for: GMB
#>  Appended missing country codes to existing shapefile: GMB

# Download population rasters from worldpop
download_pop_rasters(
 country_codes = country_codeiso,
  dest_dir = file.path(tf, "01_data", "1b_rasters", "pop_raster") |>
    normalizePath(winslash = "/", mustWork = FALSE)
)
#>  Population raster files successfully processed.
# Extract urban extent raster
extract_afurextent(
  dest_dir = file.path(tf, "01_data", "1b_rasters", "urban_extent") |>
    normalizePath(winslash = "/", mustWork = FALSE)
)
#>  The file already exists at /tmp/RtmpioKZmp/test_env/01_data/1b_rasters/urban_extent/afurextent.asc and FALSE is FALSE.
#> [1] "/tmp/RtmpioKZmp/test_env/01_data/1b_rasters/urban_extent/afurextent.asc"

# Modelling --------------------------------------------------------------

run_full_workflow(
  country_code = country_codeiso,
  survey_data_path = file.path(
    tf, "01_data", "1a_survey_data", "processed"
  ) |>
    normalizePath(winslash = "/", mustWork = FALSE),
  survey_data_suffix = "dhs_pr_records_combined.rds",
  shape_path = file.path(
    tf, "01_data", "1c_shapefiles"
  ) |>
    normalizePath(winslash = "/", mustWork = FALSE),
  shape_suffix = "district_shape.gpkg",
  pop_raster_path = file.path(
    tf, "01_data", "1b_rasters", "pop_raster"
  ) |>
    normalizePath(winslash = "/", mustWork = FALSE),
  pop_raster_suffix = "_ppp_2020_constrained.tif",
  ur_raster_path = file.path(
    tf, "01_data", "1b_rasters", "urban_extent"
  ) |>
    normalizePath(winslash = "/", mustWork = FALSE),
  ur_raster_suffix = "afurextent.asc",
  pred_save_file = FALSE,
  raster_width = 2500,
  raster_height = 2000,
  raster_resolution = 300,
  save_raster = TRUE,
  pyramid_line_color = "#67000d",
  pyramid_fill_high = "#fee0d2",
  pyramid_fill_low = "#a50f15",
  pyramid_caption = paste0(
    "Note: Total population includes ",
    "ages 99+, pyramid shows ages 0-99"
  ),
  generate_pop_raster = TRUE,
  output_paths = list(
    model = file.path(tf, "03_outputs", "3a_model_outputs"),
    plot = file.path(tf, "03_outputs", "3b_visualizations"),
    raster = file.path(tf, "03_outputs", "3c_raster_outputs"),
    table = file.path(tf, "03_outputs", "3c_table_outputs"),
    compiled = file.path(tf, "03_outputs", "3d_compiled_results"),
    excel = file.path(
      tf, "03_outputs", "3d_compiled_results",
      "age_pop_denom_compiled.xlsx"
    ),
    log = file.path(
      tf, "03_outputs", "3a_model_outputs", "modelling_log.rds"
    )
  ) |> lapply(\(x) normalizePath(x, winslash = "/", mustWork = FALSE)),
  model_params = list(
    cell_size = 5000,
    n_sim = 10,
    ignore_cache = FALSE,
    age_range = c(0, 1),
    age_interval = 1,
    return_prop = TRUE,
    scale_outcome = "log_scale",
    shape_outcome = "log_shape",
    covariates = "urban",
    cpp_script = file.path(tf, "02_scripts", "model") |>
      normalizePath(winslash = "/", mustWork = FALSE),
    control_params = list(trace = 2),
    manual_params = NULL,
    verbose = TRUE
  ),
  return_results = FALSE,
  n_cores = 1
)
#> 
#> ── Fitting Spatial Model for Gambia ────────────────────────────────────────────
#>  Importing cached model results...
#>  Successfully imported cached model results.
#> 
#> 
#> ── Generating variogram plot for Gambia ──
#> 
#>  Variogram saved to /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations/gmb_variogram.png
#> 
#> ── Creating Predictor Data for Gambia ──────────────────────────────────────────
#>  Processing shapefiles and grids...
#>  Processed shapefiles and grids.
#> 
#>  Processing population raster...
#>  Processed population raster.
#> 
#>  Processing urban-rural raster...
#>  Processed urban-rural raster.
#> 
#>  Extracting data from rasters onto grid...
#>  Extracted data from rasters onto grid.
#> 
#>  Creating grided data with predictors...
#>  Created grided data with predictors.
#> 
#>  Predictors data created and saved at /tmp/RtmpioKZmp/test_env/03_outputs/3a_model_outputs/gmb_predictor_data.rds
#> 
#> ── Running Prediction for Gambia ───────────────────────────────────────────────
#>  Making a prediction grid...
#>  Prediction grid successfully made.
#> 
#>  Setting model parameters...
#>  Model parameters set successfully.
#> 
#>  Calculating pairwise distances for prediction grid...
#>  Computed pairwise distances for prediction grid.
#> 
#>  Computing and inverting covariance matrix...
#>  Computed and inverted covariance matrix.
#> 
#>  Computing mean and standard deviation of predictions...
#>  Computed mean and standard deviation of predictions.
#> 
#>  Simulating random effects...
#>  Simulated random effects.
#> 
#>  Predicting gamma, scale, and shape parameters...
#>  Predicted gamma, scale, and shape parameters.
#> 
#> 
#> ── Producing Prediction Rasters for Gambia ─────────────────────────────────────
#>  Raster plot saved to /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations/gmb_gamma_prediction_rasters.png
#> 
#> ── Producing District-level Age-Population Tables for Gambia ───────────────────
#>  Processing interval 1/2...
#>  Completed interval 1/2.
#> 
#>  Processing interval 2/2...
#>  Completed interval 2/2.
#> 
#>  Final age population data saved to /tmp/RtmpioKZmp/test_env/03_outputs/3c_table_outputs/gmb_age_tables_pop_0_1plus_yrs_by_1yrs.rds
#> 
#> ── Producing Age-Population Raster for Gambia ──────────────────────────────────
#>  Processing interval 1/11...
#>  Completed interval 1/11.
#> 
#>  Processing interval 2/11...
#>  Completed interval 2/11.
#> 
#>  Processing interval 3/11...
#>  Completed interval 3/11.
#> 
#>  Processing interval 4/11...
#>  Completed interval 4/11.
#> 
#>  Processing interval 5/11...
#>  Completed interval 5/11.
#> 
#>  Processing interval 6/11...
#>  Completed interval 6/11.
#> 
#>  Processing interval 7/11...
#>  Completed interval 7/11.
#> 
#>  Processing interval 8/11...
#>  Completed interval 8/11.
#> 
#>  Processing interval 9/11...
#>  Completed interval 9/11.
#> 
#>  Processing interval 10/11...
#>  Completed interval 10/11.
#> 
#>  Processing interval 11/11...
#>  Completed interval 11/11.
#> 
#>  Raster stack saved to /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations/gmb_age_pop_grid_0_10_yrs_by_1yrs.tif
#> 
#> ── Producing Regional-level Age-pyramid for Gambia ─────────────────────────────
#>  Age pyramid count plot saved to /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations/gmb_age_pyramid_count.png
#>  Age pyramid prop plot saved to /tmp/RtmpioKZmp/test_env/03_outputs/3b_visualizations/gmb_age_pyramid_prop.png
#> 
#> ── Compiling model parameter data for all countries ────────────────────────────
#>  Model paramters extracted and saved to /tmp/RtmpioKZmp/test_env/03_outputs/3d_compiled_results.
#> 
#> ── Compiling age-structured population data for all countries ──────────────────
#>  Final age-structured population and proportions saved to /tmp/RtmpioKZmp/test_env/03_outputs/3d_compiled_results/age_pop_denom_compiled.xlsx.
# }