Skip to contents


This package offers a tidy solution for epidemiological data. It houses a range of functions for epidemiologists and public health data wizards for data management and cleaning. For more details on how to use this package, visit the epiCleanr website.


The package is available on Cran and can be installed in the following way:

Or install the development version from GitHub:

# If you haven't installed the 'devtools' package, run:
# install.packages("devtools")

Load the package:

Quick Workflow Overview

epiCleanr could be used as a helper package for end-to-end epidemiological data management, offering functionalities ranging from data importation and quality assessment to cleaning and exporting files. Below are some of the workflow steps this package streamlines:

Import Data

Utilise import() to seamlessly read data from a wide array of file formats, from CSV to Excel to JSON, all within one function.

Data Quality Checks

  • consistency_check(): Generate plots to identify inconsistencies, such as when the number of tests exceeds the number of cases.

  • missing_plot(): Visualize patterns of missing data or reporting rates across different variables and factors.

  • create_test(): Establish unit-testing functions to automate data validation, ensuring the robustness of your dataset.

Data Cleaning

  • clean_admin_names(): Normalize administrative names in your dataset using either user-supplied data or downloaded reference data via get_admin_names().

  • cleaning_names_strings(): Use this function to clean and standardize string columns in your data.

  • handle_outliers(): Detect and manage outliers using a variety of statistical methods, providing you with options to either remove or impute them.

Data Export

Finally, use export() to save your cleaned data back into multiple file formats, be it CSV, Excel, or other specialized formats.