---
title: "How To Use CGMissingDataR"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{How To Use CGMissingDataR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# CGMissingDataR

CGMissingDataR is an R package based on the CGMissingData Python library for evaluating model performance under feature missingness by: 

* injecting missing values into feature columns at specified masking rates, 
* imputing missing values using a Multiple Imputation by Chained Equations (MICE)-style iterative imputer, and
* training Random Forest and k-Nearest Neighbors regressors to report Mean ABsolute Percentage Error (MAPE) and R across missingness levels. 

## Installation

Before the installation, ensure that you have the following R packages installed: 

```r
install.packages(c("FNN", "ranger", "mice"))
```
Install the development version of CGMissingDataR from GitHub:

``` r
devtools::install_github("saraswatsh/CGMissingDataR")
```

## Example

Below is a brief example illustrating the usage of CGMissingDataR.

```{r, setup, cache = TRUE}
library(CGMissingDataR)

# Load example dataset
data("CGMExampleData")
results <- run_missingness_benchmark(CGMExampleData, mask_rates = c(0.05, 0.10, 0.15, 0.20),target_col = "LBORRES", # Running the missingness benchmark
feature_cols = c("TimeDifferenceMinutes", "TimeSeries", "USUBJID")) 
print(results) # Displaying the results
```
