---
title: "Run your Pipeline"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{run-your-pipeline}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

Once you have built your full specification blueprint and feel comfortable with how the pipeline is executed, you can implement a full multiverse-style analysis. 

Simply use `run_multiverse(<your expanded grid object>)`:

```{r setup, message=FALSE}
library(tidyverse)
library(multitool)

# create some data
the_data <-
  data.frame(
    id  = 1:500,
    iv1 = rnorm(500),
    iv2 = rnorm(500),
    iv3 = rnorm(500),
    mod = rnorm(500),
    dv1 = rnorm(500),
    dv2 = rnorm(500),
    include1 = rbinom(500, size = 1, prob = .1),
    include2 = sample(1:3, size = 500, replace = TRUE),
    include3 = rnorm(500)
  )

# create a pipeline blueprint
full_pipeline <- 
  the_data |>
  add_filters(include1 == 0, include2 != 3, include3 > -2.5) |> 
  add_variables(var_group = "ivs", iv1, iv2, iv3) |> 
  add_variables(var_group = "dvs", dv1, dv2) |> 
  add_model("linear model", lm({dvs} ~ {ivs} * mod))

# expand the pipeline
expanded_pipeline <- expand_decisions(full_pipeline)

# Run the multiverse
multiverse_results <- run_multiverse(expanded_pipeline)

multiverse_results
```

The result will be another `tibble` with various list columns. 

It will always contain a list column named `specifications` containing all the information you generated in your blueprint. Next, there will a list column for your fitted model fitted, labelled `model_fitted`. 

## Unpacking a multiverse analysis

There are two main ways to unpack and examine `multitool` results. The first is by using `tidyr::unnest()`.

### Unnest

Inside the `model_fitted` column, `multitool` gives us 4 columns: `model_parameters`, `model_performance`, `model_warnings`, and `model_messages`. 

```{r unnest}
multiverse_results |> unnest(model_fitted)
```

The `model_parameters` column gives you the result of calling `parameters::parameters()` on each model in your grid, which is a `data.frame` of model coefficients and their associated standard errors, confidence intervals, test statistic, and p-values.

```{r tidy}
multiverse_results |> 
  unnest(model_fitted) |> 
  unnest(model_parameters)
```

The `model_performance` column gives fit statistics, such as r^2^ or AIC and BIC values, computed by running `performance::performance()` on each model in your grid.

```{r glance}
multiverse_results |> 
  unnest(model_fitted) |>
  unnest(model_performance)
```

The `model_messages` and `model_warnings` columns contain information provided by the modeling function. If something went wrong or you need to know something about a particular model, these columns will have captured messages and warnings printed by the modeling function.

### Reveal

I wrote wrappers around the `tidyr::unnest()` workflow. The main function is `reveal()`. Pass a multiverse results object to `reveal()` and tell it which columns to grab by indicating the column name in the `.what` argument:

```{r reveal}
multiverse_results |> 
  reveal(.what = model_fitted)
```

If you want to get straight to a specific result you can specify a sub-list with the `.which` argument:

```{r which}
multiverse_results |> 
  reveal(.what = model_fitted, .which = model_parameters)
```

### `reveal_model_*`

`multitool` will run and save anything you put in your pipeline but most often, you will want to look at model parameters and/or performance. To that end, there are a set of convenience functions for getting at the most common multiverse results: `reveal_model_parameters`, `reveal_model_performance`, `reveal_model_messages`, and `reveal_model_warnings`.

`reveal_model_parameters` unpacks the model parameters in your multiverse:

```{r reveal-model-parameters}
multiverse_results |> 
  reveal_model_parameters()
```

`reveal_model_performance` unpacks the model performance:

```{r reveal-model-performance}
multiverse_results |> 
  reveal_model_performance()
```

### Unpacking Specifications

You can also choose to expand your decision grid with `.unpack_specs` to see which decisions produced what result. You have two options for unpacking your decisions - `wide` or `long`. If you set `.unpack_specs = 'wide'`, you get one column per decision variable. This is exactly the same as how your decisions appeared in your grid.

```{r unpack-specs-wide}
multiverse_results |> 
  reveal_model_parameters(.unpack_specs = "wide")
```

If you set `.unpack_specs = 'long'`, your decisions get stacked into two columns: `decision_set` and `alternatives`. This format is nice for plotting a particular result from a multiverse analyses per different decision alternatives.

```{r unpack-specs-long}
multiverse_results |> 
  reveal_model_performance(.unpack_specs = "long")
```

### Condense

Unpacking specifications alongside specific results allows us to examine the effects of our pipeline decisions. 

A powerful way to organize these results is to summarize a specific results column, say the r^2^ values of our model over the entire multiverse. `condense()` takes a result column and summarizes it with the `.how` argument, which takes a list in the form of `list(<a name you pick> = <summary function>)`.

`.how` will create a column named like so `<column being condsensed>_<summary function name provided>`. For this case, we have `r2_mean` and `r2_median`.

```{r condense}
# model performance r2 summaries
multiverse_results |>
  reveal_model_performance() |> 
  condense(r2, list(mean = mean, median = median))

# model parameters for our predictor of interest
multiverse_results |>
  reveal_model_parameters() |> 
  filter(str_detect(parameter, "iv")) |>
  condense(unstd_coef, list(mean = mean, median = median))
```

In the last example, we have filtered our multiverse results to look at our predictors `iv*` to see what the mean and median effect was (over all combinations of decisions) on our outcomes. 

However, we had three versions of our predictor and two outcomes, so combining `dplyr::group_by()` with `condense()` might be more informative:

```{r group_by-condense1}
multiverse_results |>
  reveal_model_parameters(.unpack_specs = "wide") |> 
  filter(str_detect(parameter, "iv")) |>
  group_by(ivs, dvs) |>
  condense(unstd_coef, list(mean = mean, median = median))
```

If we were interested in all the terms of the model, we can leverage `group_by` further:

```{r group_by-condense2}
multiverse_results |>
  reveal_model_parameters(.unpack_specs = "wide") |> 
  group_by(parameter, dvs) |>
  condense(unstd_coef, list(mean = mean, median = median))
```
