---
title: "Getting Started"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r, include=FALSE}
library(rix)
```

## Non-technical intro to Nix

*If you are familiar with the concept of package managers, you can skip
to the next section.*

To ensure that a project is reproducible you need to deal with at least four
things:

- Make sure that the required/correct version of R (or any other language) is installed;
- Make sure that the required versions of packages are installed;
- Make sure that system dependencies are installed (for example, you'd need a working Java installation to install the `{rJava}` R package on Linux);
- Make sure that you can install all of this for the hardware you have on hand.

For the three first bullet points, the consensus seems to be a mixture of Docker
to deal with system dependencies, `{renv}` for the packages (or `{groundhog}`,
or a fixed CRAN snapshot like those [Posit
provides](https://packagemanager.posit.co/__docs__/user/get-repo-url/#ui-frozen-urls))
and the [R installation manager](https://github.com/r-lib/rig) to install the
correct version of R (unless you use a Docker image as base that already ships
the required version by default).

As for the last bullet point, the only way out is to be able to compile the
software for the target architecture. There's a lot of moving pieces, and
knowledge that you need to have to get it right.

But it turns out that this is not the only solution. Docker + `{renv}` (or some
other way to deal with R packages) is likely the most popular way to ensure
reproducibility of your projects, but there are other tools to achieve this. One
such tool is called Nix.

Nix is a package manager for Linux distributions, macOS and it even
works on Windows if you enable WSL2. What's a package manager? If you're not a
Linux user, you may not be aware. Let me explain it this way: in R, if you want
to install a package to provide some functionality not included with a vanilla
installation of R, you'd run this:

```
install.packages("dplyr")
```

It turns out that Linux distributions, like Ubuntu for example, work in a
similar way, but for software that you'd usually install using an installer (at
least on Windows). For example you could install Firefox on Ubuntu using:

```
sudo apt-get install firefox
```

(there's also graphical interfaces that make this process "more user-friendly").
In Linux jargon, `packages` are simply what we call software (or I guess
it's all "apps" these days). These packages get downloaded from so-called
repositories (think of CRAN, the repository of R packages) but for any type of
software that you might need to make your computer work: web browsers, office
suites, multimedia software and so on.

So Nix is just another package manager that you can use to install software.

But what interests us is not using Nix to install Firefox, but instead to
install R and the R packages that we require for our analysis (or any other
programming language that we need). But why use Nix instead of the usual ways to
install software on our operating systems?

The first thing that you should know is that Nix's repository, `nixpkgs`, is
huge. Humongously huge. As I'm writing these lines, [there's more than 120'000
pieces of software available](https://search.nixos.org/packages), and the
*entirety of CRAN and Bioconductor* is also available through `nixpkgs`. So
instead of installing R as you usually do and then use `install.packages()` to
install packages, you could use Nix to handle everything. But still, why use Nix
at all?

Nix has an interesting feature: using Nix, it is possible to install software in
(relatively) isolated environments. So using Nix, you can install as many
versions of R and R packages that you need. Suppose that you start working on a
new project. As you start the project, with Nix, you would install a
project-specific version of R and R packages that you would only use for that
particular project. If you switch projects, you'd switch versions of R and R
packages.

If you are familiar with `{renv}`, you should see that this is exactly
the same thing: the difference is that not only will you have a project-specific
library of R packages, you will also have a project-specific R version. So if
you start a project now, you'd have R version 4.2.3 installed (the latest
version available in `nixpkgs` but not the latest version available, more on
this later), with the accompagnying versions of R packages, for as long as the
project lives (which can be a long time). If you start a project next year, then
that project will have its own R, maybe R version 4.4.2 or something like that,
and the set of required R packages that would be current at that time. This is
because Nix always installs the software that you need in separate, (isolated)
environments on your computer. So you can define an environment for one specific
project.

But Nix goes even further: not only can you install R and R packages using
Nix (in isolated) project-specific environments, Nix even installs the required
system dependencies. So for example if I need `{rJava}`, Nix will make sure to
install the correct version of Java as well, always in that project-specific
environment (so if you already some Java version installed on your system, there
won't be any interference).

What's also pretty awesome, is that you can use a specific version of `nixpkgs`
to *always* get *exactly* the same versions of **all** the software whenever you
build that environment to run your project's code. The environment gets defined
in a simple plain-text file, and anyone using that file to build the environment
will get exactly, byte by byte, the same environment as you when you initially
started the project. And this also regardless of the operating system that is
used.

## The Nix package manager

Nix is a package manager that can be installed on your computer (regardless of
OS) and can be used to install software like with any other package manager. If
you're familiar with the Ubuntu Linux distribution, you likely have used
`apt-get` to install software. On macOS, you may have used `homebrew` for
similar purposes. Nix functions in a similar way, but has many advantages over
classic package managers, as it focuses on reproducible builds and downloads
packages from `nixpkgs`, 
[currently the largest software repository](https://repology.org/repositories/graphs).

This means that using Nix, it is possible to install not only R, but also all
the packages required for your project. The obvious question is why use Nix
instead of simply installing R and R packages as usual. The answer is that Nix
makes sure to install every dependency of any package, up to required system
libraries. For example, the `{xlsx}` package requires the Java programming
language to be installed on your computer to successfully install. This can be
difficult to achieve, and `{xlsx}` bullied many R developers throughout the
years (especially those using a Linux distribution, `sudo R CMD javareconf`
still plagues my nightmares).

But with Nix, it suffices to declare that we want the `{xlsx}` package for our
project, and Nix figures out automatically that Java is required and installs
and configures it. It all just happens without any required intervention from
the user. The second advantage of Nix is that it is possible to *pin* a certain
*revision* of the Nix packages' repository (called `nixpkgs`) for our project.
Pinning a revision ensures that every package that Nix installs will always be
at exactly the same versions, regardless of when in the future the packages get
installed.

## rix workflow

The idea of `{rix}` is for you to declare the environment you need using the
provided `rix()` function. `rix()` is the package's main function and generates
a file called `default.nix` which is then used by the Nix package manager to
build that environment. Ideally, you would set up such an environment for each
of your projects. You can then use this environment to either work
interactively, or run R scripts. It is possible to have as many environments as
projects, and software that is common to environments will simply be re-used and
not get re-installed to save space. Environments are isolated for each other,
but can still interact with your system's files, unlike with Docker where a
volume must be mounted. Environments can also interact with the software
installed on your computer through the usual means, which can sometimes lead to
issues. For example, if you already have R installed, and a user library of R
packages, more caution is required to properly use environments managed by Nix.

It is important at this stage to understand that you should not call
`install.packages()` from a running Nix environment. If you want to add packages
to a Nix environment while analyzing data, you need to add it the `default.nix`
expression and rebuild the environment. This is explained in greater detail in
`vignette("installing-r-packages")`.

To avoid interference between your main R library of packages and your Nix
environments, calling `rix()` will also run `rix_init()`, which will create a
custom `.Rprofile` in the project's directory. This `.Rprofile` will ensure that
if you have a user library of packages, these won't get loaded by an R version
running in a Nix shell. It will also redefine `install.packages()` to throw an
error if you try to use it.

`rix()` has several arguments:

- the R version you need for your project or a snapshot date;
- a list of R packages that your project needs;
- an optional list of additional software (for example a Python interpreter, or Quarto);
- an optional list with packages to install from GitHub;
- an optional list of local packages in `.tar.gz` format to install;
- an optional list of LaTeX packages;
- whether you want to use RStudio as an IDE for your project (or VS Code, or another environment);
- the path to save the `default.nix` file (by default the current working directory)

For example:

```{r parsermd-chunk-1, eval = FALSE}
rix(
  r_ver = "latest-upstream",
  r_pkgs = c("dplyr", "chronicler"),
  ide = "none"
)
```

The call above writes a `default.nix` file in the current working directory.
This `default.nix` can in turn be used by Nix to build an environment containing
the latest version of R included in the upstream `nixpkgs`, with the `{dplyr}`
and `{chronicler}` packages.

Take note of the `ide = "none"` argument: this argument, and the values it
can take, are further discussed in the vignette `vignette("configuring-ide")`
but continue reading this vignette and then vignettes numbered by a "d".

You can instead provide a specific R version:

```{r, eval = FALSE}
rix(
  r_ver = "4.4.1",
  r_pkgs = c("dplyr", "chronicler"),
  ide = "none"
)
```

or a date:

```{r, eval = FALSE}
rix(
  date = "2024-12-14",
  r_pkgs = c("dplyr", "chronicler"),
  ide = "none"
)
```

Providing a specific R version or a date will not use the upstream/official
`nixpkgs` package repository, but our `rstats-on-nix` fork. We decided to use a
fork instead of the official `nixpkgs` for older releases, because it allows us
to backport fixes and improve package compatibility, which is espcially
important for Apple Silicon. It also allows us to provide newer releases of R
more quickly than through the official channels. For example, as of writing
(December 18th 2024), R version 4.4.2 is not yet included in upstream `nixpkgs` but
is already available through our fork. Thanks to our fork, we are also able to 
snapshot CRAN more frequently, and thus include more CRAN and Bioconductor
package versions than in the upstream `nixpkgs` repository.

### Converting from an renv.lock file

`{rix}` also includes an `renv2nix()` function that converts an `renv.lock`
file into a valid Nix expression. Read the vignette `vignette("renv2nix")`
to learn more.

### Using default.nix files

The Nix package manager can be used to build reproducible development
environments according to the specifications found in the generated
`default.nix` files, which contain a Nix *expression*. An *expression* is Nix
jargon for a function with multiple inputs and one output, this output being our
development environment. `{rix}` does not require Nix to be installed to
generate valid expressions (but does require an internet connection), so you
could generate expressions and use them on other machines. To actually build an
environment using a `default.nix` file, go to where you chose to write it
(ideally in a new, empty folder that will be the root folder of your project)
and use the Nix package manager to build the environment. Call the following
function in a terminal:

```
nix-build
```

Nix install packages in a dedicated folder on your computer, called the *Nix
store*.

Once Nix is done building the environment, you can start working on it
interactively by using the following command in a terminal emulator (not the R
console):

```
nix-shell
```

You will *drop* into a Nix shell which provides the installed software. It is not
mandatory to call `nix-build` first: you can immediately call `nix-shell`. The
advantage of using `nix-build` first is that it creates a file called `result`
which will prevent the environment to get garbage collected if you clean
the Nix store.

If you want to build an environment for an older version of R, you might get a
warning telling you that you cannot build the expression, but that you can
directly drop into it.

If you want to completely isolate your Nix environment from the rest of the
system, we recommend using `nix-shell --pure` to drop into the environment, as
described in the documentation of `rix_init()`.

Finally, if you want to delete an environment, delete the `result` file first
(if you used `nix-build`) and then call `nix-store --gc`, which will delete all
the orphaned packages.

Now that you know more about Nix and `{rix}`, it is time to get these tools
installed on your system.

- If you're running either Linux or Windows, read the Linux or Windows vignette: `vignette("setting-up-linux-windows")`
- If you're running macOS, read the macOS vignette: `vignette("setting-up-macos")`
