---
title: 'Multiblock basics: one projector, many tables'
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Multiblock basics: one projector, many tables}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
params:
  family: red
css: albers.css
resource_files:
- albers.css
- albers.js
includes:
  in_header: |-
    <script src="albers.js"></script>
    <script>document.addEventListener('DOMContentLoaded',()=>document.body.classList.add('palette-red'));</script>

---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse   = TRUE,
  comment    = "#>",
  fig.width  = 7,
  fig.height = 4
)
library(dplyr)
library(multivarious)
# Assuming necessary multiblock functions are loaded, e.g., via devtools::load_all()
```

# 1. Why multiblock?

Many studies collect several tables on the same samples – e.g.
transcriptomics + metabolomics, or multiple sensor blocks.
Most single-table reductions (PCA, ICA, NMF, …) ignore that structure.
`multiblock_projector` is a thin wrapper that keeps track of which
original columns belong to which block, so you can

*   drop-in any existing decomposition (PCA, SVD, NMF, …)
*   still know "these five loadings belong to block A, those three to block B"
*   project or reconstruct per block effortlessly.

We demonstrate with a minimal two-block toy-set.

```{r data_multiblock}
set.seed(1)
n  <- 100
pA <- 7; pB <- 5                    # two blocks, different widths

XA <- matrix(rnorm(n * pA), n, pA)
XB <- matrix(rnorm(n * pB), n, pB)
X  <- cbind(XA, XB)                 # global data matrix
blk_idx <- list(A = 1:pA, B = (pA + 1):(pA + pB)) # Named list is good practice
```

# 2. Wrap a single PCA as a multiblock projector

```{r build_multiblock}
# 2-component centred PCA (using base SVD for brevity)
preproc_fitted <- fit(center(), X)
Xc        <- transform(preproc_fitted, X)          # Centered data
svd_res   <- svd(Xc, nu = 0, nv = 2)               # only V (loadings)
mb        <- multiblock_projector(
  v             = svd_res$v,                       # p × k loadings
  preproc       = preproc_fitted,                  # remembers centering
  block_indices = blk_idx
)

print(mb)
```

## 2.1  Project the whole data

```{r project_multiblock_all}
scores_all <- project(mb, X)                       # n × 2
head(round(scores_all, 3))
```

## 2.2  Project one block only

```{r project_multiblock_block}
# Project using only data from block A (requires original columns)
scores_A <- project_block(mb, XA, block = 1)       
# Project using only data from block B
scores_B <- project_block(mb, XB, block = 2)       

cor(scores_all[,1], scores_A[,1])                  # high (they coincide)
```

Because the global PCA treats all columns jointly, projecting only block A
gives exactly the same latent coordinates as when the whole matrix is
available – useful when a block is missing at prediction time.

## 2.3  Partial feature projection

Need to use just three variables from block B?

```{r project_multiblock_partial}
# Get the global indices for the first 3 columns of block B
sel_cols_global <- blk_idx[["B"]][1:3]
# Extract the corresponding data columns from the full matrix or block B
part_XB_data  <- X[, sel_cols_global, drop = FALSE] # Data must match global indices

scores_part <- partial_project(mb, part_XB_data,
                               colind = sel_cols_global)  # Use global indices
head(round(scores_part, 3))
```

# 3. Adding scores → multiblock_biprojector

If you also keep the sample scores (from the original fit) you get two-way functionality:
re-construct data, measure error, run permutation tests, etc. That is one
extra line when creating the object:

```{r build_biprojector}
bi <- multiblock_biprojector(
  v             = svd_res$v,
  s             = Xc %*% svd_res$v,    # Calculate scores: Xc %*% V
  sdev          = svd_res$d[1:2] / sqrt(n-1), # SVD d are related to sdev
  preproc       = preproc_fitted,
  block_indices = blk_idx
)
print(bi)
```

Now you can, for instance, test whether component-wise consensus
between blocks is stronger than by chance.

```{r perm_test_multiblock}
# Quick permutation test (use more permutations for real analyses)
# use_rspectra=FALSE needed for this 2-block example; larger problems can use TRUE
perm_res <- perm_test(bi, Xlist = list(A = XA, B = XB), nperm = 99, use_rspectra = FALSE)
print(perm_res$component_results)
```

The `perm_test` method for `multiblock_biprojector` uses an eigen-based score consensus
statistic to assess whether blocks share more variance than expected by chance.

# 4. Take-aways

*   Any decomposition that delivers a loading matrix `v` (and
    optionally scores `s`) can become multiblock-aware by supplying
    `block_indices`.
*   The wrapper introduces zero new maths – it only remembers the column
    grouping and plugs into the common verbs:

| Verb                  | What it does in multiblock context                     |
|-----------------------|--------------------------------------------------------|
| `project()`           | whole-matrix projection (uses preprocessing)           |
| `project_block()`     | scores based on one block's data                       |
| `partial_project()`   | scores from an arbitrary subset of global columns      |
| `coef(..., block=)` | retrieve loadings for a specific block               |
| `perm_test()`         | permutation test for block consensus (biprojector)   |

This light infrastructure lets you prototype block-aware analyses
quickly, while still tapping into the entire `multiblock` toolkit
(cross-validation, reconstruction metrics, composition with
`compose_projector`, etc.).

```{r sessionInfo}
sessionInfo()
``` 
