---
title: "Saving and Sharing Graphs with the Caugi Format"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Saving and Sharing Graphs with the Caugi Format}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(caugi)
```

## Overview

The caugi package provides a native JSON-based serialization format for saving
and loading causal graphs. This format enables reproducible research, data
sharing, and caching of graph structures.

## Quick Start

### Writing Graphs

First, create a causal graph:

```{r}
cg <- caugi(
  A %-->% B + C,
  B %-->% D,
  C %-->% D,
  class = "DAG"
)
```

Then, write it to a file in the caugi format:

```{r}
tmp <- tempfile(fileext = ".caugi.json")
write_caugi(cg, tmp,
  comment = "Example causal graph",
  tags = c("research", "example")
)
```

That's it! The graph is now saved in a human-readable JSON file.

### Reading Graphs

You can read the graph back from the file, and verify it matches the original:

```{r}
cg_loaded <- read_caugi(tmp)

identical(edges(cg), edges(cg_loaded))
```

## The Caugi Format

### Structure

The caugi format uses a simple, human-readable JSON structure:

```{r echo=FALSE, comment=""}
cat(readLines(tmp), sep = "\n")
```

### Key Features

- **Versioned**: Schema version 1 with forward compatibility
- **Human-readable**: Uses node names and DSL operators (not indices)
- **Self-documenting**: Includes `$schema` reference for IDE validation
- **Metadata support**: Optional comments and tags

### Edge Types

The format supports all caugi edge types using their DSL operators:

| Operator   | Description                | Graph Types                |
|:-----------|:---------------------------|:---------------------------|
| `-->`      | Directed edge              | DAG, PDAG, ADMG, UNKNOWN   |
| `---`      | Undirected edge            | UG, PDAG, UNKNOWN          |
| `<->`      | Bidirected edge            | ADMG, UNKNOWN              |
| `o->`      | Partially directed         | PDAG, UNKNOWN              |
| `--o`      | Partially undirected       | PDAG, UNKNOWN              |
| `o-o`      | Partial (both circles)     | PDAG, UNKNOWN              |

## Working with the Format

### String Serialization

For programmatic use, you can serialize to/from strings:

```{r}
# Serialize to JSON string
json_str <- caugi_serialize(cg)
cat(substr(json_str, 1, 200), "...\n")

# Deserialize from JSON string
cg_from_json <- caugi_deserialize(json_str)
```

### Metadata

Add context to your graphs with comments and tags:

```{r}
write_caugi(cg, tmp,
  comment = "Mediation model from Study A",
  tags = c("mediation", "study-a", "validated")
)
```

## Different Graph Types

The format supports all caugi graph classes:

```{r}
# DAG
dag <- caugi(X %-->% Y, Y %-->% Z, class = "DAG")

# PDAG (with undirected edges)
pdag <- caugi(X %-->% Y, Y %---% Z, class = "PDAG")

# ADMG (with bidirected edges)
admg <- caugi(X %-->% Y, Y %<->% Z, class = "ADMG")

# UG (undirected graph)
ug <- caugi(X %---% Y, Y %---% Z, class = "UG")

# Save them all
write_caugi(dag, tempfile(fileext = ".caugi.json"))
write_caugi(pdag, tempfile(fileext = ".caugi.json"))
write_caugi(admg, tempfile(fileext = ".caugi.json"))
write_caugi(ug, tempfile(fileext = ".caugi.json"))
```

## File Extension Convention

We recommend using `.caugi.json` as the file extension to clearly indicate both the format and content type. This helps tools recognize the files and enables automatic handling by IDEs and validators.

## Schema Validation

All files generated by `write_caugi()` include a `$schema` field pointing to the formal JSON Schema specification:

```
https://caugi.org/schemas/caugi-v1.schema.json
```

This enables:

- **IDE support**: Autocomplete and inline validation in VS Code, IntelliJ, etc.
- **Automated validation**: Use standard JSON Schema validators
- **Documentation**: Hover hints in editors show field descriptions

## Performance

Serialization is implemented in Rust for high performance. Large graphs serialize and deserialize efficiently:

```{r eval=FALSE}
tmp_file <- tempfile(fileext = ".caugi.json")
large_dag <- generate_graph(n = 1000, m = 500, class = "DAG")
system.time(write_caugi(large_dag, tmp_file))
system.time(res <- read_caugi(tmp_file))
unlink(tmp_file)
```

```{r cleanup, include=FALSE}
# Clean up temp files
unlink(tmp)
```
