---
title: "BibTeX and CFF"
subtitle: A deterministic crosswalk between two non-equivalent citation schemas
bibliography: REFERENCES.bib
author:
  - name: Diego Hernangómez
    orcid: 0000-0001-8457-4658
    affiliations:
      - Independent Researcher
tbl-cap-location: bottom
description: >
  This article presents a crosswalk between BibTeX and the Citation File Format,
  as implemented by the cffr package.
abstract: >
  This article introduces a crosswalk between BibTeX and the Citation File
  Format (CFF) [@druskat_citation_2021], as implemented by the **cffr** package
  [@hernangomez2021]. Because the two formats differ in structure and
  expressiveness, the mapping is not bijective. Nevertheless, it provides a
  deterministic and reproducible strategy for practical interoperability across
  citation workflows.
link-citations: true
documentclass: article
toc: true
toc-depth: 3
vignette: >
  %\VignetteIndexEntry{BibTeX and CFF}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
knitr:
  opts_chunk:
    collapse: true
    comment: "#>"
---

```{r}
#| include: false
library(cffr)
# Load the table of tables

p2file <- system.file("extdata/crosswalk_tables.csv", package = "cffr")

table_master <- read.csv(p2file)
```

::: callout-important
Generative AI tools were used to assist in producing some of the material in
this article.
:::

## Citation

Please cite this article using this BibTeX entry:

``` bib
@article{hernangomez2022,
    title        = {{BibTeX} and {CFF}, A deterministic crosswalk between two non-equivalent citation schemas},
    author       = {Diego Hernangómez},
    year         = 2022,
    journal      = {The {cffr} package},
    volume       = {Vignettes},
    doi          = {10.21105/joss.03900},
    url          = {https://docs.ropensci.org/cffr/articles/bibtex-cff.html}
}
```

## BibTeX and R

[BibTeX](https://en.wikipedia.org/wiki/BibTeX) is a widely used format for
storing bibliographic references, originally designed in 1985 for
document‑centric workflows. It represents citations as relatively flat records
with loosely constrained fields, relying on convention rather than an explicit
schema.

The Citation File Format (CFF) [@druskat_citation_2021] provides a structured
and extensible alternative for representing citation metadata, particularly for
software and research outputs. It supports explicit typing, nested objects, and
richer semantics, including contributor roles and identifiers.

In modern research workflows, both formats frequently coexist. BibTeX remains
dominant in academic publishing pipelines, while CFF is increasingly adopted by
infrastructure platforms such as GitHub and Zenodo.

The **cffr** package [@hernangomez2021] provides a deterministic mapping
(crosswalk) between BibTeX and CFF, enabling practical interoperability. Due to
fundamental differences in their data models, this mapping is not bijective and
may involve structural transformations, heuristic parsing, and controlled
information loss.

## Conceptual differences between BibTeX and CFF

At a conceptual level, BibTeX and CFF embody different design philosophies.

BibTeX represents bibliographic information as flat records with loosely defined
fields, optimized for citation rendering in documents. Semantics are largely
implicit and contextual, depending on entry types and bibliography styles.

CFF, in contrast, defines a structured schema with explicit typing, nested
objects, and support for richer metadata. It is designed not only for producing
citations, but also for machine‑actionable interoperability across platforms.

These conceptual differences explain why a direct one‑to‑one correspondence
between BibTeX and CFF is not possible. Any crosswalk must therefore rely on
explicit design decisions that balance fidelity, usability, and schema
constraints.

## BibTeX Definitions

@patashnik1988 provides the canonical description of the BibTeX format. Two key
concepts are central: **entries** and **fields**.

### Entries {#entries}

The original BibTeX specification defines fourteen canonical entry types, each
corresponding to a class of cited work:

1.  **\@article**: An article from a journal or magazine.
2.  **\@book**: A book with an explicit publisher.
3.  **\@booklet**: A printed and bound work without a named publisher.
4.  **\@conference**: Equivalent to **\@inproceedings**, included for
    [Scribe](https://en.wikipedia.org/wiki/Scribe_(markup_language))
    compatibility.
5.  **\@inbook**: A part of a book, such as a chapter or page range.
6.  **\@incollection**: A part of a book with its own title.
7.  **\@inproceedings**: An article in conference proceedings.
8.  **\@manual**: Technical documentation.
9.  **\@mastersthesis**: A Master’s thesis.
10. **\@misc**: A fallback type when no other entry fits.
11. **\@phdthesis**: A PhD thesis.
12. **\@proceedings**: The proceedings of a conference.
13. **\@techreport**: A numbered technical report.
14. **\@unpublished**: A work not formally published.

Other implementations, notably BibLaTeX [@biblatexpack], extend this set with
additional entry types such as online resources, software, and datasets. In
standard BibTeX, such entries must typically be represented as **\@misc**.

In **R** [@R_2021], the base function `bibentry()` does not implement
**\@conference**. Instead, **\@inproceedings** is used, as both share the same
conceptual definition.

### Fields {#fields}

BibTeX entries consist of fields whose relevance and requirement depend on the
entry type. Some fields are mandatory, others optional, and some are ignored by
standard bibliography styles but may still carry useful information.

The following table summarizes the relationship between BibTeX **entries** and
their **required or optional fields**, following @patashnik1988. Required fields
are marked with **x**, and optional fields with **o**.

```{r}
#| label: tbl-entry_fields1
#| echo: false
#| tbl-cap: BibTeX entries
#| tbl-subcap:
#|   - "BibTeX: required fields by entry"
#|   - "BibTeX: required fields by entry (cont.)"
#|
df_table <- table_master[table_master$table == "entry_fields", -1]

nms <- c(
  "field",
  "\\@article",
  "\\@book",
  "\\@booklet",
  "\\@inbook",
  "\\@incollection",
  "\\@conference, \\@inproceedings",
  "\\@manual",
  "\\@mastersthesis, phdthesis",
  "\\@misc",
  "\\@proceedings",
  "\\@techreport",
  "\\@unpublished"
)

df_table[is.na(df_table)] <- ""
row.names(df_table) <- NULL
t1 <- df_table[, c(1:7)]
nm1 <- nms[1:7]

knitr::kable(
  t1,
  col.names = nm1,
  row.names = NA,
  align = c("l", rep("c", 6))
)

t2 <- df_table[, c(1, 8:13)]
nm2 <- nms[c(1, 8:13)]
knitr::kable(
  t2,
  col.names = nm2,
  row.names = NA,
  align = c("l", rep("c", 6))
)
```

Only a subset of fields is required for any given entry. Fields such as
**title**, **author**, and **year** appear across most entry types, whereas
others are optional or never mandatory. This strict coupling between entry types
and fields is a defining feature of BibTeX and a key source of friction when
interoperating with schema‑driven formats such as CFF.

## Citation File Format

Citation File Format (CFF) consists of plain‑text, YAML‑based files that encode
human‑ and machine‑readable citation metadata for software and datasets.

Two keys play a central role in CFF reference modeling:

- `preferred-citation`: Identifies a work that should be cited instead of the
  software or dataset itself.
- `references`: Lists related creative works, analogous to references in a
  scholarly article.

Both keys expect `definition-reference` objects, as defined in the [CFF schema
guide](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#preferred-citation).
These objects support explicit typing, metadata nesting, and structured
identifiers, in contrast to BibTeX’s flat field model.

The following table summarizes the valid keys for CFF `definition-reference`:

```{r}
#| label: tbl-refkeys
#| echo: false
#| message: false
#| warning: false
#| results: asis
#| tbl-cap: "Valid keys in CFF `definition-reference` objects"
library(cffr)

# Fill with whites
init <- paste0("`", cff_schema_definitions_refs(), "`")

l <- c(init, rep("", 4))


refkeys <- matrix(l, ncol = 5, byrow = TRUE)

knitr::kable(
  refkeys,
  row.names = NA
)
```

Conceptually, most of these keys correspond to BibTeX [**fields**](#fields),
with one key playing a special role: `type`. In CFF, `type` explicitly
identifies the kind of referenced work, making it conceptually analogous to a
BibTeX [**entry**](#entries) type rather than a **field**[^1].

[^1]: See a complete list of possible values of CFF `type` in [Appendix
    B](#appendix_cff_type)

## Mapping strategy

The mapping implemented in **cffr** follows a three‑step pipeline:

1.  Parse BibTeX entries into an intermediate representation.
2.  Apply deterministic mapping rules to transform entry types and fields.
3.  Serialize the result into CFF (and vice versa).

Whenever possible, mappings are deterministic. However, some fields require
heuristic parsing, particularly when normalizing names, dates, or page ranges.
As a result, semantic round‑trip reversibility is not guaranteed, even when the
conversion process itself is reproducible.

```{r}
#| label: cffbibread
#| comment: '#>'
string <- "@book{einstein1921,
    title        = {Relativity: The Special and the General Theory},
    author       = {Einstein, A.},
    year         = 1920,
    publisher    = {Henry Holt and Company},
    address      = {London, United Kingdom},
    isbn         = 9781587340925}"

# To cff
library(cffr)
cff_format <- cff_read_bib_text(string)

cff_format

# To BibTeX with S3 method
toBibtex(cff_format)
```

## Mapping tables

The following tables summarize the mapping rules implemented in **cffr**. They
are intended as implementation documentation rather than as a normative
specification.

Mapping types include:

- **direct**: preserved without modification.
- **transform**: renamed or structurally reorganized.
- **split**: one field mapped to multiple fields.
- **heuristic**: requires parsing or inference.
- **unsupported**: omitted due to lack of correspondence.
- **enrichment**: exists only in CFF.

These mappings are not bijective and may introduce normalization effects.

::: callout-note
For clarity throughout this document, **bold** formatting (e.g., **\@book**,
**edition**) is used for BibTeX entries and fields, whereas inline code
formatting (e.g., `book`, `edition`) is used for CFF keys.
:::

::: {#tbl-mapentry}
| BibTeX entry | CFF key `type` | Mapping type | Notes |
|:-------------------|:-------------------|:-------------------|:-------------------|
| **\@article** | `type: article` | direct | \- |
| **\@book** | `type: book` | direct | \- |
| **\@inbook** | `type: chapter` | transform | Normalized to CFF taxonomy |
| **\@incollection** | `type: chapter` | transform | Same as **\@inbook** |
| **\@manual** | `type: manual` | direct | \- |
| **\@misc** | `type: generic` | heuristic | `generic` is used as a fallback when no more specific CFF type applies. |
| **\@phdthesis** | `type: thesis` | transform | The specific thesis subtype (e.g., master’s vs PhD) may be lost when mapping from BibTeX to CFF |
| **\@techreport** | `type: report` | transform | \- |
| **other** | `type: generic` | heuristic | Default fallback |

: BibTeX entries and CFF key `type` mapping
:::

::: {#tbl-mapcore}
| BibTeX field  | CFF key            | Mapping type | Notes                               |
|:-------------------|:-------------------|:-------------------|:--------------------|
| **title**     | `title`            | direct       | \-                                  |
| **author**    | `authors`          | transform    | Parsed into structured name objects |
| **editor**    | `editors`          | transform    | Same parsing logic as authors       |
| **year**      | `year`             | direct       | \-                                  |
| **month**     | `month`            | heuristic    | Format normalization required       |
| **journal**   | `journal`          | direct       | \-                                  |
| **booktitle** | `collection-title` | transform    | Container title                     |
| **volume**    | `volume`           | direct       | \-                                  |
| **number**    | `issue`            | transform    | Naming normalization                |
| **pages**     | `start`/`end`      | split        | Parsed from page range              |
| **doi**       | `doi`              | direct       | \-                                  |
| **url**       | `url`              | direct       | \-                                  |

: Core fields
:::

::: {#tbl-mapstruct}
| BibTeX field    | CFF key              | Mapping type | Notes                          |
|:-------------------|:-------------------|:-------------------|:--------------------|
| **publisher**   | `publisher.name`     | transform    | Converted to structured object |
| **address**     | `publisher.location` | transform    | Combined with `publisher`      |
| **institution** | `institution`        | direct       | Used mainly in reports/thesis  |
| **school**      | `institution`        | transform    | Thesis normalization           |

: Structural fields
:::

::: {#tbl-mapparsed}
| BibTeX field | CFF key       | Mapping type | Notes                                 |
|:-------------|:--------------|:-------------|:--------------------------------------|
| **pages**    | `start`/`end` | heuristic    | Requires parsing, may fail            |
| **author**   | `authors`     | heuristic    | Name splitting is not always reliable |
| **month**    | `month`       | heuristic    | Multiple formats possible             |
| **note**     | `notes`       | heuristic    | Free text                             |

: Parsed fields
:::

::: {#tbl-mapnobib}
| CFF key              | BibTeX equivalent | Mapping type | Notes                |
|:---------------------|:------------------|:-------------|:---------------------|
| `preferred-citation` | \-                | enrichment   | CFF-specific concept |
| `references`         | \-                | enrichment   | Not representable    |
| `type: software`     | \-                | unsupported  | No BibTeX equivalent |
| `identifiers`        | partial           | transform    | Only DOI/URL mapped  |

: CFF keys with no BibTeX equivalence
:::

## Proposed Crosswalk

The **cffr** package provides utilities for mapping BibTeX entries (via
`bibentry()`) to CFF and back. This section describes how the crosswalk is
implemented, partially informed by @Haines_Ruby_CFF_Library_2021[^2].

[^2]: Note that this software performs only the mapping from CFF to BibTeX,
    however **cffr** can perform the mapping in both directions.

The crosswalk is primarily defined for BibTeX to CFF conversion, reverse mapping
is a best-effort approximation and may be lossy.

After presenting the general mapping between BibTeX **entries** and **fields**
and CFF keys, the next section introduces [**Entry Models**](#entrymodels) that
refine these rules for specific BibTeX entry types.

The mapping is structured according to the transformation semantics defined
below.

## Transformation Semantics

The crosswalk applies different transformation strategies:

- **Direct mapping**: one-to-one correspondence between a BibTeX field and a CFF
  key.
- **Structural transformation**: a single field is expanded into multiple keys,
  such as **pages** to `start` and `end`.
- **Heuristic mapping**: equivalence is inferred based on entry type or context.
- **Lossy mapping**: information is discarded due to lack of an equivalent
  representation.

Unless stated otherwise, mappings are lossy in at least one direction.

All mappings and examples in the following sections are instances of these
transformation classes.

### Entry/Key `type` Crosswalk

For mapping general BibTeX entries to CFF key `type`, the following equivalence
is proposed:

```{r}
#| label: tbl-entry_bib2cff
#| echo: false
#| results: asis
#| tbl-cap: "Entry/Type crosswalk: From BibTeX to CFF"
df_table <- table_master[table_master$table == "entry_bib2cff", c(2:4)]
df_table[is.na(df_table)] <- ""
# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
row.names(df_table) <- NULL

knitr::kable(
  df_table,
  col.names = c("BibTeX Entry", "CFF key: `type`", "Notes"),
  row.names = NA
)
```

The previous mapping has the following specifications:

- **\@book**, **\@inbook**, and **\@incollection** are closely related in
  BibTeX[^3]. While **\@inbook** and **\@incollection** both reference parts of
  a **\@book**, the former is used for citing sections, chapters, pages, or
  other specific parts, whereas the latter is used for citing parts with a
  specific title. Since CFF allows keys `type: book` and
  `collection-type: book`, we may utilize a combination of these fields to tag
  each entry type in CFF accordingly.
- **\@mastersthesis** and **\@phdthesis** are tagged using a combination of
  `type: thesis` and `thesis-type` keys.

[^3]: Note that BibLaTeX [@biblatexpack] handles **\@inbook** differently, see
    [Appendix A](#appendix_inbook).

Additionally, considering that CFF allows for a wide range of values[^4] for the
key `type`, the following mapping is applied from CFF to BibTeX:

[^4]: See [Appendix B](#appendix_cff_type) for all possible values. Information
    extracted from @druskat2019.

The reverse mapping prioritizes BibTeX compatibility and normalization over
exact round-trip preservation.

```{r}
#| label: tbl-entry_cff2bib
#| echo: false
#| results: asis
#| tbl-cap: "Entry/Key `type` crosswalk: From CFF to BibTeX"
df_table <- table_master[table_master$table == "entry_cff2bib", c(2:4)]
df_table[is.na(df_table)] <- ""
# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
row.names(df_table) <- NULL

knitr::kable(
  df_table,
  col.names = c("CFF key `type`", "BibTeX Entry", "Notes")
)
```

### Fields/Key Crosswalk

There is a significant similarity between the definitions and names of certain
BibTeX fields and CFF keys. While the equivalence is straightforward in some
cases, there are instances where certain keys need to be processed depending on
the **entry** type.

```{r}
#| label: tbl-fields_bib2cff
#| echo: false
#| results: asis
#| tbl-cap: "BibTeX - CFF key crosswalk"
df_table <- table_master[table_master$table == "fields_bib2cff", c(2:4)]
df_table[is.na(df_table)] <- ""
# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
row.names(df_table) <- NULL

knitr::kable(
  df_table,
  col.names = c("BibTeX **Field**", "CFF key", "Notes")
)
```

We provide more detail on some of the mappings presented in the table above:

- Some fields are not mapped because there is no clear equivalence with CFF keys
  (such as **annote**, **crossref**, and **key**). Regarding the **type** field,
  the CFF key `type` corresponds to the identifier of the work (similar to an
  entry in BibTeX), therefore, BibTeX **type** won't be mapped. These fields are
  always optional in BibTeX.
- For the **address** field, its intended use in BibTeX varies depending on the
  entry type (e.g., for **\@inproceedings**, it denotes the **address** of the
  **conference**, while for **\@mastersthesis/\@phdthesis**, it is the
  **address** of the **school**, etc.). Mapping between BibTeX and CFF becomes
  more complex when institutional metadata is involved. This results in varying
  final mappings in CFF. When mapping from CFF to BibTeX, we propose to follow
  the same entry-based logic, using the key `location` as a fallback value when
  mapping to **address**.
- In relation with this complexity mentioned above, **institution,
  organization** and **school** are mapped to `institution`.
- **series** is mapped to `collection-title` only on those entries that do not
  require **booktitle**. In practice, this means that `collection-title`
  corresponds to **booktitle** (for **\@incollection** and **\@inproceedings**),
  and in the other cases it corresponds to **series**. As a consequence,
  **series** information is lost for **\@incollection** and **\@inproceedings**,
  but in those cases it is an optional field.
- When mapping from CFF to BibTeX, we propose to use `date-published` as a
  fallback for extracting **month** and **year** fields.
- When **pages** is provided as a range separated by `--`, i.e. **pages =
  {3--5}**, it is coerced as `start: 3`, `end: 5` in CFF.

#### BibLaTeX

Additionally, there are other CFF keys that correspond to BibLaTeX fields. We
propose to include these fields in the crosswalk[^5], even though they are not
part of the core BibTeX fields definition.

[^5]: See @biblatexcheatsheet for a preview of the accepted BibLaTeX fields.

```{r}
#| label: tbl-fields_biblatex2cff
#| echo: false
#| results: asis
#| tbl-cap: "BibLaTeX - CFF Field/Key crosswalk"
df_table <- table_master[table_master$table == "fields_biblatex2cff", c(2:3)]
df_table[is.na(df_table)] <- ""
# fix links
df_table$f2 <- gsub("link_to_entry_models", "#entrymodels", df_table$f2)
row.names(df_table) <- NULL

knitr::kable(
  df_table,
  col.names = c("**BibLaTeX Field**", "CFF key")
)
```

## Limitations

The proposed crosswalk is subject to several structural limitations arising from
differences between BibTeX and CFF schemas:

- **Lossy mappings**: Some BibTeX fields (e.g., **crossref**, **annote**,
  **key**) have no equivalent in CFF and are omitted during conversion.
- **Ambiguous semantics**: Certain fields (e.g., **address**, **series**) have
  context-dependent meanings in BibTeX, requiring heuristic mapping to CFF keys
  such as `institution`, `location`, or `collection-title`.
- **Type collapsing**: Multiple BibTeX entry types (e.g., **\@misc**,
  **\@incollection**) are mapped to a single CFF type (`generic`), which acts as
  a fallback when no more specific type is available.
- **Structural transformations**: Some fields require transformation rather than
  direct mapping (e.g., **pages** → `start`/`end`), altering the representation
  of the original data.
- **Non-reversibility**: Due to the above factors, round-trip conversion (BibTeX
  → CFF → BibTeX) is not guaranteed to preserve the original structure or
  semantics.

These limitations reflect fundamental differences between the two formats rather
than implementation-specific constraints.

## Design decisions

The mapping implemented in **cffr** is guided by the following principles:

- Prefer deterministic mappings whenever possible
- Use heuristics only when unavoidable
- Normalize metadata to align with CFF conventions
- Prioritize interoperability over exact round‑trip fidelity

## Entry Models {#entrymodels}

This section documents entry‑specific mapping behavior, expanding the general
crosswalk into concrete and testable models. Examples are adapted from the
[xampl.bib](https://tug.org/texmf-docs/bibtex/xampl.bib) distributed with BibTeX
[@patashnik].

### \@article {#article}

The crosswalk of **\@article** does not require any special treatment.

```{r}
#| label: tbl-model_article
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_article", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@article** Model"
)
```

**Round-trip**

```{r}
bib <- "@article{article-full,
    title        = {The Gnats and Gnus Document Preparation System},
    author       = {Leslie A. Aamport},
    year         = 1986,
    month        = jul,
    journal      = {{G-Animal's} Journal},
    volume       = 41,
    number       = 7,
    pages        = {73+},
    note         = {This is a full ARTICLE entry}}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@book / \@inbook {#book-inbook}

In terms of the fields required in BibTeX, the primary difference between
**\@book** and **\@inbook** is that **\@inbook** requires a **chapter** or
**page** field, while **\@book** does not even allow these fields as optional.
Therefore, we propose that an **\@inbook** entry in CFF be treated as a
**\@book** with the following supplementary fields:

1.  `section`: To denote the specific **chapter** within the book.
2.  `start`/`end`: To indicate the range of **pages** covered by the section.

Additionally, note that in CFF, the **series** field corresponds to
`collection-title`, and the **address** field represents the `publisher`'s
`address`. Finally, the key `collection-type` is populated with `book-series`.

```{r}
#| label: tbl-model_book
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_book", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@book / \\@inbook** Model"
)
```

There are notable differences in how BibTeX and **BibLaTeX** handle the
**\@inbook** entry (further discussed in the [Appendix A](#appendix_inbook)). We
propose to treat a BibLaTeX **\@inbook** as a BibTeX **\@incollection.**

**Round-trip: \@book**

```{r}
bib <- "@book{book-full,
    title        = {Seminumerical Algorithms},
    author       = {Donald E. Knuth},
    year         = 1981,
    month        = 10,
    publisher    = {Addison-Wesley},
    address      = {Reading, Massachusetts},
    series       = {The Art of Computer Programming},
    volume       = 2,
    note         = {This is a full BOOK entry},
    edition      = {Second}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

**Round-trip: \@inbook**

```{r}
bib <- "@inbook{inbook-full,
    title        = {Fundamental Algorithms},
    author       = {Donald E. Knuth},
    year         = 1973,
    month        = 10,
    publisher    = {Addison-Wesley},
    address      = {Reading, Massachusetts},
    series       = {The Art of Computer Programming},
    volume       = 1,
    pages        = {10--119},
    note         = {This is a full INBOOK entry},
    edition      = {Second},
    type         = {Section},
    chapter      = {1.2}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@booklet {#booklet}

In **\@booklet** **address** is mapped to `location`.

```{r}
#| label: tbl-model_booklet
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_booklet", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@booklet** Model"
)
```

**Round-trip**

```{r }
bib <- "@booklet{booklet-full,
    title        = {The Programming of Computer Art},
    author       = {Jill C. Knvth},
    date         = {1988-03-14},
    month        = feb,
    address      = {Stanford, California},
    note         = {This is a full BOOKLET entry},
    howpublished = {Vernier Art Center}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@conference / \@inproceedings {#conf_inproc}

Note that in this case, **organization** is mapped to `institution`.
Additionally, **series** is ignored because there is no clear mapping in CFF for
this field.

```{r}
#| label: tbl-model_inproceedings
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_inproceedings", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@conference / \\@inproceedings** Model"
)
```

**Round-trip**

```{r}
bib <- "@inproceedings{inproceedings-full,
    title        = {On Notions of Information Transfer in {VLSI} Circuits},
    author       = {Alfred V. Oaho and Jeffrey D. Ullman and Mihalis Yannakakis},
    year         = 1983,
    month        = mar,
    booktitle    = {Proc. Fifteenth Annual ACM Symposium on the Theory of Computing},
    publisher    = {Academic Press},
    address      = {Boston},
    series       = {All ACM Conferences},
    number       = 17,
    pages        = {133--139},
    editor       = {Wizard V. Oz and Mihalis Yannakakis},
    organization = {The OX Association for Computing Machinery}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@incollection {#incol}

As **booktitle** is a required field, we propose to map that field to
`collection-title` and the `type` to `generic`. Therefore, an **\@incollection**
is a `type: generic` in CFF with a `collection-title` key. The `generic` type is
used as a fallback when no semantically equivalent CFF type exists.

Additionally, **series** and **type** are ignored because there is no clear
mapping in CFF for this field.

```{r}
#| label: tbl-model_incollection
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_incollection", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@incollection** Model"
)
```

**Round-trip**

```{r}
bib <- "@incollection{incollection-full,
    title        = {Semigroups of Recurrences},
    author       = {Daniel D. Lincoll},
    year         = 1977,
    month        = sep,
    booktitle    = {High Speed Computer and Algorithm Organization},
    publisher    = {Academic Press},
    address      = {New York},
    series       = {Fast Computers},
    number       = 23,
    pages        = {179--183},
    note         = {This is a full INCOLLECTION entry},
    editor       = {David J. Lipcoll and D. H. Lawrie and A. H. Sameh},
    chapter      = 3,
    type         = {Part},
    edition      = {Third}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@manual

As in the case of [**\@conference** / **\@inproceedings**](#conf_inproc),
**organization** is mapped to `institution`.

```{r}
#| label: tbl-model_manual
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_manual", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@manual** Model"
)
```

**Round-trip**

Note that **month** cannot be coerced to a single integer in the range `1--12`
as required in CFF, so it is ignored to avoid validation errors.

```{r}
bib <- "@manual{manual-full,
  title        = {The Definitive Computer Manual},
    author       = {Larry Manmaker},
    year         = 1986,
    month        = {apr-may},
    address      = {Silicon Valley},
    note         = {This is a full MANUAL entry},
    organization = {Chips-R-Us},
    edition      = {Silver}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@mastersthesis / \@phdthesis

In terms of fields required by BibTeX, it is identical for both
**\@mastersthesis** and **\@phdthesis.**

We propose here to identify each type of thesis using the key `thesis-type`. If
`thesis-type` contains a [regex pattern](https://regex101.com/r/mBWfbs/1)
`(?i)(phd)`, it is recognized as **\@phdthesis**.

Additionally, **school** is mapped to `institution`.

```{r}
#| label: tbl-model_thesis
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_thesis", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@mastersthesis / \\@phdthesis** Model"
)
```

**Round-trip: \@mastersthesis**

```{r}
bib <- "@mastersthesis{mastersthesis-full,
    title        = {Mastering Thesis Writing},
    author       = {Edouard Masterly},
    year         = 1988,
    month        = jun,
    address      = {English Department},
    note         = {This is a full MASTERSTHESIS entry},
    school       = {Stanford University},
    type         = {Master's project}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

**Round-trip: \@phdthesis**

```{r}
bib <- "@phdthesis{phdthesis-full,
    title        = {Fighting Fire with Fire: Festooning {F}rench Phrases},
    author       = {F. Phidias Phony-Baloney},
    year         = 1988,
    month        = jun,
    address      = {Department of French},
    note         = {This is a full PHDTHESIS entry},
    school       = {Fanstord University},
    type         = {{PhD} Dissertation}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@misc

The crosswalk of **\@misc** does not require any special treatment. This
**entry** does not require any **field**.

Note also that it is mapped to `type: generic` as [**\@incollection**](#incol),
but in this case **booktitle** is not even an option, so the proposed definition
should cover both **\@misc** and **\@incollection** without problems.

```{r}
#| label: tbl-model_misc
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_misc", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@misc** Model"
)
```

**Round-trip**

```{r}
bib <- "@misc{misc-full,
    title        = {Handing out random pamphlets in airports},
    author       = {Joe-Bob Missilany},
    year         = 1984,
    month        = oct,
    note         = {This is a full MISC entry},
    howpublished = {Handed out at O'Hare}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@proceedings

The proposed model is consistent with [**\@conference** /
**\@inproceedings**](#conf_inproc). Note that **\@proceedings** does not
prescribe an **author** field. In these cases, as `authors` is required in CFF,
we use *anonymous*[^6] when mapping to CFF and omit it when mapping from CFF to
BibTeX.

[^6]: As proposed on [*How to deal with unknown individual
    authors?*](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#how-to-deal-with-unknown-individual-authors),
    **(Guide to Citation File Format schema version 1.2.0)**

```{r}
#| label: tbl-model_proceedings
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_proceedings", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@proceedings** Model"
)
```

**Round-trip**

```{r}
bib <- "@proceedings{proceedings-full,
    title        = {Proc. Fifteenth Annual ACM Symposium on the Theory of Computing},
    year         = 1983,
    month        = mar,
    publisher    = {Academic Press},
    address      = {Boston},
    series       = {All ACM Conferences},
    number       = 17,
    note         = {This is a full PROCEEDINGS entry},
    editor       = {Wizard V. Oz and Mihalis Yannakakis},
    organization = {The OX Association for Computing Machinery}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@techreport

The crosswalk of **\@techreport** does not require any special treatment.

```{r}
#| label: tbl-model_techreport
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_techreport", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@techreport** Model"
)
```

**Round-trip**

```{r}
bib <- "@techreport{techreport-full,
    title        = {A Sorting Algorithm},
    author       = {Tom Terrific},
    year         = 1988,
    month        = oct,
    address      = {Computer Science Department, Fanstord, California},
    number       = 7,
    note         = {This is a full TECHREPORT entry},
    institution  = {Fanstord University},
    type         = {Wishful Research Result}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

### \@unpublished

The crosswalk of **\@unpublished** does not require any special treatment.

```{r}
#| label: tbl-model_unpublished
#| echo: false
#| results: asis
df_table <- table_master[table_master$table == "model_unpublished", c(2:4)]
df_table[is.na(df_table)] <- ""

# fix links
df_table$f3 <- gsub("link_to_entry_models", "#entrymodels", df_table$f3)
df_table$f3 <- gsub("link_to_article", "#article", df_table$f3)
df_table$f3 <- gsub("link_to_booklet", "#booklet", df_table$f3)
df_table$f3 <- gsub("link_to_book", "#book-inbook", df_table$f3)

row.names(df_table) <- NULL
knitr::kable(
  df_table,
  col.names = c("BibTeX", "CFF", "Notes"),
  caption = "**\\@unpublished** Model"
)
```

**Round-trip**

```{r}
bib <- "@unpublished{unpublished-minimal,
    title        = {Lower Bounds for Wishful Research Results},
    author       = {Ulrich Underwood and Ned Net and Paul Pot},
    note         = {Talk at Fanstord University (this is a minimal UNPUBLISHED entry)}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

## Conclusion

This article presents a practical and reproducible crosswalk between BibTeX and
CFF.

Although the formats are not fully equivalent, the deterministic mapping
implemented in **cffr** enables consistent transformations across heterogeneous
citation ecosystems. By making design decisions explicit and documenting
limitations, this work supports interoperable citation workflows bridging legacy
bibliographic systems and modern software‑centric practices.

## Appendix A: **\@inbook** in BibTeX and BibLaTeX {#appendix_inbook .appendix}

The definition of **\@inbook** and **\@incollection** in BibTeX [@patashnik1988]
is as follows:

> - **\@inbook**: A part of a book, which may be a chapter (or section) and/or a
>   range of pages. Required fields: author or editor, title, chapter and/or
>   pages, publisher, year (...)
>
> - **\@incollection**: A part of a book having its own title. Required fields:
>   author, title, booktitle, publisher, year (...)

Whereas BibLaTeX [@biblatexpack] specifies:

> - **\@inbook:** A part of a book which forms a self-contained unit with its
>   own title. Note that the [profile]{.underline} of this entry type is
>   [different from standard BibTeX]{.underline}, see § 2.3.1. Required fields:
>   author, title, booktitle, year/date (...).

When considering required fields, an important difference is the **booktitle**
requirement in BibLaTeX. Notably, BibTeX **\@incollection** requires also this
field. Moreover, both BibTeX **\@incollection** and BibLaTeX **\@inbook**
emphasize its reference to *"a part of a book (...) with its own title"*.

In this document, the proposed crosswalk ensures full compatibility with BibTeX.
Hence, we propose to consider a BibLaTeX **\@inbook** entry as equivalent to a
BibTeX **\@incollection**, given the congruence in their definitions and field
requirements.

**Round-trip**

```{r}
bib <- "@inbook{inbook-biblatex,
	author       = {Yihui Xie and Christophe Dervieux and Emily Riederer},
	title        = {Bibliographies and citations},
	booktitle    = {{R} Markdown Cookbook},
	date         = {2023-12-30},
	publisher    = {Chapman and Hall/CRC},
	address      = {Boca Raton, Florida},
	series       = {The {R} Series},
	isbn         = 9780367563837,
	url          = {https://yihui.org/rmarkdown-cookbook/},
	chapter      = {4.5}
}"

cff_read_bib_text(bib)

toBibtex(cff_read_bib_text(bib))
```

## Appendix B: CFF key: `type` values {#appendix_cff_type .appendix}

From @druskat2019, Table 4: Complete list of CFF reference types for key `type`.
Only a subset of these types is actively used in the proposed crosswalk.

```{r}
#| label: tbl-cff_types
#| echo: false
#| results: asis
#| tbl-cap: "Complete list of CFF reference types"
df_table <- table_master[table_master$table == "cff_types", c(2:3)]
df_table[is.na(df_table)] <- ""

row.names(df_table) <- NULL

knitr::kable(
  df_table,
  col.names = c("Reference type string", "Description"),
  row.names = NA
)
```
