---
title: "Using PxWebApi v2 with PxWebApiData"
author: "Øyvind Langsrud and Jan Bruusgaard"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Using PxWebApi v2 with PxWebApiData}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
keywords: Statbank, PxWeb, PxWebApi, official statistics, json-stat
---



```{r include = FALSE}
library(knitr)
library(PxWebApiData)
options(max.print = 44)

# Re-define the comment functions to control line width and minimize excessive line breaks when printing.
comment <- function(x, fun = base::comment) {
     com <- fun(x)
     nchar_name <- min(103, 2 + max(nchar(com)))
     if (length(com)) 
       if (is.null(names(com)))
         names(com) <- paste0("[", seq_along(com), "]")
     for (name in names(com)) {
         cat(strrep(" ", max(0, (nchar_name - nchar(name)))),
             name, 
             "\n", 
             strrep(" ", max(0, (nchar_name - nchar(com[[name]]) - 2))),
             "\"",
             com[[name]],
             "\"",  "\n", sep = "")
     }
}
info <- function(x) comment(x, PxWebApiData::info)
note <- function(x) comment(x, PxWebApiData::note)
```


## Preface

This vignette describes the functionality in the R package **PxWebApiData**
related to PxWebApi v2. The new functions follow the recommended
snake_case naming convention.

For older functionality, see
[Using PxWebApi v1 with PxWebApiData](pxwebapi_v1.html).

The vignette first describes the function `get_api_data()` for retrieving
data from a pre-made URL. Note that this function is not limited to PxWebApi.
At the end of the vignette, it is also shown how data from Eurostat can be
retrieved using `get_api_data()`.

The main function for PxWebApi v2 in the package can be considered to be
`api_data()`. It is closely related to `ApiData()` for PxWebApi v1.
The function `api_data()` retrieves data in one step.
Internally, the functions `query_url()` and `get_api_data()` are called.

As described later in the vignette, the package also provides dedicated
functions for retrieving metadata associated with PxWebApi v2.



\

# `get_api_data()`: retrieve data from a pre-made URL

When a data URL is already available, the data can be retrieved using
`get_api_data()`, as illustrated in the example below.

```{r eval=TRUE, tidy = FALSE, comment=NA}

url <- paste0(
  "https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en",
  "&valueCodes[Region]=0301,324*",
  "&valueCodes[ContentsCode]=???????",
  "&valueCodes[Tid]=top(2)"
)

get_api_data(url)

```


To return a single data frame with labels only, use the function `get_api_data_1`.
The function `get_api_data_2` returns codes only.
To return a data frame containing both labels and codes, use `get_api_data_12`.


The output from `get_api_data()` is identical to the output from
`api_data()`, which is shown below.
As shown, the functions `info()` and `note()` can also be used to display
additional information.


\

# `query_url()`: generate a URL from specifications

The function `query_url()` can be used to generate a data URL.
The URL used in the example above can be generated as follows:


```{r eval=TRUE, tidy = FALSE, comment=NA}
query_url("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
          Region = c("0301", "324*"), 
          ContentsCode = "???????", 
          Tid = "top(2)")
```

The function `query_url()` can be used in many different ways.

A more detailed description is given below in the section on `api_data()`.
The input to the two functions is identical.


\

# `api_data()`: specify and retrieve data in one step


## Specification by codes, `*`, `?`, and `top(n)`

The dataset considered here has three variables: `Region`, `ContentsCode`, and `Tid`.
These variables can be used as input parameters.

Each variable can be specified using codes corresponding to the coding used
in PxWebApi URL queries.

Codes can be specified directly. It is also possible to truncate codes using
an asterisk (`*`) or to mask individual characters using a question mark (`?`).
In the example below, seven characters are masked.

Using `top(2)` returns the first two values from the start position.


```{r eval=TRUE, tidy = FALSE, comment=NA}
api_data("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
         Region = c("0301", "324*"), 
         ContentsCode = "???????", 
         Tid = "top(2)")
```


A list of two data frames is returned: one with labels and one with codes.

To return a single data frame with labels only, use the function `api_data_1`.
The function `api_data_2` returns codes only.
To return a data frame containing both labels and codes, use `api_data_12`.

Internally, a data URL is first constructed and the data are then retrieved
using the function `get_api_data()`.

To obtain the generated URL, replace `api_data()` with `query_url()`.
The URL for this example has already been generated using `query_url()` in the example above.



## Specification using (default) indexing

Numeric values are interpreted as indexing, either as row numbers in the
metadata or as indices.
See the parameter `use_index` for further details.

As specified by the parameter `default_query`, unspecified variables are set to
`c(1, -2, -1)`. In the example below, `Tid` is unspecified, which therefore
corresponds to the first and the two last years.


```{r eval=TRUE, tidy = FALSE, comment=NA}
api_data_12("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
           Region = 14:17, 
           ContentsCode = 2)
```



## Specification using `TRUE`, `FALSE`, imaginary values (e.g. `3i`), and labels

All possible values are obtained by `TRUE` and this is equivalent to `"*"`.
Elimination of a variable is obtained by `FALSE`.
Imaginary values represent `top`, for example `3i` is equivalent to `"top(3)"`.


```{r eval=TRUE, tidy = FALSE, comment=NA}
api_data_2("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
          Region = FALSE, 
          ContentsCode = TRUE, 
          Tid = 3i)
```

Labels can also be used as an alternative to codes.


```{r eval=TRUE, tidy = FALSE, comment=NA}

obj <- api_data("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en", 
                Region = c("Asker", "Hurdal"), 
                ContentsCode = TRUE, 
                Tid = 2i)

```

To show either label version or code version.


```{r eval=TRUE, tidy = FALSE, comment=NA}

obj[[1]]
obj[[2]]
```


## Use `default_query = TRUE` to retrieve entire tables

```{r eval=TRUE, tidy = FALSE, comment=NA}
out <- api_data_2("https://data.ssb.no/api/pxwebapi/v2/tables/10172/data?lang=en", 
                   default_query = TRUE)
out[14:20, ]  # 9 rows printed  
```

In this case, the `NAstatus` variable is included.
See the `api_data()` parameter `make_na_status`.


## Show additional information

Use `info()` and `note()` (or `comment()`) to list additional dataset information.
```{r, comment=NA}

info(obj)
note(obj)
```

Use `note()` for explanation of NA status codes
```{r, comment=NA}
note(out)
```


## Specification by `list()` for advanced queries

Advanced queries can be specified using named lists, where the names correspond
to the encoding used in PxWebApi URL queries.


```{r eval=TRUE, tidy = FALSE, comment=NA}
 api_data_2("https://data.ssb.no/api/pxwebapi/v2/tables/07459/data?lang=en",
            Region = list(codelist = "agg_KommSummer", 
                          valueCodes = c("K-3101", "K-3103"), 
                          outputValues = "aggregated"),
            Kjonn = TRUE,
            Alder = list(codelist = "agg_TodeltGrupperingB", 
                         valueCodes = c("H17", "H18"),
                         outputValues = "aggregated"),
            ContentsCode = 1,
            Tid = 2i)  
```

In this case, the generated URL is:

```{r eval=TRUE, tidy = FALSE, comment=NA}
 url <- query_url("https://data.ssb.no/api/pxwebapi/v2/tables/07459/data?lang=en",
            Region = list(codelist = "agg_KommSummer", 
                          valueCodes = c("K-3101", "K-3103"), 
                          outputValues = "aggregated"),
            Kjonn = TRUE,
            Alder = list(codelist = "agg_TodeltGrupperingB", 
                         valueCodes = c("H17", "H18"),
                         outputValues = "aggregated"),
            ContentsCode = 1,
            Tid = 2i)
 cat(gsub("&", "\n&", url))
```


To improve readability, `cat()` together with `gsub()` is used to print the
long URL across multiple lines.

This query is constructed using information available in the metadata;
see the section below.


\

# Obtaining metadata

## `meta_frames()`

Metadata for a data set can be obtained using `meta_frames()`.


```{r eval=TRUE, tidy = FALSE, comment=NA}
mf <- meta_frames("https://data.ssb.no/api/pxwebapi/v2/tables/04861/data?lang=en")
print(mf)
```


Information about whether variables can be eliminated is stored as an attribute
and can be retrieved for all variables at once:

```{r eval=TRUE, tidy = FALSE, comment=NA}
sapply(mf, attr, "elimination") # elimination info for all variables
```

Code list information is stored as a data frame in another attribute:

```{r eval=TRUE, tidy = FALSE, comment=NA}
attr(mf[["Region"]], "code_lists")
```


## `meta_code_list()`

Metadata for code lists referenced in this output can be retrieved using
`meta_code_list()`.

## `meta_data()`

To download raw metadata without further processing, use `meta_data()`.

Note that it does not matter whether the input URL refers to data or metadata;
this is handled automatically.
 
 
\

# Eurostat data
Eurostat REST API offers JSON-stat version 2. It is possible to use this package to obtain data from Eurostat by using `get_api_data`
or the similar functions with `1`, `2` or `12` at the end

This example shows HICP total index, latest two periods for EU and Norway. See [Eurostat guidelines](https://ec.europa.eu/eurostat/web/user-guides/data-browser/api-data-access/api-detailed-guidelines/api-statistics) for more.

```{r eval=TRUE, tidy = FALSE, comment=NA, encoding = "UTF-8"}

url_eurostat <- paste0(   # Here the long url is split into several lines using paste0 
  "https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/prc_hicp_mv12r", 
  "?format=JSON&lang=EN&lastTimePeriod=2&coicop=CP00&geo=NO&geo=EU")
url_eurostat
get_api_data_12(url_eurostat)

```


\

# Background

PxWeb and it's API, PxWebApi is used as output database (Statbank) by many statistical agencies in the Nordic countries and several others, i.e. Statistics Norway, Statistics Finland, Statistics Sweden. See [list of installations](https://www.scb.se/en/services/statistical-programs-for-px-files/px-web/pxweb-examples/). 

For hints on using PxWebApi v2 in general see 
[PxWebApi v2 User Guide](https://www.ssb.no/en/api/pxwebapiv2).






