Metadata Vocabularies for Input

library(iotables)

Import and Normalisation Workflow

The five Eurostat vocabularies —
ind_ava, ind_use, prd_ava, prd_use, and cpa2_1 — were imported directly from the official Eurostat metadata registry (https://dd.eionet.europa.eu/vocabulary/eurostat/).
Each dataset mirrors the structure of its corresponding SDMX codelist and preserves Eurostat’s identifiers and validity information.

Data Sources

Vocabulary	Description	Source URL
`ind_ava`	Industries, adjustments and value added (rows for industry × industry SIOTs)	https://dd.eionet.europa.eu/vocabulary/eurostat/ind_ava/
`ind_use`	Industry uses (columns for industry × industry SIOTs)	https://dd.eionet.europa.eu/vocabulary/eurostat/ind_use/
`prd_ava`	Products, adjustments and value added (rows for product × product SIOTs)	https://dd.eionet.europa.eu/vocabulary/eurostat/prd_ava/
`prd_use`	Product uses (columns for product × product SIOTs)	https://dd.eionet.europa.eu/vocabulary/eurostat/prd_use/
`cpa2_1`	Statistical Classification of Products by Activity (CPA 2.1)	https://dd.eionet.europa.eu/vocabulary/eurostat/cpa2_1/

Import Steps

Raw Download
Each vocabulary was retrieved as an Excel export from EIONET’s vocabulary registry.
Column Standardisation
Columns were renamed to a unified schema: id, label, status, status_modified, notation, group, quadrant, numeric_order, iotables_label, block, uri
Quadrant and Block Assignment
Each item was assigned a quadrant and a semantic block consistent across vocabularies:
- 10 = intermediate (Quadrant 1)
- 20 = primary_inputs (Quadrant 3)
- 30 = final_use (Quadrant 2)
- 50 = extension / diagnostic
  Control totals such as “Total supply at basic prices” were retained as block = "control_total".
Ordinal Ordering
numeric_order was reindexed within each quadrant with consistent gaps (10, 20, …) to ensure reproducible ordering for matrix construction.
URI Generation
Each code was linked to its SKOS concept using:

df$uri <- sprintf(
  "https://dd.eionet.europa.eu/vocabularyconcept/eurostat/%s/%s",
  vocabulary_id,
  df$notation
)

Validation
Each table was checked for:
- missing or duplicate IDs
- monotone numeric order
- alignment of quadrant ↔︎ block semantics
Storage and Naming

The cleaned tibbles were stored as exported data objects:

data/ind_ava.rda data/ind_use.rda data/prd_ava.rda data/prd_use.rda

Each dataset can be loaded directly with data(<name>).

Adjustments to Vocabularies

Although the four Eurostat vocabularies (ind_ava, ind_use, prd_ava, prd_use) were imported directly from the official Eurostat metadata registry, some modifications were necessary to ensure compatibility with the actual Eurostat input–output datasets. The main data sources, in particular naio_10_cp1750 and naio_10_cp1700, occasionally include variables that are not coded according to the published and standardised vocabularies. While these inconsistencies are usually clear to a manual user, they can create ambiguity in a reproducible workflow where automated matching is required.

For example, the product × product SIOTs for the Slovak Republic contain a more detailed industry breakdown than that defined in prd_ava and prd_use. To maintain alignment across datasets, all 0-, 1-, and 2-digit codes from the cpa2_1 vocabulary were imputed into the four vocabularies. Each entry includes a validity flag in the status column, indicating whether the code is valid in the official Eurostat vocabulary or was adopted from observed but non-standard codes in the data. This approach preserves reproducibility while ensuring complete coverage of all codes encountered in current Eurostat data releases.

Versioning

All four vocabularies correspond to the 2025 Eurostat CPA 2.1 / ESA 2010 edition.

- Import and Normalisation Workflow

Metadata Vocabularies for Input–Output Analysis

Import and Normalisation Workflow

Data Sources

Import Steps

Adjustments to Vocabularies

Versioning