Working With Eurostat Data

Symmetric input–output tables and their auxiliary tables are among the most complex and structured statistical products. They describe relationships across roughly 63×63 industries, covering the supply and use of GDP/GNP, as well as income distribution, trade and transport margins, and tax receipts for each industry.

Eurostat publishes symmetric input–output data in long form. Working with these data requires knowledge of the Eurostat SNA vocabulary to arrange them in the correct order of rows and columns. The preprocessing functions of the iotables package perform this task.

Beyond managing an extensive and sometimes poorly documented vocabulary, users often face “real-world” data issues such as missing rows or columns in some Member States or spelling inconsistencies in the vocabulary. Many hours of testing and exception handling have gone into the current processing functions, but further issues may still appear. Most testing has been carried out on product x product type SIOTs, as these are the most common among Member States. The functions also support industry x industry SIOTs, though additional vocabulary exceptions may occur in these cases. Please report such issues on GitHub.

You are most likely to work with the product × product tables. For information about the availability of other data, see ?iotables_download.

The analytical functions are described in more detail in the Germany 1990 vignette. The numbering of the equations follows the Eurostat Manual of Supply, Use and Input–Output Tables (Chapter 15). The manual was prepared in 2008 under the ESA95 standards. Eurostat now uses ESA2010 definitions, and some indicators in the manual are no longer available, though this generally does not cause problems.

Another useful reference on working with input–output tables is Input–Output Analytical Tables: Methods and Application to UK National Accounts. At the time of publication, the United Kingdom was a member of the European Union, and its national accounts followed EU ESA standards.

Getting and processing Eurostat data

The following code, if specifically run (it will not run on default), will download the latest relevant data from the Eurostat data warehouse, and save it to a directory named ../not_included . The iotables package uses the other rOpenGov package Eurostat with additional processing of the rather complex SIOT bulk files.

The contents of the not_included directory can be found on GitHub, but they are not released and distributed with the package.

# Not run
not_included_directory <- file.path("..", "not_included")
if (!dir.exists(not_included_directory)) dir.create(not_included_directory)
# The contents of the 'not_included' directory can be found on GitHub,
# but they are not released and distributed with the package.

naio_10_cp1700 <- iotables_download(
  "naio_10_cp1700", # SIOT
  data_directory = not_included_directory
)

# For inclusion in the package, the files must be smaller.
# Reducing the size of the bulk files will not affect
# the demonstration.

naio_10_cp1700 <- naio_10_cp1700 %>%
  dplyr::filter(geo %in% c("CZ", "SK")) %>%
  dplyr::filter(year %in% c(2010, 2015))

# Conforming employment data both sexes from 15 years old, year 2015.
# prod_na vocabulary for product x product conformity
emp_cz <- employment_get(
  geo = "CZ", year = "2015", sex = "Total",
  age = "Y_GE15", labelling = "prod_na",
  data_directory = not_included_directory,
  force_download = TRUE
)

# Conforming employment data #both sexes from 15 years old, year 2017.
emp_sk <- employment_get(
  geo = "SK",
  year = "2017", sex = "Total",
  age = "Y_GE15", labelling = "prod_na",
  data_directory = not_included_directory,
  force_download = TRUE
)

save(naio_10_cp1700, emp_sk, emp_cz,
  file = file.path(
    "..", "inst", "extdata",
    "naio_10_product_x_product.rda"
  )
)

For a quicker building of the vignette, the data is retrieved from the not_included directory. The chunk above can reproduce this data file.

# load from pre-saved file to increase running speed
load(system.file("extdata",
  "naio_10_product_x_product.rda",
  package = "iotables"
))

In this vignette example the Czech (2015) national currency unit and the Slovak (2010) euro tables are created. Since the Slovak national currency is the euro, there is no difference between the Slovak national currency unit and euro tables.

cz_io <- iotable_get(
  source = "naio_10_cp1700",
  geo = "CZ",
  year = 2015,
  unit = "MIO_NAC",
  stk_flow = "TOTAL",
  labelling = "short"
)

sk_io <- iotable_get(
  source = "naio_10_cp1700",
  geo = "SK",
  year = 2015,
  unit = "MIO_EUR",
  stk_flow = "TOTAL",
  labelling = "short"
)

cz_input_flow <- input_flow_get(data_table = cz_io)

sk_input_flow <- input_flow_get(data_table = sk_io)

cz_output <- output_get(data_table = cz_io)
sk_output <- output_get(data_table = sk_io)

By default, total rows and columns are removed when creating use tables. ## Analytic functions {#analytic-functions}

Input coefficients matrix

Iotables removes the columns and rows that are completely empty and creates the input coefficient matrix, which is used for most of the analytical functions.

The input_coefficient_matrix_create() function relies on the following equation. The numbering of the equations is the numbering of the Eurostat Manual.

  1. \(a_{ij}\) = \(X_{ij}\) / \(x_j\) [recap: (43) is the same]

It checks the correct ordering of columns, and furthermore it fills up 0 values with 0.000001 to avoid division with zero.

input_coeff_matrix_cz <- input_coefficient_matrix_create(
  data_table = cz_io
)

input_coeff_matrix_sk <- input_coefficient_matrix_create(
  data_table = sk_io
)
#> Columns and rows of CPA_L68A, CPA_T98, CPA_U are all zeros and will be removed.

knitr::kable(head(input_coeff_matrix_cz[, 1:8]))
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
CPA_A01 0.1247278 0.0060946 0.0220441 0.0004689 0.3049289 0.0433435 0.0031216
CPA_A02 0.0027975 0.1304276 0.0066800 0.0023120 0.0002613 0.0003007 0.2438908
CPA_A03 0.0000000 0.0000000 0.0000000 0.0000000 0.0007234 0.0000000 0.0000000
CPA_B 0.0007114 0.0027908 0.1843687 0.0464026 0.0005672 0.0010852 0.0002330
CPA_C10-12 0.1481747 0.0003899 0.0494322 0.0001455 0.1537184 0.0004053 0.0002446
CPA_C13-15 0.0048691 0.0174423 0.0060120 0.0077445 0.0014149 0.2881724 0.0013162

In the Czech SIOTs, they are removed by the statistical authority, so the Czech SIOTs appear smaller.

Most countries do not use the L_68A, CPA_U [Services provided by extraterritorial organisations and bodies] and CPA_T [Services of households as employers; undifferentiated goods and services produced by households for own use]industries, instead they use L_68B for the income component of real estates, and they do not calculate the In order to prevent division by zero errors, they are symmetrically removed from rows and columns. This will not change the results.

Creating the Leontief-matrix and its inverse

The Leontief matrix is derived from Leontief equation system.

  1. \((I-A)x = y\)

The Leontief matrix is defined as \((I-A)\) and it is created with the leontieff_matrix_create() function.

The Leontief inverse is (I-A)-1 and it is created with the leontieff_inverse_create() function from the Leontief-matrix.

The code chunk below prints a small part of the Czech Leontief-inverse.

L_cz <- leontieff_matrix_create(input_coeff_matrix_cz)
#> Warning in leontieff_matrix_create(input_coeff_matrix_cz):
#> leontieff_matrix_create() is spelled correctly as leontief_matrix_create()
I_cz <- leontieff_inverse_create(input_coeff_matrix_cz)
#> Warning in leontieff_inverse_create(input_coeff_matrix_cz):
#> leontieff_inverse_create() is spelled correctly as leontief_inverse_create()

L_sk <- leontieff_matrix_create(input_coeff_matrix_sk)
#> Warning in leontieff_matrix_create(input_coeff_matrix_sk):
#> leontieff_matrix_create() is spelled correctly as leontief_matrix_create()
I_sk <- leontieff_inverse_create(input_coeff_matrix_sk)
#> Warning in leontieff_inverse_create(input_coeff_matrix_sk):
#> leontieff_inverse_create() is spelled correctly as leontief_inverse_create()

knitr::kable(head(I_cz[, 1:8]))
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
CPA_A01 1.2259342 0.0145906 0.0556468 0.0077698 0.4477719 0.0841593 0.0160654
CPA_A02 0.0074116 1.1584704 0.0123554 0.0064437 0.0062990 0.0045882 0.3595795
CPA_A03 0.0001602 0.0000050 1.0000531 0.0000047 0.0009176 0.0000166 0.0000071
CPA_B 0.0613068 0.0268029 0.2513917 1.1010207 0.0515437 0.0512433 0.0487318
CPA_C10-12 0.2183097 0.0053221 0.0711943 0.0043237 1.2639393 0.0191994 0.0066710
CPA_C13-15 0.0164678 0.0339691 0.0192012 0.0192664 0.0143365 1.4139195 0.0184852

You can create the Leontief-matrix and its inverse from the output coefficient matrix, too, starting with output_coefficient_matrix_create() if you know what you are doing!

Direct effect indicators

The direct effects can be compared to intermediate production, domestic product or total supply.

The calculation follows the Eurostat Manual on p497-499

  1. \(a_{ij}\) = \(z_{ij}\) / \(x_j\) [recap: (43) is the same]

\(a_{ij}\) = input coefficient \(z_{ij}\) = input of type i in sector j (i.e. products or capital or labour) \(x_j\) = output of sector j

By default, direct_supply_effects_create() will compare inputs to total final demand / supply. You can make comparisons to total product or total output, too. The code below prints a part of the Czech direct effects rounded to 4 digits.

primary_inputs_cz <- coefficient_matrix_create(
  data_table = cz_io,
  total = "output",
  return = "primary_inputs"
)

primary_inputs_sk <- coefficient_matrix_create(
  data_table = sk_io,
  total = "output",
  return = "primary_inputs"
)
#> Columns and rows of CPA_L68A, CPA_T98, CPA_U are all zeros and will be removed.

direct_cz <- direct_effects_create(
  input_requirements = primary_inputs_cz,
  inverse = I_cz,
  digits = 4
)
direct_sk <- direct_effects_create(primary_inputs_sk, I_sk)

knitr::kable(head(direct_cz[, 1:8]), digits = 4)
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
P7_effect 1.1504 0.5339 2.6106 3.7138 1.4740 2.7952 0.9758
P7_effect 0.4264 0.2608 0.5582 0.5140 0.4317 0.4720 0.3670
P7_effect 0.3257 0.1997 0.4212 0.3770 0.3281 0.3557 0.2792
P7_effect -0.1631 -0.0083 -0.1921 -0.0118 -0.0594 -0.0132 -0.0033
P7_effect 0.2271 0.1470 0.2732 0.2627 0.2105 0.2105 0.1824
P7_effect 0.6973 0.7187 0.6183 0.4769 0.5992 0.5174 0.6049

Direct effects measure the initial, direct impact of the change in demand and supply for a product. When production goes up, it will create demand in all supply industries (backward linkages) and create opportunities in the industries that use the product themselves (forward linkages.)

This is not the total effect, because some of the increased production will translate into income, which in turn will be used to create further demand in all parts of the domestic economy. The total effect is characterized by multipliers.

Total effects

Input multipliers

The input_multipliers_create function will create the various multipliers for each product.

  1. Z = B(I-A)-1

B = vector of input coefficients compared to final demand / supply.

Z = direct and indirect requirements for wages (or other income)

The calculation follows the Eurostat Manual p 499-502.

The code chunk below prints a part of the Czech multipliers, rounded to 4 digits.

primary_inputs_cz <- coefficient_matrix_create(
  data_table = cz_io,
  total = "output",
  return = "primary_inputs"
)

primary_inputs_sk <- coefficient_matrix_create(
  data_table = sk_io,
  total = "output",
  return = "primary_inputs"
)
#> Columns and rows of CPA_L68A, CPA_T98, CPA_U are all zeros and will be removed.

multipliers_cz <- input_multipliers_create(primary_inputs_cz, I_cz)
multipliers_sk <- input_multipliers_create(primary_inputs_sk, I_sk)

knitr::kable(head(multipliers_cz[, 1:8]), digits = 4)
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
P7 4.3810 5.3101 2.3346 1.2878 2.4907 1.8916 4.4461
D1 2.5807 2.7261 2.0634 1.6869 3.8260 2.5307 3.3160
D11 2.5562 2.6990 2.0474 1.7224 3.8490 2.5445 3.3161
D29X39 1.2222 1.4523 1.0535 1.1609 1695.1985 7.1750 -6.9818
P51C 2.4184 2.4343 2.2227 1.8254 4.7038 2.9675 3.8517
B2A3G 2.0953 1.5894 2.2357 2.2601 4.5521 3.2666 4.0008

Employment indicators and multipliers

The creation of the employment indicator is facilitated with the data processing function employment_get. The employment data as input data is not part of the Eurostat SIOT files, and the Eurostat employment data uses a more disaggregated structure. This function downloads and brings the employment data to conforming aggregate vector.

Other inputs, for example, CO2 emissions may be used, but they are likely to be need a manual aggregation. The helper function conforming_vector_create will create an empty vector that you can save as a .csv or Excel file and fill up manually with customary input data.

# New function is needed to add employment vector to SIOT
names(emp_sk)[1] <- "prod_na"
names(emp_cz)[1] <- "prod_na"

emp_indicator_cz <- full_join(
  cz_io,
  emp_cz
) %>%
  coefficient_matrix_create(.,
    return_part = "primary_inputs"
  ) %>%
  filter(prod_na == "employment_total")


emp_effect_cz <- direct_effects_create(emp_indicator_cz, I_cz)

knitr::kable(emp_effect_cz[1:8], digits = 5)
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
employment_total_effect 0.00121 0.00111 0.0024 0.0011 0.00119 0.00155 0.00137
# New function is needed to add employment vector to SIOT

emp_multiplier_cz <- input_multipliers_create(
  emp_indicator_cz, I_cz
)

knitr::kable(emp_multiplier_cz[1:8], digits = 5)
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
employment_total 2.19587 1.75467 1.3926 1.81069 3.10277 2.03429 2.49164

Output multipliers

output_multipliers_cz <- output_multiplier_create(input_coeff_matrix_cz)
output_multipliers_sk <- output_multiplier_create(input_coeff_matrix_sk)

knitr::kable(head(output_multipliers_cz[, 1:8]), digits = 4)
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
output_multipliers 2.6834 2.0952 2.687 2.3066 3.0105 2.7403 2.8346

Interindustrial linkage analysis

Backward linkages

Backward linkages show the effect on industries that are suppliers to the production (industry) in question.

cz_bw <- backward_linkages(I_cz)
sk_bw <- backward_linkages(I_sk)

knitr::kable(head(cz_bw[, 1:8]), digits = 4)
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
backward linkages 2.6834 2.0952 2.687 2.3066 3.0105 2.7403 2.8346

Forward linkages

Forward linkages show the effects on other industries that use the product (industry output) at question as an input. Forward linkages can be derived from the ouput coefficient table.

output_coeff_cz <- output_coefficient_matrix_create(
  data_table = cz_io, total = "TFU", digits = 4
)

output_coeff_sk <- output_coefficient_matrix_create(
  data_table = sk_io, total = "TFU", digits = 4
)
#> Columns and rows of CPA_L68A, CPA_T98, CPA_U are all zeros and will be removed.

knitr::kable(head(output_coeff_cz[, 1:8]))
prod_na CPA_A01 CPA_A02 CPA_A03 CPA_B CPA_C10-12 CPA_C13-15 CPA_C16
CPA_A01 0.2408 0.0028 0.0003 0.0003 0.8878 0.0308 0.0025
CPA_A02 0.0280 0.3063 0.0005 0.0069 0.0040 0.0011 1.0091
CPA_A03 0.0000 0.0000 0.0000 0.0000 0.0777 0.0000 0.0000
CPA_B 0.0028 0.0025 0.0051 0.0534 0.0033 0.0015 0.0004
CPA_C10-12 0.0818 0.0001 0.0002 0.0000 0.1279 0.0001 0.0001
CPA_C13-15 0.0097 0.0082 0.0001 0.0046 0.0043 0.2117 0.0011

From the output coefficient matrix we can create the Leontief-matrix for outputs, its inverse, and summarize for forward linkages. These steps are taking place in the forward_linkages function.

cz_fw <- forward_linkages(output_coeff_cz)
sk_fw <- forward_linkages(output_coeff_sk)

knitr::kable(head(cz_fw), digits = 4)

Write results into Excel files

This code will not run, unless you run it separately. The resulting files can be used to check calculations in Excel or other application.

The contents of the not_included directory can be found on GitHub, but they are not released and distributed with the package.

require(xlsx)
cz_file_name <- file.path("..", "not_included", "CzechRep_test.xlsx")
# Czech Republic data
xlsx::write.xlsx(cz_io,
  file = cz_file_name, sheetName = "io_table",
  col.names = TRUE, row.names = TRUE, append = FALSE
)
xlsx::write.xlsx(cz_output,
  file = cz_file_name, sheetName = "cz_output",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(input_coeff_matrix_cz,
  file = cz_file_name,
  sheetName = "input_coeff_matrix_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(L_cz,
  file = cz_file_name, sheetName = "L_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(I_cz,
  file = cz_file_name, sheetName = "I_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(direct_cz,
  file = cz_file_name,
  sheetName = "direct_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(multipliers_cz,
  file = cz_file_name,
  sheetName = "multipliers_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(emp_effect_cz,
  file = cz_file_name,
  sheetName = "emp_effect_cz_2015",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(emp_indicator_cz,
  file = cz_file_name,
  sheetName = "emp_indicator_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(emp_multiplier_cz,
  file = cz_file_name,
  sheetName = "emp_multiplier_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(cz_bw,
  file = cz_file_name,
  sheetName = "cz_backward_linkages",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(output_coeff_cz,
  file = cz_file_name,
  sheetName = "output_coeff_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(cz_fw,
  file = cz_file_name,
  sheetName = "cz_forward_linkages",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(output_multipliers_cz,
  file = cz_file_name,
  sheetName = "output_multipliers_cz",
  col.names = TRUE, row.names = TRUE, append = TRUE
)


sk_file_name <- file.path("..", "not_included", "SlovakRep_test.xlsx")
# Czech Republic data
xlsx::write.xlsx(sk_io,
  file = sk_file_name, sheetName = "io_table",
  col.names = TRUE, row.names = TRUE, append = FALSE
)
xlsx::write.xlsx(sk_output,
  file = sk_file_name, sheetName = "sk_output",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(input_coeff_matrix_sk,
  file = sk_file_name,
  sheetName = "input_coeff_matrix_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(L_sk,
  file = sk_file_name, sheetName = "L_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(I_sk,
  file = sk_file_name, sheetName = "I_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(direct_sk,
  file = sk_file_name,
  sheetName = "direct_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(multipliers_sk,
  file = sk_file_name,
  sheetName = "multipliers_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(emp_effect_sk,
  file = sk_file_name,
  sheetName = "emp_effect_sk_2015",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(emp_indicator_sk,
  file = sk_file_name,
  sheetName = "emp_indicator_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(emp_multiplier_sk,
  file = sk_file_name,
  sheetName = "employment_multipliers_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(sk_bw,
  file = sk_file_name,
  sheetName = "sk_backward_linkages",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(output_coeff_sk,
  file = sk_file_name,
  sheetName = "output_coeff_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(sk_fw,
  file = sk_file_name,
  sheetName = "sk_forward_linkages",
  col.names = TRUE, row.names = TRUE, append = TRUE
)
xlsx::write.xlsx(output_multipliers_sk,
  file = sk_file_name,
  sheetName = "output_multipliers_sk",
  col.names = TRUE, row.names = TRUE, append = TRUE
)