--- title: "fmi2: introduction" author: "Joona Lehtomäki" date: "`r Sys.Date()`" output: prettydoc::html_pretty: theme: leonids highlight: github vignette: > %\VignetteIndexEntry{fmi2 introduction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Setup `fmi2` is not yet in CRAN, so you'll need to install it directly from GitHub. While you're at it, make sure you also install all the packages below as we'll be using them in this tutorial. ```{r installs, eval=FALSE} install.packages(c("DT", "ggplot2", "leaflet", "remotes", "sf", "tidyverse")) remotes::install_github("ropengov/fmi2") remotes::install_github("ropensci/skimr") ``` ```{r load-libraries, message=FALSE, warning=FALSE} library(DT) library(fmi2) library(ggplot2) library(knitr) library(leaflet) library(sf) library(skimr) ``` # Getting started You can retrieve weather stating observation data with various temporal resolution using `fmi2`. First thing you need to know is of course which location exactly you want to get the data from. The FMI API provides multiple different ways of defining the spatial query area: + **Bounding box** given by coordindates and defining an area + **Place name** for which to provide data. + **FMISID** numeric FMI observation station identifier + **GEOID** numeric geoid of the location + **WMO** code of the location We'll start off by using the **FMISID** identifies which is given to each [FMI observation stations](https://en.ilmatieteenlaitos.fi/observation-stations). The online table is also available in `fmi2` using the function `fmi_stations()`: ```{r show-stations} station_data <- fmi2::fmi_stations() station_data %>% DT::datatable() ``` We're going to pick "Hanko Tulliniemi" as an example here and use its FMISID (100946) to retrieve the data. As you can see from the table above, it also provides the latlon (geographical) coordinates for the observation station. Before we get the actual data, let's visualize Hanko region. ```{r plot-hanko-map} # Get data for Tulliniemi only tulliniemi_station <- station_data %>% dplyr::filter(fmisid == 100946) # Plot on a map using leaflet leaflet::leaflet(station_data) %>% leaflet::setView(lng = tulliniemi_station$lon, lat = tulliniemi_station$lat, zoom = 11) %>% leaflet::addTiles() %>% leaflet::addMarkers(~lon, ~lat, popup = ~name, label = ~as.character(fmisid)) ``` # Getting daily weather observation data Now that we know how the address a specific observation station, we can proceed to getting the actual data. `fmi2` providers several functions to retrieving data with different variables and temporal resolution. We'll start with `obs_weather_daily()` which returns daily average observation data from a given location. Let's get the daily weather observation data for the first 6 monhts of 2019: ```{r getting-data} # Use Hanko Tulliniemi weather station FMISID tulliniemi_data <- obs_weather_daily(starttime = "2019-01-01", endtime = "2019-06-30", fmisid = 100946) ``` In total, the function returned `r nrow(tulliniemi_data)` observations. You can also note the following: ```{r sf-class} class(tulliniemi_data) ``` which means that the data returned by `obs_weather_daily()` is a spatial `sf` object with the `geometry` column storing the geographical information of the weather station. We'll come back to this later. Now we are interested in what *kind* of data did we actually get? Let's find out: ```{r} unique(tulliniemi_data$variable) ``` So there are six variables with their corresponding values. `fmi2` provides a helper function `describe_variables()` that can be useful in finding out more about the variables: ```{r describe-variables} var_descriptions <- fmi2::describe_variables(tulliniemi_data$variable) var_descriptions %>% DT::datatable() ``` `obs_weather_daily()` returns data in so called long (or melted) format meaning that all variable (i.e. parameter) names are contained in column `variable` and corresponding values in `value` column. You can transform the data into a wide format using `tidyr`: ```{r spread-data} wide_data <- tulliniemi_data %>% tidyr::spread(variable, value) %>% # Let's convert the sf object into a regular tibble sf::st_set_geometry(NULL) wide_data %>% DT::datatable() ``` Looks like there aren't too much data for `rrday`, `snow` or `TG_PT12H_min`. Let's have a closer look at the data: ```{r skim-data, results='asis'} (skimr::skim(wide_data)) ``` Seems like the above mentioned variables indeed don't have data between the defined days. Let's get the same data from a couple of other observation stations around finland. Note that this time we're using place name instead of a FMISID. ```{r multiple stations, warning=FALSE} oulu_data <- obs_weather_daily(starttime = "2019-01-01", endtime = "2019-06-30", place = "Oulu") nuorgam_data <- obs_weather_daily(starttime = "2019-01-01", endtime = "2019-06-30", place = "Nuorgam") # Add location name to each data set and combine them oulu_data$location <- "Oulu" nuorgam_data$location <- "Nuorgam" tulliniemi_data$location <- "Hanko" all_data <- rbind(tulliniemi_data, oulu_data, nuorgam_data) # Factorize location and make order explicit all_data <- all_data %>% dplyr::mutate(location = factor(location, levels = c("Nuorgam", "Oulu", "Hanko"), ordered = TRUE)) ``` Let's plot the daily temperature data in different locations: ```{r plot-data} all_data %>% dplyr::filter(variable == "tday" | variable == "tmax" | variable == "tmin") %>% ggplot(aes(x = time, y = value, color = variable)) + geom_line() + facet_wrap(~ location, ncol=1) + ylab("Temperature (C)\n") + xlab("\nDate") + theme_minimal() ``` # Getting hourly weather observation data Instead of daily values, it is also possible to retrieve weather observation data with finer temporal resolution, such as hourly data, using the function `obs_weather_hourly()`. The data retrieved this has slightly different content as compared to the daily data: ```{r hourly-data} # Get the hourly observations for the first day of 2019 in Hanko Tulliniemi tulliniemi_data <- fmi2::obs_weather_hourly(starttime = "2019-02-01", endtime = "2019-02-02", fmisid = 100946) ``` Again, let's first have a look at what we actually got: ```{r descrive-variables-2} var_descriptions <- fmi2::describe_variables(tulliniemi_data$variable) var_descriptions %>% DT::datatable() ```