helsinki R package provides tools to access open data from the Helsinki region in Finland.
For contact information, source code and bug reports, see the project’s GitHub page. For other similar packages and related blog posts, see the rOpenGov project website.
Release version for most users:
Development version for developers and other interested parties:
Load the package.
The package has basic functions for interacting with WFS APIs,
courtesy of FMI2-package:
wfs_api()
for returning “wfs_api” and to_sf()
for turning these objects into sf-objects.
All available features of a given API can be easily listed with the
get_feature_list()
function. The API functions can,
however, be used with a wide variety of different base.url
parameters.
input_url <- "https://kartta.hsy.fi/geoserver/wfs"
hsy_features <- get_feature_list(base.url = input_url)
# Select only features which are related to water utilities and services
hsy_vesihuolto <- hsy_features[which(hsy_features$Namespace == "vesihuolto"), ]
hsy_vesihuolto
# We select our feature of interest from this list: Location of waterposts
feature_of_interest <- "vesihuolto:VH_Vesipostit_HSY"
When the wanted feature and its Name (in other words: Namespace:Title
combination) is known, it can be downloaded with
get_feature()
by providing the correct
base.url
and the Name as the typename
parameter.
input_url <- "https://kartta.hsy.fi/geoserver/wfs"
feature_of_interest <- "vesihuolto:VH_Vesipostit_HSY"
# downloading a feature
waterposts <- get_feature(base.url = input_url, typename = feature_of_interest)
# Visualizing the location of waterposts
if (exists("waterposts")) {
if (!is.null(waterposts)) {
plot(waterposts$geom)
}
}
Dots on a blank canvas do not make much sense and therefore
helsinki-package has get_city_map()
function for
downloading city district boundaries. An example of this is provided in
the Helsinki region district
maps section of this vignette.
Helsinki-package provides an easy-to-use menu-driven
select_feature()
function that effectively combines
get_feature_list()
and get_feature()
. At
default it only returns the Name of the wanted function, but if
get
parameter is set to TRUE, it returns an sf_object which
can be easily visualized.
input_url <- "https://kartta.hsy.fi/geoserver/wfs"
# Interactive example with select_feature
selected_feature <- select_feature(base.url = input_url)
feature <- get_feature(base.url = input_url, typename = selected_feature)
# Skipping a redundant step with parameter get = TRUE
feature <- select_feature(base.url = input_url, get = TRUE)
The above example shows a general use case which can easily be applied to Helsinki Region Environmental Services (HSY) WFS API as well as other service providers’ APIs.
For legacy reasons, helsinki-package has also some specialized functions that aim to make downloading often used data as easy as possible.
Specifically, there are two new functions that replace deprecated
functionalities from get_hsy()
function:
get_vaestotietoruudukko()
(population grid) and
get_rakennustietoruudukko()
(building information
grid).
library(ggplot2)
pop_grid <- get_vaestotietoruudukko(year = 2018)
building_grid <- get_rakennustietoruudukko(year = 2020)
# Logarithmic scales to make the visualizations more discernible
if (!all(is.null(pop_grid), is.null(building_grid))) {
ggplot(pop_grid) +
geom_sf(aes(colour = log(asukkaita), fill = log(asukkaita)))
ggplot(building_grid) +
geom_sf(aes(colour = log(kerala_yht), fill = log(kerala_yht)))
}
With the previous version of the helsinki package, years 2015 to 2020
were supported. In 2022 a new year was added, 2011, demonstrating how
the API may be updated more regularly than the package. The
get_feature_list()
function can be used to download
datasets that are not baked into included functions.
In addition to the datasets listed in the API getting updated, there are also legacy datasets that were never included in the API. We have added the functionality to download datasets from a wider selection of years, as zip files from a different file repository. These files may differ slightly from those downloaded via API and have different column names and larger grid squares and so on.
library(ggplot2)
pop_grid2 <- get_vaestotietoruudukko(year = 2011)
building_grid2 <- get_rakennustietoruudukko(year = 2011)
if (!all(is.null(pop_grid2), is.null(building_grid2))) {
ggplot(pop_grid2) +
geom_sf(aes(colour = log(ASUKKAITA), fill = log(ASUKKAITA)))
ggplot(building_grid2) +
geom_sf(aes(colour = log(ASVALJYYS), fill = log(ASVALJYYS)))
}
While easy enough to build, specialized functions such as these are probably not something that power users want to rely on in their work flows. They also add more manual phases to package maintenance and therefore are probably not the direction we’re heading in the future. If you feel differently about this and there is a dataset that gets a lot of use, feel free to drop us a suggestion in GitHub.
Function get_servicemap()
retrieves regional service
data from city of Helsinki Service Map API, that
contains data from the Service Map.
# Search for "puisto" (park) (specify q="query")
search_puisto <- get_servicemap(query = "search", q = "puisto")
# Study results: 47 variables in the data frame
str(search_puisto, max.level = 1)
We can see that this search returns a large number of results, over 2000. The results are returned as pages, where each page has 20 results by default. By giving no additional search parameters, we get 20 results from the first page of search results.
# Get names for the first 20 results
search_puisto$results$name.fi
# See what kind of data is given for services
names(search_puisto$results)
More results could be retrieved and viewed by giving additional
search
parameters.
search_puisto <- get_servicemap(query = "search", q = "puisto", page_size = 30, page = 2)
str(search_puisto)
search_puisto$results$name.fi
As we could see from above example, the returned data frame had 30 observations with 29 variables. At full width this output can be messy to handle in R console. One possible option would be to turn it into a more easily manageable tibble (which often is not a bad idea), another is to limit the extent of the query at the start.
Function get_linkedevents()
retrieves regional event
data from the new Linked
Events API.
Helsinki region geographic data can be accessed from a WFS API by using the get_city_map() function. Data is available for all 4 cities in the capital region: Helsinki, Espoo, Vantaa and Kauniainen.
Administrative divisions can be accessed on 3 distinct levels: “suuralue”, “tilastoalue” and “pienalue”. Literal, completely unofficial translations for these could be “grand district”, “statistical area” and “(minor) district”. The naming convention of these levels is sometimes confusing even in Finnish documents and different names can vary by city and time.
The main takeaway is that “suuralue” is the highest-level division and “pienalue” is the most granular level of division. “Tilastoalue” is somewhere between these two. These are the names to be used even if the city of interest might not use them in their Finnish or English website.
As promised earlier in API Access, the following example gives an idea on how to visualize waterpost locations (and, of course, other types of spatial data as well) on capital region map.
helsinki <- get_city_map(city = "helsinki", level = "suuralue")
espoo <- get_city_map(city = "espoo", level = "suuralue")
vantaa <- get_city_map(city = "vantaa", level = "suuralue")
kauniainen <- get_city_map(city = "kauniainen", level = "suuralue")
library(ggplot2)
if (!all(is.null(helsinki), is.null(espoo), is.null(vantaa), is.null(kauniainen), is.null(waterposts))) {
ggplot() +
geom_sf(data = helsinki) +
geom_sf(data = espoo) +
geom_sf(data = vantaa) +
geom_sf(data = kauniainen) +
geom_sf(data = waterposts)
}
In addition, it is possible to download “aanestysalue” (voting district) divisions for the city of Helsinki. Currently this data is not available for other cities and it must be accessed from other sources.
map <- get_city_map(city = "helsinki", level = "suuralue")
voting_district <- get_city_map(city = "helsinki", level = "aanestysalue")
For other cities than Helsinki voting districts are currently not available.
See help()
to get citation information for each function
and related data sources.
If no such information is explicitly stated, see data provider’s website for more information.