Package 'sorvi'

Title: Functions for Finnish Open Data
Description: Misc support functions for rOpenGov and open data downloads.
Authors: Leo Lahti [aut, cre] , Juuso Parkkinen [aut], Joona Lehtomaki [aut], Pyry Kantanen [aut]
Maintainer: Leo Lahti <[email protected]>
License: BSD_2_clause + file LICENSE
Version: 0.8.21
Built: 2024-09-14 02:51:26 UTC
Source: https://github.com/rOpenGov/sorvi

Help Index


Algorithmic Tools for Open Data in Finland

Description

The sorvi package hosts various functions that are mainly helpful in rOpenGov package maintenance, package authoring and drawing graphs for presentations. Additionally it has some functions that do not (yet) have their own package but are useful in some contexts.

Author(s)

Leo Lahti, Juuso Parkkinen, Jussi Paananen, Joona Lehtomaki, Einari Happonen, Juuso Haapanen, and Pyry Kantanen [email protected]

References

See citation("sorvi") https://github.com/rOpenGov/sorvi

Examples

library(sorvi)

Get CRAN download statistics

Description

Produces a tibble or a visualization of package download statistics.

Usage

cran_downloads(
  pkgs = "all",
  output = "tibble",
  sum = "by_month",
  plot.scale = 11,
  use.cache = TRUE
)

Arguments

pkgs

Package name(s). Default is "all", which prints statistics for all rOpenGov packages. You can also input 1 or more package names as a vector.

output

"tibble" (default) or "plot". With sum "by_month" and "by_year" "plot" outputs a line chart, with "total" it outputs a bar chart.

sum

"by_month" (default), "by_year" or "total"

plot.scale

integer, default is 11. Smaller numbers decrease the size of plot elements, larger numbers make them larger.

use.cache

Cache downloaded statistics. Default is TRUE

Details

This function is intended for easy retrieval and visualization of rOpenGov package download statistics from CRAN. It is an evolution of an R script by antagomir. As such it retains some features that were present in the original R script and were deemed useful for rOpenGov's internal use. This function may or may not be useful in other instances.

Value

tibble or a ggplot2 line chart or a bar chart

Author(s)

Leo Lahti, Pyry Kantanen <[email protected]>

Examples

## Not run: 
df <- cran_downloads(pkgs = "eurostat", sum = "total", use.cache = FALSE)
kable(df)

## Compare two packages
p1 <- cran_downloads(pkgs = "eurostat", sum = "by_year", output = "plot")
p2 <- cran_downloads(pkgs = "osmar", sum = "by_year", output = "plot")
gridExtra::grid.arrange(p1, p2, nrow = 2)

## End(Not run)

Get IFPI Finland music consumption statistics

Description

Download chart position data from ifpi.fi

Usage

get_ifpi_charts(channel = "radio", year = NA, week = NA)

Arguments

channel

Options: "radio", "albumit", "singlet", "fyysiset-albumit"

year

year as numeric. Default is NA, returning charts from current year. Charts are available from 2014 onwards.

week

week as numeric. Default is NA, returning most last possible charts. Week cannot be the current week. Please note that number of weeks differ between years. For simplicity's sake valid weeks are set to be between 1 and 53. Use e.g. 'lubridate::isoweek' to check how many weeks a given year has.

Details

Web scraping function that is inspired by Sauravkaushik8 Kaushik's blog post "Beginner's Guide on Web Scraping in R" on analyticsvidhya.com. Downloads chart data from Musiikkituottajat - IFPI Finland ry website. Please note that this function works only with IFPI Finland website!

The output has the following columns:

  • rank: Rank on chart

  • artist: Artist name

  • song_title: Song title

  • rank_last_week: Rank on chart on the previous week. RE if the song has re-entered the chart

  • chart_woc: Weeks on chart

  • week: Week number of observation

  • year: Year of observation

Value

tibble

Author(s)

Pyry Kantanen <[email protected]>

See Also

Original tutorial in https://www.analyticsvidhya.com/blog/2017/03/beginners-guide-on-web-scraping-in-r-using-rvest-with-hands-on-knowledge/


Select Municipalities by Year

Description

From a larger dataset containing historical municipalities, pick a certain year and return an output that contains the most recent information on each municipality.

Usage

get_municipalities(year = 2002, type = "sf")

Arguments

year

a year between 1865-2020

type

either "data.frame", "tibble" or "sf"

Details

See dataset "kunnat1865_2021"

Value

a data.frame or sf object

Author(s)

Pyry Kantanen

Source

Data attribution: FinnONTO Consortium: https://seco.cs.aalto.fi/projects/finnonto/


GitHub issues statistics

Description

Get statistics about GitHub issues from GitHub API.

Usage

gh_issue_stats(
  owner = "ropengov",
  repo = "geofi",
  issue.type = NA,
  time.from = NA,
  time.to = NA
)

Arguments

owner

Repository owner / organization. Default is "ropengov"

repo

Repository name. Default is "geofi"

issue.type

Type of issues printed: "issue", "PR" or NA printing all (default).

time.from

Start date in ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ. Default is "2010-09-01T00:00:00Z".

time.to

End date in ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ. Default is Sys.time().

Details

This function is intended for easy information retrieval about rOpenGov package issues and pull requests. More specifically, this function returns a tibble containing information on issue id, title, status (open or closed), number of comments, who opened it, when it was created, what was the openers status (rOpenGov organization member, package contributor or a regular user who opened e.g. a bug issue) and what is the type of the issue.

GitHub Issues API handles Pull Requests and Issues similarly and therefore this function returns both types by default. Different types of issues can be filtered by using the issue.type parameter.

Kudos for this function go to Jennifer Bryan. The changes made here are mostly related to adding additional fields (opener_type, issue_type) to the output tibble and writing a function around these original contributions. The scope of this function is to mainly help rOpenGov team analyze the type of user feedback we get via GitHub issues and therefore the scope of this function is very limited.

Value

tibble

Author(s)

Original scripts by Jennifer Bryan (jennybc), function by Pyry Kantanen <[email protected]>

See Also

GitHub Issues API documentation: https://docs.github.com/en/rest/reference/issues

Original "analyze GitHub stuff with R" repository: https://github.com/jennybc/analyze-github-stuff-with-r


Municipality dataset

Description

A dataset containing information about each instance of individual municipalities.

Usage

kunnat1865_2021

Format

A simple feature with 1337 rows and 10 variables:

x

Universal Resource Identifier (URI) for each municipality instance in time. For example: http://www.yso.fi/onto/sapo/Maalahti(1908-1972)

kunta_nro

Municipality code, a unique number assigned for each municipality that stays the same as long as the municipality exists. For example: "475"

kunta_name_fi

The official name of the municipality in Finnish. For example: Maalahti

kunta_name_fi

The official name of the municipality in Swedish. For example: Malax

startyear

Start year of the municipality instance, e.g. founding year. For example: 1865

endyear

End year of the municipality instance, can be NA if still valid. For example: 1972

area

Area of the municipality, in square kilometers. For example 185.00

muutos_kuvaus

A description of the change that occurred at the beginning of this specific instance. For example: "Ahlainen erotettiin Ulvilasta 1908"

muutos_tyyppi

Type of the change. For example: "Jakaantuminen"

muutos_tunniste

Identifiers for the changes that have happened, which can be used to link past and future instances of municipalities together. For example: "Jakaantuminen1534, Jakaantuminen2"

Details

Most of the Finnish municipalities were formed after 1865 decree on municipal governance in the country Asetus kunnallishallituksesta maalla 1865 but the dataset contains some municipalities that were allegedly formed even before that. There are two instances of "illegal municipalities" (Mustio and Rutakko) that were not recognized as actual municipalities but functioned as such in late 1800s and early 1900s.

Source

Raw data downloaded from ONKI.fi website on 04 Aug 2022: http://onki.fi/en/browser/overview/sapo Data attribution: FinnONTO Consortium: https://seco.cs.aalto.fi/projects/finnonto/

Information on abolished municipalities and municipality name changes from Statistics Finland website: Municipalities and regional divisions based on municipalities in files and classification publications


Supporting Data

Description

Load custom data sets.

Usage

load_sorvi_data(data.id, verbose = TRUE)

Arguments

data.id

data ID to download (see details)

verbose

verbose

Details

The following data sets are available:

  • translation_provincesTranslation of Finnish province (maakunta) names (Finnish, English).

Value

Data set. The format depends on the data.

Author(s)

Leo Lahti [email protected]

References

See citation("sorvi")

Examples

translations <- load_sorvi_data("translation_provinces")

Municipality geometries

Description

A simple feature containing the URIs, municipality codes and geometries of municipalities in time. The starting point and end point of each municipality can be determined by combining polygons1909_2009 with another dataset that contains such information.

Usage

polygons1909_2009

Format

A simple feature with 860 rows and 3 variables:

x

Universal Resource Identifier (URI) for each municipality instance in time. For example: http://www.yso.fi/onto/sapo/Maalahti(1908-1972)

kunta_nro

Municipality code, a unique 3-digit code (001-999) assigned for each municipality that stays the same as long as the municipality exists. For example: "475"

geometry

A single list column with geometries

Source

Original data downloaded from ONKI.fi website on 04 Aug 2022: http://onki.fi/en/browser/overview/sapo Data attribution: FinnONTO Consortium: https://seco.cs.aalto.fi/projects/finnonto/