Title: | Structural Handling of Finnish Personal Identity Codes |
---|---|
Description: | Structural handling of Finnish identity codes (natural persons and organizations); extract information, check ID validity and diagnostics. |
Authors: | Pyry Kantanen [aut, cre] , Mans Magnusson [aut], Jussi Paananen [aut], Juho Kopra [ctb], Oskari Luomala [ctb], Tuomo Nieminen [ctb], Leo Lahti [aut] |
Maintainer: | Pyry Kantanen <[email protected]> |
License: | BSD_2_clause + file LICENSE |
Version: | 1.0.7 |
Built: | 2024-11-09 03:06:58 UTC |
Source: | https://github.com/rOpenGov/hetu |
A function that checks whether a bid
(Finnish Business ID) is valid.
Returns TRUE
or FALSE
.
bid_ctrl(bid)
bid_ctrl(bid)
bid |
a vector of 1 or more business identity numbers |
bid_ctrl(c("0000000-0", "0000001-9")) # TRUE TRUE bid_ctrl("0737546-1") # FALSE
bid_ctrl(c("0000000-0", "0000001-9")) # TRUE TRUE bid_ctrl("0737546-1") # FALSE
Extract embedded information from Finnish personal identity codes (hetu).
hetu(pin, extract = NULL, allow.temp = FALSE, diagnostic = FALSE)
hetu(pin, extract = NULL, allow.temp = FALSE, diagnostic = FALSE)
pin |
Finnish personal identity code(s) as a character vector |
extract |
Extract only selected part of the information.
Valid values are " |
allow.temp |
Allow artificial or temporary PINs (personal numbers
900-999). If |
diagnostic |
Print additional information about possible problems in
PINs. The checks are " |
Finnish personal identity code data.frame,
or if extract parameter is set, the requested part of the
information as a vector. Returns an error or NA
if the given
character vector is not a valid Finnish personal identity code.
hetu |
Finnish personal identity code as a character vector. A correct pin should be in the form DDMMYYCZZZQ, where DDMMYY stands for date, C for century sign, ZZZ for personal number and Q for control character. |
sex |
sex of the person as a character vector ("Male" or "Female"). |
p.num |
Personal number part of the identity code. |
ctrl.char |
Control character for the personal identity code. |
date |
Birthdate. |
day |
Day of the birthdate. |
month |
Month of the birthdate. |
year |
Year of the birthdate. |
century |
Century character of the birthdate: + (1800), - (1900) or A (2000). |
valid.pin |
Does the personal identity code pass all validity
checks: ( |
Pyry Kantanen, Jussi Paananen
pin_ctrl
For validating Finnish personal
identity codes.
hetu("111111-111C") hetu("111111-111C")$date hetu("111111-111C")$sex # Same as previous, but using extract argument hetu("111111-111C", extract="sex") # Process a vector of hetu's hetu(c("010101-0101", "111111-111C")) # Process a vector of hetu's and extract sex information from each hetu(c("010101-0101", "111111-111C"), extract="sex")
hetu("111111-111C") hetu("111111-111C")$date hetu("111111-111C")$sex # Same as previous, but using extract argument hetu("111111-111C", extract="sex") # Process a vector of hetu's hetu(c("010101-0101", "111111-111C")) # Process a vector of hetu's and extract sex information from each hetu(c("010101-0101", "111111-111C"), extract="sex")
Calculate a valid control character for an incomplete Finnish personal identity codes (hetu).
hetu_control_char(pin, with.century = TRUE)
hetu_control_char(pin, with.century = TRUE)
pin |
An incomplete PIN that ONLY has a date, century marker (optional, see parameter with.century) and personal number |
with.century |
If TRUE (default), the function assumes that the PIN input contains a century marker (DDMMYYQZZZ). If FALSE, the function assumes that the PIN contains only date and personal number (DDMMYYZZZ). |
This method of calculating the control character was devised by mathematician Erkki Pale (1962) to detect input errors but also to detect errors produced by early punch card machines. The long number produced by writing the birth date and the personal number together are divided by 31 and the remainder is used to look up the control character from a separate table containing alphanumeric characters except letters G, I, O, Q and Z.
The method of calculating the control character does not need century character and therefore the function has an option to omit it.
Control character, either a number 0-9 or a letter.
Pyry Kantanen
hetu
For extracting information from Finnish personal
identity codes.
hetu_control_char("010101-010") hetu_control_char("010101010", with.century = FALSE)
hetu_control_char("010101-010") hetu_control_char("010101010", with.century = FALSE)
Prints information on the tests that are used to confirm or reject the validity of each personal identity code.
hetu_diagnostic(pin, extract = NULL) pin_diagnostic(pin, extract = NULL)
hetu_diagnostic(pin, extract = NULL) pin_diagnostic(pin, extract = NULL)
pin |
Finnish personal identification number as a character vector, or vector of identification numbers as a character vectors |
extract |
Extract only selected part of the diagnostic information.
Valid values are " |
A data.frame containing diagnostic checks about PINs.
hetu
for the main function on which
hetu_diagnostic
relies on.
diagnosis_example <- c("010101-0102", "111111-111Q", "010101B0101", "320101-0101", "011301-0101", "010101-01010", "010101-0011") ## Print all diagnostics for various fake personal identity codes hetu_diagnostic(diagnosis_example) # Extract century-related checks hetu_diagnostic(diagnosis_example, extract = "valid.century") diagnosis_example <- c("010101-0102", "111111-111Q", "010101B0101", "320101-0101", "011301-0101", "010101-01010", "010101-0011") ## Print all diagnoses pin_diagnostic(diagnosis_example)
diagnosis_example <- c("010101-0102", "111111-111Q", "010101B0101", "320101-0101", "011301-0101", "010101-01010", "010101-0011") ## Print all diagnostics for various fake personal identity codes hetu_diagnostic(diagnosis_example) # Extract century-related checks hetu_diagnostic(diagnosis_example, extract = "valid.century") diagnosis_example <- c("010101-0102", "111111-111Q", "010101B0101", "320101-0101", "011301-0101", "010101-01010", "010101-0011") ## Print all diagnoses pin_diagnostic(diagnosis_example)
Calculate age in years, months, weeks or days from personal identity codes.
pin_age(pin, date = Sys.Date(), timespan = "years", allow.temp = FALSE) hetu_age(pin, date = Sys.Date(), timespan = "years", allow.temp = FALSE)
pin_age(pin, date = Sys.Date(), timespan = "years", allow.temp = FALSE) hetu_age(pin, date = Sys.Date(), timespan = "years", allow.temp = FALSE)
pin |
Finnish personal identity code(s) as a character vector |
date |
Date at which age is calculated. If a vector is provided it
must be of the same length as the |
timespan |
Timespan to use to calculate age. The possible timespans are:
|
allow.temp |
Allow artificial or temporary PINs (personal numbers
900-999). If |
Age as an integer vector.
ex_pin <- c("010101-0101", "111111-111C") pin_age(ex_pin, date = "2012-01-01") ex_pin <- c("010101-0101", "111111-111C") hetu_age(ex_pin, date = "2012-01-01")
ex_pin <- c("010101-0101", "111111-111C") pin_age(ex_pin, date = "2012-01-01") ex_pin <- c("010101-0101", "111111-111C") hetu_age(ex_pin, date = "2012-01-01")
Validate Finnish personal identity codes (hetu).
pin_ctrl(pin, allow.temp = FALSE) hetu_ctrl(pin, allow.temp = FALSE)
pin_ctrl(pin, allow.temp = FALSE) hetu_ctrl(pin, allow.temp = FALSE)
pin |
Finnish personal identity code(s) as a character vector |
allow.temp |
If TRUE, temporary PINs (personal numbers 900-999) are
handled similarly to regular PINs (personal numbers 002-899), meaning
that otherwise valid temporary PIN will return a TRUE. Default
is |
A logical vector indicating whether the input vector contains valid Finnish personal identity codes.
Pyry Kantanen
hetu
For extracting information from Finnish personal
identity codes.
pin_ctrl("010101-0101") # TRUE pin_ctrl("010101-010A") # FALSE pin_ctrl(c("010101-0101", "010101-010A")) # TRUE FALSE hetu_ctrl("010101-0101") # TRUE hetu_ctrl("010101-010A") # FALSE hetu_ctrl(c("010101-0101", "010101-010A")) # TRUE FALSE
pin_ctrl("010101-0101") # TRUE pin_ctrl("010101-010A") # FALSE pin_ctrl(c("010101-0101", "010101-010A")) # TRUE FALSE hetu_ctrl("010101-0101") # TRUE hetu_ctrl("010101-010A") # FALSE hetu_ctrl(c("010101-0101", "010101-010A")) # TRUE FALSE
Returns the date of birth in date format.
pin_date(pin, allow.temp = FALSE) hetu_date(pin, allow.temp = FALSE)
pin_date(pin, allow.temp = FALSE) hetu_date(pin, allow.temp = FALSE)
pin |
Finnish personal identity code(s) as a character vector |
allow.temp |
Allow artificial or temporary PINs (personal numbers
900-999). If |
Date of birth as a vector in date format.
pin_date(c("010101-0101", "111111-111C")) hetu_date(c("010101-0101", "111111-111C"))
pin_date(c("010101-0101", "111111-111C")) hetu_date(c("010101-0101", "111111-111C"))
Extract sex (as binary) from Finnish personal identification code.
pin_sex(pin, allow.temp = TRUE) hetu_sex(pin, allow.temp = TRUE)
pin_sex(pin, allow.temp = TRUE) hetu_sex(pin, allow.temp = TRUE)
pin |
Finnish personal identity code(s) as a character vector |
allow.temp |
Allow artificial or temporary PINs (personal numbers
900-999). If |
Factor with label 'Male' and 'Female'.
Pyry Kantanen, Leo Lahti
hetu
For general information extraction
pin_sex("010101-010A") hetu_sex("010101-010A")
pin_sex("010101-010A") hetu_sex("010101-010A")
A function that generates random Finnish Business ID's, bid
-numbers
(Y-tunnus).
rbid(n)
rbid(n)
n |
number of generated BIDs |
a vector of generated BID
-numbers.
x <- rbid(3) bid_ctrl(x)
x <- rbid(3) bid_ctrl(x)
A function that generates random Finnish personal identity codes
(hetu
codes).
rpin( n, start.date = as.Date("1895-01-01"), end.date = Sys.Date(), p.male = 0.4, p.temp = 0, num.cores = 1 ) rhetu( n, start.date = as.Date("1895-01-01"), end.date = Sys.Date(), p.male = 0.4, p.temp = 0, num.cores = 1 )
rpin( n, start.date = as.Date("1895-01-01"), end.date = Sys.Date(), p.male = 0.4, p.temp = 0, num.cores = 1 ) rhetu( n, start.date = as.Date("1895-01-01"), end.date = Sys.Date(), p.male = 0.4, p.temp = 0, num.cores = 1 )
n |
number of generated |
start.date |
Lower limit of generated |
end.date |
Upper limit of generated |
p.male |
Probability of males, between 0.0 and 1.0. Default is 0.4. |
p.temp |
Probability of temporary identification numbers, between 0.0 and 1.0. Default is 0.0. |
num.cores |
The number of cores for parallel processing. The number
of available cores can be determined with |
There is a finite number of valid personal identity codes available per day. More specifically, there are 498 odd personal numbers for males and 498 even personal numbers for females from range 002-899. Additionally there are 50 odd numbers for males and 50 even numbers for females in the temporary personal identity code number range 900-999 that is not normally in use. This function will return an error "too few positive probabilities" in sample.int function if you try to generate too many codes in a short enough timeframe.
The theoretical upper limit of valid PINs is in the millions since there are 898 PINs available for each day, 327770 for each year. In practice this number is much lower since same personal number component cannot be "recycled" if it has been used in the past. To illustrate, if an identity code "010101-0101" has already been assigned to someone born in 1901-01-01, a similar code "010101A0101" for someone born in 2001-01-01 could not be used.
a vector of generated hetu
-pins.
Pyry Kantanen, Jussi Paananen
x <- rpin(3) hetu(x) hetu(x, extract = "sex") hetu(x, extract = "ctrl.char") x <- rhetu(3) x
x <- rpin(3) hetu(x) hetu(x, extract = "sex") hetu(x, extract = "ctrl.char") x <- rhetu(3) x
Calculate a valid control character for an incomplete Finnish Unique Identification Number (FINUID, or sähköinen asiointitunnus SATU).
satu_control_char(pin, print.full = FALSE)
satu_control_char(pin, print.full = FALSE)
pin |
An incomplete FINUID that has 8 first numbers. |
print.full |
Should the function print only the whole FINUID-number (TRUE) or only the control character (FALSE). Default is FALSE. |
This method of calculating the control character was devised by mathematician Erkki Pale (1962) to detect input errors but also to detect errors produced by early punch card machines. The long number produced by writing the birth date and the personal number together are divided by 31 and the remainder is used to look up the control character from a separate table containing alphanumeric characters except letters G, I, O, Q and Z.
The method of calculating the control character does not need century character and therefore the function has an option to omit it.
Control character, either a number 0-9 or a letter (length 1 character). If parameter print.full is set to TRUE, the function returns a complete FINUID / SATU number (length 9 characters).
Pyry Kantanen
For more detailed information about FINUID, see Finnish Digital and population data services agency website: https://dvv.fi/en/citizen-certificate-and-electronic-identity
# The first assigned FINUID number, 10000001N. satu_control_char("10000001")
# The first assigned FINUID number, 10000001N. satu_control_char("10000001")
A function that checks whether a satu
(Finnish Unique Identification
Number) is valid. Returns TRUE
or FALSE
.
satu_ctrl(satu)
satu_ctrl(satu)
satu |
a vector of 1 or more Unique Identification Numbers |
satu_ctrl("10000001N") # TRUE satu_ctrl(c("10000001N", "20000001B")) # TRUE FALSE
satu_ctrl("10000001N") # TRUE satu_ctrl(c("10000001N", "20000001B")) # TRUE FALSE