Title: | Longitudinal Dataframes into Arrays for Machine Learning Training |
---|---|
Description: | An easy tool to transform 2D longitudinal data into 3D arrays suitable for Long short-term memory neural networks training. The array output can be used by the 'keras' package. Long short-term memory neural networks are described in: Hochreiter, S., & Schmidhuber, J. (1997) <doi:10.1162/neco.1997.9.8.1735>. |
Authors: | Luis Garcez [aut, cre, cph] |
Maintainer: | Luis Garcez <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2.0 |
Built: | 2024-11-17 03:54:40 UTC |
Source: | https://github.com/luisgarcez11/long2lstmarray |
An example dataset containing Amyotrophic Lateral Sclerosis Functional Rating Scale - Revised.
alsfrs_data
alsfrs_data
A data frame with 100 rows and 15 variables:
Subject ID
Visit day
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
Scale items
https://pubmed.ncbi.nlm.nih.gov/10540002/
Generate a matrix with various lags from a variable in the dataframe
get_var_array( data, subj_var, var, time_var, lags, label_length = 1, label_output = FALSE )
get_var_array( data, subj_var, var, time_var, lags, label_length = 1, label_output = FALSE )
data |
A data frame, data frame extension (e.g. a |
subj_var |
A character string referring to the variable that specifies the "subject" variable. |
var |
A character string referring to the variable that contains the variable values. |
time_var |
A character string referring to the variable that contains the time variable values (e.g. visit day, minutes, years). |
lags |
The length of each sliced sequence. |
label_length |
How many values after are considered to
be the label? Default to 1. If |
label_output |
logical. if |
If label_output
is FALSE
,
a matrix with the sliced sequences is returned.
If label_output
is TRUE
, a list with the matrix
and vector with the labels from the same variable is returned.
get_var_array(alsfrs_data, "subjid", "p2", "visdy", lags = 3, label_output = FALSE)
get_var_array(alsfrs_data, "subjid", "p2", "visdy", lags = 3, label_output = FALSE)
Get variable values from subject/variable name pair
get_var_sequence(data, subj_var, subj, var)
get_var_sequence(data, subj_var, subj, var)
data |
A data frame, data frame extension (e.g. a |
subj_var |
A character string referring to the variable that specifies the "subject" variable. |
subj |
Any value that the "subject" variable can take. |
var |
A character string referring to the variable that contains the variable values. |
A vector of values from variable var
which subj_var
equal to subj
.
get_var_sequence(sleep, subj_var = "ID", 1, "extra")
get_var_sequence(sleep, subj_var = "ID", 1, "extra")
Generate a matrix with various lags from a dataframe
longitudinal_array( data, subj_var, vars, time_var, lags, label_length = 1, label_var = NULL, label_output = FALSE, time_var_output = FALSE )
longitudinal_array( data, subj_var, vars, time_var, lags, label_length = 1, label_var = NULL, label_output = FALSE, time_var_output = FALSE )
data |
A data frame, data frame extension (e.g. a |
subj_var |
A character string referring to the variable that specifies the "subject" variable. |
vars |
A character string referring to the variables that contain the variable values. |
time_var |
A character string referring to the variable that contains the time variable values (e.g. visit day, minutes, years). Important to get the sequences in the right order. |
lags |
The length of each sliced sequence. |
label_length |
How many values after are considered to be
the label? Default to 1. If |
label_var |
A character string referring to the variables that contain the label variable values. |
label_output |
logical. if |
time_var_output |
logical. Is |
If label_output
is FALSE
, a 3D array with the
sliced sequences is returned. The array dimensions are
subject, time and variable.
If label_output
is TRUE
, a list with the array
and vector with the labels is returned.
longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE) longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE)[1,,] longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE)[,1,] longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE)[,,1]
longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE) longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE)[1,,] longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE)[,1,] longitudinal_array(alsfrs_data, "subjid", c("p1", "p2", "p3"), "visdy", lags = 3, label_output = FALSE)[,,1]
Generate a matrix with various lags from a sequence
slice_var_sequence(sequence, lags, label_length = 1, label_output = TRUE)
slice_var_sequence(sequence, lags, label_length = 1, label_output = TRUE)
sequence |
A vector representing the sequence to be sliced into many rows. |
lags |
The length of each sliced sequence. |
label_length |
How many values after are
considered to be the label? Default to 1.
If |
label_output |
logical. if |
If label_output
is FALSE
, a matrix with
the sliced sequences is returned.
If label_output
is TRUE
, a list with
the matrix and vector with
the labels is returned.
slice_var_sequence(sequence = 1:30, lags = 3, label_length = 1, label_output = TRUE) slice_var_sequence(sequence = 1:30, lags = 3, label_length = 1, label_output = FALSE) slice_var_sequence(sequence = 1:30, lags = 3, label_length = 2, label_output = FALSE)
slice_var_sequence(sequence = 1:30, lags = 3, label_length = 1, label_output = TRUE) slice_var_sequence(sequence = 1:30, lags = 3, label_length = 1, label_output = FALSE) slice_var_sequence(sequence = 1:30, lags = 3, label_length = 2, label_output = FALSE)