| Title: | Plotting Trade-Off AUC-Dimensionality |
|---|---|
| Description: | Perform and Runtime statistical comparisons between models. This package aims at choosing the best model for a particular dataset, regarding its discriminant power and runtime. |
| Authors: | Garcez Luis [aut, cre] |
| Maintainer: | Garcez Luis <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-28 08:40:10 UTC |
| Source: | https://github.com/luisgarcez11/tradeoffaucdim |
Apply model and create column with fit
apply_model( obj, models = c("SL.glm", "SL.rpart"), test_partition_prop = 0.2, perf_measure = "auc" )apply_model( obj, models = c("SL.glm", "SL.rpart"), test_partition_prop = 0.2, perf_measure = "auc" )
obj |
object returned from |
models |
models to be analyzed |
test_partition_prop |
test proportion |
perf_measure |
performance measure |
list with fit models and parameters
apply_model(obj2)apply_model(obj2)
Banana quality dataset
bananaqualitybananaquality
An object of class data.frame with 8000 rows and 8 columns.
Banana quality dataset subset
bananaquality_samplebananaquality_sample
An object of class data.frame with 50 rows and 8 columns.
Create a list with bootstrap samples
bootstrap_data( data, outcome = "Quality", indep_vars = c("Size", "Weight", "Sweetness", "Softness", "HarvestTime", "Ripeness", "Acidity"), n_samples = 50, n_maximum_dim = 5 )bootstrap_data( data, outcome = "Quality", indep_vars = c("Size", "Weight", "Sweetness", "Softness", "HarvestTime", "Ripeness", "Acidity"), n_samples = 50, n_maximum_dim = 5 )
data |
a dataframe to be analyzed |
outcome |
a string representing the outcome variable |
indep_vars |
a vector of strings to be considered |
n_samples |
number of bootstrap samples |
n_maximum_dim |
maximum number of variables to be considered |
list
bootstrap_data(bananaquality_sample)bootstrap_data(bananaquality_sample)
Performs statistical tests to compare performance and runtime.
compare_test(obj, x_label_offset = 1, y_label_offset = 10)compare_test(obj, x_label_offset = 1, y_label_offset = 10)
obj |
object returned by |
x_label_offset |
x coordinate to plot p-value |
y_label_offset |
y coordinate to plot p-value |
list with statistical tests performed
compare_test(obj5)compare_test(obj5)
Define independent variables to be tested
define_indepvars(obj, p_in = 0.5, p_out = 0.6)define_indepvars(obj, p_in = 0.5, p_out = 0.6)
obj |
object returned by |
p_in |
entry p-value used to determine variable order |
p_out |
removal p-value used to determine variable order |
list
define_indepvars(obj1)define_indepvars(obj1)
bootstrap_data
obj1
obj1obj1
An object of class list of length 5.
define_indepvars_outcome
obj2
obj2obj2
An object of class list of length 7.
apply_model
obj3
obj3obj3
An object of class list of length 10.
summary_statistics
obj4
obj4obj4
An object of class list of length 11.
plot_curve
obj5
obj5obj5
An object of class list of length 15.
compare_test
obj6
obj6obj6
An object of class list of length 16.
Return plot features.
plot_curve(obj)plot_curve(obj)
obj |
object returned by |
list with graphical features
plot_curve(obj4)plot_curve(obj4)
Return summary statistics
summary_stats(obj)summary_stats(obj)
obj |
object returned from |
list with summary statistics and bootstrap confidence intervals
summary_stats(obj3)summary_stats(obj3)
Wrap all pipeline
wrapper_aucdim( data, outcome, indep_vars, n_samples = 100, n_maximum_dim = 5, p_in = 0.5, p_out = 0.6, models = c("SL.glm"), test_partition_prop = 0.2, perf_measure = "auc", x_label_offset = 1, y_label_offset = 10 )wrapper_aucdim( data, outcome, indep_vars, n_samples = 100, n_maximum_dim = 5, p_in = 0.5, p_out = 0.6, models = c("SL.glm"), test_partition_prop = 0.2, perf_measure = "auc", x_label_offset = 1, y_label_offset = 10 )
data |
a dataframe to be analyzed |
outcome |
a string representing the outcome variable |
indep_vars |
a vector of strings to be considered |
n_samples |
number of bootstrap samples |
n_maximum_dim |
maximum number of variables |
p_in |
entry p-value for choosing variable order |
p_out |
exclusion p-value for choosing variable order |
models |
a string representing the models to compare |
test_partition_prop |
test partition proportion |
perf_measure |
performance measure to be considered |
x_label_offset |
x coordinate for plotting |
y_label_offset |
y coordinate for plotting |
a list with the final object