It is meant to require less stuff and more robust.
devtools::install_github("Laurae2/LauraeDS", dep = FALSE)
Dependencies installation:
install.packages(c("Matrix", "sparsio", "fst", "data.table", "pbapply", "parallel"))
devtools::install_github("fstpackage/fst@e060e62")
devtools::install_github("Laurae2/ez_xgb/R-package@2017-02-15-v1")
devtools::install_github("Microsoft/LightGBM/R-package@fc59fce") # Jul 14 2017, v2.0.4
Parallel functions are provided to make R fly on multi-core and multi-socket systems, provided enough RAM.
Function | Packages | Description |
---|---|---|
parallel.csv | data.table, fst, parallel | Parallelizes and multithreads the reading of CSV files and writes to fst file format for fast reading. |
parallel.threading | parallel | Sets processor affinity correctly on Windows machines. Provide a boost of up to 200% in memory bounded applications. |
parallel.destroy | parallel | Stops a parallel cluster, or destroy any available clusters bound to the current R session. |
I/O Functions allows to read files from sparse matrices quickly.
Function | Packages | Description |
---|---|---|
sparse.read | sparsio, Matrix | Reads SVMLight file format (sparse matrices) |
sparse.write | sparsio, Matrix | Writes SVMLight file format (sparse matrices) |
Fold functions allow to generate folds for cross-validation very quickly.
Function | Packages | Description |
---|---|---|
kfold | None | Generate cross-validated folds (stratified, treatment, pseudo-random, random) |
nkfold | None | Generate Repeated cross-validated folds (stratified, treatment, pseudo-random, random) |
Optimized metrics might help get an edge when you can.
Function | Packages | Description |
---|---|---|
metrics.acc.max | data.table | Maximum Binary Accuracy |
metrics.f1.max | data.table | Maximum F1 Score (Precision with Sensitivity Harmonic Mean |
metrics.fallout;max | data.table | Minimum Fall-Out (False Positive Rate) |
metrics.kappa.max | data.table | Maximum Kappa Statistic |
metrics.mcc.max | data.table | Maximum Matthews Correlation Coefficient |
metrics.missrate.max | data.table | Minim Miss-rate (False Negative Rate) |
metrics.precision.max | data.table | Maximum Precision (Positive Predictive Rate) |
metrics.sensitivity.max | data.table | Maximum Sensitivity (True Positive Rate) |
metrics.specifity.max | data.table | Maximum Specificity (True Negative Rate) |
Computing and/or solving metrics might help you understand what default values are the best for the metric.
Function | Packages | Description |
---|---|---|
metrics.logloss | None | Logarithmic Loss (logloss) |
metrics.logloss.unsafe | None | Logarithmic Loss (logloss) without bound checking |
metrics.logloss.solve | stats | Logarithmic Loss Solver |
Generating binary matrices never got easier if you can throw lists and data.frames directly.
Function | Packages | Description |
---|---|---|
Laurae.xgb.dmat | xgboost, Matrix | Wrapper for extensible xgb.DMatrix generation. |
Laurae.lgb.dmat | lightgbm, Matrix | Wrapper for extensible lgb.Dataset generation. |
Not remembering every existing hyperparameters? Now you can by pressing Tab to autocomplete hyperparameters.
Function | Packages | Description |
---|---|---|
Laurae.xgb.train | xgboost, Matrix | Wrapper for xgboost Models |
Creating loss/metrics can be a tedious task without templates. Use these as template wrappers: focus on loss/metrics, wrap them with a template quickly.
Function | Packages | Description |
---|---|---|
xgb.wrap.loss | xgboost | Wrapper to make quick xgboost loss function. |
xgb.wrap.metric | xgboost | Wrapper to make quick xgboost metric function. |
lgb.wrap.loss | LightGBM | Wrapper to make quick LightGBM loss function. |
lgb.wrap.metric | LightGBM | Wrapper to make quick LightGBM metric function. |
Need functions answering metrics quickly? Here are some.
Function | Packages | Description |
---|---|---|
metrics.logloss | None | Computes the logarithmic loss. |
metrics.logloss.unsafe | None | Computes the logarithmic loss faster by skipping out of bounds checks. |
metrics.logloss.solve | stats | Solves for a parameter involving the logartihmic loss (minimal loss, constant prediction value, ratio). |
I'm not liable for anything