It is meant to require less stuff and more robust.
devtools::install_github("Laurae2/LauraeDS", dep = FALSE)Dependencies installation:
install.packages(c("Matrix", "sparsio", "fst", "data.table", "pbapply", "parallel"))
devtools::install_github("fstpackage/fst@e060e62")
devtools::install_github("Laurae2/ez_xgb/R-package@2017-02-15-v1")
devtools::install_github("Microsoft/LightGBM/R-package@fc59fce") # Jul 14 2017, v2.0.4Parallel functions are provided to make R fly on multi-core and multi-socket systems, provided enough RAM.
| Function | Packages | Description | 
|---|---|---|
| parallel.csv | data.table, fst, parallel | Parallelizes and multithreads the reading of CSV files and writes to fst file format for fast reading. | 
| parallel.threading | parallel | Sets processor affinity correctly on Windows machines. Provide a boost of up to 200% in memory bounded applications. | 
| parallel.destroy | parallel | Stops a parallel cluster, or destroy any available clusters bound to the current R session. | 
I/O Functions allows to read files from sparse matrices quickly.
| Function | Packages | Description | 
|---|---|---|
| sparse.read | sparsio, Matrix | Reads SVMLight file format (sparse matrices) | 
| sparse.write | sparsio, Matrix | Writes SVMLight file format (sparse matrices) | 
Fold functions allow to generate folds for cross-validation very quickly.
| Function | Packages | Description | 
|---|---|---|
| kfold | None | Generate cross-validated folds (stratified, treatment, pseudo-random, random) | 
| nkfold | None | Generate Repeated cross-validated folds (stratified, treatment, pseudo-random, random) | 
Optimized metrics might help get an edge when you can.
| Function | Packages | Description | 
|---|---|---|
| metrics.acc.max | data.table | Maximum Binary Accuracy | 
| metrics.f1.max | data.table | Maximum F1 Score (Precision with Sensitivity Harmonic Mean | 
| metrics.fallout;max | data.table | Minimum Fall-Out (False Positive Rate) | 
| metrics.kappa.max | data.table | Maximum Kappa Statistic | 
| metrics.mcc.max | data.table | Maximum Matthews Correlation Coefficient | 
| metrics.missrate.max | data.table | Minim Miss-rate (False Negative Rate) | 
| metrics.precision.max | data.table | Maximum Precision (Positive Predictive Rate) | 
| metrics.sensitivity.max | data.table | Maximum Sensitivity (True Positive Rate) | 
| metrics.specifity.max | data.table | Maximum Specificity (True Negative Rate) | 
Computing and/or solving metrics might help you understand what default values are the best for the metric.
| Function | Packages | Description | 
|---|---|---|
| metrics.logloss | None | Logarithmic Loss (logloss) | 
| metrics.logloss.unsafe | None | Logarithmic Loss (logloss) without bound checking | 
| metrics.logloss.solve | stats | Logarithmic Loss Solver | 
Generating binary matrices never got easier if you can throw lists and data.frames directly.
| Function | Packages | Description | 
|---|---|---|
| Laurae.xgb.dmat | xgboost, Matrix | Wrapper for extensible xgb.DMatrix generation. | 
| Laurae.lgb.dmat | lightgbm, Matrix | Wrapper for extensible lgb.Dataset generation. | 
Not remembering every existing hyperparameters? Now you can by pressing Tab to autocomplete hyperparameters.
| Function | Packages | Description | 
|---|---|---|
| Laurae.xgb.train | xgboost, Matrix | Wrapper for xgboost Models | 
Creating loss/metrics can be a tedious task without templates. Use these as template wrappers: focus on loss/metrics, wrap them with a template quickly.
| Function | Packages | Description | 
|---|---|---|
| xgb.wrap.loss | xgboost | Wrapper to make quick xgboost loss function. | 
| xgb.wrap.metric | xgboost | Wrapper to make quick xgboost metric function. | 
| lgb.wrap.loss | LightGBM | Wrapper to make quick LightGBM loss function. | 
| lgb.wrap.metric | LightGBM | Wrapper to make quick LightGBM metric function. | 
Need functions answering metrics quickly? Here are some.
| Function | Packages | Description | 
|---|---|---|
| metrics.logloss | None | Computes the logarithmic loss. | 
| metrics.logloss.unsafe | None | Computes the logarithmic loss faster by skipping out of bounds checks. | 
| metrics.logloss.solve | stats | Solves for a parameter involving the logartihmic loss (minimal loss, constant prediction value, ratio). | 
I'm not liable for anything