Parallelizes the writing of separate CSV files (still sequential reading) in order to store them in fst
format (also, overwrites fst::threads_fst
. Requires data.table
and fst
packages.
parallel.csv(file, compress = 35, progress_bar = TRUE, clean_mem = FALSE, cl = NULL, max_threads = max(ifelse(is.null(cl), parallel::detectCores(), ifelse(!is.list(cl), round(parallel::detectCores()/cl), round(parallel::detectCores()/length(cl)))), 1), wkdir = NULL, ...)
file | Type: vector of characters. Path to all files to read. |
---|---|
compress | Type: numeric. Compression rate to use. Defaults to |
progress_bar | Type: logical. Whether to print a progress bar. Defaults to |
clean_mem | Type: logical. Whether the force garbage collection at the end of each file read in order to reclaim RAM. Defaults to |
cl | Type: cluster or integer. A parallel cluster for parallelized calls. Used only when |
max_threads | Type: numeric. The maximum number of threads allowed to adapt |
wkdir | Type: character. The working directory, when using a cluster. Defaults to |
... | Other arguments to pass to |
The element or the list of fst
file names.
# NOT RUN { # Cannot pass CRAN checks. Disabled. # Do it on your own files! library(fst) # devtools::install_github("fstPackage/fst@e060e62") library(data.table) library(parallel) parallel.csv(c("file_1.csv", "file_2.csv"), max_threads = 1, progress_bar = TRUE) parallel.csv(paste0("file_", 1:100, ".csv"), max_threads = 1, progress_bar = TRUE, cl = 8) # }