Package: ff 4.5.2

ff: Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Authors:Daniel Adler [aut], Christian Gläser [ctb], Oleg Nenadic [ctb], Jens Oehlschlägel [aut, cre], Martijn Schuemie [ctb], Walter Zucchini [ctb]

ff_4.5.2.tar.gz
ff_4.5.2.zip(r-4.5)ff_4.5.2.zip(r-4.4)ff_4.5.2.zip(r-4.3)
ff_4.5.2.tgz(r-4.5-x86_64)ff_4.5.2.tgz(r-4.5-arm64)ff_4.5.2.tgz(r-4.4-x86_64)ff_4.5.2.tgz(r-4.4-arm64)ff_4.5.2.tgz(r-4.3-x86_64)ff_4.5.2.tgz(r-4.3-arm64)
ff_4.5.2.tar.gz(r-4.5-noble)ff_4.5.2.tar.gz(r-4.4-noble)
ff.pdf |ff.html✨
ff/json (API)
NEWS

# Install 'ff' in R:

install.packages('ff', repos = c('https://truecluster.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/truecluster/ff/issues

Uses libs:

c++– GNU Standard C++ Library v3

On CRAN:

cpp

12.01 score 27 stars 71 packages 764 scripts 51k downloads 14 mentions 165 exports 1 dependencies

Last updated 2 months agofrom:d9dd7552bd. Checks:1 OK, 11 NOTE. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 13 2025
R-4.5-win-x86_64	NOTE	Mar 13 2025
R-4.5-mac-x86_64	NOTE	Mar 13 2025
R-4.5-mac-aarch64	NOTE	Mar 13 2025
R-4.5-linux-x86_64	NOTE	Mar 13 2025
R-4.4-win-x86_64	NOTE	Mar 13 2025
R-4.4-mac-x86_64	NOTE	Mar 13 2025
R-4.4-mac-aarch64	NOTE	Mar 13 2025
R-4.4-linux-x86_64	NOTE	Mar 13 2025
R-4.3-win-x86_64	NOTE	Mar 13 2025
R-4.3-mac-x86_64	NOTE	Mar 13 2025
R-4.3-mac-aarch64	NOTE	Mar 13 2025

Exports:.ffbytes .ffmode .rambytes .rammode .vcoerceable .vimplemented .vmax .vmin .vmode .vNA .vunsigned .vvalues add appendLevels array2vector arrayIndex2vectorIndex as.boolean as.byte as.ff as.ffdf as.hi as.nibble as.quad as.ram as.short as.ubyte as.ushort as.vmode bigsample boolean byte ccbind cfun clength clone.ff clone.ffdf cmean cmedian cquantile crbind csum csummary delete deleteIfOpen dforder dfsort dimorder dimorder<-dimorderCompatible dimorderStandard dummy.dimnames ff ffapply ffcolapply ffconform ffdf ffdfindexget ffdfindexset ffdforder ffdfsort ffdrop ffindexget ffindexorder ffindexordersize ffindexset ffinfo ffload fforder ffreturn ffrowapply ffsave ffsave.image ffsort ffsuitable ffsymmxtensions fftempfile ffvecapply ffxtensions file.move file.resize filename filename<-finalize finalizer finalizer<-fixdiag fixdiag<-get.ff getalignedpagesize getdefaultpagesize geterror.ff geterrstr.ff getpagesize getset.ff hi hiparse is.factor is.ff is.ffdf is.open is.ordered is.readonly matcomb matprint maxffmode maxlength mismatch ncol<-nibble nrow<-pagesize pattern pattern<-quad ram2ffcode ram2ramcode ramattribs ramclass ramdforder ramdfsort read.csv.ffdf read.csv2.ffdf read.delim.ffdf read.delim2.ffdf read.ff read.table.ffdf readwrite.ff recodeLevels regtest.fforder regtest.vmode repnam set.ff short sortLevels splitPathFile standardPathFile subscript2integer swap symmetric symmIndex2vectorIndex tempPathFile ubyte unclass<-undim unsort unsplitPathFile ushort vecprint vector.vmode vector2array vectorCompatible vectorIndex2arrayIndex vectorStandard vmode vmode<-vt vw vw<-write.csv write.csv.ffdf write.csv2 write.csv2.ffdf write.ff write.table.ffdf ymismatch

Dependencies:bit

Help page	Topics
Incrementing an ff or ram object	add add.default add.ff
Array: make vector from array	array2vector
Array: make vector positions from array index	arrayIndex2vectorIndex
Coercing ram to ff and ff to ram objects	as.ff as.ff.default as.ff.ff as.ram as.ram.default as.ram.ff
Conversion between bit and ff boolean	as.bit.ff as.ff.bit
Coercing to ffdf and data.frame	as.data.frame.ffdf as.ffdf as.ffdf.data.frame as.ffdf.ff_matrix as.ffdf.ff_vector
Hybrid Index, coercion to	as.hi as.hi.( as.hi.bit as.hi.bitwhich as.hi.call as.hi.character as.hi.double as.hi.hi as.hi.integer as.hi.logical as.hi.matrix as.hi.name as.hi.NULL as.hi.ri as.hi.which
Hybrid Index, coercing from	as.bit.hi as.bitwhich.hi as.character.hi as.integer.hi as.logical.hi as.matrix.hi as.which.hi
Coercing to virtual mode	as.boolean as.boolean.default as.byte as.byte.default as.nibble as.nibble.default as.quad as.quad.default as.short as.short.default as.ubyte as.ubyte.default as.ushort as.ushort.default as.vmode as.vmode.default as.vmode.ff
Sampling from large pools	bigsample bigsample.default bigsample.ff
Collapsing functions for batch processing	ccbind CFUN cfun clength cmean cmedian cquantile crbind csum csummary
Chunk ff_vector and ffdf	chunk.ffdf chunk.ff_vector
Cloning ff and ram objects	clone.ff
Cloning ffdf objects	clone.ffdf
Closing ff files	close.ff close.ffdf close.ff_pointer
Deleting the file behind an ff object	delete delete.default delete.ff delete.ffdf delete.ff_pointer deleteIfOpen deleteIfOpen.ff deleteIfOpen.ff_pointer
Getting and setting dim and dimorder	dim.ff dim.ffdf dim<-.ff dim<-.ffdf dimorder dimorder.default dimorder.ffdf dimorder.ff_array dimorder<- dimorder<-.ffdf dimorder<-.ff_array
Getting and setting dimnames	dimnames.ff dimnames.ff_array dimnames<-.ff_array
Getting and setting dimnames of ffdf	dimnames.ffdf dimnames<-.ffdf names.ffdf names<-.ffdf row.names.ffdf row.names<-.ffdf
Test for dimorder compatibility	dimorderCompatible dimorderStandard vectorCompatible vectorStandard
Array: make dimnames	dummy.dimnames
Reading and writing vectors and arrays (high-level)	Extract.ff [.ff [.ff_array [<-.ff [<-.ff_array [[.ff [[<-.ff
Reading and writing data.frames (ffdf)	$.ffdf $<-.ffdf Extract.ffdf [.ffdf [<-.ffdf [[.ffdf [[<-.ffdf
ff classes for representing (large) atomic data	ff ff_pointer
Apply for ff objects	ffapply ffcolapply ffrowapply ffvecapply
Get most conforming argument	ffconform
ff class for data.frames	ffdf
Reading and writing ffdf data.frame using ff subscripts	ffdfindexget ffdfindexset
Sorting: convenience wrappers for data.frames	dforder dfsort ffdforder ffdfsort ramdforder ramdfsort
Delete an ffarchive	ffdrop
Reading and writing ff vectors using ff subscripts	ffindexget ffindexset
Sorting: chunked ordering of integer suscript positions	ffindexorder ffindexordersize
Inspect content of ff saves	ffinfo
Reload ffSaved Datasets	ffload
Sorting: order from ff vectors	fforder
Return suitable ff object	ffreturn
Save R and ff objects	ffsave ffsave.image
Sorting of ff vectors	ffsort
Test ff object for suitability	ffsuitable ffsuitable_attribs
Test for availability of ff extensions	ffsymmxtensions ffxtensions
Change size of move an existing file	file.move file.resize
Get or set filename	filename filename.default filename.ffdf filename.ff_pointer filename<- filename<-.ff pattern pattern.ff pattern<- pattern<-.ff pattern<-.ffdf
Call finalizer	finalize finalize.ff finalize.ffdf finalize.ff_pointer
Get and set finalizer (name)	finalizer finalizer.ff finalizer<- finalizer<-.ff
Test for fixed diagonal	fixdiag fixdiag.default fixdiag.dist fixdiag.ff fixdiag<-
Get error and error string	geterror.ff geterrstr.ff
Get page size information	getalignedpagesize getdefaultpagesize getpagesize
Reading and writing vectors of values (low-level)	get.ff getset.ff set.ff
Hybrid index class	hi print.hi str.hi
Hybrid Index, parsing	hiparse
Test for class ff	is.ff
Test for class ff	is.ffdf
Test if object is opened	is.open is.open.ff is.open.ffdf is.open.ff_pointer
Get readonly status	is.readonly is.readonly.ff
Getting and setting 'is.sorted' physical attribute	is.sorted.default is.sorted<-.default
Getting and setting length	length.ff length<-.ff
Getting length of a ffdf dataframe	length.ffdf
Hybrid Index, querying	length.hi maxindex.hi poslength.hi
Getting and setting factor levels	is.factor is.factor.default is.factor.ff is.ordered is.ordered.default is.ordered.ff levels.ff levels<-.ff
ff Limitations and Warnings	LimWarn
Array: make matrix indices from row and columns positions	matcomb
Print beginning and end of big matrix	matprint print.matprint
Lossless vmode coercability	maxffmode
Get physical length of an ff or ram object	maxlength maxlength.default maxlength.ff
Test for recycle mismatch	mismatch ymismatch
Getting and setting 'na.count' physical attribute	na.count.default na.count.ff na.count<-.default na.count<-.ff
Getting and setting names	names.ff names.ff_array names<-.ff names<-.ff_array
Assigning the number of rows or columns	ncol<- nrow<-
Opening an ff file	open.ff open.ffdf
Pagesize of ff object	pagesize pagesize.ff
Getting and setting physical and virtual attributes of ff objects	physical.ff physical<-.ff virtual.ff virtual<-.ff
Getting physical and virtual attributes of ffdf objects	physical.ffdf virtual.ffdf
Print and str methods	print.ff print.ffdf print.ff_matrix print.ff_vector str.ff str.ffdf
Factor codings	ram2ffcode ram2ramcode
Get ramclass and ramattribs	ramattribs ramattribs.default ramattribs.ff ramattribs_excludes ramclass ramclass.default ramclass.ff ramclass_excludes
Sorting: order R vector in-RAM and in-place	keyorder.default mergeorder.default radixorder.default ramorder.default shellorder.default
Sorting: Sort R vector in-RAM and in-place	keysort.default mergesort.default radixsort.default ramsort.default shellsort.default
Importing csv files into ff data.frames	read.csv.ffdf read.csv2.ffdf read.delim.ffdf read.delim2.ffdf read.table.ffdf
Reading and writing vectors (low-level)	read.ff readwrite.ff write.ff
Sorting: regression tests	regtest.fforder
Replicate with names	repnam
Factor level manipulation	appendLevels recodeLevels recodeLevels.factor recodeLevels.ff sortLevels sortLevels.factor sortLevels.ff sortLevels.ffdf
Analyze pathfile-strings	fftempfile splitPathFile standardPathFile tempPathFile unsplitPathFile
Reading and writing in one operation (high-level)	swap swap.default swap.ff swap.ff_array
Test for symmetric structure	symmetric symmetric.default symmetric.dist symmetric.ff
Array: make vector positions from symmetric array index	symmIndex2vectorIndex
Unclassed assignement	unclass<-
Undim	undim
Hybrid Index, internal utilities	subscript2integer unsort unsort.ahi unsort.hi
Update ff content from another object	update.ff update.ffdf
Print beginning and end of big vector	print.vecprint vecprint
Create vector of virtual mode	boolean byte nibble quad short ubyte ushort vector.vmode vector.vmode.default vector.vmode.ff
Array: make array from vector	vector2array
Array: make array from index vector positions	vectorIndex2arrayIndex
Virtual storage mode	.ffbytes .ffmode .rambytes .rammode .vcoerceable .vimplemented .vmax .vmin .vmode .vNA .vunsigned .vvalues regtest.vmode vmode vmode.default vmode.ff vmode<- vmode<-.default vmode<-.ff
Virtual storage mode of ffdf	vmode.ffdf
Virtual transpose	t.ff vt vt.default vt.ff
Getting and setting virtual windows	vw vw.default vw.ff vw<- vw<-.ff_array vw<-.ff_vector
Exporting csv files from ff data.frames	write.csv write.csv.ffdf write.csv2 write.csv2.ffdf write.table.ffdf

Package: ff 4.5.2

ff: Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)