lzf is a lossless compression filter available in HDF5. Like gzip, decompressed data is bit-for-bit identical to the original, but lzf is designed for speed rather than compression ratio. It compresses and decompresses much faster than gzip and produces larger files.
The right use case for lzf is real-time or streaming applications, in-memory caching, and network transmission — places where the time spent decompressing matters more than the disk space saved. For offline storage where compactness matters and write time is negligible, gzip is a better choice.
In h5py, lzf is selected the same way gzip is:
hdf.create_dataset('fast', data=matrix, compression='lzf')lzf doesn’t take a compression level — it’s a fixed algorithm, with no slow-vs-small dial.
The third HDF5 compression option is szip, a lossless extended-Rice implementation tuned for correlated scientific arrays. Between the three, gzip is the safe default (slow but universally readable), lzf is the speed-optimized lossless option, and szip is the specialized one for scientific data — with the caveat that it’s patent-encumbered and not bundled in every HDF5 build.