Posts tagged p2p
Shuffling large data at constant memory in Dask
- 15 March 2023
With release 2023.2.1
, dask.dataframe
introduces a new shuffling method called P2P, making sorts, merges, and joins faster and using constant memory.
Benchmarks show impressive improvements: