Posts tagged xarray
Dask Array: Scaling Up for Terabyte-Level Performance | Pangeo Showcase 2025-04-09
- 09 April 2025
Dask Array is the distributed array framework that has been the de-facto standard for running large-scale Xarray computations for many years. Nonetheless, the experience has been mixed. Over the last couple of months, Dask and Xarray maintainers have worked together to improve several shortcomings that have made workloads painful to run (or fail altogether) in the past. In this talk, we will look at some of the improvements that were made, and how they combine with other changes like P2P rechunking to make running array computations at the Terabyte scale effortless.
Dask ❤️ Xarray: Geoscience at Massive Scale | PyData Global 2024
- 05 December 2024
Doing geoscience is hard. It’s even harder if you have to figure out how to handle large amounts of data!
Xarray is an open-source Python library designed to simplify the handling of labeled multi-dimensional arrays, like raster geospatial data, making it a favorite among geoscientists. It allows these scientists to easily express their computations, and is backed by Dask, a Python library for parallel and distributed computing, to scale computations to entire clusters of machines.
Geoscience at Massive Scale | PyData Paris 2024
- 24 September 2024
When scaling geoscience workloads to large datasets, many scientists and developers reach for Dask, a library for distributed computing that plugs seamlessly into Xarray and offers an Array API that wraps NumPy. Featuring a distributed environment capable of running your workload on large clusters, Dask promises to make it easy to scale from prototyping on your laptop to analyzing petabyte-scale datasets.