Introduction to Xarray and Dask for Geoscientists
Instructor: Abishek Gopal
Time: Friday, April 23, 10:00AM-12:30PM
Location: Zoom session only
Prerequisites: Terra account. Basic experience with Python and Jupyter notebooks recommended.
Dask lazy loading and parallelization potentially allows researchers to scale their computations from their laptops to supercomputing clusters. We will use model output from CESM, in the netCDF format, to generate climatology and time-averages, visualize quantities, resample data, perform interpolations in the vertical coordinates, and look at how to parallelize xarray-based computations using the dask library.
The presentation slides are available as downloadable PDF files.
- Brief introduction to the Pangeo framework
- Data structures in xarray
- Reading and writing netCDF files using xarray
- Dask chunking and lazy loading
- Computations available through xarray and xgcm
- Visualizing xarray DataArrays with cartopy
- Explicitly parallelizing computations using dask
- Using the dask dashboard to understand memory usage