🎀 Awesome Zarr resources
Zarr is a cloud-native, chunked, compressed, and hierarchical array data format.
The Zarr website is already an excellent resource for learning about Zarr and its ecosystem. This list is intended to complement the website with a curated and opinionated list of resources.
This list focuses on Geo/Earth Sciences, but is not limited to that domain.
Lists
Introductory talks Youtube playlist
Two excellent and up-to-date introductory talks:
Zarr V3 is the upcoming version of Zarr. It is a major update that will bring many new features and improvements.
If you're getting into Zarr now, it might be a good idea to start with Zarr V3.
For an excellent in-depth overview, see the ESIP series of talks
This list contains libraries that directly relate to Zarr in some way.
For implementations of Zarr, see Zarr Implementations.
Storage & I/O
ETL
Developer-oriented
Visualization: For tools & libraries for visualization, see visualization section
Kerchunk allows you to efficiently read chunked data formats such as GRID, NetCDF, COGs by exposing them as a Zarr store.
Talks and tutorials
In the future, Kerchunk will be split into upstream functionality in Zarr itself and a new VirtualiZarr package.
Existing lists
Talks
Zarr has seen great adoption in the life sciences domain.
Talks and resources
Zarr has seen most work on visualization in the bioimaging community:
For a general overview, see
Essentially all other common array data formats can be exposed as Zarr. See Kerchunk.
Zarr, NetCDF, and HDF5 are three separate data formats that nonetheless relate to each other in multiple ways.
Resources
Zarr and N5 are two similar array data formats that share common goals and development.
The Zarr V3 spec aims to provide a common implementation target (sources: 1, 2)
Links
GeoZarr is a proposal for a Zarr-based geospatial data format, being submitted as an OGC standard
GeoZarr will define a metadata convention for Zarr stores that contain geospatial data.
It will also define the relationship of Zarr with CF and NetCDF
Links
STAC provides a common structure for describing and cataloging spatiotemporal assets.
With its hierarchical structure and key-value metadata support, Zarr's capabilities overlap significantly with STAC.
The communities have not yet converged on a canonical representation of Zarr datasets through STAC.
Today, a good example of exposing Zarr in STAC is Planetary Computer
More discussion & Related links
In the future, the Zarr V3 Spec and GeoZarr convention will likely enable greater interoperability between STAC and Zarr.