Python tools for geographic data
Small bug-fix release with this critical fix:
geopandas.read_file
(#2908). This
restores the behaviour to download all data up-front before passing it to the
underlying engine (fiona or pyogrio), except if the server supports partial requests
(to support reading a subset of a large file).New methods:
sample_points
method to sample random points from Polygon or LineString geometries (#2860).hilbert_distance()
method that calculates the distance along a Hilbert curve for each geometry in a GeoSeries/GeoDataFrame (#2297).sort_values()
) based on the distance along the Hilbert curve (#2070).get_coordinates()
method from shapely to GeoSeries/GeoDataframe (#2624).minimum_bounding_circle()
method from shapely to GeoSeries/GeoDataframe (#2621).minimum_bounding_radius()
as GeoSeries method (#2827).Other new features and improvements:
GeoSeries.fillna
via another GeoSeries
(#2535).min_zoom
and max_zoom
inside the map_kwds
argument for .explore()
(#2599).mode="a"
or append=True
) in to_file()
using engine="pyogrio"
(#2788).to_wgs84
keyword to to_json
allowing automatic re-projecting to follow the 2016 GeoJSON specification (#416).to_json
output now includes a "crs"
field if the CRS is not the default WGS84 (#1774).geometry
attribute of GeoDataFrame without an active geometry column related to the default name "geometry"
being provided in the constructor (#2577)Deprecations and compatibility notes:
unary_union
will return 'GEOMETRYCOLLECTION EMPTY'
instead of None for all-None GeoSeries. (#2618)query_bulk()
method of the spatial index .sindex
property is deprecated in favor of query()
(#2823).Bug fixes:
plot()
if an empty or missing geometry is present (#2224)explore()
(#2657)to_parquet
/to_feather
to not write an invalid bbox (with NaNs) in the metadata in case of an empty GeoDataFrame (#2653)to_parquet
/to_feather
to use correct WKB flavor for 3D geometries (#2654)read_file
to avoid reading all file bytes prior to calling Fiona or Pyogrio if provided a URL as input (#2796)copy()
downcasting GeoDataFrames without an active geometry column to a DataFrame (#2775)iterfeatures()
method of GeoDataFrame to correctly handle non-scalar values when na='drop'
is specified (#2811)plot
(#2886)Notes on (optional) dependencies:
Acknowledgments
Thanks to everyone who contributed to this release! A total of 32 people contributed patches to this release. People with a "+" by their names contributed a patch for the first time.
Bug fixes:
to_crs()
when using PyGEOS or
Shapely >= 2.0 (previously the z coordinates were lost) (#1345).naturalearth_lowres
built-in dataset (#2670)Small bug-fix release removing the shapely<2 pin in the installation requirements.
The highlight of this release is the support for Shapely 2.0. This makes it possible to test Shapely 2.0 (currently 2.0b1) alongside GeoPandas.
Note that if you also have PyGEOS installed, you need to set an environment variable (USE_PYGEOS=0
) before importing geopandas to actually test Shapely 2.0 features instead of PyGEOS. See https://geopandas.org/en/latest/getting_started/install.html#using-the-optional-pygeos-dependency for more details.
New features and improvements:
normalize()
method from shapely to GeoSeries/GeoDataframe (#2537).make_valid()
method from shapely to GeoSeries/GeoDataframe (#2539).where
filter to read_file
(#2552).Deprecations and compatibility notes:
crs
of a GeoDataFrame
without active geometry column was deprecated and this now raises an AttributeError (#2578)..explore()
for recent Matplotlib versions (#2596).Bug fixes:
geopandas.clip()
when clipping with an empty geometry (#2589).gdf.geometry
where the active geometry column is missing, and a column named "geometry"
is present will now raise an AttributeError
, rather than returning gdf["geometry"]
(#2575).pandas.concat
will no longer silently override CRS information if not all inputs have the same CRS (#2056).Acknowledgments
Thanks to everyone who contributed to this release! A total of 17 people contributed patches to this release. People with a "+" by their names contributed a patch for the first time.
Small bug-fix release:
unstack()
and pivot()
involving MultiIndex, or GeoDataFrame construction with MultiIndex (#2486).GeoDataFrame.explode()
with non-default geometry column name.apply()
causing row-wise all nan float columns to be casted to GeometryDtype (#2482).version
in to_parquet
and to_feather
. As a result, the version
parameter for the to_parquet
and to_feather
methods has been replaced with schema_version
. version
will be passed directly to underlying feather or parquet writer. version
will only be used to set schema_version
if version
is one of 0.1.0 or 0.4.0 (#2496).Highlights of this release:
geopandas.read_file()
and GeoDataFrame.to_file()
methods to read and write GIS file formats can now optionally use the pyogrio package under the hood through the engine="pyogrio"
keyword. The pyogrio package implements vectorized IO for GDAL/OGR vector data sources, and is faster compared to the fiona
-based engine (#2225).UserWarning
(#2327).New features and improvements:
Improved handling of GeoDataFrame when the active geometry column is lost from the GeoDataFrame. Previously, square bracket indexing gdf[[...]]
returned a GeoDataFrame when the active geometry column was retained and a DataFrame was returned otherwise. Other pandas indexing methods (loc
, iloc
, etc) did not follow the same rules. The new behaviour for all indexing/reshaping operations is now as follows (#2329, #2060):
DataFrame
containing the active geometry column, a GeoDataFrame is returnedDataFrame
containing GeometryDtype
columns, but not the active geometry column, a GeoDataFrame
is returned, where the active geometry column is set to None
(set the new geometry column with set_geometry()
)DataFrame
containing no GeometryDtype
columns, a DataFrame
is returned (this can be upcast again by calling set_geometry()
or the GeoDataFrame
constructor)Series
of GeometryDtype
, a GeoSeries
is returned, otherwise Series
is returned.Datetime fields are now read and written correctly for GIS formats which support them (e.g. GPKG, GeoJSON) with fiona 1.8.14 or higher. Previously, datetimes were read as strings (#2202).
folium.Map
keyword arguments can now be specified as the map_kwds
argument to GeoDataFrame.explore()
method (#2315).
Add a new parameter style_function
to GeoDataFrame.explore()
to enable plot styling based on GeoJSON properties (#2377).
It is now possible to write an empty GeoDataFrame
to a file for supported formats (#2240). Attempting to do so will now emit a UserWarning
instead of a ValueError
.
Fast rectangle clipping has been exposed as GeoSeries/GeoDataFrame.clip_by_rect()
(#1928).
The mask
parameter of GeoSeries/GeoDataFrame.clip()
now accepts a rectangular mask as a list-like to perform fast rectangle clipping using the new GeoSeries/GeoDataFrame.clip_by_rect()
(#2414).
Bundled demo dataset naturalearth_lowres
has been updated to version 5.0.1 of the source, with field ISO_A3
manually corrected for some cases (#2418).
Deprecations and compatibility notes:
GeometryArray.equals_exact()
and GeometryArray.almost_equals()
have been removed. They should
be replaced with GeometryArray.geom_equals_exact()
and GeometryArray.geom_almost_equals()
respectively (#2267).explicit_crs_from_epsg()
, epsg_from_crs()
and get_epsg_file_contents()
were removed (#2340).GeoSeries.isna()
with empty geometries present has been removed (#2349).GeoDataFrame/GeoSeries
constructor which contradicted the underlying GeometryArray
now raises a ValueError
(#2100).GeoDataFrame
constructor when no geometry column is provided and calling GeoDataFrame. set_crs
on a GeoDataFrame
without an active geometry column now raise a ValueError
(#2100)GeoSeries
constructor is now fully deprecated and will raise a TypeError
(#2314). Previously, a pandas.Series
was returned for non-geometry data.GeoSeries/GeoDataFrame
set operations __xor__()
, __or__()
, __and__()
and __sub__()
, geopandas.io.file.read_file
/to_file
and geopandas.io.sql.read_postgis
now emit FutureWarning
instead of DeprecationWarning
and will be completely removed in a future release.crs
of a GeoDataFrame
without active geometry column is deprecated and will be removed in GeoPandas 0.12 (#2373).Bug fixes:
GeoSeries.to_frame
now creates a GeoDataFrame
with the geometry column name set correctly (#2296)UnboundLocalError
in GeoDataFrame.plot()
using legend=True
and missing_kwds
(#2281).explode()
incorrectly relating index to columns, including where the input index is not unique (#2292)GeoSeries.[xyz]
raising an IndexError
when the underlying GeoSeries contains empty points (#2335). Rows corresponding to empty points now contain np.nan
.GeoDataFrame.iloc
raising a TypeError
when indexing a GeoDataFrame
with only a single column of GeometryDtype
(#1970).GeoDataFrame.iterfeatures()
not returning features with the same field order as GeoDataFrame.columns
(#2396).GeoDataFrame.from_features()
to support reading GeoJSON with null properties (#2243).GeoDataFrame.to_parquet()
not intercepting engine
keyword argument, breaking consistency with pandas (#2227)GeoDataFrame.explore()
producing an error when column
is of boolean dtype (#2403).GeoDataFrame.to_postgis()
output the wrong SRID for ESRI authority CRS (#2414).GeoDataFrame.from_dict/from_features
classmethods using GeoDataFrame
rather than cls
as the constructor.GeoDataFrame.plot()
producing incorrect colors with mixed geometry types when colors
keyword is provided. (#2420)Notes on (optional) dependencies:
Acknowledgments
Thanks to everyone who contributed to this release! A total of 31 people contributed patches to this release. People with a "+" by their names contributed a patch for the first time.
Small bug-fix release:
overlay()
in case no geometries are intersecting (but have overlapping total bounds) (#2172).overlay()
with keep_geom_type=True
in case the overlay of two geometries in a GeometryCollection with other geometry types (#2177).overlay()
to honor the keep_geom_type
keyword for the op="differnce"
case (#2164).plot()
with a mapclassify scheme
in case the formatted legend labels have duplicates (#2166).explore()
method ignoring the vmin
and vmax
keywords in case they are set to 0 (#2175).unary_union
to correctly handle a GeoSeries with missing values (#2181).clip()
(#2179).Small bug-fix release:
overlay()
with non-overlapping geometries and a non-default how (i.e. not "intersection") (#2157).Highlights of this release:
sjoin_nearest()
method to join based on proximity, with the
ability to set a maximum search radius (#1865). In addition, the sindex
attribute gained a new method for a "nearest" spatial index query (#1865,
#2053).explore()
method on GeoDataFrame and GeoSeries with native support
for interactive visualization based on folium / leaflet.js (#1953)geopandas.sjoin()
/overlay()
/clip()
functions are now also
available as methods on the GeoDataFrame (#2141, #1984, #2150).New features and improvements:
value_counts()
method for geometry dtype (#2047).explode()
method has a new ignore_index
keyword (consistent with
pandas' explode method) to reset the index in the result, and a new
index_parts
keywords to control whether a cumulative count indexing the
parts of the exploded multi-geometries should be added (#1871).points_from_xy()
is now available as a GeoSeries method from_xy
(#1936).to_file()
method will now attempt to detect the driver (if not
specified) based on the extension of the provided filename, instead of
defaulting to ESRI Shapefile (#1609).storage_options
keyword in read_parquet()
for
specifying filesystem-specific options (e.g. for S3) based on fsspec (#2107).~
(user home directory) expansion (#1876).convert_dtypes()
method from pandas to preserve the
GeoDataFrame class (#2115).GeoSeries.from_wkb()
(#2106).estimate_utm_crs()
method to handle crossing the antimeridian
with pyproj 3.1+ (#2049).geocode()
from GeoCode.Farm to the Photon
geocoding API (https://photon.komoot.io) (#2007).Deprecations and compatibility notes:
op=
keyword of sjoin()
to indicate which spatial predicate to use
for joining is being deprecated and renamed in favor of a new predicate=
keyword (#1626).cascaded_union
attribute is deprecated, use unary_union
instead (#2074).pd.concat(.., axis=1)
function if this results in duplicated active geometry columns (#2046).explode()
method currently returns a GeoSeries/GeoDataFrame with a
MultiIndex, with an additional level with indices of the parts of the
exploded multi-geometries. For consistency with pandas, this will change in
the future and the new index_parts
keyword is added to control this.Bug fixes:
clip()
function to correctly clip MultiPoints instead of
leaving them intact when partly outside of the clip bounds (#2148).GeoSeries.isna()
to correctly return a boolean Series in case of an
empty GeoSeries (#2073).GeoDataFrame(gdf)
) (#2138).GeoDataFrame.__setitem__
) (#1963)GeoDataFrame.apply()
to preserve the active geometry column name
(#1955).sjoin()
to not ignore the suffixes in case of a right-join
(how="right
) (#2065).GeoDataFrame.explode()
with a MultiIndex (#1945).to/from_wkb
and to_from_wkt
(#1891).to_file()
and to_json()
when DataFrame has duplicate columns to
raise an error (#1900).path_effects
keyword in plot()
(#2127).GeoDataFrame.explode()
to preserve attrs
(#1935)Notes on (optional) dependencies:
Acknowledgments
Thanks to everyone who contributed to this release! A total of 29 people contributed patches to this release. People with a "+" by their names contributed a patch for the first time.