Setting up a GeoPandas dev environment using the pandas main branch (Meson Edition)
Contents
I’ve previously written a post about how to set up a developer environment for GeoPandas which is based upon the latest commit of pandas on github. With pandas 2.0 (or perhaps a minor patch after), pandas switched to using a meson based build backend - and this updates the instructions accordingly. This post draws from my previous one, plus elements of this post, with some of my own fixes added.
Environment Setup
I’ve been using mambaforge for environment management outside of work for quite a while, and these instructions are assuming that lightly (note, from normal conda, the biggest implicit assumption is that the default channel is conda-forge).
- Clone pandas
git clone git@github.com:<fork_author>/pandas
andgit remote add upstream git@github.com:pandas-dev/pandas
cd
into pandas fork checkout.git fetch upstream
andgit merge upstream/main
git fetch --all --tags
. This is sometimes a gotcha when dealing with compatibility code based on pandas version. In a dev environmentpandas.__version__
is git aware.mamba create -n pandas_dev_meson python=3.11 Cython versioneer pytest pytest-xdist numpy python-dateutil pytz matplotlib pyarrow scipy numpy meson-python meson[ninja] hypothesis pytest-asyncio
, I use this manual list as theenvironment.yml
has a zoo of optional dependencies I don’t overly care about from a GeoPandas context, it also means I get to freely pick which version of python itself to use.mamba activate pandas_dev_meson
- Now we need to invoke meson, I had this crash weirdly, not exactly sure what I needed to remove, but this should be sufficient. Make sure no local changes in git and then
run
rm pandas/_libs/*.pyx
,rm pandas/_libs/*.pxi
rm pandas/_libs/*.pyd
andrm pandas/_libs/*.c
. This will remove some code we actually need, so rungit checkout .
to restore the missing files from the index. pip install -ve . --no-build-isolation --config-settings builddir="builddir" --config-settings editable-verbose=true
. If debug symbols are needed, then apparentlypip install -ve . --no-build-isolation --config-settings builddir="builddir" --config-settings setup-args="-Dbuildtype=debug"
is the magic invocation. Note the flag--config-settings editable-verbose=true
shows output when importing pandas and a rebuild of the underlying extensions is triggered.pre-commit install
(technically optional if only working on geopandas) One significant change of the new build backend is that in editable mode, cython extensions will automatically be re-built at import time, meaning you don’t have to compile
Add GeoPandas on top
mamba install -y fiona pyproj shapely pyogrio black pre-commit ipython jupyterlab
- cd to geopandas fork dir and
python -m pip install -e .
pre-commit install
At this point, if everything has worked right, it should be able to run the geopandas test suite successfully.
Reminder: how to update the pandas code
Despite being reasonably happy reading and editing (and on rare occasion, writing) cython, I never seem to dig into it
enough to remember what the right invocation for the compile step is. This is my self reminder:
One significant change of the new build backend is that in editable mode, cython extensions will automatically be re-built at import time, meaning you don’t have to compile cython file manually, which is a nice quality of life change.
Extra: pyogrio dev env on windows with OSGeo4W.
After experimenting with this, I’ve reverted to using WSL as it seems less flaky overall, ended up with a broken environment down the track and not exactly sure why These are my notes on installing pyogrio from source on windows, which flesh out the notes in (the docs)[https://pyogrio.readthedocs.io/en/latest/install.html#windows]. I’ve done this most recently with GDAL 3.6.4 from OSGeo4W with QGIS 3.30, but also with GDAL 3.5.1 in the past.
- Download OSGeo4W network installer https://www.qgis.org/en/site/forusers/download.html
- (As administrator) run installer for all users, install gdal and gdal-devel (the latter adds header files and populates the \include dir)
- Create conda env
conda create -n pyogrio_dev python=3.11 pandas shapely Cython pyproj ipython pytest pyarrow versioneer
. (Do not install fiona! - this will cause DLL loading errors from the conflicting versions of GDAL. Perhaps this can work if building fiona from source as well, but i haven’t tried.) - Activate the environment:
conda activate pyogrio_dev
- In OSGeo4W shell, run
gdalinfo --version
we need to know the version of GDAL to pass to the installler. - Switch to dir containing checkout of pyogrio
- Install pyogrio
python -m pip install --install-option=build_ext --install-option="-IC:\OSGeo4W\include" --install-option="-lgdal_i" --install-option="-LC:\OSGeo4W\lib" --no-deps --no-use-pep517 --install-option=--gdalversion --install-option=3.6.4 -e . -v
(where you replace 3.6.4 with whatever version of gdal is reported by gdalinfo). Note this looks a bit odd supplying--gdalversion
and3.6.4
separately, but the pyogrio setup code looks specifically for the key--gdalversion
, so we have to pass these as two consecutive arguments. - Alternatively, set environment variables
$env:GDAL_VERSION="3.6.4"; $env:GDAL_LIBRARY_PATH="C:\OSGeo4W\lib"; $env:GDAL_INCLUDE_PATH="C:\OSGeo4W\include"
and runpython -m pip install --no-deps --force-reinstall --no-use-pep517 -e . -v
- You might have to set the environment variable
GDAL_DATA
. I’ve now set this to$env:GDAL_DATA="C:\OSGeo4W\apps\gdal\share\gdal"
, but I remember this “just working” in the past. - If everything has gone well, importing pyogrio will work and the tests will pass when run.
pip 23.1 compatibility
In pip 23.1, the --install-option
flag in pip was removed. For now, it seems that using --config-settings
(the
apparent replace) doesn’t behave. Instead, supply the environment variables as in (9). There’s potentially some work to
do on the packaging of pyogrio to make this a little easier, but not a packaging expert.
Extra: pyogrio linux install
- conda env
mamba create -n pyogrio_dev python=3.11 pandas shapely Cython pyproj ipython pytest pyarrow versioneer gdal
- clone pyogrio & fetch tags
- Get us a GDAL to build against:
- Using apt:
sudo apt install gdal-bin
andsudo apt-get install libgdal-dev
- Using conda: I’m yet to actually test this directly because it seems to require there to not be a system GDAL /
system GDAL not on path, and I can’t do that without breaking my existing environment. Originally I presumed this
wouldn’t install the requisite header files to build other packages against, but pyogrio does this in its CI. In
theory though, this is great because you’re not tied to the version of GDAL bundled into debian stable, and don’t
have to updated ubuntu to get a new version of GDAL.
mamba install gdal
- Using apt:
python setup.py develop
pip install --no-deps geopandas
- don’t want to install fiona which has another version of GDAL bundled into the wheel (could perhaps use conda/ mamba to install this too)
Author Matt Richards
LastMod September 10, 2023 (a5f2d7d)