Shortcuts

AlternativesΒΆ

TorchGeo is not the only geospatial machine learning library out there, there are a number of alternatives that you can consider using. The goal of this page is to provide an up-to-date listing of these libraries and the features they support in order to help you decide which library is right for you. Criteria for inclusion on this list include:

  • geospatial: Must be primarily intended for working with geospatial, remote sensing, or satellite imagery data. This rules out libraries like torchvision, which provides little to no support for multispectral data or geospatial transforms.

  • machine learning: Must provide basic machine learning functionality. This rules out libraries like GDAL, which is useful for data loading but offers no support for machine learning.

  • library: Must be an actively developed software library with testing and releases on repositories like PyPI or CRAN. This rules out libraries like TorchSat, RoboSat, and Solaris, which have been abandoned and are no longer maintained.

When deciding which library is most useful to you, it is worth considering the features they support, how actively the library is being developed, and how popular the library is, roughly in that order.

Note

Software is a living, breathing organism and is constantly undergoing change. If any of the above information is incorrect or out of date, or if you want to add a new project to this list, please open a PR!

Last updated: 30 November 2024

FeaturesΒΆ

Key: βœ… full support, 🚧 partial support, ❌ no support

Library

ML Backend

I/O Backend

Spatial Backend

Transform Backend

Datasets

Weights

CLI

Reprojection

STAC

Time-Series

TorchGeo

PyTorch

GDAL, h5py, laspy, OpenCV, pandas, pillow, scipy

R-tree

Kornia

92

69

βœ…

βœ…

❌

🚧

eo-learn

scikit-learn

GDAL, OpenCV, pandas

geopandas

numpy

0

0

❌

βœ…

❌

🚧

Raster Vision

PyTorch, TensorFlow*

GDAL, OpenCV, pandas, pillow, scipy, xarray

STAC

Albumentations

0

6

βœ…

βœ…

βœ…

βœ…

PaddleRS

PaddlePaddle

GDAL, OpenCV

shapely

numpy

7

14

🚧

βœ…

❌

🚧

segment-geospatial

PyTorch

GDAL, OpenCV, pandas

geopandas

numpy

0

0

❌

βœ…

❌

❌

DeepForest

PyTorch

GDAL, OpenCV, pandas, pillow, scipy

R-tree

Albumentations

0

4

❌

❌

❌

❌

TerraTorch

PyTorch

GDAL, h5py, pandas, xarray

R-tree

Albumentations

22

1

βœ…

βœ…

❌

🚧

SITS

R Torch

GDAL

tidyverse

22

0

❌

βœ…

βœ…

βœ…

scikit-eo

scikit-learn, TensorFlow

pandas, scipy, numpy, rasterio

geopandas

numpy

0

0

❌

❌

❌

🚧

*Support for TensorFlow was dropped in Raster Vision 0.12.

ML Backend: The machine learning libraries used by the project. For example, if you are a scikit-learn user, eo-learn may be perfect for you, but if you need more advanced deep learning support, you may want to choose a different library.

I/O Backend: The I/O libraries used by the project to read data. This gives you a rough idea of which file formats are supported. For example, if you need to work with lidar data, a project that uses laspy may be important to you.

Spatial Backend: The spatial library used to perform spatial joins and compute intersections based on geospatial metadata. This may be important to you if you intend to scale up your simulations.

Transform Backend: The transform library used to perform data augmentation. For example, Kornia performs all augmentations on PyTorch Tensors, allowing you to run your transforms on the GPU for an entire mini-batch at a time.

Datasets: The number of geospatial datasets built into the library. Note that most projects have something similar to TorchGeo’s RasterDataset and VectorDataset, allowing you to work with generic raster and vector files. Collections of datasets are only counted a single time, so data loaders for Landsats 1–9 are a single dataset, and data loaders for SpaceNets 1–8 are also a single dataset.

Weights: The number of model weights pre-trained on geospatial data that are offered by the library. Note that most projects support hundreds of model architectures via a library like PyTorch Image Models, and can use models pre-trained on ImageNet. There are far fewer libraries that provide foundation model weights pre-trained on multispectral satellite imagery.

CLI: Whether or not the library has a command-line interface. This low-code or no-code solution is convenient for users with limited programming experience, and can offer nice features for reproducing research and fast experimentation.

Reprojection: Whether or not the library supports automatic reprojection and resampling of data. Without this, users are forced to manually warp data using a library like GDAL if they want to combine datasets in different coordinate systems or spatial resolutions.

STAC: Whether or not the library supports the spatiotemporal asset catalog. STAC is becoming a popular means of indexing into spatiotemporal data like satellite imagery.

Time-Series: Whether or not the library supports time-series modeling. For many remote sensing applications, time-series data provide important signals.

GitHubΒΆ

These are metrics that can be scraped from GitHub.

Library

Contributors

Forks

Watchers

Stars

Issues

PRs

Releases

Commits

Core SLOCs

Test SLOCs

Test Coverage

License

TorchGeo

76

352

51

2,790

446

1,860

13

2,193

29,305

17,294

100%

MIT

eo-learn

40

299

46

1,131

160

640

45

2,472

7,497

5,872

92%

MIT

Raster Vision

32

388

71

2,090

701

1,430

23

3,614

21,734

8,792

90%

Apache-2.0

PaddleRS

23

91

13

400

93

116

3

644

20,679

3,239

48%

Apache-2.0

segment-geospatial

20

316

61

3,078

150

136

38

229

6,845

92

22%

MIT

DeepForest

17

176

17

524

439

351

47

938

3,320

1,886

86%

MIT

TerraTorch

16

24

12

171

78

185

8

606

14,933

2,077

44%

Apache-2.0

SITS

14

78

28

483

654

590

44

6,244

24,284

8,697

94%

GPL-2.0

scikit-eo

7

20

9

192

24

13

17

510

1,617

170

37%

Apache-2.0

Contributors: The number of contributors. This is one of the most important metrics for project development. The more developers you have, the higher the bus factor, and the more likely the project is to survive. More contributors also means more new features and bug fixes.

Forks: The number of times the git repository has been forked. This gives you an idea of how many people are attempting to modify the source code, even if they have not (yet) contributed back their changes.

Watchers: The number of people watching activity on the repository. These are people who are interested enough to get notifications for every issue, PR, release, or discussion.

Stars: The number of people who have starred the repository. This is not the best metric for number of users, and instead gives you a better idea about the amount of hype surrounding the project.

Issues: The total number of open and closed issues. Although it may seem counterintuitive, the more issues, the better. Large projects like PyTorch have tens of thousands of open issues. This does not mean that PyTorch is broken, it means that it is popular and has enough issues to discover corner cases or open feature requests.

PRs: The total number of open and closed pull requests. This tells you how active development of the project has been. Note that this metric can be artificially inflated by bots like dependabot.

Releases: The number of software releases. The frequency of releases varies from project to project. The important thing to look for is multiple releases.

Commits: The number of commits on the main development branch. This is another metric for how active development has been. However, this can vary a lot depending on whether PRs are merged with or without squashing first.

Core SLOCs: The number of source lines of code in the core library, excluding empty lines and comments. This tells you how large the library is, and how long it would take someone to write something like it themselves. We use scc to compute SLOCs and exclude markup languages from the count.

Test SLOCs: The number of source lines of code in the testing suite, excluding empty lines and comments. This tells you how well tested the project is. A good goal to strive for is a similar amount of code for testing as there is in the core library itself.

Test Coverage: The percentage of the core library that is hit by unit tests. This is especially important for interpreted languages like Python and R where there is no compiler type checking. 100% test coverage is ideal, but 80% is considered good.

License: The license the project is distributed under. For commercial researchers, this may be very important and decide whether or not they are able to use the software.

DownloadsΒΆ

These are download metrics for the project. Note that these numbers can be artificially inflated by mirrors and installs during continuous integration. They give you a better idea of the number of projects that depend on a library than the number of users of that library.

Library

PyPI/CRAN Last Week

PyPI/CRAN Last Month

PyPI/CRAN All Time

Conda All Time

Total All Time

TorchGeo

8,435

30,948

311,897

25,174

337,071

eo-learn

309

2,370

156,309

40,325

196,634

Raster Vision

9,198

31,588

115,670

3,968

119,638

PaddleRS

16

53

2,029

0

2,029

segment-geospatial

1,956

10,689

157,443

26,576

184,019

DeepForest

767

13,925

827,339

71,367

898,706

TerraTorch

318

1,322

7,037

0

7,037

SITS

120

539

14,618

78,976

91,743

scikit-eo

115

717

14,700

0

14,700

PyPI Downloads: The number of downloads from the Python Packaging Index. PyPI download metrics are computed by PyPI Stats and PePy.

CRAN Downloads: The number of downloads from the Comprehensive R Archive Network. CRAN download metrics are computed by Meta CRAN and DataScienceMeta.

Conda Downloads: The number of downloads from Conda Forge. Conda download metrics are computed by Conda Forge.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources