Shortcuts

AlternativesΒΆ

TorchGeo is not the only geospatial machine learning library out there, there are a number of alternatives that you can consider using. The goal of this page is to provide an up-to-date listing of these libraries and the features they support in order to help you decide which library is right for you. Criteria for inclusion on this list include:

  • geospatial: Must be primarily intended for working with geospatial, remote sensing, or satellite imagery data. This rules out libraries like torchvision, which provides little to no support for multispectral data or geospatial transforms.

  • machine learning: Must provide basic machine learning functionality. This rules out libraries like GDAL, which is useful for data loading but offers no support for machine learning.

  • library: Must be an actively developed software library with testing and releases on repositories like PyPI or CRAN. This rules out libraries like TorchSat, RoboSat, and Solaris, which have been abandoned and are no longer maintained.

When deciding which library is most useful to you, it is worth considering the features they support, how actively the library is being developed, and how popular the library is, roughly in that order.

Note

Software is a living, breathing organism and is constantly undergoing change. If any of the above information is incorrect or out of date, or if you want to add a new project to this list, please open a PR!

Last updated: 28 August 2024

FeaturesΒΆ

Key: βœ… full support, 🚧 partial support, ❌ no support

Library

ML Backend

I/O Backend

Spatial Backend

Transform Backend

Datasets

Weights

CLI

Reprojection

STAC

Time-Series

TorchGeo

PyTorch

GDAL, h5py, laspy, OpenCV, pandas, pillow, scipy

R-tree

Kornia

82

68

βœ…

βœ…

❌

🚧

eo-learn

scikit-learn

GDAL, OpenCV, pandas

geopandas

numpy

0

0

❌

βœ…

❌

🚧

Raster Vision

PyTorch, TensorFlow*

GDAL, OpenCV, pandas, pillow, scipy, xarray

STAC

Albumentations

0

6

βœ…

βœ…

βœ…

βœ…

PaddleRS

PaddlePaddle

GDAL, OpenCV

shapely

numpy

7

14

🚧

βœ…

❌

🚧

segment-geospatial

PyTorch

GDAL, OpenCV, pandas

geopandas

numpy

0

0

❌

βœ…

❌

❌

DeepForest

PyTorch

GDAL, OpenCV, pandas, pillow, scipy

R-tree

Albumentations

0

3

❌

❌

❌

❌

SITS

R Torch

GDAL

tidyverse

22

0

❌

βœ…

βœ…

βœ…

TerraTorch

PyTorch

GDAL, h5py, pandas, xarray

R-tree

Albumentations

16

1

βœ…

βœ…

❌

🚧

scikit-eo

scikit-learn, TensorFlow

pandas, scipy, numpy, rasterio

geopandas

numpy

0

0

❌

❌

❌

🚧

*Support for TensorFlow was dropped in Raster Vision 0.12.

ML Backend: The machine learning libraries used by the project. For example, if you are a scikit-learn user, eo-learn may be perfect for you, but if you need more advanced deep learning support, you may want to choose a different library.

I/O Backend: The I/O libraries used by the project to read data. This gives you a rough idea of which file formats are supported. For example, if you need to work with lidar data, a project that uses laspy may be important to you.

Spatial Backend: The spatial library used to perform spatial joins and compute intersections based on geospatial metadata. This may be important to you if you intend to scale up your simulations.

Transform Backend: The transform library used to perform data augmentation. For example, Kornia performs all augmentations on PyTorch Tensors, allowing you to run your transforms on the GPU for an entire mini-batch at a time.

Datasets: The number of geospatial datasets built into the library. Note that most projects have something similar to TorchGeo’s RasterDataset and VectorDataset, allowing you to work with generic raster and vector files. Collections of datasets are only counted a single time, so data loaders for Landsats 1–9 are a single dataset, and data loaders for SpaceNets 1–8 are also a single dataset.

Weights: The number of model weights pre-trained on geospatial data that are offered by the library. Note that most projects support hundreds of model architectures via a library like PyTorch Image Models, and can use models pre-trained on ImageNet. There are far fewer libraries that provide foundation model weights pre-trained on multispectral satellite imagery.

CLI: Whether or not the library has a command-line interface. This low-code or no-code solution is convenient for users with limited programming experience, and can offer nice features for reproducing research and fast experimentation.

Reprojection: Whether or not the library supports automatic reprojection and resampling of data. Without this, users are forced to manually warp data using a library like GDAL if they want to combine datasets in different coordinate systems or spatial resolutions.

STAC: Whether or not the library supports the spatiotemporal asset catalog. STAC is becoming a popular means of indexing into spatiotemporal data like satellite imagery.

Time-Series: Whether or not the library supports time-series modeling. For many remote sensing applications, time-series data provide important signals.

GitHubΒΆ

These are metrics that can be scraped from GitHub.

Library

Contributors

Forks

Watchers

Stars

Issues

PRs

Releases

Commits

Core SLOCs

Test SLOCs

Test Coverage

License

TorchGeo

72

308

44

2,409

419

1,714

11

2,074

30,761

16,058

100%

MIT

eo-learn

40

300

46

1,108

159

638

44

2,470

8,207

5,932

92%

MIT

Raster Vision

32

381

71

2,046

697

1,382

22

3,614

22,779

9,429

90%

Apache-2.0

PaddleRS

23

89

13

374

91

116

3

644

21,859

3,384

48%

Apache-2.0

segment-geospatial

17

281

55

2,834

129

104

27

186

5,598

92

22%

MIT

DeepForest

17

172

17

474

413

301

44

864

3,357

1,794

86%

MIT

SITS

14

76

28

451

622

583

44

6,244

24,284

8,697

94%

GPL-2.0

TerraTorch

9

10

9

121

46

92

2

243

10,101

583

44%

Apache-2.0

scikit-eo

6

17

8

132

20

11

15

496

1,636

94

37%

Apache-2.0

Contributors: The number of contributors. This is one of the most important metrics for project development. The more developers you have, the higher the bus factor, and the more likely the project is to survive. More contributors also means more new features and bug fixes.

Forks: The number of times the git repository has been forked. This gives you an idea of how many people are attempting to modify the source code, even if they have not (yet) contributed back their changes.

Watchers: The number of people watching activity on the repository. These are people who are interested enough to get notifications for every issue, PR, release, or discussion.

Stars: The number of people who have starred the repository. This is not the best metric for number of users, and instead gives you a better idea about the amount of hype surrounding the project.

Issues: The total number of open and closed issues. Although it may seem counterintuitive, the more issues, the better. Large projects like PyTorch have tens of thousands of open issues. This does not mean that PyTorch is broken, it means that it is popular and has enough issues to discover corner cases or open feature requests.

PRs: The total number of open and closed pull requests. This tells you how active development of the project has been. Note that this metric can be artificially inflated by bots like dependabot.

Releases: The number of software releases. The frequency of releases varies from project to project. The important thing to look for is multiple releases.

Commits: The number of commits on the main development branch. This is another metric for how active development has been. However, this can vary a lot depending on whether PRs are merged with or without squashing first.

Core SLOCs: The number of source lines of code in the core library, excluding empty lines and comments. This tells you how large the library is, and how long it would take someone to write something like it themselves. We use scc to compute SLOCs and exclude markdown languages from the count.

Test SLOCs: The number of source lines of code in the testing suite, excluding empty lines and comments. This tells you how well tested the project is. A good goal to strive for is a similar amount of code for testing as there is in the core library itself.

Test Coverage: The percentage of the core library that is hit by unit tests. This is especially important for interpreted languages like Python and R where there is no compiler type checking. 100% test coverage is ideal, but 80% is considered good.

License: The license the project is distributed under. For commercial researchers, this may be very important and decide whether or not they are able to use the software.

DownloadsΒΆ

These are download metrics for the project. Note that these numbers can be artificially inflated by mirrors and installs during continuous integration. They give you a better idea of the number of projects that depend on a library than the number of users of that library.

Library

PyPI/CRAN Last Week

PyPI/CRAN Last Month

PyPI/CRAN All Time

Conda All Time

Total All Time

TorchGeo

1,828

9,789

255,293

21,108

276,401

eo-learn

319

1,560

141,983

36,205

178,188

Raster Vision

138

652

61,938

3,254

65,192

PaddleRS

10

36

1,642

0

1,642

segment-geospatial

1,553

7,363

117,664

18,147

135,811

DeepForest

564

3,652

761,520

62,869

824,389

SITS

304

648

12,767

78,976

91,743

TerraTorch

259

988

2,378

0

2,378

scikit-eo

162

621

12,048

0

12,048

PyPI Downloads: The number of downloads from the Python Packaging Index. PyPI download metrics are computed by PyPI Stats and PePy.

CRAN Downloads: The number of downloads from the Comprehensive R Archive Network. CRAN download metrics are computed by Meta CRAN and DataScienceMeta.

Conda Downloads: The number of downloads from Conda Forge. Conda download metrics are computed by Conda Forge.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources