torchgeo.datamodules¶
Geospatial DataModules¶
AgriFieldNet¶
- class torchgeo.datamodules.AgriFieldNetDataModule(batch_size=64, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the AgriFieldNet dataset.
New in version 0.6.
- __init__(batch_size=64, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new AgriFieldNetDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
AgriFieldNet
.
Chesapeake Land Cover¶
- class torchgeo.datamodules.ChesapeakeCVPRDataModule(train_splits, val_splits, test_splits, batch_size=64, patch_size=256, length=None, num_workers=0, class_set=7, use_prior_labels=False, prior_smoothing_constant=0.0001, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the Chesapeake CVPR Land Cover dataset.
Uses the random splits defined per state to partition tiles into train, val, and test sets.
- __init__(train_splits, val_splits, test_splits, batch_size=64, patch_size=256, length=None, num_workers=0, class_set=7, use_prior_labels=False, prior_smoothing_constant=0.0001, **kwargs)[source]¶
Initialize a new ChesapeakeCVPRDataModule instance.
- Parameters:
train_splits (list[str]) – Splits used to train the model, e.g., [“ny-train”].
val_splits (list[str]) – Splits used to validate the model, e.g., [“ny-val”].
test_splits (list[str]) – Splits used to test the model, e.g., [“ny-test”].
batch_size (int) – Size of each mini-batch.
patch_size (int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
class_set (int) – The high-resolution land cover class set to use (5 or 7).
use_prior_labels (bool) – Flag for using a prior over high-resolution classes instead of the high-resolution labels themselves.
prior_smoothing_constant (float) – Additive smoothing to add when using prior labels.
**kwargs (Any) – Additional keyword arguments passed to
ChesapeakeCVPR
.
- Raises:
ValueError – If
use_prior_labels=True
is used withclass_set=7
.
- setup(stage)[source]¶
Set up datasets and samplers.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
L7 Irish¶
- class torchgeo.datamodules.L7IrishDataModule(batch_size=1, patch_size=224, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the L7 Irish dataset.
New in version 0.5.
- __init__(batch_size=1, patch_size=224, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new L7IrishDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
L7Irish
.
L8 Biome¶
- class torchgeo.datamodules.L8BiomeDataModule(batch_size=1, patch_size=224, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the L8 Biome dataset.
New in version 0.5.
- __init__(batch_size=1, patch_size=224, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new L8BiomeDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
L8Biome
.
MMFlood¶
- class torchgeo.datamodules.MMFloodDataModule(batch_size=32, patch_size=512, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the MMFlood dataset.
New in version 0.7.
- __init__(batch_size=32, patch_size=512, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new MMFloodDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
MMFlood
.
NAIP¶
- class torchgeo.datamodules.NAIPChesapeakeDataModule(batch_size=64, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the NAIP and Chesapeake datasets.
Uses the train/val/test splits from the dataset.
- __init__(batch_size=64, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new NAIPChesapeakeDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
NAIP
(prefix keys withnaip_
) andChesapeake
(prefix keys withchesapeake_
).
I/O Bench¶
- class torchgeo.datamodules.IOBenchDataModule(batch_size=32, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the I/O benchmark dataset.
New in version 0.6.
- __init__(batch_size=32, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new IOBenchDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
IOBench
.
Sentinel¶
- class torchgeo.datamodules.Sentinel2CDLDataModule(batch_size=64, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the Sentinel-2 and CDL datasets.
New in version 0.6.
- __init__(batch_size=64, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new Sentinel2CDLDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
CDL
(prefix keys withcdl_
) andSentinel2
(prefix keys withsentinel2_
).
- class torchgeo.datamodules.Sentinel2EuroCropsDataModule(batch_size=64, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the EuroCrops and Sentinel2 datasets.
Uses the train/val/test splits from the dataset.
New in version 0.6.
- __init__(batch_size=64, patch_size=256, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new Sentinel2EuroCropsDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
EuroCrops
(prefix keys witheurocrops_
) andSentinel2
(prefix keys withsentinel2_
).
- class torchgeo.datamodules.Sentinel2NCCMDataModule(batch_size=64, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the Sentinel-2 and NCCM dataset.
New in version 0.6.
- __init__(batch_size=64, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new Sentinel2NCCMDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
NCCM
(prefix keys withnccm_
) andSentinel2
(prefix keys withsentinel2_
).
- class torchgeo.datamodules.Sentinel2SouthAmericaSoybeanDataModule(batch_size=64, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule for SouthAmericaSoybean and Sentinel2 datasets.
New in version 0.6.
- __init__(batch_size=64, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new Sentinel2SouthAmericaSoybeanDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
SouthAmericaSoybean
(prefix keys withsouth_america_soybean_
) andSentinel2
(prefix keys withsentinel2_
).
SouthAfricaCropType¶
- class torchgeo.datamodules.SouthAfricaCropTypeDataModule(batch_size=64, patch_size=16, length=None, num_workers=0, **kwargs)[source]¶
Bases:
GeoDataModule
LightningDataModule implementation for the SouthAfricaCropType dataset.
New in version 0.6.
- __init__(batch_size=64, patch_size=16, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new SouthAfricaCropTypeDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
SouthAfricaCropType
.
Non-geospatial DataModules¶
BigEarthNet¶
- class torchgeo.datamodules.BigEarthNetDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the BigEarthNet dataset.
Uses the train/val/test splits from the dataset.
CaBuAr¶
- class torchgeo.datamodules.CaBuArDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the CaBuAr dataset.
Uses the train/val/test splits from the dataset
New in version 0.6.
CaFFe¶
- class torchgeo.datamodules.CaFFeDataModule(batch_size=64, num_workers=0, size=512, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the CaFFe dataset.
Implements the default splits that come with the dataset.
New in version 0.7.
ChaBuD¶
- class torchgeo.datamodules.ChaBuDDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the ChaBuD dataset.
Uses the train/val splits from the dataset
New in version 0.6.
COWC¶
- class torchgeo.datamodules.COWCCountingDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the COWC Counting dataset.
Deep Globe Land Cover Challenge¶
- class torchgeo.datamodules.DeepGlobeLandCoverDataModule(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the DeepGlobe Land Cover dataset.
Uses the train/test splits from the dataset.
- __init__(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Initialize a new DeepGlobeLandCoverDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.val_split_pct (float) – Percentage of the dataset to use as a validation set.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
DeepGlobeLandCover
.
Digital Typhoon¶
- class torchgeo.datamodules.DigitalTyphoonDataModule(split_by='time', batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
Digital Typhoon Data Module.
New in version 0.6.
- __init__(split_by='time', batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new DigitalTyphoonDataModule instance.
- Parameters:
split_by (str) – Either ‘time’ or ‘typhoon_id’, which decides how to split the dataset for train, val, test
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
DigitalTyphoon
.
ETCI2021 Flood Detection¶
- class torchgeo.datamodules.ETCI2021DataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the ETCI2021 dataset.
Splits the existing train split from the dataset into train/val with 80/20 proportions, then uses the existing val dataset as the test data.
New in version 0.2.
- __init__(batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new ETCI2021DataModule instance.
- setup(stage)[source]¶
Set up datasets.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
EuroSAT¶
- class torchgeo.datamodules.EuroSATDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the EuroSAT dataset.
Uses the train/val/test splits from the dataset.
New in version 0.2.
- class torchgeo.datamodules.EuroSATSpatialDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the EuroSATSpatial dataset.
Uses the spatial train/val/test splits from the dataset.
New in version 0.6.
- __init__(batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new EuroSATSpatialDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
EuroSATSpatial
.
- class torchgeo.datamodules.EuroSAT100DataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the EuroSAT100 dataset.
Intended for tutorials and demonstrations, not for benchmarking.
New in version 0.5.
FAIR1M¶
- class torchgeo.datamodules.FAIR1MDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the FAIR1M dataset.
New in version 0.2.
Fields Of The World¶
- class torchgeo.datamodules.FieldsOfTheWorldDataModule(train_countries=['austria'], val_countries=['austria'], test_countries=['austria'], batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the FTW dataset.
New in version 0.7.
- __init__(train_countries=['austria'], val_countries=['austria'], test_countries=['austria'], batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new FTWDataModule instance.
- Parameters:
train_countries (list[str]) – List of countries to use for training.
val_countries (list[str]) – List of countries to use for validation.
test_countries (list[str]) – List of countries to use for testing.
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
FieldsOfTheWorld
.
- Raises:
AssertionError – If ‘countries’ are specified in kwargs
FireRisk¶
- class torchgeo.datamodules.FireRiskDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the FireRisk dataset.
New in version 0.5.
GeoNRW¶
- class torchgeo.datamodules.GeoNRWDataModule(batch_size=64, num_workers=0, size=256, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the GeoNRW dataset.
Implements 80/20 train/val splits based on city locations. See
setup()
for more details.New in version 0.6.
GID-15¶
- class torchgeo.datamodules.GID15DataModule(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the GID-15 dataset.
Uses the train/test splits from the dataset.
New in version 0.4.
- __init__(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Initialize a new GID15DataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.val_split_pct (float) – Percentage of the dataset to use as a validation set
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
GID15
.
HySpecNet-11k¶
- class torchgeo.datamodules.HySpecNet11kDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the HySpecNet11k dataset.
New in version 0.7.
Inria Aerial Image Labeling¶
- class torchgeo.datamodules.InriaAerialImageLabelingDataModule(batch_size=64, patch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the InriaAerialImageLabeling dataset.
Uses the train/test splits from the dataset and further splits the train split into train/val splits.
New in version 0.3.
- __init__(batch_size=64, patch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new InriaAerialImageLabelingDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
InriaAerialImageLabeling
.
- setup(stage)[source]¶
Set up datasets.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
New in version 0.7.
LandCover.ai¶
- class torchgeo.datamodules.LandCoverAIDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the LandCover.ai dataset.
Uses the train/val/test splits from the dataset.
- class torchgeo.datamodules.LandCoverAI100DataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the LandCoverAI100 dataset.
Uses the train/val/test splits from the dataset.
New in version 0.7.
- __init__(batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new LandCoverAI100DataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
LandCoverAI100
.
LEVIR-CD¶
- class torchgeo.datamodules.LEVIRCDDataModule(batch_size=8, patch_size=256, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the LEVIR-CD dataset.
New in version 0.6.
- __init__(batch_size=8, patch_size=256, num_workers=0, **kwargs)[source]¶
Initialize a new LEVIRCDDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
LEVIRCD
.
LEVIR-CD+¶
- class torchgeo.datamodules.LEVIRCDPlusDataModule(batch_size=8, patch_size=256, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the LEVIR-CD+ dataset.
Uses the train/test splits from the dataset and further splits the train split into train/val splits.
New in version 0.6.
- __init__(batch_size=8, patch_size=256, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Initialize a new LEVIRCDPlusDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.val_split_pct (float) – Percentage of the dataset to use as a validation set.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
LEVIRCDPlus
.
LoveDA¶
- class torchgeo.datamodules.LoveDADataModule(batch_size=32, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the LoveDA dataset.
Uses the train/val/test splits from the dataset.
New in version 0.2.
NASA Marine Debris¶
- class torchgeo.datamodules.NASAMarineDebrisDataModule(batch_size=64, num_workers=0, val_split_pct=0.2, test_split_pct=0.2, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the NASA Marine Debris dataset.
New in version 0.2.
- __init__(batch_size=64, num_workers=0, val_split_pct=0.2, test_split_pct=0.2, **kwargs)[source]¶
Initialize a new NASAMarineDebrisDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
val_split_pct (float) – Percentage of the dataset to use as a validation set.
test_split_pct (float) – Percentage of the dataset to use as a test set.
**kwargs (Any) – Additional keyword arguments passed to
NASAMarineDebris
.
OSCD¶
- class torchgeo.datamodules.OSCDDataModule(batch_size=32, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the OSCD dataset.
Uses the train/test splits from the dataset and further splits the train split into train/val splits.
New in version 0.2.
- __init__(batch_size=32, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Initialize a new OSCDDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.val_split_pct (float) – Percentage of the dataset to use as a validation set.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
OSCD
.
- setup(stage)[source]¶
Set up datasets.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
New in version 0.7.
Potsdam¶
- class torchgeo.datamodules.Potsdam2DDataModule(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the Potsdam2D dataset.
Uses the train/test splits from the dataset.
New in version 0.2.
- __init__(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Initialize a new Potsdam2DDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.val_split_pct (float) – Percentage of the dataset to use as a validation set.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
Potsdam2D
.
QuakeSet¶
- class torchgeo.datamodules.QuakeSetDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the QuakeSet dataset.
New in version 0.6.
ReforesTree¶
- class torchgeo.datamodules.ReforesTreeDataModule(batch_size=64, patch_size=64, num_workers=0, val_split_pct=0.2, test_split_pct=0.2, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the ReforesTree dataset.
New in version 0.7.
- __init__(batch_size=64, patch_size=64, num_workers=0, val_split_pct=0.2, test_split_pct=0.2, **kwargs)[source]¶
Initialize a new ReforesTreeDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.num_workers (int) – Number of workers for parallel data loading.
val_split_pct (float) – Percentage of the dataset to use as a validation set.
test_split_pct (float) – Percentage of the dataset to use as a test set.
**kwargs (Any) – Additional keyword arguments passed to
ReforesTree
.
RESISC45¶
- class torchgeo.datamodules.RESISC45DataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the RESISC45 dataset.
Uses the train/val/test splits from the dataset.
Seasonal Contrast¶
- class torchgeo.datamodules.SeasonalContrastS2DataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the Seasonal Contrast dataset.
New in version 0.5.
- __init__(batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new SeasonalContrastS2DataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
SeasonalContrastS2
.
SEN12MS¶
- class torchgeo.datamodules.SEN12MSDataModule(batch_size=64, num_workers=0, band_set='all', **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the SEN12MS dataset.
Implements 80/20 geographic train/val splits and uses the test split from the classification dataset definitions.
Uses the Simplified IGBP scheme defined in the 2020 Data Fusion Competition. See https://arxiv.org/abs/2002.08254.
- DFC2020_CLASS_MAPPING = tensor([ 0, 1, 1, 1, 1, 1, 2, 2, 3, 3, 4, 5, 6, 7, 6, 8, 9, 10])¶
Mapping from the IGBP class definitions to the DFC2020, taken from the dataloader here: https://github.com/lukasliebel/dfc2020_baseline.
- __init__(batch_size=64, num_workers=0, band_set='all', **kwargs)[source]¶
Initialize a new SEN12MSDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
band_set (str) – Subset of S1/S2 bands to use. Options are: “all”, “s1”, “s2-all”, and “s2-reduced” where the “s2-reduced” set includes: B2, B3, B4, B8, B11, and B12.
**kwargs (Any) – Additional keyword arguments passed to
SEN12MS
.
- setup(stage)[source]¶
Set up datasets.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
So2Sat¶
- class torchgeo.datamodules.So2SatDataModule(batch_size=64, num_workers=0, band_set='all', val_split_pct=0.2, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the So2Sat dataset.
If using the version 2 dataset, we use the train/val/test splits from the dataset. If using the version 3 datasets, we use a random 80/20 train/val split from the “train” set and use the “test” set as the test set.
- __init__(batch_size=64, num_workers=0, band_set='all', val_split_pct=0.2, **kwargs)[source]¶
Initialize a new So2SatDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
band_set (str) – One of ‘all’, ‘s1’, ‘s2’, or ‘rgb’.
val_split_pct (float) – Percentage of training data to use for validation in with the version 3 datasets.
**kwargs (Any) – Additional keyword arguments passed to
So2Sat
.
New in version 0.5: The val_split_pct parameter, and the ‘rgb’ argument to band_set.
SpaceNet¶
- class torchgeo.datamodules.SpaceNetBaseDataModule(spacenet_ds_class, batch_size=64, num_workers=0, val_split_pct=0.1, test_split_pct=0.2, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the SpaceNet datasets.
Randomly splits the train split into train/val/test. The test split does not have labels, and is only used for prediction.
New in version 0.7.
- __init__(spacenet_ds_class, batch_size=64, num_workers=0, val_split_pct=0.1, test_split_pct=0.2, **kwargs)[source]¶
Initialize a new SpaceNetBaseDataModule instance.
- Parameters:
spacenet_ds_class (type[torchgeo.datasets.spacenet.base.SpaceNet]) – The SpaceNet dataset class to use.
batch_size (int) – Size of each mini-batch.
val_split_pct (float) – Percentage of the dataset to use as a validation set.
test_split_pct (float) – Percentage of the dataset to use as a test set.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to the SpaceNet dataset.
- setup(stage)[source]¶
Set up datasets.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
- class torchgeo.datamodules.SpaceNet1DataModule(batch_size=64, num_workers=0, val_split_pct=0.1, test_split_pct=0.2, **kwargs)[source]¶
Bases:
SpaceNetBaseDataModule
LightningDataModule implementation for the SpaceNet1 dataset.
Randomly splits the train split into train/val/test. The test split does not have labels, and is only used for prediction.
New in version 0.4.
- __init__(batch_size=64, num_workers=0, val_split_pct=0.1, test_split_pct=0.2, **kwargs)[source]¶
Initialize a new SpaceNet1DataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
val_split_pct (float) – Percentage of the dataset to use as a validation set.
test_split_pct (float) – Percentage of the dataset to use as a test set.
**kwargs (Any) – Additional keyword arguments passed to
SpaceNet1
.
- class torchgeo.datamodules.SpaceNet6DataModule(batch_size=64, num_workers=0, val_split_pct=0.1, test_split_pct=0.2, **kwargs)[source]¶
Bases:
SpaceNetBaseDataModule
LightningDataModule implementation for the SpaceNet6 dataset.
Randomly splits the train split into train/val/test. The test split does not have labels, and is only used for prediction.
New in version 0.7.
- __init__(batch_size=64, num_workers=0, val_split_pct=0.1, test_split_pct=0.2, **kwargs)[source]¶
Initialize a new SpaceNet6DataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
val_split_pct (float) – Percentage of the dataset to use as a validation set.
test_split_pct (float) – Percentage of the dataset to use as a test set.
**kwargs (Any) – Additional keyword arguments passed to
SpaceNet6
.
SSL4EO¶
- class torchgeo.datamodules.SSL4EOLDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the SSL4EO-L dataset.
New in version 0.5.
- class torchgeo.datamodules.SSL4EOS12DataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the SSL4EO-S12 dataset.
New in version 0.5.
SSL4EO-L Benchmark¶
- class torchgeo.datamodules.SSL4EOLBenchmarkDataModule(batch_size=64, patch_size=224, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the SSL4EO-L Benchmark dataset.
New in version 0.5.
Substation¶
- class torchgeo.datamodules.SubstationDataModule(batch_size=64, num_workers=0, val_split_pct=0.2, test_split_pct=0.2, size=256, **kwargs)[source]¶
Bases:
NonGeoDataModule
Substation Data Module with train-test split and transformations.
New in version 0.7.
- __init__(batch_size=64, num_workers=0, val_split_pct=0.2, test_split_pct=0.2, size=256, **kwargs)[source]¶
Initialize a new SubstationDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for data loading.
val_split_pct (float) – Percentage of data to use for validation.
test_split_pct (float) – Percentage of data to use for testing.
size (int) – Size of the input images.
**kwargs (Any) – Additional keyword arguments passed to
Substation
.
SustainBench Crop Yield¶
- class torchgeo.datamodules.SustainBenchCropYieldDataModule(batch_size=32, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule for SustainBench Crop Yield dataset.
New in version 0.5.
- __init__(batch_size=32, num_workers=0, **kwargs)[source]¶
Initialize a new SustainBenchCropYieldDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
SustainBenchCropYield
.
TreeSatAI¶
- class torchgeo.datamodules.TreeSatAIDataModule(batch_size=64, patch_size=304, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the TreeSatAI dataset.
New in version 0.7.
- __init__(batch_size=64, patch_size=304, num_workers=0, **kwargs)[source]¶
Initialize a new TreeSatAIDataModule instance.
- setup(stage)[source]¶
Set up datasets.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
Tropical Cyclone¶
- class torchgeo.datamodules.TropicalCycloneDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the NASA Cyclone dataset.
Implements 80/20 train/val splits based on hurricane storm ids. See
setup()
for more details.Changed in version 0.4: Class name changed from CycloneDataModule to TropicalCycloneDataModule to be consistent with TropicalCyclone dataset.
- __init__(batch_size=64, num_workers=0, **kwargs)[source]¶
Initialize a new TropicalCycloneDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
TropicalCyclone
.
UC Merced¶
- class torchgeo.datamodules.UCMercedDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the UC Merced dataset.
Uses random train/val/test splits.
USAVars¶
- class torchgeo.datamodules.USAVarsDataModule(batch_size=64, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the USAVars dataset.
Uses random train/val/test splits.
New in version 0.3.
Vaihingen¶
- class torchgeo.datamodules.Vaihingen2DDataModule(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the Vaihingen2D dataset.
Uses the train/test splits from the dataset.
New in version 0.2.
- __init__(batch_size=64, patch_size=64, val_split_pct=0.2, num_workers=0, **kwargs)[source]¶
Initialize a new Vaihingen2DDataModule instance.
- Parameters:
batch_size (int) – Size of each mini-batch.
patch_size (tuple[int, int] | int) – Size of each patch, either
size
or(height, width)
. Should be a multiple of 32 for most segmentation architectures.val_split_pct (float) – Percentage of the dataset to use as a validation set.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
Vaihingen2D
.
xView2¶
- class torchgeo.datamodules.XView2DataModule(batch_size=64, num_workers=0, val_split_pct=0.2, **kwargs)[source]¶
Bases:
NonGeoDataModule
LightningDataModule implementation for the xView2 dataset.
Uses the train/val/test splits from the dataset.
New in version 0.2.
Base Classes¶
BaseDataModule¶
- class torchgeo.datamodules.BaseDataModule(dataset_class, batch_size=1, num_workers=0, **kwargs)[source]¶
Bases:
LightningDataModule
Base class for all TorchGeo data modules.
New in version 0.5.
- __init__(dataset_class, batch_size=1, num_workers=0, **kwargs)[source]¶
Initialize a new BaseDataModule instance.
- Parameters:
dataset_class (type[torch.utils.data.dataset.Dataset[dict[str, torch.Tensor]]]) – Class used to instantiate a new dataset.
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
dataset_class
- prepare_data()[source]¶
Download and prepare data.
During distributed training, this method is called only within a single process to avoid corrupted data. This method should not set state since it is not called on every device, use
setup
instead.
- on_after_batch_transfer(batch, dataloader_idx)[source]¶
Apply batch augmentations to the batch after it is transferred to the device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be altered or augmented.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A batch of data.
- Return type:
- plot(*args, **kwargs)[source]¶
Run the plot method of the validation dataset if one exists.
Should only be called during ‘fit’ or ‘validate’ stages as
val_dataset
may not exist during other stages.- Parameters:
- Returns:
A matplotlib Figure with the image, ground truth, and predictions.
- Return type:
matplotlib.figure.Figure | None
GeoDataModule¶
- class torchgeo.datamodules.GeoDataModule(dataset_class, batch_size=1, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Bases:
BaseDataModule
Base class for data modules containing geospatial information.
New in version 0.4.
- __init__(dataset_class, batch_size=1, patch_size=64, length=None, num_workers=0, **kwargs)[source]¶
Initialize a new GeoDataModule instance.
- Parameters:
dataset_class (type[torchgeo.datasets.geo.GeoDataset]) – Class used to instantiate a new dataset.
batch_size (int) – Size of each mini-batch.
patch_size (int | tuple[int, int]) – Size of each patch, either
size
or(height, width)
.length (int | None) – Length of each training epoch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
dataset_class
- setup(stage)[source]¶
Set up datasets and samplers.
Called at the beginning of fit, validate, test, or predict. During distributed training, this method is called from every process across all the nodes. Setting state here is recommended.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- train_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for training.
- Returns:
A collection of data loaders specifying training samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset or sampler, or if the dataset or sampler has length 0.- Return type:
- val_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for validation.
- Returns:
A collection of data loaders specifying validation samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset or sampler, or if the dataset or sampler has length 0.- Return type:
- test_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for testing.
- Returns:
A collection of data loaders specifying testing samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset or sampler, or if the dataset or sampler has length 0.- Return type:
- predict_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for prediction.
- Returns:
A collection of data loaders specifying prediction samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset or sampler, or if the dataset or sampler has length 0.- Return type:
- transfer_batch_to_device(batch, device, dataloader_idx)[source]¶
Transfer batch to device.
Defines how custom data types are moved to the target device.
- Parameters:
batch (dict[str, torch.Tensor]) – A batch of data that needs to be transferred to a new device.
device (device) – The target device as defined in PyTorch.
dataloader_idx (int) – The index of the dataloader to which the batch belongs.
- Returns:
A reference to the data on the new device.
- Return type:
NonGeoDataModule¶
- class torchgeo.datamodules.NonGeoDataModule(dataset_class, batch_size=1, num_workers=0, **kwargs)[source]¶
Bases:
BaseDataModule
Base class for data modules lacking geospatial information.
New in version 0.4.
- __init__(dataset_class, batch_size=1, num_workers=0, **kwargs)[source]¶
Initialize a new NonGeoDataModule instance.
- Parameters:
dataset_class (type[torchgeo.datasets.geo.NonGeoDataset]) – Class used to instantiate a new dataset.
batch_size (int) – Size of each mini-batch.
num_workers (int) – Number of workers for parallel data loading.
**kwargs (Any) – Additional keyword arguments passed to
dataset_class
- setup(stage)[source]¶
Set up datasets.
Called at the beginning of fit, validate, test, or predict. During distributed training, this method is called from every process across all the nodes. Setting state here is recommended.
- Parameters:
stage (str) – Either ‘fit’, ‘validate’, ‘test’, or ‘predict’.
- train_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for training.
- Returns:
A collection of data loaders specifying training samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset, or if the dataset has length 0.- Return type:
- val_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for validation.
- Returns:
A collection of data loaders specifying validation samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset, or if the dataset has length 0.- Return type:
- test_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for testing.
- Returns:
A collection of data loaders specifying testing samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset, or if the dataset has length 0.- Return type:
- predict_dataloader()[source]¶
Implement one or more PyTorch DataLoaders for prediction.
- Returns:
A collection of data loaders specifying prediction samples.
- Raises:
MisconfigurationException – If
setup()
does not define a dataset, or if the dataset has length 0.- Return type: