Shortcuts
Open in Colab Open on Planetary Computer

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

Benchmarking

This tutorial benchmarks the performance of various sampling strategies, with and without caching.

It’s recommended to run this notebook on Google Colab if you don’t have your own GPU. Click the “Open in Colab” button above to get started.

Setup

First, we install TorchGeo.

[ ]:
%pip install torchgeo

Imports

Next, we import TorchGeo and any other libraries we need.

[1]:
import os
import tempfile
import time
from typing import Tuple

from torch.utils.data import DataLoader

from torchgeo.datasets import NAIP, ChesapeakeDE
from torchgeo.datasets.utils import download_url, stack_samples
from torchgeo.samplers import RandomGeoSampler, GridGeoSampler, RandomBatchGeoSampler

Datasets

For this tutorial, we’ll be using imagery from the National Agriculture Imagery Program (NAIP) and labels from the Chesapeake Bay High-Resolution Land Cover Project. First, we manually download a few NAIP tiles.

[ ]:
data_root = tempfile.gettempdir()
naip_root = os.path.join(data_root, "naip")
naip_url = "https://naipblobs.blob.core.windows.net/naip/v002/de/2018/de_060cm_2018/38075/"
tiles = [
    "m_3807511_ne_18_060_20181104.tif",
    "m_3807511_se_18_060_20181104.tif",
    "m_3807512_nw_18_060_20180815.tif",
    "m_3807512_sw_18_060_20180815.tif",
]
for tile in tiles:
    download_url(naip_url + tile, naip_root)

Next, we tell TorchGeo to automatically download the corresponding Chesapeake labels.

[ ]:
chesapeake_root = os.path.join(data_root, "chesapeake")

chesapeake = ChesapeakeDE(chesapeake_root, download=True)

Timing function

[ ]:
def time_epoch(dataloader: DataLoader) -> Tuple[float, int]:
    tic = time.time()
    i = 0
    for _ in dataloader:
        i += 1
    toc = time.time()
    return toc - tic, i

RandomGeoSampler

[ ]:
for cache in [False, True]:
    chesapeake = ChesapeakeDE(chesapeake_root, cache=cache)
    naip = NAIP(naip_root, crs=chesapeake.crs, res=chesapeake.res, cache=cache)
    dataset = chesapeake & naip
    sampler = RandomGeoSampler(naip, size=1000, length=888)
    dataloader = DataLoader(dataset, batch_size=12, sampler=sampler, collate_fn=stack_samples)
    duration, count = time_epoch(dataloader)
    print(duration, count)
296.582683801651 74
54.20210099220276 74

GridGeoSampler

[ ]:
for cache in [False, True]:
    chesapeake = ChesapeakeDE(chesapeake_root, cache=cache)
    naip = NAIP(naip_root, crs=chesapeake.crs, res=chesapeake.res, cache=cache)
    dataset = chesapeake & naip
    sampler = GridGeoSampler(naip, size=1000, stride=500)
    dataloader = DataLoader(dataset, batch_size=12, sampler=sampler, collate_fn=stack_samples)
    duration, count = time_epoch(dataloader)
    print(duration, count)
391.90197944641113 74
118.0611424446106 74

RandomBatchGeoSampler

[ ]:
for cache in [False, True]:
    chesapeake = ChesapeakeDE(chesapeake_root, cache=cache)
    naip = NAIP(naip_root, crs=chesapeake.crs, res=chesapeake.res, cache=cache)
    dataset = chesapeake & naip
    sampler = RandomBatchGeoSampler(naip, size=1000, batch_size=12, length=888)
    dataloader = DataLoader(dataset, batch_sampler=sampler, collate_fn=stack_samples)
    duration, count = time_epoch(dataloader)
    print(duration, count)
230.51380324363708 74
53.99923872947693 74
Read the Docs v: v0.2.1
Versions
latest
stable
v0.2.1
v0.2.0
v0.1.1
v0.1.0
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources