Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Support reading uint/int/float dtypes #18

Merged
merged 4 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ crate-type = ["cdylib", "rlib"]
bytes = "1.5.0"
geo = { git = "https://github.com/georust/geo.git", version = "0.28.0", rev = "481196b4e50a488442b3919e02496ad909fc5412" }
ndarray = "0.15.6"
num-traits = "0.2.19"
numpy = "0.21.0"
object_store = { version = "0.9.0", features = ["http"] }
pyo3 = { version = "0.21.1", features = ["abi3-py310", "extension-module"] }
Expand Down
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ async fn main() {
};

// Read GeoTIFF into an ndarray::Array
let arr: Array3<f32> = read_geotiff(stream).unwrap();
let arr: Array3<f32> = read_geotiff::<f32, _>(stream).unwrap();
assert_eq!(arr.dim(), (1, 549, 549));
assert_eq!(arr[[0, 500, 500]], 0.13482364);
}
Expand Down Expand Up @@ -91,9 +91,9 @@ assert dataarray.dtype == "float32"
```

> [!NOTE]
> Currently, this crate/library only supports reading single or multi-band float32
> GeoTIFF files, i.e. other dtypes (e.g. uint16) don't work yet. See roadmap below on
> future plans.
> Currently, the Python library supports reading single or multi-band GeoTIFF files into
> a float32 array only, i.e. other dtypes (e.g. uint16) don't work yet. There is support
> for reading into different dtypes in the Rust crate via a turbofish operator though!


## Roadmap
Expand All @@ -104,19 +104,20 @@ Short term (Q1 2024):
- [x] Read from HTTP remote storage (using
[`object-store`](https://github.com/apache/arrow-rs/tree/object_store_0.9.0/object_store))

Medium term (Q2 2024):
Medium term (Q2-Q4 2024):
- [x] Integration with `xarray` as a
[`BackendEntrypoint`](https://docs.xarray.dev/en/v2024.02.0/internals/how-to-add-new-backend.html)
- [ ] Implement single-band GeoTIFF reader for multiple dtypes (uint/int/float) (relying
on [`geotiff`](https://github.com/georust/geotiff) crate)
- [x] Implement single-band GeoTIFF reader for multiple dtypes (uint/int/float) (based
on [`geotiff`](https://github.com/georust/geotiff) crate, Rust-only)

Longer term (Q3-Q4 2024):
Longer term (2025):
- [ ] Parallel reader (TBD on multi-threaded or asynchronous)
- [ ] Direct-to-GPU loading


## Related crates

- https://github.com/developmentseed/aiocogeo-rs
- https://github.com/georust/geotiff
- https://github.com/jblindsay/whitebox-tools
- https://github.com/pka/georaster
75 changes: 61 additions & 14 deletions src/io/geotiff.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ use std::io::{Read, Seek};

use geo::AffineTransform;
use ndarray::{Array, Array1, Array3};
use num_traits::FromPrimitive;
use tiff::decoder::{Decoder, DecodingResult, Limits};
use tiff::tags::Tag;
use tiff::{ColorType, TiffError, TiffFormatError, TiffResult, TiffUnsupportedError};
Expand All @@ -23,7 +24,7 @@ impl<R: Read + Seek> CogReader<R> {
}

/// Decode GeoTIFF image to an [`ndarray::Array`]
pub fn ndarray(&mut self) -> TiffResult<Array3<f32>> {
pub fn ndarray<T: FromPrimitive + 'static>(&mut self) -> TiffResult<Array3<T>> {
// Count number of bands
let color_type = self.decoder.colortype()?;
let num_bands: usize = match color_type {
Expand All @@ -44,19 +45,45 @@ impl<R: Read + Seek> CogReader<R> {

// Get image pixel data
let decode_result = self.decoder.read_image()?;
let image_data: Vec<f32> = match decode_result {
DecodingResult::F32(img_data) => img_data,
_ => {
return Err(TiffError::UnsupportedError(
TiffUnsupportedError::UnsupportedDataType,
))
let image_data: Vec<T> = match decode_result {
DecodingResult::U8(img_data) => {
img_data.iter().map(|v| T::from_u8(*v).unwrap()).collect()
}
DecodingResult::U16(img_data) => {
img_data.iter().map(|v| T::from_u16(*v).unwrap()).collect()
}
DecodingResult::U32(img_data) => {
img_data.iter().map(|v| T::from_u32(*v).unwrap()).collect()
}
DecodingResult::U64(img_data) => {
img_data.iter().map(|v| T::from_u64(*v).unwrap()).collect()
}
DecodingResult::I8(img_data) => {
img_data.iter().map(|v| T::from_i8(*v).unwrap()).collect()
}
DecodingResult::I16(img_data) => {
img_data.iter().map(|v| T::from_i16(*v).unwrap()).collect()
}
DecodingResult::I32(img_data) => {
img_data.iter().map(|v| T::from_i32(*v).unwrap()).collect()
}
DecodingResult::I64(img_data) => {
img_data.iter().map(|v| T::from_i64(*v).unwrap()).collect()
}
DecodingResult::F32(img_data) => {
img_data.iter().map(|v| T::from_f32(*v).unwrap()).collect()
}
DecodingResult::F64(img_data) => {
img_data.iter().map(|v| T::from_f64(*v).unwrap()).collect()
}
};

// Put image pixel data into an ndarray
let array_data =
Array3::from_shape_vec((num_bands, height as usize, width as usize), image_data)
.map_err(|_| TiffFormatError::InconsistentSizesEncountered)?;
let array_data: Array3<T> = Array3::from_shape_vec(
(num_bands, height as usize, width as usize),
image_data.into(),
)
.map_err(|_| TiffFormatError::InconsistentSizesEncountered)?;

Ok(array_data)
}
Expand Down Expand Up @@ -138,12 +165,14 @@ impl<R: Read + Seek> CogReader<R> {
}

/// Synchronously read a GeoTIFF file into an [`ndarray::Array`]
pub fn read_geotiff<R: Read + Seek>(stream: R) -> TiffResult<Array3<f32>> {
pub fn read_geotiff<T: FromPrimitive + 'static, R: Read + Seek>(
stream: R,
) -> TiffResult<Array3<T>> {
// Open TIFF stream with decoder
let mut reader = CogReader::new(stream)?;

// Decode TIFF into ndarray
let array_data: Array3<f32> = reader.ndarray()?;
let array_data: Array3<T> = reader.ndarray()?;

Ok(array_data)
}
Expand Down Expand Up @@ -205,7 +234,25 @@ mod tests {
let array = reader.ndarray().unwrap();

assert_eq!(array.dim(), (2, 512, 512));
assert_eq!(array.mean(), Some(225.17654));
assert_eq!(array.mean(), Some(225.17439122416545));
}

#[tokio::test]
async fn test_read_geotiff_uint16_dtype() {
let cog_url: &str =
"https://github.com/OSGeo/gdal/raw/v3.9.2/autotest/gcore/data/uint16.tif";
let tif_url = Url::parse(cog_url).unwrap();
let (store, location) = parse_url(&tif_url).unwrap();

let result = store.get(&location).await.unwrap();
let bytes = result.bytes().await.unwrap();
let stream = Cursor::new(bytes);

let mut reader = CogReader::new(stream).unwrap();
let array = reader.ndarray::<u16>().unwrap();

assert_eq!(array.dim(), (1, 20, 20));
assert_eq!(array.mean(), Some(126));
}

#[tokio::test]
Expand All @@ -219,7 +266,7 @@ mod tests {
let stream = Cursor::new(bytes);

let mut reader = CogReader::new(stream).unwrap();
let array = reader.ndarray().unwrap();
let array = reader.ndarray::<f32>().unwrap();

assert_eq!(array.shape(), [1, 2, 3]);
assert_eq!(array, array![[[1.41, 1.23, 0.78], [0.32, -0.23, -1.88]]])
Expand Down
7 changes: 6 additions & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,16 @@
//! Cursor::new(bytes)
//! };
//!
//! let arr: Array3<f32> = read_geotiff(stream).unwrap();
//! let arr: Array3<f32> = read_geotiff::<f32, _>(stream).unwrap();
//! assert_eq!(arr.dim(), (1, 549, 549));
//! assert_eq!(arr[[0, 500, 500]], 0.13482364);
//! }
//! ```
//!
//! Note that the output dtype can be specified either by using a type hint
//! (`let arr: Array3<f32>`) or via the turbofish operator (`read_geotiff::<f32>`).
//! Currently supported dtypes include uint (u8/u16/u32/u64), int (i8/i16/i32/i64) and
//! float (f32/f64).

/// Modules for handling Input/Output of GeoTIFF data
pub mod io;
Expand Down