From ddceac0eb8a7b4612fcbebd07478d6a652744805 Mon Sep 17 00:00:00 2001 From: Michael Johns Date: Fri, 26 Jan 2024 10:19:54 -0500 Subject: [PATCH 1/2] formatting. cleanup gdal driver sections. --- docs/source/api/raster-format-readers.rst | 38 ++++++++++---------- docs/source/api/vector-format-readers.rst | 31 ++++++++++------- docs/source/index.rst | 42 ++++++++++++++++------- docs/source/usage/installation.rst | 28 ++++++++++----- 4 files changed, 88 insertions(+), 51 deletions(-) diff --git a/docs/source/api/raster-format-readers.rst b/docs/source/api/raster-format-readers.rst index 1e7a5d189..323ca5b0f 100644 --- a/docs/source/api/raster-format-readers.rst +++ b/docs/source/api/raster-format-readers.rst @@ -5,24 +5,26 @@ Raster Format Readers Intro ##### -Mosaic provides spark readers for the following raster formats: - - * GTiff (GeoTiff) using .tif file extension - https://gdal.org/drivers/raster/gtiff.html - * COG (Cloud Optimized GeoTiff) using .tif file extension - https://gdal.org/drivers/raster/cog.html - * HDF4 using .hdf file extension - https://gdal.org/drivers/raster/hdf4.html - * HDF5 using .h5 file extension - https://gdal.org/drivers/raster/hdf5.html - * NetCDF using .nc file extension - https://gdal.org/drivers/raster/netcdf.html - * JP2ECW using .jp2 file extension - https://gdal.org/drivers/raster/jp2ecw.html - * JP2KAK using .jp2 file extension - https://gdal.org/drivers/raster/jp2kak.html - * JP2OpenJPEG using .jp2 file extension - https://gdal.org/drivers/raster/jp2openjpeg.html - * PDF using .pdf file extension - https://gdal.org/drivers/raster/pdf.html - * PNG using .png file extension - https://gdal.org/drivers/raster/png.html - * VRT using .vrt file extension - https://gdal.org/drivers/raster/vrt.html - * XPM using .xpm file extension - https://gdal.org/drivers/raster/xpm.html - * GRIB using .grb file extension - https://gdal.org/drivers/raster/grib.html - * Zarr using .zarr file extension - https://gdal.org/drivers/raster/zarr.html - -Other formats are supported if supported by GDAL available drivers. +Mosaic provides spark readers for vector files supported by GDAL OGR drivers. +Only the drivers that are built by default are supported. +Here are some common useful file formats: + + * `GTiff `_ (GeoTiff) using .tif file extension + * `COG `_ (Cloud Optimized GeoTiff) using .tif file extension + * `HDF4 `_ using .hdf file extension + * `HDF5 `_ using .h5 file extension + * `NetCDF `_ using .nc file extension + * `JP2ECW `_ using .jp2 file extension + * `JP2KAK `_ using .jp2 file extension + * `JP2OpenJPEG `_ using .jp2 file extension + * `PDF `_ using .pdf file extension + * `PNG `_ using .png file extension + * `VRT `_ using .vrt file extension + * `XPM `_ using .xpm file extension + * `GRIB `_ using .grb file extension + * `Zarr `_ using .zarr file extension + +For more information please refer to gdal `raster driver `_ documentation. Mosaic provides two flavors of the readers: diff --git a/docs/source/api/vector-format-readers.rst b/docs/source/api/vector-format-readers.rst index e540a11f2..43e8a6e08 100644 --- a/docs/source/api/vector-format-readers.rst +++ b/docs/source/api/vector-format-readers.rst @@ -9,19 +9,24 @@ Mosaic provides spark readers for vector files supported by GDAL OGR drivers. Only the drivers that are built by default are supported. Here are some common useful file formats: - * GeoJSON (also ESRIJSON, TopoJSON) https://gdal.org/drivers/vector/geojson.html - * ESRI File Geodatabase (FileGDB) and ESRI File Geodatabase vector (OpenFileGDB). Mosaic implements named reader geo_db (described in this doc). https://gdal.org/drivers/vector/filegdb.html - * ESRI Shapefile / DBF (ESRI Shapefile) - Mosaic implements named reader shapefile (described in this doc) https://gdal.org/drivers/vector/shapefile.html - * Network Common Data Form (netCDF) - Mosaic implements raster reader also https://gdal.org/drivers/raster/netcdf.html - * (Geo)Parquet (Parquet) - Mosaic will be implementing a custom reader soon https://gdal.org/drivers/vector/parquet.html - * Spreadsheets (XLSX, XLS, ODS) https://gdal.org/drivers/vector/xls.html - * U.S. Census TIGER/Line (TIGER) https://gdal.org/drivers/vector/tiger.html - * PostgreSQL Dump (PGDump) https://gdal.org/drivers/vector/pgdump.html - * Keyhole Markup Language (KML) https://gdal.org/drivers/vector/kml.html - * Geography Markup Language (GML) https://gdal.org/drivers/vector/gml.html - * GRASS - option for Linear Referencing Systems (LRS) https://gdal.org/drivers/vector/grass.html - -For more information please refer to gdal documentation: https://gdal.org/drivers/vector/index.html + * `GeoJSON `_ (also `ESRIJSON `_, + `TopoJSON `_) + * `FileGDB `_ (ESRI File Geodatabase) and + `OpenFileGDB `_ (ESRI File Geodatabase vector) - + Mosaic implements named reader :ref:`spark.read.format("geo_db")` (described in this doc). + * `ESRI Shapefile `_ (ESRI Shapefile / DBF) - + Mosaic implements named reader :ref:`spark.read.format("shapefile")` (described in this doc). + * `netCDF `_ (Network Common Data Form) - + Mosaic supports GDAL netCDF raster reader also. + * `XLSX `_, `XLS `_, + `ODS `_ spreadsheets + * `TIGER `_ (U.S. Census TIGER/Line) + * `PGDump `_ (PostgreSQL Dump) + * `KML `_ (Keyhole Markup Language) + * `GML `_ (Geography Markup Language) + * `GRASS `_ - option for Linear Referencing Systems (LRS) + +For more information please refer to gdal `vector driver `_ documentation. Mosaic provides two flavors of the general readers: diff --git a/docs/source/index.rst b/docs/source/index.rst index 44e8e0665..8e63fcb49 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -65,27 +65,38 @@ Version 0.4.x Series We recommend using Databricks Runtime versions 13.3 LTS with Photon enabled. - Mosaic 0.4.x series only supports DBR 13.x DBRs. If running on a different DBR it will throw an exception: -**DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify `%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.** +**DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify +`%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.** -Mosaic 0.4.x series issues the following ERROR on a standard, non-Photon cluster `ADB `_ | `AWS `_ | `GCP `_ : +Mosaic 0.4.x series issues an ERROR on standard, non-Photon clusters `ADB `_ | +`AWS `_ | +`GCP `_ : -**DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic 0.4.x series restricts executing this cluster.** +**DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial +AI benefits; Mosaic 0.4.x series restricts executing this cluster.** As of Mosaic 0.4.0 (subject to change in follow-on releases) * `Assigned Clusters `_ : Mosaic Python, SQL, R, and Scala APIs. - * `Shared Access Clusters `_ : Mosaic Scala API (JVM) with Admin `allowlisting `_ ; Python bindings to Mosaic Scala APIs are blocked by Py4J Security on Shared Access Clusters. - - Mosaic SQL expressions cannot yet be registered with `Unity Catalog `_ - due to API changes affecting DBRs >= 13, more `here `_. + * `Shared Access Clusters `_ : Mosaic Scala API (JVM) with + Admin `allowlisting `_ ; + Python bindings to Mosaic Scala APIs are blocked by Py4J Security on Shared Access Clusters. + +.. warning:: + Mosaic SQL expressions cannot yet be registered with `Unity Catalog `_ + due to API changes affecting DBRs >= 13, more `here `_. .. note:: As of Mosaic 0.4.0 (subject to change in follow-on releases) - * `Unity Catalog `_ : Enforces process isolation which is difficult to accomplish with custom JVM libraries; as such only built-in (aka platform provided) JVM APIs can be invoked from other supported languages in Shared Access Clusters. - * `Volumes `_ : Along the same principle of isolation, clusters (both assigned and shared access) can read Volumes via relevant built-in readers and writers or via custom python calls which do not involve any custom JVM code. + * `Unity Catalog `_ : Enforces process isolation which is difficult to + accomplish with custom JVM libraries; as such only built-in (aka platform provided) JVM APIs can be invoked from other + supported languages in Shared Access Clusters. + * `Volumes `_ : Along the same principle of isolation, + clusters (both assigned and shared access) can read Volumes via relevant built-in readers and writers or via custom + python calls which do not involve any custom JVM code. Version 0.3.x Series @@ -97,11 +108,18 @@ For Mosaic versions < 0.4.0 please use the `0.3.x docs `_ | `AWS `_ | `GCP `_ : +As of the 0.3.11 release, Mosaic issues the following WARNING when initialized on a cluster that is neither Photon Runtime +nor Databricks Runtime ML `ADB `_ | +`AWS `_ | +`GCP `_ : -**DEPRECATION WARNING: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic will stop working on this cluster after v0.3.x.** +**DEPRECATION WARNING: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial +AI benefits; Mosaic will stop working on this cluster after v0.3.x.** -If you are receiving this warning in v0.3.11+, you will want to begin to plan for a supported runtime. The reason we are making this change is that we are streamlining Mosaic internals to be more aligned with future product APIs which are powered by Photon. Along this direction of change, Mosaic has standardized to JTS as its default and supported Vector Geometry Provider. +If you are receiving this warning in v0.3.11+, you will want to begin to plan for a supported runtime. The reason we are +making this change is that we are streamlining Mosaic internals to be more aligned with future product APIs which are +powered by Photon. Along this direction of change, Mosaic has standardized to JTS as its default and supported Vector +Geometry Provider. Documentation diff --git a/docs/source/usage/installation.rst b/docs/source/usage/installation.rst index 94e1ca134..b74e53f4b 100644 --- a/docs/source/usage/installation.rst +++ b/docs/source/usage/installation.rst @@ -10,24 +10,36 @@ Supported platforms Mosaic 0.4.x series only supports DBR 13.x DBRs. If running on a different DBR it will throw an exception: -**DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify `%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.** +**DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify +`%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.** -Mosaic 0.4.x series issues the following ERROR on a standard, non-Photon cluster `ADB `_ | `AWS `_ | `GCP `_ : +Mosaic 0.4.x series issues an ERROR on standard, non-Photon clusters `ADB `_ | +`AWS `_ | +`GCP `_ : -**DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic 0.4.x series restricts executing this cluster.** +**DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial +AI benefits; Mosaic 0.4.x series restricts executing this cluster.** As of Mosaic 0.4.0 (subject to change in follow-on releases) * `Assigned Clusters `_ : Mosaic Python, SQL, R, and Scala APIs. - * `Shared Access Clusters `_ : Mosaic Scala API (JVM) with Admin `allowlisting `_ ; Python bindings to Mosaic Scala APIs are blocked by Py4J Security on Shared Access Clusters. - - Mosaic SQL expressions cannot yet be registered with `Unity Catalog `_ - due to API changes affecting DBRs >= 13, more `here `_. + * `Shared Access Clusters `_ : Mosaic Scala API (JVM) with + Admin `allowlisting `_ ; + Python bindings to Mosaic Scala APIs are blocked by Py4J Security on Shared Access Clusters. + +.. warning:: + Mosaic SQL expressions cannot yet be registered with `Unity Catalog `_ + due to API changes affecting DBRs >= 13, more `here `_. .. note:: As of Mosaic 0.4.0 (subject to change in follow-on releases) - * `Unity Catalog `_ : Enforces process isolation which is difficult to accomplish with custom JVM libraries; as such only built-in (aka platform provided) JVM APIs can be invoked from other supported languages in Shared Access Clusters. - * `Volumes `_ : Along the same principle of isolation, clusters (both assigned and shared access) can read Volumes via relevant built-in readers and writers or via custom python calls which do not involve any custom JVM code. + * `Unity Catalog `_ : Enforces process isolation which is difficult to + accomplish with custom JVM libraries; as such only built-in (aka platform provided) JVM APIs can be invoked from other + supported languages in Shared Access Clusters. + * `Volumes `_ : Along the same principle of isolation, + clusters (both assigned and shared access) can read Volumes via relevant built-in readers and writers or via custom + python calls which do not involve any custom JVM code. If you have cluster creation permissions in your Databricks workspace, you can create a cluster using the instructions From 0df51541f088f176e89a21c1512ec28836e5dfd1 Mon Sep 17 00:00:00 2001 From: Michael Johns Date: Fri, 26 Jan 2024 11:08:24 -0500 Subject: [PATCH 2/2] formatting. --- docs/source/api/raster-format-readers.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/api/raster-format-readers.rst b/docs/source/api/raster-format-readers.rst index 323ca5b0f..7e77f39d6 100644 --- a/docs/source/api/raster-format-readers.rst +++ b/docs/source/api/raster-format-readers.rst @@ -5,7 +5,7 @@ Raster Format Readers Intro ##### -Mosaic provides spark readers for vector files supported by GDAL OGR drivers. +Mosaic provides spark readers for raster files supported by GDAL OGR drivers. Only the drivers that are built by default are supported. Here are some common useful file formats: