Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoTIFF 1.0 and EPSG codes #642

Open
marhop opened this issue Sep 2, 2020 · 0 comments
Open

GeoTIFF 1.0 and EPSG codes #642

marhop opened this issue Sep 2, 2020 · 0 comments
Labels
bug A product defect that needs fixing feature New functionality to be developed OAG

Comments

@marhop
Copy link
Member

marhop commented Sep 2, 2020

Hello,

I would like to propose a revised definition of "valid" for GeoTIFF images. But first ...

Some background

GeoTIFF relies heavily on numerical codes to describe various geodetic "things" like coordinate systems and map projections. A lot of these codes are defined in the EPSG Geodetic Parameter Dataset. For example, the coordinate reference system "WGS 84 / UTM zone 60N" used in a GeoTIFF image may be identified by simply setting the ProjectedCSType GeoTIFF key to the value 32660 which stands for EPSG code 32660 (see also this example from the GeoTIFF 1.0 specification). This code is considered valid by JHOVE 1.24.

The GeoTIFF 1.0 specification, section 6.3.3.1 reserves the whole range of numbers from 20000 to 32760 in ProjectedCSType for EPSG codes. (There are similar definitions for other GeoTIFF keys.) Consequently, one would expect that EPSG code 25832 is also considered valid by JHOVE, right? It is not, and here's the reason:

When GeoTIFF 1.0 was specified in the 1990s, not only the valid range of values was defined for keys like ProjectedCSType but also the meaning of some of these codes, namely those defined in the EPSG dataset version 2.1 current at this time. So the abovementioned section of the spec reserves a range for EPSG codes, but it also adds a long list of explicit codes (in this range) taken from the EPSG dataset 2.1. This list contains an entry PCS_WGS84_UTM_zone_60N = 32660 but no entry for 25832 because this code was only added to the EPSG dataset in a later version.

This raises the question: Should all values from the specified range be valid or only those on the explicit list? (Or maybe those in the EPSG dataset current at the time of validation?) Reading the specification I could not find a definitive answer.

The problem

JHOVE 1.24 requires a valid code to be on the explicit list i.e., it only accepts codes that were in the EPSG dataset version 2.1 released some 20 years ago. This may (or may not) be true to the literal interpretation of the specification but probably not to its intention. Furthermore I understand from talking to people working in the field that it does not reflect the common practice in geodesy. This is supported by the following quote from the GeoTIFF FAQ:

The GeoTIFF 1.0 Specification references the EPSG 2.1 database. Without a specification revision this remains the case; however, it is common practice among vendors producing and consuming GeoTIFF files to support recent versions of the EPSG database, currently 6.4 or higher.

(NB, the EPSG dataset has in the meanwhile reached version 9.8 ...)

And to make an even stronger point, consider this quote from a recent advancement of the GeoTIFF standard by the Open Geospatial Consortium, resulting in OGC GeoTIFF 1.1 (Annex B.3.1):

In GeoTIFF, standard CRSs are identified through reference to an EPSG CRS code. [...] Note: This document removes the reference to the specific EPSG codes listed in the 1995 GeoTIFF v1.0 specification and replaces it by allowing reference to any code in the EPSG Dataset, including codes for any objects introduced into the EPSG Dataset after publication of this document.

So although this statement does not apply to GeoTIFF 1.0 it clearly shows that the explicit, hard-wired code lists are considered a mistake in retrospect.

Possible solutions

OK, that was an awful lot of text but I hope I have highlighted all aspects of the problem. Let me propose the following options in descending order of preference:

  1. Change JHOVE to accept all codes in the current EPSG dataset as valid for their respective GeoTIFF keys.

    This is IMHO the best approach to validation because exactly the codes in the EPSG dataset count as valid (no more, no less), but also the most involved because it requires regular updates of JHOVE's GeoTIFF modules with new releases of the EPSG dataset. (The dataset can be downloaded from https://epsg.org; requires registration though.)

  2. Change JHOVE to accept all values from the ranges (not the explicit code lists) defined in the GeoTIFF 1.0 specification as valid for their respective GeoTIFF keys.

    This is rather lax since it considers the whole possible code space as valid without accounting for undefined codes. Besides, it cannot print human-readable code descriptions as is done now (at least not for unknown codes). But still a good middle ground.

  3. Don't change anything.

    This implements the strictest possible interpretation of the GeoTIFF 1.0 specification (too strict if you ask me) but leads to a lot of "well-formed but not valid" results for files that are, at least (!) from a practical point of view, perfectly valid. Very unsatisfying.

What do you think? If that helps, I could possibly be tempted to work on an implementation ... But let's first discuss this!

Thanks,
Martin

PS: There are some existing issues and pull requests addressing parts of this problem (#368 + #321 and #624 + #623).

@carlwilson carlwilson added bug A product defect that needs fixing feature New functionality to be developed OAG labels Jun 21, 2022
@carlwilson carlwilson added this to the OPF Hackathon 2023 Tasks milestone Jun 21, 2023
@carlwilson carlwilson removed this from the OPF Hackathon 2023 Tasks milestone Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A product defect that needs fixing feature New functionality to be developed OAG
Projects
Status: No status
Development

No branches or pull requests

2 participants