Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KML parsing fails as of DuckDB v1.1.2 #431

Open
marcoslot opened this issue Oct 16, 2024 · 2 comments
Open

KML parsing fails as of DuckDB v1.1.2 #431

marcoslot opened this issue Oct 16, 2024 · 2 comments

Comments

@marcoslot
Copy link

Not sure whether this is a GDAL issue or duckdb_spatial issue, but KML parsing sometimes fails in DuckDB v1.1.2, worked in v1.1.1

v1.1.2:

copy (SELECT
            'hello-'||generate_series AS name,
            'world-'||generate_series AS desc,
            format('POINT(52.{} 4.{})', generate_series, generate_series)::geometry AS geom
          FROM generate_series(1,100)) to 'test.kml' with (format 'GDAL', driver 'KML');

select count(*) from st_read('test.kml');
IO Error: GDAL Error (1): XML parsing of KML file failed : unclosed token at line 278, column 2

select count(*) from st_read('test.kml');
IO Error: GDAL Error (1): GDALOpen() called on test.kml recursively

v1.1.1:

select count(*) from st_read('test.kml');
┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│          100 │
└──────────────┘
@Maxxen
Copy link
Member

Maxxen commented Oct 16, 2024

Hi! Thanks for opening this issue!
Im unable to reproduce the error using the code you provided, both when building duckdb from source and when using the one provided by brew. What platform are you on?

maxxen@Maxs-MacBook-Pro-2 duckdb_spatial % duckdb
v1.1.2 f680b7d08f
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D load spatial;
D copy (SELECT
              'hello-'||generate_series AS name,
              'world-'||generate_series AS desc,
              format('POINT(52.{} 4.{})', generate_series, generate_series)::geometry AS geom
            FROM generate_series(1,100)) to 'test.kml' with (format 'GDAL', driver 'KML');
D
D select count(*) from st_read('test.kml');
┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│          100 │
└──────────────┘

@marcoslot
Copy link
Author

I'm using Ubuntu 22.04 on x86_64 using DuckDB CLI v1.1.2

$ rm ~/.duckdb/extensions/v1.1.2/linux_amd64/*
$ duckdb
v1.1.2 f680b7d08f
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D install spatial; load spatial;
D copy (SELECT
              'hello-'||generate_series AS name,
              'world-'||generate_series AS desc,
              format('POINT(52.{} 4.{})', generate_series, generate_series)::geometry AS geom
            FROM generate_series(1,100)) to 'test.kml' with (format 'GDAL', driver 'KML');
D select count(*) from st_read('test.kml');
IO Error: GDAL Error (1): XML parsing of KML file failed : no element found at line 279, column 0
D 

The behaviour appears to be size related:

D copy (SELECT
              'hello-'||generate_series AS name,
              'world-'||generate_series AS desc,
              format('POINT(52.{} 4.{})', generate_series, generate_series)::geometry AS geom
            FROM generate_series(1,53)) to 'test.kml' with (format 'GDAL', driver 'KML');
D select count(*) from st_read('test.kml');
┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│           53 │
└──────────────┘
D copy (SELECT
              'hello-'||generate_series AS name,
              'world-'||generate_series AS desc,
              format('POINT(52.{} 4.{})', generate_series, generate_series)::geometry AS geom
            FROM generate_series(1,54)) to 'test.kml' with (format 'GDAL', driver 'KML');
D select count(*) from st_read('test.kml');
IO Error: GDAL Error (1): XML parsing of KML file failed : unclosed token at line 279, column 0

With 53 rows the file is 8058 bytes, and with 54 rows it is 8206 bytes. Presumably ~8192 is the boundary.

I tried building DuckDB from source and downloading from https://github.com/duckdb/duckdb/releases/download/v1.1.2/duckdb_cli-linux-amd64.zip and both give the same error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants