Skip to content

Commit

Permalink
Unable to read date format columns (int96 type) from avro-parquet sch…
Browse files Browse the repository at this point in the history
…ema (databrickslabs#22)

INT96 is deprecated so we must set "parquet.avro.readInt96AsFixed" configuration to "true" when build the reader.
  • Loading branch information
jeremihas-caruso authored Mar 25, 2024
1 parent 1855e83 commit 09429a3
Showing 1 changed file with 5 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,14 @@ public List<T> readN(Integer num) throws IOException {
*/
private List<ParquetReader<T>> getReaders() throws IOException {
List<ParquetReader<T>> readers = new LinkedList<>();
Configuration conf = new Configuration();

conf.set("parquet.avro.readInt96AsFixed", "true");

for (Path path : paths) {
LocalInputFile localInputFile = new LocalInputFile(path);
ParquetReader<T> reader =
AvroParquetReader.<T>builder(localInputFile).build();
AvroParquetReader.<T>builder(localInputFile).withConf(conf).build();
readers.add(reader);
}
return readers;
Expand Down

0 comments on commit 09429a3

Please sign in to comment.