Drop NULL columns from INSERT INTO #68

ryannedolan · 2024-07-14T00:24:31Z

Summary

Drop NULL columns (e.g. NULL AS KEY) from INSERT INTO pipelines.

Promote NULL fields to BYTES in DDL (e.g. KEY BYTES).

Details

Flink does not allow NULLs in queries (SQL) or in table definitions (DDL). This is problematic, because we would like to be able to write NULL AS KEY in subscriptions.

In Hoptimator, sink tables are derived from subscription SQL. If we write NULL AS KEY in a subscription, the sink table would have a NULL-typed column, and the Flink job would fail. Instead, NULL AS KEY should result in the column being omitted form the pipeline entirely (defaulting to NULL).

At the same time, Flink DDL cannot have NULL-typed fields, so we promote these to BYTES.

The result is that SELECT ... NULL AS KEY ... results in a sink table with KEY BYTES and pipelines that omit the KEY field (defaulting to NULL).

Testing

New unit tests. Tested locally and by manually tweaking production pipelines.

Changed integration tests to use NULL AS KEY.

ryannedolan force-pushed the computed-nulls branch from 11ef30a to 8af5711 Compare July 14, 2024 04:43

ryannedolan changed the title ~~Replace NULL fields with computed values~~ Drop NULL columns from INSERT INTO Jul 14, 2024

ryannedolan force-pushed the computed-nulls branch 3 times, most recently from 31a8cd7 to f24f866 Compare July 14, 2024 16:34

ryannedolan marked this pull request as ready for review July 14, 2024 16:47

Drop NULL columns from INSERT INTO

2ed7e4f

ryannedolan force-pushed the computed-nulls branch from f24f866 to 2ed7e4f Compare July 14, 2024 16:50

ryannedolan requested review from ehoner and hshukla July 14, 2024 23:53

hshukla approved these changes Jul 15, 2024

View reviewed changes

ryannedolan merged commit 701ce75 into main Jul 15, 2024
1 check passed

ryannedolan deleted the computed-nulls branch July 15, 2024 19:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop NULL columns from INSERT INTO #68

Drop NULL columns from INSERT INTO #68

ryannedolan commented Jul 14, 2024 •

edited

Loading

Drop NULL columns from INSERT INTO #68

Drop NULL columns from INSERT INTO #68

Conversation

ryannedolan commented Jul 14, 2024 • edited Loading

Summary

Details

Testing

ryannedolan commented Jul 14, 2024 •

edited

Loading