Implicit cast change in DBR 13.3 can cause failures in Silver Spark modules #1311
Labels
bug
Something isn't working
data quality
There is a data quality issue here
schema change
Requires a schema change
Milestone
Overwatch Version: 0.8.2.0
In raw Spark event logs, and therefore in Overwatch table
spark_events_bronze
, the fieldExecutorID
is usually a number but occasionally it gets the value'driver'
.In Overwatch deployments where this special value is present in the first run for a given target storage location Spark will infer
STRING
type for that column and this particular issue will never occur.In Overwatch deployments where no such special value is present in the first run for a given target storage location Spark will infer
BIGINT
/Long
for that column and create thespark_*_silver
target tables with same. In some later Overwatch ETL run, when'driver'
shows up in that column inspark_events_bronze
one of two things can happen when persisting results tospark_*_silver
tables depending on the DBR version:DBR 11.3: Spark silently converts
'driver'
toNULL
while implicitly casting values like'0'
and'105'
toBIGINT
s. This behavior is available in later DBRs by setting configuration propertyspark.sql.storeAssignmentPolicy
tolegacy
, but this is not explicitly set anywhere in the Overwatch code as of release 0.8.2.0.DBR 13.3: by default,
spark.sql.storeAssignmentPolicy
is set toANSI
, which causes a runtime exception when attempting to implicitly cast'driver'
toBIGINT
. See Safe casts enabled by default for Delta Lake operations in the DBR 13.3 release notes and ANSI compliance in Databricks Runtime in the Databricks SQL language reference for details.The Silver Spark modules should be future-proofed for DBR > 11.3 by explicitly designating
STRING
type forExecutorID
columns or some equivalent solution.Two workarounds are available in the meantime:
spark_*_silver
tables, i.e.ExecutorID
will beNULL
unlike its upstream source column of typeSTRING
inspark_events_bronze
.The text was updated successfully, but these errors were encountered: