[BUG] Excel File with Macros Detected as "Potentially" Malicious. Unable to read Excel as a result. #832

nova-jj · 2024-02-22T19:41:40Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Within an Azure Databricks Environment we're using this library to read Excel files stored in a Storage Account accessed using either the ABFSS or DBFS protocols, suggesting this is a file issue and not a protocol issue.
.
Attempting to read the file with newer versions of the spark-excel library result in the following error caused by macros in the workbook: crealytics excel workbook java.io.IOException: The file appears to be potentially malicious. "This file embeds more internal file entries than expected."

We have reverted to a previous version that does not present this error and are looking for a solution that allows us to bypass the macro detection in our workbook which does contain macros, but are required as part of the workbook.

Expected Behavior

Reading the file into a dataframe should not be met with this error, OR, an option to override the macro detection in order to be able to force-read when "potentially" maliciousness is present.

Steps To Reproduce

The following python code produces our error:

file_path= "dbfs:/FileStore/our_excel_file.xlsm"
df = spark.read.format("com.crealytics.spark.excel").option("header", "true").load(file_path)
df = df.toPandas()

Environment

- Spark version: 3.4.1 via Databricks Runtime 13.3
- Spark-Excel version: 3.5.0_0.20.3
- OS: Windows but remote-run from Databricks clusters
- Cluster environment: Multiple cluster configurations representing dev/stg/prd using the same Databricks Runtime and Spark Versions.

Anything else?

We have reverted to using the previous version maven coordinates: com.crealytics:spark-excel_2.12:0.13.7 for our install which does not produce this issue.

The text was updated successfully, but these errors were encountered:

nightscape · 2024-02-25T20:40:10Z

spark-excel doesn't do anything in that regard.
It must be an upstream library that performs this check. Can you try to find out if this comes from POI?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Excel File with Macros Detected as "Potentially" Malicious. Unable to read Excel as a result. #832

[BUG] Excel File with Macros Detected as "Potentially" Malicious. Unable to read Excel as a result. #832

nova-jj commented Feb 22, 2024

nightscape commented Feb 25, 2024

[BUG] Excel File with Macros Detected as "Potentially" Malicious. Unable to read Excel as a result. #832

[BUG] Excel File with Macros Detected as "Potentially" Malicious. Unable to read Excel as a result. #832

Comments

nova-jj commented Feb 22, 2024

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

nightscape commented Feb 25, 2024