Delta log getting too big, resulting in spark job failures while writing. #779

nnani · 2021-09-09T15:10:28Z

Hello,

We have been using delta library for more than 2 years now on HDI Cluster. Recently we came across few cases where the spark job starts failing when trying to append data to a existing partitioned table. It fails with java.lang.OutOfMemory: Jave heap size at ...... issue.
Delta library used - 0.5 version
Table partitioned on 3 columns.

We tried querying this delta table through Jupyter and when no filters applied, it fails with same error.

After searching for this issue, it looks like the delta library is trying read / store the list of parquet files that need to be scanned into an array, however its failing to do so.

When we try with huge (10GB) driver memory the spark job goes through. However, we cannot afford to allocate that much of driver memory due to # of jobs and infra limitations.

Based on this we have the below questions. It would be nice if you can help answer the doubts

We find almost 1K json files under _delta_log folder and too many checkpoints file around 90MB each. When do these files get deleted from the folder and what triggers this deletion ?
We have HDFS based on Azure Blob storage. In this case, we see one of the DELTA table has almost 1.7 Million blobs (files), but latest checkpoint size is only 35 MB, while other DELTA table has same number of blobs and checkpoint file size around 90MB ? How is this possible ? Table structure is exactly same.
When delta tries to write in any mode (overwrite / append), does it read the complete table first, before writing. If yes, is this as designed or done with specific purpose ? As we see the spark job goes through when reading, but fails, when writing every time.
When delta table is read from spark, does it really need 10GB to read a 90 MB parquet file ? Anything else happening behind the scenes ?
What is the max size a checkpoint file can be extended to ?
Any good way to compact the complete table, as even compaction is failing due to OOM issues. seems it tries to read all the data and fails.

Note - We have already vacuumed all the data for these tables.

dennyglee · 2021-10-11T02:54:54Z

Hi @nnani - there are a number of questions here and it may be worth pinging us in the Delta Users Slack.

If you have a lot of files in _delta_log, you can reduce the number by more aggressively removing them via VACUUM with the delta.logRetentionDuration property. More information in the Delta documentation > Table Utility Commands
In this case, by any chance are you overwriting the table? If so, then the number of actual files for the table would be the same (hence the same 35MB checkpoint size).
Delta will overwrite or append based on your specification - i.e. are you specifying in df.write.format("delta").mode("...") such that the mode is either append or overwrite. More information in the Delta documentation > Write to a table.
When Delta is read from Spark, the 10GB size may have to do with the fact that it needs to read such a large transaction log. By resolving [1], that should reduce the size.
The checkpoint file has no set max per se - it is the 10th transaction to backup the transaction log such that Spark can read it faster.
If possible, please vacuum both the log and data.

kikalyan · 2021-11-02T07:56:35Z

@dennyglee Is delta.logRetentionDuration property supported in 0.6.1 version?
if so, can you share an example. I tried using alter table properties command by passing the delta table path, it throws an error table not found.

dennyglee · 2021-11-04T15:51:16Z

We started supporting delta.logRetentionDuration in Delta 0.7 per https://docs.delta.io/0.7.0/delta-batch.html#data-retention. HTH!

chengshaoli · 2023-03-23T07:08:01Z

The checkpoint.parquet file size of one of my delta tables had reached 118M, which caused my spark program to process each batch slowly. One of the job that merged transaction logs was executed for 1min each time.

Is this a normal phenomenon? Is this checkpoint.parquet arriving so big?
In addition, I tried to expand the parallelism processing of this file slice, but the spark parameter setting did not take effect.

Bennyelg · 2023-09-14T08:05:58Z

bump

machielg · 2023-09-22T19:10:59Z

I have a delta lake table with 100s of checkpoint files created per minute. The delta log folder is reaching over 8 terabytes and 3 million files. The table itselfs is about 1 terabyte. The table is now beyond vacuum because the driver crashes, probably due to the vast number of checkpoint files.

dennyglee added the question Questions on how to use Delta Lake label Oct 11, 2021

dennyglee added the need author feedback Issue is waiting for the author to respond label Oct 11, 2021

osopardo1 mentioned this issue Jan 27, 2022

Avoid repeated reading of the DeltaLog Qbeast-io/qbeast-spark#65

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delta log getting too big, resulting in spark job failures while writing. #779

Delta log getting too big, resulting in spark job failures while writing. #779

nnani commented Sep 9, 2021

dennyglee commented Oct 11, 2021

kikalyan commented Nov 2, 2021

dennyglee commented Nov 4, 2021

chengshaoli commented Mar 23, 2023

Bennyelg commented Sep 14, 2023

machielg commented Sep 22, 2023

Delta log getting too big, resulting in spark job failures while writing. #779

Delta log getting too big, resulting in spark job failures while writing. #779

Comments

nnani commented Sep 9, 2021

dennyglee commented Oct 11, 2021

kikalyan commented Nov 2, 2021

dennyglee commented Nov 4, 2021

chengshaoli commented Mar 23, 2023

Bennyelg commented Sep 14, 2023

machielg commented Sep 22, 2023