Skipping partition as no new files detected #18

RomeLeader · 2019-09-25T13:44:30Z

Hi,

My log bucket is fairly large in size, however we have Glaicered anything older than three months. When I run the job, I get the following, as it completes in a minute or two:

19/09/25 13:26:02 WARN HadoopDataSource: Skipping Partition
{}as no new files detected @ s3://<BUCKET>/ / or path does not exist

where is the name of my S3 access log storage bucket.

My logs are being saved at top-level in the S3 bucket, i.e. all log files are at s3:///

What could be happening here? I know there are logs in the bucket that are not partitioned, and the converted DB/tables are empty when I preview them. I have given the classification of the raw data table as CSV, but I am not sure what is correct.

Any pointers would be appreciated!

The text was updated successfully, but these errors were encountered:

MarcusElwin · 2022-06-30T08:54:40Z

We get a similar issue when a file is not in s3 and an empty DataFrame is still created, shouldn't this raise an exception?:

22/06/30 08:52:18 WARN HadoopDataSource: Skipping Partition {} as no new files detected @ s3://sample-bucket/test/dict_most_common_names_old.csv or path does not exist
Empty DataFrame
Columns: []
Index: []
<class 'pandas.core.frame.DataFrame'>

MyJBMe · 2023-04-07T00:12:34Z

I experienced the same error. Turned out my glue job just did not have enough permissions. Thereby you may check your assigned role.

TLazarevic · 2023-06-06T10:57:43Z

What permissions were you missing @MyJBMe ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skipping partition as no new files detected #18

Skipping partition as no new files detected #18

RomeLeader commented Sep 25, 2019 •

edited

Loading

MarcusElwin commented Jun 30, 2022 •

edited

Loading

MyJBMe commented Apr 7, 2023

TLazarevic commented Jun 6, 2023

Skipping partition as no new files detected #18

Skipping partition as no new files detected #18

Comments

RomeLeader commented Sep 25, 2019 • edited Loading

MarcusElwin commented Jun 30, 2022 • edited Loading

MyJBMe commented Apr 7, 2023

TLazarevic commented Jun 6, 2023

RomeLeader commented Sep 25, 2019 •

edited

Loading

MarcusElwin commented Jun 30, 2022 •

edited

Loading