Skip to content
This repository has been archived by the owner on May 1, 2024. It is now read-only.

Commit

Permalink
exclude paths which do not exist
Browse files Browse the repository at this point in the history
  • Loading branch information
rao-abdul-mannan committed May 31, 2018
1 parent 01f6b2b commit fe100b0
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion edx/analytics/tasks/common/spark.py
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ def get_event_log_dataframe(self, spark, *args, **kwargs):
pattern=self.pattern,
date_pattern=self.date_pattern,
).output()
self.path_targets = [task.path for task in path_targets]
self.path_targets = [task.path for task in path_targets if task.exists()]
dataframe = spark.read.format('json').load(self.path_targets, schema=self.get_log_schema())
dataframe = dataframe.filter(dataframe['time'].isNotNull()) \
.withColumn('event_date', date_format(to_date(dataframe['time']), 'yyyy-MM-dd'))
Expand Down

0 comments on commit fe100b0

Please sign in to comment.