Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Reset RT_ALERTS Ingestion dataset on schema change #457

Merged
merged 2 commits into from
Oct 25, 2024

Conversation

rymarczy
Copy link
Collaborator

The RT_ALERTS dataset that we ingest produces parquet files that contain columns consisting of struct types. If the struct type of the column changes, a newly ingested dataset can not be merged with the existing dataset from S3.

If this situation emerges, the S3 dataset will be over-written by the newly ingested dataset, avoiding the merge errors for the day. This will only be applied to Alerts datasets, as the contents of Alerts RT data does not change often throughout the day. Since Alerts events are persisted with started/complete/modified timestamp fields.

@rymarczy rymarczy requested a review from arkadyan October 25, 2024 12:46
Copy link

Coverage of commit 404841d

Summary coverage rate:
  lines......: 75.3% (2491 of 3309 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/convert_gtfs_rt.py                                           |48.4%    219|    -     0|    -      0

Download coverage report

Copy link

Coverage of commit 38a2aa9

Summary coverage rate:
  lines......: 75.7% (2506 of 3309 lines)
  functions..: no data found
  branches...: no data found

Files changed coverage rate:
                                                                                     |Lines       |Functions  |Branches    
  Filename                                                                           |Rate     Num|Rate    Num|Rate     Num
  =========================================================================================================================
  src/lamp_py/ingestion/convert_gtfs_rt.py                                           |48.4%    219|    -     0|    -      0

Download coverage report

Copy link

@arkadyan arkadyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rymarczy rymarczy merged commit e08367a into main Oct 25, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants