-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
185 html report always cleaning #248
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dev #248 +/- ##
=======================================
Coverage 98.18% 98.18%
=======================================
Files 21 21
Lines 1924 1927 +3
=======================================
+ Hits 1889 1892 +3
Misses 35 35
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running this with the clipped GTFS for Newport, the new flag clean_feed
works as expected, with the report including some errors and warnings when the flag is set to False
that would be removed otherwise.
However, when running the function with clean_feed == False
in larger GTFS files (all Wales or Leeds), I get the following error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File Untitled-1:2
[1](untitled-1:1) # %%
----> [2](untitled-1:2) instance.html_report(overwrite=True, clean_feed=False)
File [~/src/transport_performance/gtfs/validation.py:1501](~/src/transport_performance/gtfs/validation.py:1501), in GtfsInstance.html_report(self, report_dir, overwrite, summary_type, extended_validation, clean_feed)
[1499](~/src/transport_performance/gtfs/validation.py:1499) # create extended reports if requested
[1500](~/src/transport_performance/gtfs/validation.py:1500) if extended_validation:
-> [1501](~/src/transport_performance/gtfs/validation.py:1501) self._extended_validation(output_path=report_dir)
[1502](~/src/transport_performance/gtfs/validation.py:1502) info_href = (
[1503](~/src/transport_performance/gtfs/validation.py:1503) validation_dataframe["message"].apply(
[1504](~/src/transport_performance/gtfs/validation.py:1504) lambda x: "_".join(x.split(" "))
(...)
[1508](~/src/transport_performance/gtfs/validation.py:1508) + ".html"
[1509](~/src/transport_performance/gtfs/validation.py:1509) )
[1510](~/src/transport_performance/gtfs/validation.py:1510) validation_dataframe["info"] = [
[1511](~/src/transport_performance/gtfs/validation.py:1511) f"""<a href="{href}"> Further Info</a>"""
[1512](~/src/transport_performance/gtfs/validation.py:1512) if len(rows) > 1
[1513](~/src/transport_performance/gtfs/validation.py:1513) else "Unavailable"
[1514](~/src/transport_performance/gtfs/validation.py:1514) for href, rows in zip(info_href, validation_dataframe["rows"])
[1515](~/src/transport_performance/gtfs/validation.py:1515) ]
File [~/src/transport_performance/gtfs/validation.py:1375](~/src/transport_performance/gtfs/validation.py:1375), in GtfsInstance._extended_validation(self, output_path, scheme)
[1370](~/src/transport_performance/gtfs/validation.py:1370) duplicate_counts[col] = impacted_rows[
[1371](~/src/transport_performance/gtfs/validation.py:1371) impacted_rows[f"{col}_original"]
[1372](~/src/transport_performance/gtfs/validation.py:1372) == impacted_rows[f"{col}_duplicate"]
[1373](~/src/transport_performance/gtfs/validation.py:1373) ].shape[0]
[1374](~/src/transport_performance/gtfs/validation.py:1374) else:
-> [1375](~/src/transport_performance/gtfs/validation.py:1375) impacted_rows = table_map[table].copy().iloc[rows]
[1377](~/src/transport_performance/gtfs/validation.py:1377) # create the html to display the impacted rows (clean possibly)
[1378](~/src/transport_performance/gtfs/validation.py:1378) table_html = f"""
[1379](~/src/transport_performance/gtfs/validation.py:1379) <head>
[1380](~/src/transport_performance/gtfs/validation.py:1380) <link rel="stylesheet" href="styles.css">
(...)
[1389](~/src/transport_performance/gtfs/validation.py:1389) {msg_type}</span>
[1390](~/src/transport_performance/gtfs/validation.py:1390) </h1>"""
KeyError: 'full_stop_schedule'
I suspect there's something in those GTFS that doesn't work with the HTML report, and it was getting cleaned by default. Setting the new flag to False
is causing the report to fail. Could you please investigate? Thanks!
PS: did some minor changes (correcting a typo and merging with dev).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @CBROWN-ONS. I am now getting a different KeyError
using Leeds:
KeyError: 'multiple_stops_invalid'
.
Is this a case where we will need to manually add objects to the table map manually if a new issue appears (e.g. with other unfamiliar GTFS), or will there be a fixed number of them? Just wondering if this issue may appear again when trying a new location (e.g. Germany).
HI Sergio. This shouldn't be the case of 1 by 1 adding unless we add new validation tables manually. Do you have the full error message? |
Here's the code I ran:
Here's the full traceback:
|
This should be fixed now. It was the same error as before (an unfinished TODO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* feat: add doctstring;add param;add type defence * feat: add condition on whether to clean or not * fix: fixed small typo in error message * fix: changed match in test to fit error message * fix: update gtfs attr table for html reports * test: Refactor tests that are now warning about expired feed * refactor: Test targets new row thanks to feed expired warning * refactor: Tests assert against fixture with expired feed warning --------- Co-authored-by: Sergio Recio <[email protected]> Co-authored-by: r-leyshon <[email protected]> 4300b44
Description
This PR makes it so that
gtfs::validation::GtfsIntance.html_report()
does not clean when theclean_feed
param is set toFalse
Fixes #185
Motivation and Context
Type of change
How Has This Been Tested?
Test configuration details:
Advice for reviewer
Checklist:
Additional comments