Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store and retrieve user-specified network config, addresses #644 #941

Merged
merged 5 commits into from
Oct 3, 2024

Conversation

abyrd
Copy link
Member

@abyrd abyrd commented Oct 2, 2024

This is a draft implementation of the server side of the functionality described in #644, as a basis for discussion on integration with the UI.

At bundle creation, the POST request body can now include a "config" item containing JSON that is deserialized to a com.conveyal.r5.analyst.cluster.TransportNetworkConfig. This deserialization step provides some free validation and fail-fast behavior if the user makes mistakes in crafting the network configuration. For backward compatibility, the OSM and GTFS ID fields of that TransportNetworkConfig object are ignored and overwritten with the ones specified in the enclosing POST form's other fields.

The completed network config is then included in the database entry for the bundle, as well as the JSON file in file storage that is visible to the workers. That file is considered the definitive source of truth for the network config, as it has been around a lot longer and is present for all bundles created in the past.

The contents of that definitive config file are visible at a new endpoint at /api/bundle/[id]/config. But for any newly created bundles, the API response is expected to be identical to the config field of the bundle object in the database.

This has been tested using CURL for a simple case specifying "buildGridsForModes":["BICYCLE"]. Configuration was retained in the database and JSON file, and was available via both the database and single-purpose API endpoints.

// However, only the instance fields specifying things other than OSM and GTFS IDs will be retained.
// Use strict object mapper (from the strict/lenient pair) to fail on misspelled field names.
String configString = files.get("config").get(0).getString();
bundle.config = JsonUtilities.objectMapper.readValue(configString, TransportNetworkConfig.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want actually want strict validation of the JSON done here? If we have a new version of the worker that uses new config values, we would need to deploy a new backend with a new TransportNetworkConfig containing those new values in order to create bundles using those new config values. We lose part of the benefit of using a freeform JSON field in that all allowable fields must be predefined in the backend at the time of deployment.

I already ran into this while creating the UI component, because the proposed defaults contain a value that is not currently in the config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. We've consistently tried to allow single-purpose or experimental worker versions to function without the backend understanding their particularities, and we need to do the same here. To that end, any validation of specific features should be done by the worker that builds the network, not by the backend. I do want to perform validation because it could be very confusing (and lead to trivial support requests and degraded user experience) if small faults in the configuration were accepted and silent failures ensued.

The backend should perform as much validation as it can in order to fail fast, so someone doesn't have to move on to starting a worker and running some analysis to discover a simple mistake in JSON quoting (for example). Any new special-purpose worker version should be adding fields to those already known by the backend. So I think we want the backend to validate that it's JSON and any fields it already knows about can be deserialized properly, while ignoring the fields it doesn't know about. That's the behavior I originally had, and I shouldn't have added the second commit to alter it.

Copy link
Member Author

@abyrd abyrd Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, even on the worker we always need to ignore unknown field names. There are legitimate reasons a network built with a certain config might later be used with an older worker that doesn't possess a particular new feature (if only for testing the effects of the new feature). This has been clarified in the code comments.

abyrd added 2 commits October 3, 2024 12:16
This reverts commit 3094f8a32aa5ef8be8a816ef6fe42bc5ea8513c.

The backend needs to allow sending config to workers with new features it doesn't know about.
@abyrd abyrd enabled auto-merge October 3, 2024 17:16
Copy link
Member

@ansoncfit ansoncfit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@abyrd abyrd merged commit 68c6772 into dev Oct 3, 2024
3 checks passed
@abyrd abyrd deleted the network-config branch October 3, 2024 19:18
@trevorgerhardt trevorgerhardt mentioned this pull request Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants