Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store and retrieve user-specified network config, addresses #644 #941

Merged
merged 5 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
import com.conveyal.file.FileUtils;
import com.conveyal.gtfs.GTFSCache;
import com.conveyal.gtfs.GTFSFeed;
import com.conveyal.gtfs.error.GTFSError;
import com.conveyal.gtfs.error.GeneralError;
import com.conveyal.gtfs.model.Stop;
import com.conveyal.gtfs.validator.PostLoadValidator;
Expand All @@ -22,6 +21,7 @@
import com.conveyal.r5.analyst.progress.ProgressInputStream;
import com.conveyal.r5.analyst.cluster.TransportNetworkConfig;
import com.conveyal.r5.analyst.progress.Task;
import com.conveyal.r5.common.JsonUtilities;
import com.conveyal.r5.streets.OSMCache;
import com.conveyal.r5.util.ExceptionUtils;
import com.mongodb.QueryBuilder;
Expand Down Expand Up @@ -81,6 +81,7 @@ public BundleController (BackendComponents components) {
public void registerEndpoints (Service sparkService) {
sparkService.path("/api/bundle", () -> {
sparkService.get("", this::getBundles, toJson);
sparkService.get("/:_id/config", this::getBundleConfig, toJson);
sparkService.get("/:_id", this::getBundle, toJson);
sparkService.post("", this::create, toJson);
sparkService.put("/:_id", this::update, toJson);
Expand Down Expand Up @@ -110,15 +111,13 @@ private Bundle create (Request req, Response res) {
try {
bundle.name = files.get("bundleName").get(0).getString("UTF-8");
bundle.regionId = files.get("regionId").get(0).getString("UTF-8");

if (files.get("osmId") != null) {
bundle.osmId = files.get("osmId").get(0).getString("UTF-8");
Bundle bundleWithOsm = Persistence.bundles.find(QueryBuilder.start("osmId").is(bundle.osmId).get()).next();
if (bundleWithOsm == null) {
throw AnalysisServerException.badRequest("Selected OSM does not exist.");
}
}

if (files.get("feedGroupId") != null) {
bundle.feedGroupId = files.get("feedGroupId").get(0).getString("UTF-8");
Bundle bundleWithFeed = Persistence.bundles.find(QueryBuilder.start("feedGroupId").is(bundle.feedGroupId).get()).next();
Expand All @@ -135,6 +134,13 @@ private Bundle create (Request req, Response res) {
bundle.feedsComplete = bundleWithFeed.feedsComplete;
bundle.totalFeeds = bundleWithFeed.totalFeeds;
}
if (files.get("config") != null) {
// For validation, rather than reading as freeform JSON, deserialize into a model class instance.
// However, only the instance fields specifying things other than OSM and GTFS IDs will be retained.
// Use strict object mapper (from the strict/lenient pair) to fail on misspelled field names.
String configString = files.get("config").get(0).getString();
bundle.config = JsonUtilities.objectMapper.readValue(configString, TransportNetworkConfig.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want actually want strict validation of the JSON done here? If we have a new version of the worker that uses new config values, we would need to deploy a new backend with a new TransportNetworkConfig containing those new values in order to create bundles using those new config values. We lose part of the benefit of using a freeform JSON field in that all allowable fields must be predefined in the backend at the time of deployment.

I already ran into this while creating the UI component, because the proposed defaults contain a value that is not currently in the config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. We've consistently tried to allow single-purpose or experimental worker versions to function without the backend understanding their particularities, and we need to do the same here. To that end, any validation of specific features should be done by the worker that builds the network, not by the backend. I do want to perform validation because it could be very confusing (and lead to trivial support requests and degraded user experience) if small faults in the configuration were accepted and silent failures ensued.

The backend should perform as much validation as it can in order to fail fast, so someone doesn't have to move on to starting a worker and running some analysis to discover a simple mistake in JSON quoting (for example). Any new special-purpose worker version should be adding fields to those already known by the backend. So I think we want the backend to validate that it's JSON and any fields it already knows about can be deserialized properly, while ignoring the fields it doesn't know about. That's the behavior I originally had, and I shouldn't have added the second commit to alter it.

Copy link
Member Author

@abyrd abyrd Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, even on the worker we always need to ignore unknown field names. There are legitimate reasons a network built with a certain config might later be used with an older worker that doesn't possess a particular new feature (if only for testing the effects of the new feature). This has been clarified in the code comments.

}
UserPermissions userPermissions = UserPermissions.from(req);
bundle.accessGroup = userPermissions.accessGroup;
bundle.createdBy = userPermissions.email;
Expand Down Expand Up @@ -274,15 +280,19 @@ private Bundle create (Request req, Response res) {
return bundle;
}

/** SIDE EFFECTS: This method will change the field bundle.config before writing it. */
private void writeNetworkConfigToCache (Bundle bundle) throws IOException {
TransportNetworkConfig networkConfig = new TransportNetworkConfig();
networkConfig.osmId = bundle.osmId;
networkConfig.gtfsIds = bundle.feeds.stream().map(f -> f.bundleScopedFeedId).collect(Collectors.toList());

// If the user specified additional network configuration options, they should already be in bundle.config.
// If no custom options were specified, we start with a fresh, empty instance.
if (bundle.config == null) {
bundle.config = new TransportNetworkConfig();
}
// This will overwrite and override any inconsistent osm and gtfs IDs that were mistakenly supplied by the user.
bundle.config.osmId = bundle.osmId;
bundle.config.gtfsIds = bundle.feeds.stream().map(f -> f.bundleScopedFeedId).collect(Collectors.toList());
String configFileName = bundle._id + ".json";
File configFile = FileUtils.createScratchFile("json");
JsonUtil.objectMapper.writeValue(configFile, networkConfig);

JsonUtil.objectMapper.writeValue(configFile, bundle.config);
FileStorageKey key = new FileStorageKey(BUNDLES, configFileName);
fileStorage.moveIntoStorage(key, configFile);
}
Expand Down Expand Up @@ -312,6 +322,27 @@ private Bundle getBundle (Request req, Response res) {
return bundle;
}

/**
* There are two copies of the Bundle/Network config: one in the Bundle entry in the database and one in a JSON
* file (obtainable by the workers). This method always reads the one in the file, which has been around longer
* and is considered the definitive source of truth. The entry in the database is a newer addition and has only
* been around since September 2024.
*/
private TransportNetworkConfig getBundleConfig (Request request, Response res) {
// Unfortunately this mimics logic in TransportNetworkCache. Deduplicate in a static utility method?
String id = GTFSCache.cleanId(request.params("_id"));
FileStorageKey key = new FileStorageKey(BUNDLES, id, "json");
File networkConfigFile = fileStorage.getFile(key);
// Unlike in the worker, we expect the backend to have a model field for every known network/bundle option.
// Threfore, use the default objectMapper that does not tolerate unknown fields.
try {
return JsonUtil.objectMapper.readValue(networkConfigFile, TransportNetworkConfig.class);
} catch (Exception exception) {
LOG.error("Exception deserializing stored network config", exception);
return null;
}
}

private Collection<Bundle> getBundles (Request req, Response res) {
return Persistence.bundles.findPermittedForQuery(req);
}
Expand Down
6 changes: 6 additions & 0 deletions src/main/java/com/conveyal/analysis/models/Bundle.java
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import com.conveyal.gtfs.error.GTFSError;
import com.conveyal.gtfs.model.FeedInfo;
import com.conveyal.gtfs.validator.model.Priority;
import com.conveyal.r5.analyst.cluster.TransportNetworkConfig;
import com.fasterxml.jackson.annotation.JsonIgnore;

import java.time.LocalDate;
Expand Down Expand Up @@ -47,6 +48,11 @@ public class Bundle extends Model implements Cloneable {
public int feedsComplete;
public int totalFeeds;

// The definitive TransportNetworkConfig is a JSON file stored alonside the feeds in file storage. It is
// duplicated here to record any additional user-specified options that were supplied when the bundle was created.
// It may contain redundant copies of information stored in the outer level Bundle such as OSM and GTFS feed IDs.
public TransportNetworkConfig config;

public static String bundleScopeFeedId (String feedId, String feedGroupId) {
return String.format("%s_%s", feedId, feedGroupId);
}
Expand Down
Loading