Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added an option to split the output proto files by top level category #367

Open
wants to merge 3 commits into
base: xml_converter
Choose a base branch
from

Conversation

klingbolt
Copy link
Contributor

No description provided.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full Diff
--- xml_converter/integration_tests/test_cases/xml_split_by_category/output_proto/toplevel1.guildpoint.textproto._old	2024-12-09 03:53:05.031311821 +0000
+++ xml_converter/integration_tests/test_cases/xml_split_by_category/output_proto/toplevel1.guildpoint.textproto._new	2024-12-09 03:53:05.035311909 +0000
@@ -0,0 +1,12 @@
+category {
+  name: "TopLevel1"
+  icon {
+    map_id: 50
+    position {
+      x: 169.81
+      y: 210.65
+      z: 215.83
+    }
+  }
+  id: "\010\007L]\332B\361\323"
+}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full Diff
--- xml_converter/integration_tests/test_cases/xml_split_by_category/output_proto/toplevel2.guildpoint.textproto._old	2024-12-09 03:53:05.043312083 +0000
+++ xml_converter/integration_tests/test_cases/xml_split_by_category/output_proto/toplevel2.guildpoint.textproto._new	2024-12-09 03:53:05.051312257 +0000
@@ -0,0 +1,12 @@
+category {
+  name: "TopLevel2"
+  icon {
+    map_id: 50
+    position {
+      x: 169.81
+      y: 210.65
+      z: 215.83
+    }
+  }
+  id: "\010\007L]\333B\361\323"
+}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full Diff
--- xml_converter/integration_tests/test_cases/xml_split_by_category/output_proto/toplevel3.guildpoint.textproto._old	2024-12-09 03:53:05.059312431 +0000
+++ xml_converter/integration_tests/test_cases/xml_split_by_category/output_proto/toplevel3.guildpoint.textproto._new	2024-12-09 03:53:05.067312606 +0000
@@ -0,0 +1,12 @@
+category {
+  name: "TopLevel3"
+  icon {
+    map_id: 50
+    position {
+      x: 169.81
+      y: 210.65
+      z: 215.83
+    }
+  }
+  id: "\010\007L]\334B\361\323"
+}

Copy link
Owner

@AsherGlick AsherGlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good proof of concept but I think some of the minor design decisions chosen here did not take enough into consideration and need to be reevaluated.

Comment on lines +2 to +6
<MarkerCategory DisplayName="TopLevel1" Name="toplevel1" XPos="169.81" YPos="210.65" ZPos="215.83" MapID="50" ></MarkerCategory>

<MarkerCategory DisplayName="TopLevel2" Name="toplevel2" XPos="169.81" YPos="210.65" ZPos="215.83" MapID="50" ></MarkerCategory>

<MarkerCategory DisplayName="TopLevel3" Name="toplevel3" XPos="169.81" YPos="210.65" ZPos="215.83" MapID="50" ></MarkerCategory>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets differentiate these by more than just DisplayName and Name. Otherwise this test does a silly job at not catching the issue of "we write the same top level node but with different display/names for each category". The attributes could also be given to the POIs instead of the categories because XPos YPos ZPos dont make a huge amount of sense as category attributes.

Lets also add some subcategories here too so that we can test that "more than just the top level category gets written". Because we do want the entire category tree.

Comment on lines +237 to +238
StringHierarchy category_filter;
category_filter.add_path({}, true);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you hardcoding the global wildcard here and preventing it from being passed in to the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that category_filter is not global in the same sense as the other two large inputs. It is created just before this function is called, and always for a single path. In this sense, it is more similar in scope to the ProtoWriterState, which is also created in this function.

Comment on lines +248 to +284
void write_protobuf_file_per_category(
const string& marker_pack_root_directory,
const map<string, Category>* marker_categories,
const vector<Parseable*>* parsed_pois) {
std::map<string, std::vector<Parseable*>> category_to_pois;
ProtoWriterState state;
state.marker_pack_root_directory = marker_pack_root_directory;

for (size_t i = 0; i < parsed_pois->size(); i++) {
Parseable* parsed_poi = (*parsed_pois)[i];
if (parsed_poi->classname() == "POI") {
Icon* icon = dynamic_cast<Icon*>(parsed_poi);
// TODO(331): This is the wrong place to lowercase() the category and is hiding some crimes elsewhere
category_to_pois[lowercase(icon->category.category)].push_back(parsed_poi);
}
else if (parsed_poi->classname() == "Trail") {
Trail* trail = dynamic_cast<Trail*>(parsed_poi);
// TODO(331): This is the wrong place to lowercase() the category and is hiding some crimes elsewhere
category_to_pois[lowercase(trail->category.category)].push_back(parsed_poi);
}
else {
std::cout << "Unknown type" << std::endl;
}
}

for (map<string, Category>::const_iterator it = marker_categories->begin(); it != marker_categories->end(); it++) {
StringHierarchy category_filter;
category_filter.add_path({it->first}, true);
_write_protobuf_file(
join_file_paths(state.marker_pack_root_directory, it->first + ".guildpoint"),
category_filter,
marker_categories,
category_to_pois,
&state);
}
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then this function duplicates all of the logic of write_protobuf_file in order to access the earlier removed ability to pass in a category_fiter.

If the counter argument is the building of the category_to_pois dict for each file, aka: performance optimizations. I would necessarily have expected to see benchmarks somewhere in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function actually duplicates write_protobuf_file_per_map_id. Since that is now 3 places, perhaps the creation of category_to_pois should be its own function.

My other thought was that I am already iterating over the marker_categories map in order to get the name and would need a second iteration if I were to do it for the category filter first before this function is called.

Comment on lines +309 to +311
StringHierarchy category_filter;
category_filter.add_path({}, true);

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Why are you hardcoding the global wildcard here and preventing it from being passed in to the function?

Comment on lines +149 to +154
if (split_proto_by_category) {
write_protobuf_file_per_category(output_path, marker_categories, parsed_pois);
}
else {
write_protobuf_file(output_path, marker_categories, parsed_pois);
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should probably be nearby the logic for split by map id. It is convoluted that split by map id is in the parent function but split by category is in this child function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have actually considered moving map_id into this function but it would require reworking some other variables so thought it outside this PR scope.

Comment on lines +317 to +318
else if (!strcmp(argv[i], "--split-guildpoint-by-category")) {
split_proto_by_category = true;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if --output-split-guildpoint-path and --split-guildpoint-by-category are both specified. Does the program act in a way that a user could reasonably expect? I think it wont, because it will only split by map id as the current logic for splitting by map id does not interact with the splitting by category logic.

Maybe this is the wrong argument to be exposing to the user? Think through some user stories for this holistically because we can still alter how other command line flags for xml_converter are presented to the user if we think there are better paradigms to use than what we are currently using.

@klingbolt
Copy link
Contributor Author

I agree this PR is incomplete. It is also heavily affected by a bug in type that is addressed in #331. In hindsight I should have marked this as a draft.
I would like your opinion. This is now the third optional flag if you consider, split by mapid, split by category, and allow duplicates. I think it should be possible to want all three of these at the same time. Could we change the arguments to that there is one filepath argument and then all three of these, or any other writing modifiers, as options? Or should it be split so that every optional writing option is its own filepath.

@AsherGlick
Copy link
Owner

I have not given the ux much thought yet. But I would think about "is there a situation where we want to output multiple files". I could think that an automated service that is testing these would want to output multiple types at once, maybe for automatic downloads. And we use that within our test harness, but the harness could be changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants